CA2286306A1 - Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome - Google Patents

Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome Download PDF

Info

Publication number
CA2286306A1
CA2286306A1 CA002286306A CA2286306A CA2286306A1 CA 2286306 A1 CA2286306 A1 CA 2286306A1 CA 002286306 A CA002286306 A CA 002286306A CA 2286306 A CA2286306 A CA 2286306A CA 2286306 A1 CA2286306 A1 CA 2286306A1
Authority
CA
Canada
Prior art keywords
seq
ghpo
gly
leu
asn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002286306A
Other languages
French (fr)
Inventor
Harold Kleanthous
Amal Al-Garawi
Charles Miller
Jean-Francois Tomb
Raymond Peter Oomen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MERIEUX ORAVAX
Human Genome Sciences Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2286306A1 publication Critical patent/CA2286306A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/12Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria
    • C07K16/1203Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-negative bacteria
    • C07K16/121Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from bacteria from Gram-negative bacteria from Helicobacter (Campylobacter) (G)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P1/00Drugs for disorders of the alimentary tract or the digestive system
    • A61P1/04Drugs for disorders of the alimentary tract or the digestive system for ulcers, gastritis or reflux esophagitis, e.g. antacids, inhibitors of acid secretion, mucosal protectants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/205Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Campylobacter (G)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • General Chemical & Material Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Engineering & Computer Science (AREA)
  • Oncology (AREA)
  • Communicable Diseases (AREA)
  • Immunology (AREA)
  • Peptides Or Proteins (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

The invention provides Helicobacter polypeptides that can be used in vaccination methods for preventing or treating Helicobacter infection, and polynucleotides that encode these polypeptides. The invention also provides diagnostic methods employing these polypeptides.

Description

CA 02286306 1999-10-01.~
DEMANDES OU BREVETS VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET
COMPREND PLUS D'UN TOME.
CECI EST LE TOME ~ DE ~ .
NOTE: Pour les tomes additionels, veuillez contacter le Bureau canadien des brevets THiS SECTION OF THE APPLICATIONIPATENT CONTAINS MORE
THAN ONE VOLUME
._ ,_ ' NOTE: Fog additional volumes-pl~as~ contact.~the Canadian Patent Offc~ .

IDENTIFICATION OF POLYNUCLEOTIDES ENCOpING NOVEL
HELICOBACTER POLYPEPTIDES IN THE HELICOBACTER GENOME
The invention relates to Helicobacter antigens and corresponding polynucleotide molecules that can be used in methods to prevent or treat Helicobacter infection in mammals, such as humans.
Background of the Invention Helicobacter is a genus of spiral, gram-negative bacteria that colonize the gastrointestinal tracts of mammals. Several species colonize the stomach, most notably H. pylori, H. heilmanii, H. fells, and H. mustelae. Although H.
pylori is the species most commonly associated with human infection, H.
heilmanii and H. fells have also been isolated from humans, but at lower frequencies than H. pylori. Helicobacter infects over 50% of adult populations in developed countries and nearly 100% in developing countries and some Pacific rim countries, making it one of the most prevalent infections worldwide.
Helicobacter is routinely recovered from gastric biopsies of humans with histological evidence of gastritis and peptic ulceration. Indeed, H.
pylori is now recognized as an important pathogen of humans, in that the chronic gastritis it causes is a risk factor for the development of peptic ulcer diseases and gastric carcinoma. It is thus highly desirable to develop safe and effective vaccines for preventing and treating Helicobacter infection.
A number of Helicobacter antigens have been characterized or isolated.
These include urease, which is composed of two structural subunits of approximately 30 and 67 kDa (Hu et al., Infect. Immun. 58:992, 1990; Dunn et al., J. Biol. Chem. 265:9464, 1990; Evans et al., Microbial Pathogenesis 10:15, 1991; Labigne et al., J. Bact., 173:1920, 1991); the 87 kDa vacuolar cytotoxin (VacA) (Cover et al., J. Biol. Chem. 267:10570, 1992; Phadnis et al., Infect.
Immun. 62:1557, 1994; WO 93/18150); a 128 kDa immunodominant antigen associated with the cytotoxin (CagA, also called TagA; WO 93/18150; U.S.
Patent No. 5,403,924); 13 and S 8 kDa heat shock proteins HspA and HspB
(Suerbaum et al., Mol. Microbiol. 14:959, 1994; WO 93/18150); a 54 kDa catalase (Hazell et al., J. Gen. Microbio1.137:57, 1991); a 15 kDa histidine-rich protein (Hpn) (Gilbert et al., Infect. Immun. 63:2682, 1995); a 20 kDa membrane-associated lipoprotein (Kostrcynska et al., J. Bact. 176:5938, 1994);
a 30 kDa outer membrane protein (Bolin et al., J. Clin. Microbiol. 33:381, 1995); a lactoferrin receptor (FR 2,724,936); and several porins, designated HopA, HopB, HopC, HopD, and HopE, which have molecular weights of 48-67 kDa (Exner et al., Infect. Immun. 63:1567, 1995; Doig et al., J. Bact.
177:5447, 1995). Some of these proteins have been proposed as potential vaccine antigens. In particular, urease is believed to be a vaccine candidate (WO 94/9823; WO 95/22987; WO 95/3824; Michetti et al., Gastroenterology 107:1002, 1994). Nevertheless, it is thought that several antigens may ultimately be necessary in a vaccine.
Summary of the Invention The invention provides polynucleotide molecules that encode Helicobacter polypeptides, designated GHPO 35 (SEQ ID N0:2), GHPO 55 (SEQ ID N0:4), GHPO 78 (SEQ ID N0:6), GHPO 89 (SEQ ID N0:8), GHPO
129 (SEQ ID NO:10), GHPO 541 (SEQ ID N0:12), GHPO 607 (SEQ ID
N0:14), GHPO 635 (SEQ ID N0:16), GHPO 70I (SEQ ID N0:18), GHPO
712 (SEQ ID N0:20), GHPO 761 (SEQ ID N0:22), GHPO 838 (SEQ ID
N0:24), GHPO 1034 (SEQ ID N0:26), GHPO 1085 (SEQ ID N0:28), GHPO
1213 (SEQ ID N0:30), GHPO 1255 (SEQ ID N0:32), GHPO 1308 (SEQ ID
N0:34), GHPO 1389 (SEQ ID N0:36), GHPO 1706 (SEQ ID N0:38), GHPO
234 (SEQ ID N0:40), GHPO 314 (SEQ ID N0:42), GHPO 510 (SEQ ID
N0:44), GHPO 603 (SEQ ID N0:46), GHPO 937 (SEQ ID N0:48), GHPO
1027 (SEQ ID N0:50), GHPO 1099 (SEQ ID N0:52), GHPO 1151 (SEQ ID
N0:54), GHPO 1275 (SEQ ID N0:56), GHPO 1365 (SEQ ID N0:58), GHPO
1578 (SEQ ID N0:60), GHPO 22 (SEQ ID N0:62), GHPO 58 (SEQ ID
N0:64), GHPO 200 (SEQ ID N0:66), GHPO 558 (SEQ ID N0:68), GHPO
563 (SEQ ID N0:70), GHPO 695 (SEQ ID N0:72), GHPO 699 (SEQ ID
N0:74), GHPO 702 (SEQ ID N0:76), GHPO 709 (SEQ ID N0:78), GHPO
741 (SEQ ID N0:80), GHPO 762 (SEQ ID N0:82), GHPO 827 {SEQ ID
N0:84), GHPO 852 (SEQ ID N0:86), GHPO 1013 (SEQ ID N0:88), GHPO
1020 (SEQ ID N0:90), GHPO 1031 (SEQ ID N0:92), GHPO 1052 (SEQ ID
N0:94), GHPO 1127 (SEQ ID N0:96), GHPO 1149 (SEQ ID N0:98), GHPO
1176 (SEQ ID NO:100), GHPO 1250 (SEQ ID N0:102), GHPO 1312 (SEQ ID
N0:104), GHPO 1358 (SEQ ID N0:106), GHPO 1490 (SEQ ID N0:108), GHPO 1559 (SEQ ID NO:110), GHPO 1651 (SEQ ID N0:112), GHPO 1726 (SEQ ID N0:114), GHPO 1780 {SEQ ID N0:116), GHPO 895 (SEQ ID
N0:118), GHPO 1447 (SEQ ID N0:120), GHPO 28 (SEQ ID N0:122), GHPO
86 (SEQ ID N0:124), GHPO 155 (SEQ ID N0:126), GHPO i 57 (SEQ ID
N0:128), GHPO 237 (SEQ ID N0:130), GHPO 290 (SEQ ID N0:132), GHPO
293 (SEQ ID N0:134), GHPO 335 (SEQ ID N0:136), GHPO 374 (SEQ ID
N0:138), GHPO 442 (SEQ ID N0:140), GHPO 480 (SEQ ID N0:142), GHPO
523 (SEQ ID N0:144), GHPO 610 (SEQ ID N0:146), GHPO 675 (SEQ ID
N0:148), GHPO 690 (SEQ ID N0:150), GHPO 829 (SEQ ID N0:152), GHPO
850 (SEQ ID N0:154), GHPO 876 (SEQ ID N0:156), GHPO 984 (SEQ ID
N0:158), GHPO 989 (SEQ ID N0:160), GHPO 1111 (SEQ ID N0:162), GHPO 1145 (SEQ ID N0:164), GHPO 1256 (SEQ ID N0:166), GHPO 1264 (SEQ ID N0:168), GHPO 1316 (SEQ ID N0:170), GHPO 1368 (SEQ ID
N0:172), GHPO 1442 (SEQ ID N0:174), GHPO 1506 (SEQ ID N0:176), GHPO 1543 (SEQ ID N0:178), GHPO 1574 (SEQ ID N0:180}, GHPO 1627 (SEQ ID N0:182), GHPO 1657 (SEQ ID N0:184), GHPO 1664 (SEQ ID
N0:186), GHPO 1694 (SEQ ID N0:188), GHPO 1704 (SEQ ID N0:190), GHPO 1763 (SEQ ID N0:192), GHPO 616 (SEQ ID N0:194), GHPO 76 (SEQ ID N0:196), GHPO 109 (SEQ ID N0:198), GHPO 163 (SEQ ID
N0:200), GHPO 169 {SEQ ID N0:202), GHPO 208 (SEQ ID N0:204), GHPO
219 (SEQ ID N0:206), GHPO 445 (SEQ ID N0:208), GHPO 479 (SEQ ID
N0:210), GHPO 525 (SEQ ID N0:212), GHPO 535 (SEQ ID N0:214), GHPO
73I (SEQ ID N0:216), GHPO 836 (SEQ ID N0:218), GHPO 879 (SEQ ID
N0:220), GHPO 881 (SEQ ID N0:222), GHPO 886 (SEQ ID N0:224), GHPO
893 (SEQ ID N0:226), GHPO 894 (SEQ ID N0:228), GHPO 976 (SEQ ID
N0:230), GHPO 1011 (SEQ ID N0:232), GHPO 1024 (SEQ ID N0:234), GHPO 1084 (SEQ ID N0:236), GHPO 1329 (SEQ ID N0:238), GHPO 1330 (SEQ ID N0:240), GHPO 1346 (SEQ ID N0:242), GHPO 1360 (SEQ ID
N0:244), GHPO 1388 (SEQ ID N0:246), GHPO 1411 (SEQ ID N0:248), GHPO 1419 (SEQ ID N0:250), GHPO 1446 (SEQ ID N0:252), GHPO 1469 (SEQ ID N0:254), GHPO 1501 (SEQ ID N0:256), GHPO 1505 (SEQ ID
N0:258), GHPO 1522 (SEQ ID N0:260), GHPO 1525 (SEQ ID N0:262), GHPO 1615 (SEQ ID NO:264), GHPO 1689 (SEQ ID N0:266), GHPO 1733 -(SEQ ID N0:268), GHPO 18 (SEQ ID N0:270), GHPO 139 (SEQ ID
N0:272), GHPO 142 (SEQ ID N0:274), GHPO 250 (SEQ ID N0:276), GHPO
257 (SEQ ID N0:278), GHPO 325 (SEQ ID N0:280), GHPO 355 (SEQ ID
N0:282), GHPO 357 (SEQ ID N0:284), GHPO 454 (SEQ ID N0:286), GHPO

-S-47S (SEQ ID N0:288), GHPO S 1 S (SEQ ID N0:290), GHPO S27 (SEQ ID
N0:292), GHPO SS1 (SEQ ID N0:294), GHPO 602 (SEQ ID N0:296), GHPO
626 (SEQ ID N0:298), GHPO 646 (SEQ ID N0:300), GHPO 6S3 (SEQ ID
N0:302), GHPO 6SS (SEQ ID N0:304), GHPO 670 (SEQ ID N0:306), GHPO
739 (SEQ ID N0:308), GHPO 798 (SEQ ID N0:310), GHPO 1102 (SEQ ID
S N0:312), GHPO 1114 (SEQ ID N0:314), GHPO 11 S2 (SEQ ID N0:316), GHPO 1272 (SEQ ID N0:318), GHPO 134S (SEQ ID N0:320), GHPO 1377 (SEQ ID N0:322), GHPO 1424 (SEQ ID N0:324), GHPO 1430 (SEQ ID
N0:326), GHPO 1 S02 (SEQ ID N0:328), GHPO 1600 (SEQ ID N0:330), GHPO 1714 (SEQ ID N0:332), GHPO 3S9 (SEQ ID N0:334), GHPO 678 (SEQ ID N0:336), GHPO 708 (SEQ ID N0:338), GHPO 7S9 (SEQ ID
N0:340), GHPO 847 (SEQ ID N0:342), GHPO lOSO (SEQ ID N0:344), GHPO 1101 (SEQ ID N0:346), GHPO 1120 (SEQ ID N0:348), GHPO 1138 (SEQ ID N0:3S0), GHPO 1310 (SEQ ID N0:3S2), GHPO 1320 (SEQ ID
N0:3S4), GHPO 1375 (SEQ ID N0:3S6), GHPO 1432 (SEQ ID N0:3S8), 1S GHPO 21 (SEQ ID N0:360), GHPO 282 (SEQ ID N0:362), GHPO 1089 (SEQ ID N0:364), GHPO 1141 (SEQ ID N0:366), GHPO 1280 (SEQ ID
N0:368), GHPO 1608 (SEQ ID N0:370), GHPO 1 S (SEQ ID N0:372), GHPO
16 (SEQ iD N0:374), GHPO 36 (SEQ ID N0:376), GHPO 38 (SEQ ID
N0:378), GHPO S2 (SEQ ID N0:380), GHPO S7 (SEQ ID N0:382), GHPO
64 (SEQ ID N0:384), GHPO 79 (SEQ ID N0:386), GHPO 84 (SEQ ID
N0:388), GHPO 86 (SEQ ID N0:390), GHPO 99 (SEQ ID N0:392), GHPO
- 106 (SEQ ID N0:394), GHPO 118 (SEQ ID N0:396), GHPO 122 (SEQ ID
N0:398), GHPO 128 (SEQ ID N0:400), GHPO 138 (SEQ ID N0:402), GHPO
1 S3 (SEQ ID N0:404), GHPO 160 (SEQ ID N0:406), GHPO 168 {SEQ ID
2S N0:408), GHPO 179 (SEQ ID N0:410), GHPO 189 (SEQ ID N0:412), GHPO
229 (SEQ ID N0:414), GHPO 243 {SEQ ID N0:416), GHPO 244 (SEQ ID

N0:418), GHPO 251 (SEQ ID N0:420), GHPO 267 {SEQ ID N0:422), GHPO
269 (SEQ ID N0:424), GHPO 279 (SEQ ID N0:426), GHPO 284 (SEQ ID
N0:428), GHPO 296 (SEQ ID N0:430), GHPO 300 (SEQ ID N0:432), GHPO
305 (SEQ ID N0:434), GHPO 319 (SEQ ID N0:436), GHPO 330 (SEQ ID
N0:438), GHPO 340 (SEQ ID N0:440), GHPO 342 (SEQ ID N0:442), GHPO
344 (SEQ ID N0:444), GHPO 358 (SEQ ID N0:446), GHPO 373 (SEQ ID
N0:448), GHPO 382 (SEQ ID N0:450), GHPO 384 (SEQ ID N0:452), GHPO
398 (SEQ ID N0:454), GHPO 409 (SEQ ID N0:456), GHPO 422 (SEQ ID
N0:458), GHPO 430 (SEQ ID N0:460), GHPO 446 (SEQ ID N0:462), GHPO
447 (SEQ ID N0:464), GHPO 450 (SEQ ID N0:466), GHPO 451 (SEQ ID
N0:468), GHPO 452 (SEQ ID N0:470), GHPO 456 (SEQ ID N0:472), GHPO
461 (SEQ ID N0:474), GHPO 476 (SEQ ID N0:476), GHPO 478 {SEQ ID
N0:478), GHPO 491 (SEQ ID N0:480), GHPO S 11 (SEQ ID N0:482), GHPO
519 (SEQ ID N0:484), GHPO 526 (SEQ ID N0:486), GHPO 534 (SEQ ID
N0:488), GHPO 536 (SEQ ID N0:490), GHPO 542 (SEQ ID N0:492), GHPO
544 (SEQ ID N0:494), GHPO 576 (SEQ ID N0:496), GHPO 578 (SEQ ID
N0:498), GHPO 580 (SEQ ID NO:500), GHPO 585 (SEQ ID N0:502), GHPO
599 (SEQ ID N0:504), GHPO 639 (SEQ ID N0:506), GHPO 642 (SEQ ID
N0:508), GHPO 647 (SEQ ID NO:510), GHPO 654 (SEQ ID N0:512), GHPO
669 (SEQ ID N0:514), GHPO 710 (SEQ ID N0:516), GHPO 713 (SEQ ID
N0:518), GHPO 716 (SEQ ID N0:520), GHPO 718 (SEQ ID N0:522), GHPO
?26 (SEQ ID N0:524), GHPO 734 (SEQ ID N0:526), GHPO 740 {SEQ ID
N0:528), GHPO 770 (SEQ ID N0:530), GHPO 782 (SEQ ID N0:532), GHPO
786 (SEQ ID N0:534), GHPO 792 (SEQ ID N0:536), GHPO 797 (SEQ ID
N0:538), GHPO 816 (SEQ ID N0:540), GHPO 828 (SEQ ID N0:542), GHPO
839 (SEQ ID N0:544), GHPO 840 (SEQ ID N0:546), GHPO 842 (SEQ ID
N0:548), GHPO 885 (SEQ ID NO:550), GHPO 889 (SEQ ID N0:552), GHPO

_7_ 903 (SEQ ID N0:554), GHPO 912 (SEQ ID N0:556), GHPO 946 (SEQ ID
N0:558), GHPO 958 (SEQ ID N0:560), GHPO 968 (SEQ ID N0:562), GHPO
987 (SEQ ID N0:564), GHPO 992 (SEQ ID N0:566), GHPO 996 (SEQ ID
N0:568), GHPO 997 (SEQ ID N0:570), GHPO 1002 (SEQ ID N0:572), GHPO 1026 (SEQ ID N0:574), GHPO 1028 (SEQ ID N0:576), GHPO 1034 (SEQ ID N0:578), GHPO 1038 (SEQ ID N0:580}, GHPO 1059 (SEQ ID
N0:582), GHPO 1065 (SEQ ID N0:584), GHPO 1072 (SEQ ID N0:586), GHPO 1073 (SEQ ID N0:588), GHPO 1088 (SEQ ID N0:590), GHPO 1091 (SEQ ID N0:592), GHPO 1105 (SEQ ID N0:594), GHPO 1115 (SEQ ID
N0:596), GHPO 1159 (SEQ ID N0:598), GHPO 1177 (SEQ ID N0:600), GHPO 1187 (SEQ ID N0:602), GHPO 1192 (SEQ ID N0:604), GHPO 1195 (SEQ ID N0:606), GHPO 1224 (SEQ ID N0:608), GHPO 1225 {SEQ ID
N0:610), GHPO 1228 (SEQ ID N0:612), GHPO 1229 (SEQ ID N0:614), GHPO 1231 (SEQ ID N0:616), GHPO 1236 (SEQ ID N0:618), GHPO 1242 (SEQ ID N0:620), GHPO 1248 (SEQ ID N0:622), GHPO 1270 (SEQ ID
N0:624), GHPO 1271 (SEQ ID N0:626), GHPO 1298 (SEQ ID N0:628), GHPO 1301 (SEQ ID N0:630), GHPO 1304 (SEQ ID N0:632), GHPO 1315 (SEQ ID N0:634), GHPO 1319 (SEQ ID N0:636), GHPO 1323 (SEQ ID
N0:638), GHPO 1331 (SEQ ID N0:640), GHPO 1332 (SEQ ID N0:642), GHPO 1347 (SEQ ID N0:644), GHPO 1373 {SEQ ID N0:646), GHPO 1376 (SEQ ID N0:648), GHPO 1380 (SEQ ID N0:650), GHPO 1394 (SEQ ID
N0:652), GHPO 1407 (SEQ ID N0:654), GHPO 1415 (SEQ ID N0:656), - GHPO 1425 (SEQ ID N0:658), GHPO 1427 (SEQ ID N0:660), GHPO 1444 (SEQ ID N0:662), GHPO 1449 (SEQ ID N0:664), GHPO 1465 (SEQ ID
N0:666), GHPO 1475 (SEQ ID N0:668), GHPO 1479 (SEQ ID N0:670), GHPO 1483 (SEQ ID N0:672), GHPO 1488 (SEQ ID N0:674), GHPO 1496 (SEQ ID N0:676), GHPO 1524 (SEQ ID N0:678), GHPO 1536 (SEQ ID

_g_ N0:680), GHPO 1539 (SEQ ID N0:682), GHPO 1540 (SEQ ID N0:684), GHPO 1542 (SEQ ID N0:686}, GHPO 1555 (SEQ ID N0:688), GHPO 1560 (SEQ ID N0:690), GHPO 1564 (SEQ ID N0:692), GHPO 1570 (SEQ ID
N0:694), GHPO 1588 (SEQ ID N0:696), GHPO 1604 (SEQ ID N0:698), GHPO 1605 (SEQ ID N0:700), GHPO 1619 (SEQ ID N0:702), GHPO 1629 (SEQ ID N0:704), GHPO 1642 (SEQ ID N0:706), GHPO 1654 (SEQ ID
N0:708), GHPO 1661 (SEQ ID N0:710), GHPO 1673 (SEQ ID N0:712), GHPO 1687 (SEQ ID N0:714), GHPO 1692 (SEQ ID N0:716), GHPO 1693 (SEQ ID N0:718), GHPO 1699 (SEQ ID N0:720), GHPO 1738 (SEQ ID
N0:722), GHPO 1745 (SEQ ID N0:724), GHPO 1746 (SEQ ID N0:726), GHPO 1754 (SEQ ID N0:728), GHPO 1792 (SEQ ID N0:730), GHPO 1795 (SEQ ID N0:732), GHPO 1796 (SEQ ID N0:734), GHPO 7 (SEQ ID
N0:736), GHPO 8 (SEQ ID N0:738), GHPO 9 (SEQ ID N0:740), GHPO 10 (SEQ ID N0:742), GHPO 12 (SEQ ID N0:744), GHPO 25 (SEQ ID N0:746), GHPO 27 (SEQ ID N0:748), GHPO 29 (SEQ ID N0:750), GHPO 30 (SEQ ID
N0:752), GHPO 37 (SEQ ID N0:754},, GHPO 49 (SEQ ID N0:756), GHPO
51 (SEQ ID N0:758), GHPO 54 (SEQ ID N0:760), GHPO 65 (SEQ ID
N0:762), GHPO 66 (SEQ ID N0:764), GHPO 68 (SEQ ID N0:766), GHPO
70 (SEQ ID N0:768), GHPO 77 (SEQ ID N0:770), GHPO 83 (SEQ ID
N0:772), GHPO 85 (SEQ ID N0:774}, GHPO 87 (SEQ ID N0:776), GHPO
91 (SEQ ID N0:778), GHPO 92 (SEQ ID N0:780), GHPO 96 (SEQ ID
N0:782}, GHPO 97 (SEQ ID N0:784}, GHPO 111 (SEQ ID N0:786), GHPO
115 (SEQ ID N0:788), GHPO 117 (SEQ ID N0:790), GHPO I23 (SEQ ID
N0:792), GHPO 124 (SEQ ID N0:794), GHPO 126 (SEQ ID N0:796), GHPO
127 (SEQ ID N0:798), GHPO 128 (SEQ ID N0:800), GHPO 131 (SEQ ID
N0:802), GHPO 133 (SEQ ID N0:804), GHPO 140 (SEQ ID N0:806), GHPO
141 (SEQ ID N0:808), GHPO 145 (SEQ ID N0:810), GHPO 147 (SEQ ID

N0:812), GHPO 166 (SEQ ID N0:814), GHPO 181 (SEQ ID N0:816), GHPO
187 (SEQ ID N0:818), GHPO 188 (SEQ ID N0:820), GHPO 192 (SEQ ID
N0:822), GHPO 202 (SEQ ID N0:824), GHPO 204 {SEQ ID N0:826), GHPO
205 (SEQ ID N0:828), GHPO 212 (SEQ ID N0:830), GHPO 218 (SEQ ID
N0:832), GHPO 226 (SEQ ID N0:834), GHPO 231 (SEQ ID N0:836), GHPO
236 (SEQ ID N0:838), GHPO 239 (SEQ ID N0:840), GHPO 245 (SEQ ID
N0:842), GHPO 246 {SEQ ID N0:844), GHPO 248 (SEQ ID N0:846), GHPO
253 (SEQ ID N0:848), GHPO 265 (SEQ ID N0:850), GHPO 266 (SEQ ID
NO:852), GHPO 271 (SEQ ID N0:854), GHPO 272 (SEQ ID N0:856), GHPO
286 (SEQ ID N0:858), GHPO 291 (SEQ ID N0:860), GHPO 292 {SEQ ID
N0:862), GHPO 297 (SEQ ID N0:864), GHPO 304 (SEQ ID N0:866), GHPO
307 (SEQ ID N0:868), GHPO 324 (SEQ ID N0:870), GHPO 326 (SEQ ID
N0:872), GHPO 331 (SEQ ID N0:874), GHPO 343 (SEQ ID N0:876), GHPO
345 (SEQ ID N0:878), GHPO 346 (SEQ ID N0:880), GHPO 352 (SEQ ID
N0:882), GHPO 355 (SEQ ID N0:884), GHPO 363 {SEQ ID N0:886), GHPO
369 (SEQ ID N0:888), GHPO 376 (SEQ ID N0:890), GHPO 378 (SEQ ID
N0:892), GHPO 388 {SEQ ID N0:894), GHPO 396 (SEQ ID N0:896), GHPO
403 (SEQ ID N0:898), GHPO 410 (SEQ ID N0:900), GHPO 415 (SEQ ID
N0:902), GHPO 421 (SEQ ID N0:904), GHPO 439 (SEQ ID N0:906), GHPO
441 (SEQ ID N0:908), GHPO 443 (SEQ ID N0:910), GHPO 453 (SEQ ID
N0:912), GHPO 455 (SEQ ID N0:914), GHPO 464 (SEQ ID N0:916), GHPO
467 (SEQ ID N0:918), GHPO 468 (SEQ ID N0:920), GHPO 470 (SEQ ID
N0:922), GHPO 486 (SEQ ID N0:924), GHPO 487 (SEQ ID N0:926), GHPO
488 (SEQ ID N0:928), GHPO 489 (SEQ ID N0:930), GHPO 498 (SEQ ID
N0:932), GHPO 501 (SEQ ID N0:934), GHPO 504 (SEQ ID N0:936), GHPO
512 (SEQ ID N0:938), GHPO 517 (SEQ ID N0:940), GHPO 520 (SEQ ID
N0:942), GHPO 528 (SEQ ID N0:944), GHPO 530 (SEQ ID N0:946), GHPO

532 (SEQ ID N0:948), GHPO 548 (SEQ ID N0:950), GHPO 561 (SEQ ID
N0:952), GHPO 564 {SEQ ID N0:954), GHPO 572 (SEQ ID N0:956), GHPO
573 (SEQ ID N0:958), GHPO 574 (SEQ ID N0:960), GHPO 577 (SEQ ID
N0:962), GHPO 579 (SEQ ID N0:964), GHPO 583 (SEQ ID N0:966), GHPO
588 (SEQ ID N0:968), GHPO 593 (SEQ ID N0:970), GHPO 597 (SEQ ID
N0:972), GHPO 598 (SEQ ID N0:974), GHPO 604 (SEQ ID N0:976}, GHPO
606 (SEQ ID N0:978), GHPO 611 (SEQ ID N0:980), GHPO 612 (SEQ ID
N0:982), GHPO 615 (SEQ ID N0:984), GHPO 632 (SEQ ID N0:986), GHPO
633 (SEQ ID N0:988), GHPO 637 (SEQ ID N0:990), GHPO 651 (SEQ ID
N0:992), GHPO 663 (SEQ ID N0:994), GHPO 686 (SEQ ID N0:996}, GHPO
693 (SEQ ID N0:998), GHPO 698 (SEQ ID NO:1000), GHPO 703 (SEQ ID
NO:1002), GHPO 704 (SEQ ID N0:1004), GHPO 705 (SEQ ID N0:1006), GHPO 707 (SEQ ID N0:1008), GHPO 721 (SEQ ID NO:1010), GHPO 727 (SEQ ID N0:1012), GHPO 728 (SEQ ID N0:1014), GHPO 733 (SEQ ID
N0:1016), GHPO 758 (SEQ ID N0:1018), GHPO 763 (SEQ ID N0:1020}, 1 S GHPO 771 (SEQ ID N0:1022), GHPO 774 (SEQ ID N0:1024), GHPO 776 (SEQ ID N0:1026), GHPO 783 (SEQ ID N0:1028), GHPO 800 (SEQ ID
N0:1030), GHPO 806 (SEQ ID N0:1032), GHPO 807 (SEQ ID N0:1034), GHPO 808 (SEQ ID NO:I036), GHPO 809 (SEQ ID N0:1038), GHPO 811 (SEQ ID N0:1040), GHPO 815 (SEQ ID N0:1042), GHPO 819 (SEQ ID
N0:1044), GHPO 841 (SEQ ID N0:1046), GHPO 843 (SEQ ID N0:1048}, GHPO 846 (SEQ ID NO:1050), GHPO 875 (SEQ ID N0:1052), GHPO 892 (SEQ ID N0:1054), GHPO 902 (SEQ ID N0:1056), GHPO 904 (SEQ ID
N0:1058), GHPO 906 (SEQ ID N0:1060), GHPO 908 (SEQ ID N0:1062), GHPO 921 (SEQ ID N0:1064), GHPO 923 (SEQ ID N0:1066), GHPO 926 (SEQ ID N0:1068), GHPO 933 (SEQ ID N0:1070), GHPO 939 (SEQ ID
N0:1072), GHPO 940 (SEQ ID N0:1074}, GHPO 943 (SEQ ID N0:1076), GHPO 951 (SEQ ID N0:1078), GHPO 961 (SEQ ID N0:1080), GHPO 965 (SEQ ID N0:1082), GHPO 990 (SEQ ID NO: i 084), GHPO 991 {SEQ ID
N0:1086), GHPO 998 (SEQ ID N0:1088), GHPO 1001 (SEQ ID N0:1090), GHPO 1005 (SEQ ID N0:1092), GHPO 1033 (SEQ ID N0:1094), GHPO
1039 (SEQ ID N0:1096), GHPO 1041 (SEQ ID N0:1098), GHPO 1043 {SEQ
ID NO:l 100), GHPO 1044 (SEQ ID N0:1102), GHPO 1051 (SEQ ID
N0:1104), GHPO 1058 (SEQ ID N0:1106), GHPO 1060 (SEQ ID N0:1108), GHPO 1075 (SEQ ID NO:1110), GHPO 1077 (SEQ ID N0:1112), GHPO
1082 (SEQ ID N0:1114), GHPO 1083 (SEQ ID NO:1116), GHPO 1086 (SEQ
ID N0:1118), GHPO 1087 (SEQ ID NO:l 120), GHPO 1090 (SEQ ID
NO:l 122), GHPO 1097 (SEQ ID NO:l 124), GHPO 1098 (SEQ ID N0:1126), GHPO 1103 (SEQ ID NO:l 128), GHPO 1113 (SEQ ID N0:1130), GHPO
1116 (SEQ ID N0:1132), GHPO 1123 (SEQ ID N0:1134), GHPO 1125 (SEQ
ID N0:1136), GHPO 1129 (SEQ ID N0:1138), GHPO 1130 (SEQ ID
N0:1140), GHPO 1134 (SEQ ID N0:1142), GHPO 1161 (SEQ ID N0:1144), GHPO 1166 (SEQ ID NO:I 146), GHPO 1170 (SEQ ID N0:1148), GHPO
1175 (SEQ ID NO:1150), GHPO 1181 (SEQ ID N0:1152), GHPO 1186 (SEQ
ID N0:1154), GHPO 1188 (SEQ ID N0:1156), GHPO 1191 (SEQ ID
NO:l 158), GHPO 1193 (SEQ ID N0:1160), GHPO 1196 (SEQ ID N0:1162), GHPO 1204 (SEQ ID N0:1164), GHPO 1210 (SEQ ID N0:1166), GHPO
1211 (SEQ ID N0:1168), GHPO 1216 (SEQ ID N0:1170), GHPO 1218 (SEQ
ID N0:1172), GHPO 1220 (SEQ ID N0:1174), GHPO 1223 (SEQ ID
N0:1176), GHPO 1226 (SEQ ID N0:1178), GHPO 1240 (SEQ ID N0:1180), GHPO 1246 (SEQ ID N0:1182), GHPO 1251 (SEQ ID N0:1184), GHPO
1252 (SEQ ID N0:1186), GHPO 1261 (SEQ ID N0:1188), GHPO 1265 (SEQ
ID N0:1190), GHPO 1267 (SEQ ID N0:1192), GHPO 1278 (SEQ ID
N0:1194), GHPO 1282 (SEQ ID N0:1196), GHPO 1283 (SEQ ID N0:1198), GHPO 1287 (SEQ ID N0:1200), GHPO 1292 (SEQ ID N0:1202), GHPO
1293 (SEQ ID N0:1204), GHPO 1302 (SEQ ID N0:1206), GHPO 1309 (SEQ
ID N0:1208), GHPO 1317 (SEQ ID N0:1210), GHPO 1318 (SEQ ID
N0:1212), GHPO 1321 (SEQ ID N0:1214), GHPO 1325 (SEQ ID N0:1216), GHPO 1341 (SEQ ID N0:1218), GHPO 1351 (SEQ ID N0:1220), GHPO
1354 (SEQ ID N0:1222), GHPO 1363 (SEQ ID N0:1224), GHPO 1371 (SEQ
ID N0:1226), GHPO 1381 {SEQ ID N0:1228), GHPO 1401 (SEQ ID
N0:1230), GHPO 1402 (SEQ ID N0:1232), GHPO 1403 (SEQ ID N0:1234), GHPO 1408 (SEQ ID N0:1236), GHPO 1416 (SEQ ID N0:1238), GHPO
1420 (SEQ ID N0:1240), GHPO 1428 (SEQ ID N0:1242), GHPO 1437 (SEQ
ID N0:1244), GHPO 1439 (SEQ ID N0:1246), GHPO 1460 (SEQ ID
N0:1248), GHPO 1463 (SEQ ID N0:1250), GHPO 1472 (SEQ ID N0:1252), GHPO 1474 (SEQ ID N0:1254), GHPO 1484 (SEQ ID N0:1256), GHPO
1489 (SEQ ID N0:1258), GHPO 1494 (SEQ ID N0:1260), GHPO 1495 (SEQ
ID N0:1262), GHPO 1498 (SEQ ID N0:1264), GHPO 1499 (SEQ ID
N0:1266), GHPO 1500 (SEQ ID N0:1268), GHPO 1503 (SEQ ID N0:1270), GHPO 1504 (SEQ ID N0:1272), GHPO 1510 (SEQ ID N0:1274), GHPO
1518 (SEQ ID NO:1276), GHPO 1533 (SEQ ID N0:1278), GHPO 1541 (SEQ
ID N0:1280), GHPO 1544 (SEQ ID N0:1282), GHPO 1548 (SEQ ID
N0:1284), GHPO 1565 (SEQ ID N0:1286), GHPO 1575 (SEQ ID N0:1288), GHPO 1582 (SEQ ID N0:1290), GHPO 1595 (SEQ ID N0:1292), GHPO
1597 (SEQ ID N0:1294), GHPO 1599 (SEQ ID N0:1296), GHPO 1601 (SEQ
ID N0:1298), GHPO 1609 (SEQ ID N0:1300), GHPO 1613 (SEQ ID
N0:1302), GHPO 1614 (SEQ ID N0:1304), GHPO 1626 (SEQ ID N0:1306), GHPO 1628 (SEQ ID N0:1308), GHPO 1639 (SEQ ID N0:1310), GHPO
1640 (SEQ ID N0:1312), GHPO 1641 (SEQ ID N0:1314), GHPO 1646 (SEQ
ID N0:1316), GHPO 1662 (SEQ ID NO:I318), GHPO 1667 (SEQ ID

N0:1320), GHPO 1668 (SEQ ID N0:1322), GHPO 1670 (SEQ ID N0:1324), GHPO 1671 (SEQ ID N0:1326), GHPO 1672 (SEQ ID N0:1328), GHPO
1678 (SEQ ID N0:1330), GHPO 1684 (SEQ ID N0:1332), GHPO 1695 (SEQ
ID N0:1334), GHPO 1697 {SEQ ID N0:1336), GHPO 1701 (SEQ ID
N0:1338), GHPO 1719 (SEQ ID N0:1340), GHPO 1723 (SEQ ID N0:1342), GHPO 1732 (SEQ ID N0:1344), GHPO 1739 (SEQ ID N0:1346), GHPO
1741 (SEQ ID N0:1348), GHPO 1747 (SEQ ID N0:1350), GHPO 1749 (SEQ
ID N0:1352), GHPO 1750 (SEQ ID N0:1354), GHPO 1751 (SEQ ID
N0:1356), GHPO 1755 (SEQ ID N0:1358), GHPO 1771 (SEQ ID N0:1360), GHPO 1786 (SEQ ID N0:1362), and GHPO 1789 (SEQ ID N0:1364), which can be used, e.g., in methods to prevent, treat, or diagnose Helicobacter infection. The sequences of polynucleotides that encode these polypeptides are shown in the sequence listing (odd numbers, up to SEQ ID N0:1363). Those skilled in the art will understand that the invention also includes polynucleotide molecules that encode mutants and derivatives of these polypeptides, which can result from the addition, deletion, or substitution of non-essential amino acids, as is described further below.
In addition to the polynucleotide molecules described above, the invention includes the corresponding polypeptides (i.e., polypeptides encoded by the polynucleotide molecules of the invention, or fragments thereof), and monospecific antibodies that specifically bind to these polypeptides. The polypeptides of the invention include those having the amino acid sequences shown in the sequence listing (even numbers, up to SEQ ID N0:1363), as well as mature forms of proteins having sequences shown in the sequence listing in their unprocessed forms, and fragments thereof.
The present invention has many applications and includes expression cassettes, vectors, and cells transformed or transfected with the polynucleotides of the invention. Accordingly, the present invention provides (i) methods for producing polypeptides of the invention in recombinant host systems and related expression cassettes, vectors, and transformed or transfected cells;
(ii) live vaccine vectors, such as pox virus, Salmonella typhimurium, and Vibrio cholerae vectors, that contain polynucleotides of the invention (such vaccine vectors being useful in, e.g., methods for preventing or treating Helicobacter infection) in combination with a diluent or carrier, and related pharmaceutical compositions and associated therapeutic and/or prophylactic methods; (iii) therapeutic and/or prophylactic methods involving administration of polynucleotide molecules, either in a naked form or formulated with a delivery vehicle, polypeptides or mixtures of polypeptides, or monospecific antibodies of the invention, and related pharmaceutical compositions; (iv) methods for detecting the presence of Helicobacter in biological samples, which can involve the use of polynucleotide molecules, monospecific antibodies, or polypeptides of the invention; and (v) methods for purifying polypeptides of the invention by antibody-based affinity chromatography.
Detailed Description Open reading frames (ORFs) encoding new polypeptides, designated GHPO 35 (SEQ ID N0:2), GHPO 55 (SEQ ID N0:4), GHPO 78 (SEQ ID
N0:6), GHPO 89 (SEQ ID N0:8), GHPO 129 (SEQ ID NO:10), GHPO 541 (SEQ ID N0:12), GHPO 607 (SEQ ID N0:14), GHPO 635 (SEQ ID N0:16), GHPO 701 (SEQ ID N0:18), GHPO 712 (SEQ ID N0:20), GHPO 761 (SEQ
ID N0:22), GHPO 838 (SEQ ID N0:24), GHPO 1034 (SEQ ID N0:26), GHPO 1085 (SEQ ID N0:28), GHPO 1213 (SEQ ID N0:30), GHPO 1255 (SEQ ID N0:32), GHPO 1308 (SEQ ID N0:34), GHPO 1389 (SEQ ID
N0:36), GHPO 1706 (SEQ ID N0:38), GHPO 234 (SEQ ID N0:40), GHPO

314 (SEQ ID N0:42), GHPO 510 (SEQ ID N0:44), GHPO 603 (SEQ ID
N0:46), GHPO 937 (SEQ ID N0:48), GHPO 1027 (SEQ ID N0:50), GHPO
1099 (SEQ ID N0:52), GHPO 1151 (SEQ ID N0:54), GHPO 1275 (SEQ ID
N0:56), GHPO 1365 (SEQ ID N0:58), GHPO 1578 (SEQ ID N0:60), GHPO
22 (SEQ ID N0:62), GHPO 58 (SEQ ID N0:64), GHPO 200 (SEQ ID N0:66), GHPO 558 (SEQ ID N0:68), GHPO 563 (SEQ ID N0:70), GHPO 695 (SEQ
ID N0:72), GHPO 699 (SEQ ID N0:74), GHPO 702 (SEQ ID N0:76), GHPO
709 (SEQ ID N0:78), GHPO 741 (SEQ ID N0:80), GHPO 762 (SEQ ID
N0:82), GHPO 827 (SEQ ID N0:84), GHPO 852 (SEQ ID N0:86), GHPO
1013 (SEQ ID N0:88), GHPO 1020 (SEQ ID N0:90), GHPO 1031 (SEQ ID
N0:92), GHPO 1052 (SEQ ID N0:94), GHPO 1127 (SEQ ID N0:96), GHPO
1149 (SEQ ID N0:98), GHPO 1176 (SEQ ID NO:100), GHPO 1250 (SEQ ID
N0:102), GHPO 1312 (SEQ ID N0:104), GHPO 1358 (SEQ ID N0:106), GHPO 1490 (SEQ ID N0:108), GHPO 1559 (SEQ ID NO:110), GHPO 1651 (SEQ ID N0:112), GHPO 1726 (SEQ ID N0:114), GHPO 1780 (SEQ ID
N0:116), GHPO 895 (SEQ ID N0:118), GHPO 1447 (SEQ ID NO:120), GHPO 28 (SEQ ID N0:122), GHPO 86 (SEQ ID N0:124), GHPO 155 (SEQ
ID N0:126), GHPO 157 (SEQ ID N0:128), GHPO 237 (SEQ ID N0:130), GHPO 290 {SEQ ID N0:132), GHPO 293 (SEQ ID N0:134), GHPO 335 (SEQ ID N0:136), GHPO 374 (SEQ ID N0:138), GHPO 442 (SEQ ID
N0:140), GHPO 480 (SEQ ID N0:142), GHPO 523 (SEQ ID N0:144), GHPO
610 (SEQ ID N0:146), GHPO 675 (SEQ ID N0:148), GHPO 690 (SEQ ID
N0:150), GHPO 829 (SEQ ID N0:152), GHPO 850 (SEQ ID N0:154), GHPO
876 (SEQ ID N0:156), GHPO 984 (SEQ ID N0:158), GHPO 989 (SEQ ID
N0:160), GHPO 1111 {SEQ ID N0:162), GHPO 1145 (SEQ ID N0:164), GHPO 1256 (SEQ ID N0:166), GHPO 1264 (SEQ ID N0:168), GHPO 1316 (SEQ ID N0:170), GHPO 1368 (SEQ ID N0:172), GHPO 1442 (SEQ ID

N0:174), GHPO 1506 (SEQ ID N0:176), GHPO 1543 (SEQ ID N0:178), GHPO 1574 (SEQ ID N0:180), GHPO 1627 (SEQ ID N0:182), GHPO 1657 (SEQ ID N0:184), GHPO 1664 (SEQ ID N0:186), GHPO 1694 (SEQ ID
N0:188), GHPO 1704 (SEQ ID N0:190), GHPO 1763 (SEQ ID N0:192), GHPO 616 (SEQ ID N0:194), GHPO 76 (SEQ ID N0:196), GHPO 109 (SEQ
ID N0:198), GHPO 163 (SEQ ID NO:200), GHPO 169 (SEQ ID N0:202), GHPO 208 (SEQ ID N0:204), GHPO 219 (SEQ ID N0:206), GHPO 445 (SEQ ID N0:208), GHPO 479 (SEQ ID N0:210), GHPO 525 (SEQ ID
N0:212), GHPO 535 (SEQ ID N0:214), GHPO 731 (SEQ ID N0:216), GHPO
836 (SEQ ID N0:218), GHPO 879 (SEQ ID N0:220), GHPO 881 (SEQ ID
N0:222), GHPO 886 (SEQ ID N0:224), GHPO 893 (SEQ ID N0:226), GHPO
894 (SEQ ID N0:228), GHPO 976 (SEQ ID N0:230), GHPO 1011 (SEQ ID
N0:232), GHPO 1024 (SEQ ID N0:234), GHPO 1084 (SEQ ID N0:236), GHPO 1329 (SEQ ID N0:238), GHPO 1330 (SEQ ID N0:240), GHPO 1346 (SEQ ID N0:242), GHPO 1360 (SEQ ID N0:244), GHPO 1388 (SEQ ID
N0:246), GHPO 1411 (SEQ ID N0:248), GHPO 1419 (SEQ ID N0:250), GHPO 1446 (SEQ ID N0:252), GHPO 1469 (SEQ ID N0:254), GHPO 1501 (SEQ ID N0:256), GHPO 1505 (SEQ ID N0:258), GHPO 1522 (SEQ ID
N0:260), GHPO 1525 (SEQ ID N0:262), GHPO 161 S (SEQ ID N0:264), GHPO 1689 (SEQ ID N0:266), GHPO 1733 (SEQ ID N0:268), GHPO 18 (SEQ ID N0:270), GHPO 139 (SEQ ID N0:272), GHPO 142 (SEQ ID
N0:274), GHPO 250 (SEQ ID N0:276), GHPO 257 (SEQ ID N0:278), GHPO
325 (SEQ ID N0:280), GHPO 355 (SEQ ID N0:282), GHPO 357 (SEQ ID
N0:284), GHPO 454 (SEQ ID N0:286), GHPO 475 (SEQ ID N0:288), GHPO
515 (SEQ ID N0:290), GHPO 527 (SEQ ID N0:292), GHPO 551 (SEQ ID
N0:294), GHPO 602 (SEQ ID N0:296), GHPO 626 {SEQ ID N0:298), GHPO
646 (SEQ ID N0:300), GHPO 653 (SEQ ID N0:302), GHPO 655 (SEQ ID

N0:304), GHPO 670 (SEQ ID N0:306), GHPO 739 (SEQ ID N0:308), GHPO
798 (SEQ ID N0:310), GHPO 1102 (SEQ ID N0:312), GHPO 1114 (SEQ ID
N0:314), GHPO 1152 (SEQ ID N0:316), GHPO 1272 (SEQ ID N0:318), GHPO 1345 (SEQ ID N0:320), GHPO 1377 (SEQ TD N0:322), GHPO 1424 (SEQ ID N0:324), GHPO 1430 (SEQ ID N0:326), GHPO 1502 (SEQ ID
N0:328), GHPO 1600 (SEQ ID NO:330), GHPO 1714 (SEQ ID N0:332), GHPO 359 (SEQ ID N0:334), GHPO 678 (SEQ ID N0:336), GHPO 708 (SEQ ID N0:338), GHPO 759 (SEQ ID N0:340), GHPO 847 (SEQ ID
N0:342), GHPO 1050 (SEQ ID N0:344), GHPO 1101 (SEQ ID N0:346), GHPO 1120 (SEQ ID N0:348), GHPO 1138 (SEQ ID N0:350), GHPO 1310 (SEQ ID N0:352), GHPO 1320 (SEQ ID N0:354), GHPO 1375 (SEQ ID
N0:356), GHPO 1432 (SEQ ID N0:358), GHPO 21 (SEQ ID N0:360), GHPO
282 (SEQ ID N0:362), GHPO 1089 (SEQ ID N0:364), GHPO 1141 (SEQ ID
N0:366), GHPO 1280 (SEQ ID N0:368), GHPO 1608 (SEQ ID N0:370), GHPO 15 (SEQ ID N0:372), GHPO 16 (SEQ ID N0:3?4), GHPO 36 (SEQ ID
N0:376), GHPO 38 (SEQ ID N0:378), GHPO 52 (SEQ ID N0:380), GHPO
57 (SEQ ID N0:382), GHPO 64 (SEQ ID N0:384), GHPO 79 (SEQ ID
N0:386), GHPO 84 (SEQ ID N0:388), GHPO 86 (SEQ ID N0:390), GHPO
99 (SEQ ID N0:392), GHPO 106 (SEQ ID N0:394), GHPO 118 (SEQ ID
N0:396), GHPO 122 (SEQ ID N0:398), GHPO 128 (SEQ ID N0:400), GHPO
138 (SEQ ID N0:402), GHPO 153 (SEQ ID N0:404), GHPO 160 (SEQ ID
N0:406), GHPO 168 (SEQ ID N0:408), GHPO 179 (SEQ ID N0:410), GHPO
189 (SEQ ID N0:412), GHPO 229 (SEQ ID N0:414), GHPO 243 (SEQ ID
N0:416), GHPO 244 (SEQ ID N0:418), GHPO 251 (SEQ ID N0:420), GHPO
267 (SEQ ID N0:422), GHPO 269 (SEQ ID N0:424), GHPO 279 (SEQ ID
N0:426), GHPO 284 (SEQ ID N0:428), GHPO 296 (SEQ ID N0:430), GHPO
300 (SEQ ID N0:432), GHPO 305 (SEQ ID N0:434), GHPO 319 (SEQ ID

N0:436), GHPO 330 (SEQ ID N0:438), GHPO 340 (SEQ ID N0:440), GHPO
342 (SEQ ID N0:442), GHPO 344 (SEQ ID N0:444), GHPO 358 (SEQ ID
N0:446), GHPO 373 (SEQ ID N0:448), GHPO 382 (SEQ ID N0:450), GHPO
384 (SEQ ID N0:452), GHPO 398 (SEQ ID N0:454), GHPO 409 (SEQ ID
N0:456), GHPO 422 (SEQ ID N0:458), GHPO 430 (SEQ ID N0:460), GHPO
446 (SEQ ID N0:462), GHPO 447 (SEQ ID N0:464), GHPO 450 (SEQ ID
N0:466), GHPO 451 (SEQ ID N0:468), GHPO 452 (SEQ ID N0:470), GHPO
456 (SEQ ID N0:472), GHPO 461 (SEQ ID N0:474), GHPO 476 (SEQ ID
N0:476), GHPO 478 (SEQ ID N0:478), GHPO 491 (SEQ ID N0:480), GHPO
511 (SEQ ID N0:482), GHPO 519 (SEQ ID N0:484), GHPO 526 (SEQ ID
N0:486), GHPO 534 (SEQ ID N0:488), GHPO 536 (SEQ ID N0:490), GHPO
542 (SEQ ID N0:492), GHPO 544 (SEQ ID N0:494), GHPO 576 (SEQ ID
N0:496), GHPO 578 (SEQ ID N0:498), GHPO 580 (SEQ ID NO:500), GHPO
585 (SEQ ID N0:502), GHPO 599 (SEQ ID N0:504), GHPO 639 (SEQ ID
N0:506), GHPO 642 (SEQ ID N0:508), GHPO 647 (SEQ ID NO:510), GHPO
654 (SEQ ID N0:512), GHPO 669 (SEQ ID NO:5I4), GHPO 710 (SEQ ID
N0:516), GHPO 713 (SEQ ID N0:518), GHPO 716 (SEQ ID N0:520}, GHPO
718 (SEQ ID N0:522), GHPO 726 (SEQ ID N0:524), GHPO 734 (SEQ ID
N0:526), GHPO 740 (SEQ ID N0:528), GHPO 770 (SEQ ID N0:530), GHPO
782 (SEQ ID N0:532), GHPO 786 (SEQ ID N0:534), GHPO 792 (SEQ ID
N0:536), GHPO 797 (SEQ ID N0:538), GHPO 816 (SEQ ID N0:540), GHPO
828 (SEQ ID N0:542), GHPO 839 (SEQ ID N0:544), GHPO 840 (SEQ ID
N0:546), GHPO 842 (SEQ ID N0:548), GHPO 885 (SEQ ID NO:S50), GHPO
889 (SEQ ID N0:552}, GHPO 903 (SEQ ID N0:554), GHPO 912 (SEQ ID
N0:556), GHPO 946 (SEQ ID N0:558), GHPO 958 (SEQ ID N0:560), GHPO
968 (SEQ ID N0:562), GHPO 987 (SEQ ID N0:564), GHPO 992 (SEQ ID
N0:566), GHPO 996 (SEQ ID N0:568), GHPO 997 (SEQ ID N0:570), GHPO

1002 (SEQ ID N0:572), GHPO 1026 {SEQ ID N0:574), GHPO 1028 (SEQ ID
N0:576), GHPO 1034 (SEQ ID N0:578), GHPO 1038 (SEQ ID N0:580), GHPO 1059 (SEQ ID N0:582), GHPO 1065 (SEQ ID N0:584}, GHPO 1072 (SEQ ID N0:586), GHPO 1073 (SEQ ID N0:588), GHPO 1088 (SEQ ID
N0:590}, GHPO 1091 (SEQ ID N0:592), GHPO 1105 (SEQ ID N0:594), GHPO 111 S (SEQ ID N0:596), GHPO 1159 {SEQ ID N0:598), GHPO 1177 (SEQ ID N0:600), GHPO 1187 (SEQ ID N0:602), GHPO 1192 (SEQ ID
N0:604), GHPO I 195 (SEQ ID N0:606), GHPO 1224 {SEQ ID N0:608), GHPO 1225 (SEQ ID N0:610), GHPO 1228 (SEQ iD N0:612), GHPO 1229 (SEQ ID N0:614), GHPO 1231 (SEQ ID N0:616), GHPO 1236 (SEQ ID
N0:618), GHPO 1242 (SEQ ID N0:620), GHPO 1248 (SEQ ID N0:622), GHPO 1270 (SEQ ID N0:624), GHPO 1271 (SEQ ID N0:626), GHPO 1298 (SEQ ID N0:628), GHPO 1301 (SEQ ID N0:630), GHPO 1304 (SEQ ID
N0:632), GHPO 1315 (SEQ ID N0:634), GHPO 1319 (SEQ ID N0:636), GHPO 1323 (SEQ ID N0:638), GHPO 1331 (SEQ ID N0:640), GHPO 1332 (SEQ ID N0:642), GHPO 1347 (SEQ ID N0:644), GHPO 1373 (SEQ ID
N0:646), GHPO 1376 (SEQ ID N0:648), GHPO 1380 (SEQ ID N0:650), GHPO 1394 (SEQ ID N0:652), GHPO 1407 (SEQ ID N0:654), GHPO 1415 (SEQ ID N0:656), GHPO 1425 (SEQ ID N0:658), GHPO 1427 (SEQ ID
N0:660), GHPO 1444 (SEQ ID N0:662), GHPO 1449 (SEQ ID N0:664), GHPO 1465 (SEQ ID N0:666), GHPO 1475 (SEQ ID N0:668), GHPO 1479 (SEQ ID N0:670), GHPO 1483 (SEQ ID N0:672), GHPO 1488 (SEQ ID
N0:674), GHPO 1496 (SEQ ID N0:676), GHPO 1524 (SEQ ID N0:678), GHPO 1536 (SEQ ID N0:680), GHPO 1539 (SEQ ID N0:682), GHPO 1540 (SEQ ID N0:684), GHPO 1542 (SEQ ID NO:686), GHPO 1555 (SEQ ID
N0:688), GHPO 1560 (SEQ ID N0:690), GHPO 1564 (SEQ ID N0:692), GHPO 1570 (SEQ ID N0:694), GHPO 1588 (SEQ ID N0:696), GHPO 1604 {SEQ ID N0:698), GHPO 1605 (SEQ ID N0:700), GHPO 1619 (SEQ ID
N0:702), GHPO 1629 (SEQ ID N0:704), GHPO 1642 (SEQ ID N0:706), GHPO 1654 (SEQ ID N0:708), GHPO 1661 (SEQ ID N0:710), GHPO 1673 (SEQ ID N0:712), GHPO 1687 (SEQ ID N0:714), GHPO 1692 (SEQ ID
N0:716), GHPO 1693 (SEQ ID N0:718), GHPO 1699 (SEQ ID N0:720), S GHPO 1738 (SEQ ID N0:722), GHPO 1745 (SEQ ID N0:724), GHPO 1746 (SEQ ID N0:726), GHPO 1754 {SEQ ID N0:728), GHPO 1792 (SEQ ID
N0:730), GHPO 1795 (SEQ ID N0:732), GHPO 1796 (SEQ ID N0:734), GHPO 7 (SEQ ID N0:736}, GHPO 8 (SEQ ID N0:738), GHPO 9 (SEQ ID
N0:740), GHPO 10 (SEQ ID N0:742), GHPO 12 (SEQ ID N0:744), GHPO
25 (SEQ ID N0:746), GHPO 27 (SEQ ID N0:748), GHPO 29 (SEQ ID
N0:750), GHPO 30 (SEQ ID N0:752), GHPO 37 (SEQ ID N0:754), GHPO
49 (SEQ ID N0:756), GHPO S 1 (SEQ ID N0:758), GHPO 54 (SEQ ID
N0:760), GHPO 65 (SEQ ID N0:762), GHPO 66 (SEQ ID N0:764), GHPO
68 (SEQ ID N0:766), GHPO 70 (SEQ ID N0:768), GHPO 77 (SEQ ID
N0:770), GHPO 83 (SEQ ID N0:772), GHPO 85 (SEQ ID N0:774), GHPO
87 (SEQ ID N0:776), GHPO 91 (SEQ ID N0:778), GHPO 92 (SEQ ID
N0:780), GHPO 96 (SEQ ID N0:782), GHPO 97 (SEQ ID N0:784), GHPO
111 (SEQ ID N0:786), GHPO 115 (SEQ ID N0:788), GHPO 1 I7 (SEQ ID
N0:790), GHPO 123 (SEQ ID N0:792), GHPO 124 (SEQ ID N0:794), GHPO
126 (SEQ ID N0:796), GHPO 127 (SEQ ID N0:798), GHPO 128 (SEQ ID
N0:800), GHPO 13I (SEQ ID N0:802), GHPO 133 (SEQ ID N0:804), GHPO
140 (SEQ ID N0:806), GHPO 141 (SEQ ID N0:808), GHPO 145 (SEQ ID
N0:810), GHPO 147 (SEQ ID N0:812), GHPO 166 (SEQ ID N0:814), GHPO
181 (SEQ ID N0:816), GHPO 187 (SEQ ID N0:818), GHPO 188 (SEQ ID
N0:820), GHPO 192 (SEQ ID N0:822), GHPO 202 (SEQ ID N0:824), GHPO
204 (SEQ ID N0:826), GHPO 205 (SEQ ID N0:828}, GHPO 212 (SEQ ID

N0:830), GHPO 218 (SEQ ID N0:832), GHPO 226 (SEQ ID N0:834), GHPO
231 (SEQ ID N0:836), GHPO 236 (SEQ ID N0:838), GHPO 239 (SEQ ID
N0:840), GHPO 245 (SEQ ID N0:842), GHPO 246 (SEQ ID N0:844), GHPO
248 (SEQ ID N0:846), GHPO 253 (SEQ ID N0:848), GHPO 265 (SEQ ID
N0:850), GHPO 266 (SEQ ID N0:852), GHPO 271 (SEQ ID N0:854), GHPO
272 (SEQ ID N0:856), GHPO 286 (SEQ ID N0:858), GHPO 291 (SEQ ID
N0:860), GHPO 292 {SEQ ID N0:862), GHPO 297 (SEQ ID NO:864), GHPO
304 (SEQ ID N0:866), GHPO 307 (SEQ ID N0:868), GHPO 324 (SEQ ID
N0:870), GHPO 326 (SEQ ID N0:872), GHPO 331 (SEQ ID N0:874), GHPO
343 (SEQ ID N0:876), GHPO 345 (SEQ ID N0:878), GHPO 346 (SEQ ID
N0:880), GHPO 352 (SEQ ID N0:882), GHPO 355 (SEQ ID N0:884), GHPO
363 (SEQ ID N0:886), GHPO 369 (SEQ ID N0:888), GHPO 376 (SEQ ID
N0:890), GHPO 378 (SEQ ID N0:892), GHPO 388 (SEQ ID N0:894), GHPO
396 (SEQ ID N0:896), GHPO 403 (SEQ ID N0:898), GHPO 410 (SEQ ID
N0:900), GHPO 415 (SEQ ID N0:902), GHPO 421 (SEQ ID N0:904), GHPO
439 (SEQ ID N0:906), GHPO 441 (SEQ ID N0:908), GHPO 443 (SEQ ID
N0:910), GHPO 453 (SEQ ID N0:912), GHPO 455 (SEQ ID N0:914), GHPO
464 (SEQ ID N0:916), GHPO 467 (SEQ ID N0:918), GHPO 468 (SEQ ID
N0:920), GHPO 470 (SEQ ID N0:922), GHPO 486 (SEQ ID N0:924), GHPO
487 (SEQ ID N0:926), GHPO 488 (SEQ ID N0:928), GHPO 489 (SEQ ID
N0:930), GHPO 498 (SEQ ID N0:932), GHPO 501 (SEQ ID N0:934), GHPO
504 (SEQ ID N0:936), GHPO 512 (SEQ ID N0:938), GHPO 517 (SEQ ID
N0:940), GHPO 520 (SEQ ID N0:942), GHPO 528 (SEQ ID N0:944), GHPO
530 (SEQ ID N0:946), GHPO 532 (SEQ ID N0:948), GHPO 548 (SEQ ID
N0:950), GHPO 561 (SEQ ID N0:952), GHPO 564 (SEQ ID N0:954), GHPO
572 (SEQ ID N0:956), GHPO 573 (SEQ ID N0:958), GHPO 574 (SEQ ID
N0:960), GHPO 577 (SEQ ID N0:962), GHPO 579 (SEQ ID N0:964), GHPO

583 (SEQ ID N0:966), GHPO 588 (SEQ ID N0:968), GHPO 593 (SEQ ID
N0:970), GHPO 597 (SEQ ID N0:972), GHPO 598 (SEQ ID N0:974), GHPO
604 (SEQ ID N0:976), GHPO 606 (SEQ ID N0:978), GHPO 611 (SEQ ID
N0:980), GHPO 612 (SEQ ID N0:982), GHPO 615 (SEQ ID N0:984), GHPO
632 (SEQ ID N0:986), GHPO 633 (SEQ ID N0:988), GHPO 637 (SEQ ID
N0:990), GHPO 651 (SEQ ID N0:992), GHPO 663 (SEQ ID N0:994), GHPO
686 (SEQ ID N0:996), GHPO 693 (SEQ ID N0:998), GHPO 698 (SEQ ID
NO:1000), GHPO 703 (SEQ ID N0:1002), GHPO 704 (SEQ ID N0:1004), GHPO 705 (SEQ ID N0:1006), GHPO 707 (SEQ ID N0:1008), GHPO 72I
(SEQ ID NO:1010), GHPO 727 (SEQ ID N0:1012), GHPO 728 (SEQ ID
NO:I014), GHPO 733 {SEQ ID N0:1016), GHPO 758 (SEQ ID N0:1018), GHPO 763 (SEQ ID N0:1020), GHPO 771 (SEQ ID N0:1022), GHPO 774 (SEQ ID N0:1024), GHPO 776 (SEQ ID N0:1026), GHPO 783 (SEQ ID
N0:1028), GHPO 800 (SEQ ID N0:1030), GHPO 806 (SEQ ID N0:1032), GHPO 807 (SEQ ID N0:1034), GHPO 808 (SEQ ID N0:1036), GHPO 809 (SEQ ID N0:1038), GHPO 811 (SEQ ID N0:1040), GHPO 815 (SEQ ID
N0:1042), GHPO 819 (SEQ ID N0:1044), GHPO 841 (SEQ ID N0:1046), GHPO 843 (SEQ ID N0:1048), GHPO 846 {SEQ ID NO:1050), GHPO 875 (SEQ ID N0:1052), GHPO 892 (SEQ ID NO:1054), GHPO 902 (SEQ ID
N0:1056), GHPO 904 (SEQ ID N0:1058), GHPO 906 (SEQ ID NO: i 060), GHPO 908 (SEQ ID N0:1062), GHPO 921 (SEQ ID N0:1064), GHPO 923 (SEQ ID N0:1066), GHPO 926 (SEQ ID N0:1068), GHPO 933 (SEQ ID
N0:1070), GHPO 939 (SEQ ID N0:1072), GHPO 940 (SEQ ID N0:1074), GHPO 943 {SEQ ID N0:1076), GHPO 951 (SEQ ID N0:1078), GHPO 961 (SEQ ID N0:1080), GHPO 965 (SEQ ID N0:1082), GHPO 990 (SEQ ID
N0:1084), GHPO 991 (SEQ ID N0:1086), GHPO 998 (SEQ ID N0:1088), GHPO 1001 (SEQ ID N0:1090), GHPO 1005 (SEQ ID N0:1092), GHPO

1033 (SEQ ID N0:1094), GHPO 1039 (SEQ ID N0:1096), GHPO 1041 (SEQ
ID N0:1098), GHPO 1043 (SEQ ID NO:1100), GHPO 1044 (SEQ ID
NO:l 102), GHPO 1051 (SEQ ID N0:1104), GHPO 1058 (SEQ ID N0:1106), GHPO 1060 (SEQ ID N0:1108), GHPO 1075 {SEQ ID NO:l 110), GHPO
1077 {SEQ ID N0:1112), GHPO 1082 (SEQ ID N0:1114), GHPO 1083 (SEQ
ID N0:1116), GHPO 1086 (SEQ ID NO:l 118), GHPO 1087 (SEQ ID
N0:1120), GHPO 1090 (SEQ ID N0:1122), GHPO 1097 (SEQ ID N0:1124), GHPO 1098 (SEQ ID N0:1126), GHPO 1103 (SEQ ID N0:1128), GHPO
1113 (SEQ ID N0:1130), GHPO 1116 (SEQ ID N0:1132), GHPO 1123 (SEQ
ID N0:1134), GHPO 1125 {SEQ ID N0:1136), GHPO 1129 (SEQ ID
N0:1138), GHPO 1130 (SEQ ID N0:1140), GHPO 1134 (SEQ ID N0:1142), GHPO 1161 (SEQ ID N0:1144), GHPO 1166 (SEQ ID N0:1146), GHPO
1170 (SEQ ID N0:1148), GHPO 1175 (SEQ ID NO:11 SO), GHPO 1181 (SEQ
ID NO:l 152), GHPO 1186 {SEQ ID N0:1154), GHPO 1188 (SEQ ID
N0:1156), GHPO 1191 (SEQ ID N0:1158), GHPO 1193 {SEQ ID N0:1160), GHPO 1196 (SEQ ID N0:1162), GHPO 1204 (SEQ ID N0:1164), GHPO
1210 (SEQ ID N0:1166), GHPO 1211 (SEQ ID N0:1168), GHPO 1216 (SEQ
ID NO:1 I70), GHPO 1218 (SEQ ID N0:1172), GHPO 1220 (SEQ ID
N0:1174), GHPO 1223 {SEQ ID N0:1176), GHPO 1226 (SEQ ID N0:1178), GHPO 1240 (SEQ ID N0:1180), GHPO 1246 (SEQ ID N0:1182), GHPO
1251 {SEQ ID N0:1184), GHPO 1252 (SEQ ID N0:1186), GHPO 1261 (SEQ
ID N0:1188), GHPO 1265 {SEQ ID N0:1190), GHPO 1267 (SEQ ID
N0:1192), GHPO 1278 (SEQ ID N0:1194), GHPO 1282 (SEQ ID N0:1196), GHPO 1283 (SEQ ID N0:1198), GHPO 1287 (SEQ ID N0:1200), GHPO
1292 (SEQ ID N0:1202), GHPO 1293 (SEQ ID N0:1204), GHPO 1302 (SEQ
ID N0:1206), GHPO 1309 (SEQ ID N0:1208), GHPO 1317 (SEQ ID
N0:1210), GHPO 1318 (SEQ ID N0:1212), GHPO 1321 (SEQ ID N0:1214), GHPO 1325 (SEQ ID N0:1216), GHPO 1341 (SEQ ID N0:1218), GHPO
1351 (SEQ ID N0:1220), GHPO 1354 (SEQ ID N0:1222), GHPO 1363 (SEQ
ID N0:1224), GHPO 1371 (SEQ ID N0:1226), GHPO 1381 (SEQ ID
N0:1228), GHPO 1401 (SEQ ID N0:1230), GHPO 1402 (SEQ ID N0:1232), GHPO 1403 (SEQ ID N0:1234), GHPO 1408 (SEQ ID N0:1236), GHPO
1416 (SEQ ID NO:I238), GHPO 1420 (SEQ ID N0:1240), GHPO 1428 (SEQ
ID N0:1242), GHPO 1437 (SEQ ID NO:I244), GHPO 1439 (SEQ ID
N0:1246), GHPO 1460 (SEQ ID N0:1248), GHPO 1463 (SEQ ID N0:1250), GHPO 1472 (SEQ ID N0:1252), GHPO 1474 (SEQ ID NO:I254}, GHPO
1484 (SEQ ID N0:1256), GHPO 1489 (SEQ ID N0:1258), GHPO 1494 (SEQ
ID N0:1260), GHPO 1495 (SEQ ID N0:1262), GHPO 1498 (SEQ ID
N0:1264), GHPO 1499 (SEQ ID N0:1266), GHPO 1500 (SEQ ID N0:1268), GHPO 1503 (SEQ ID N0:1270), GHPO 1504 (SEQ ID N0:1272), GHPO
1510 (SEQ ID N0:1274}, GHPO 1518 (SEQ ID N0:1276), GHPO 1533 (SEQ
ID N0:1278), GHPO 1541 (SEQ ID N0:1280), GHPO 1544 (SEQ ID
N0:1282), GHPO 1548 (SEQ ID N0:1284), GHPO 1565 (SEQ ID N0:1286), GHPO 1575 (SEQ ID N0:1288), GHPO 1582 (SEQ ID N0:1290), GHPO
1595 (SEQ ID N0:1292), GHPO 1597 (SEQ ID N0:1294), GHPO 1599 (SEQ
ID N0:1296), GHPO 1601 (SEQ ID N0:1298), GHPO 1609 (SEQ ID
N0:1300), GHPO 1613 (SEQ ID N0:1302), GHPO 1614 (SEQ ID N0:1304), GHPO 1626 (SEQ ID N0:1306), GHPO 1628 (SEQ ID N0:1308), GHPO
1639 (SEQ ID N0:1310), GHPO 1640 (SEQ ID N0:1312), GHPO 1641 (SEQ
ID N0:1314), GHPO 1646 (SEQ ID N0:1316), GHPO 1662 (SEQ ID -N0:1318), GHPO 1667 (SEQ ID N0:1320), GHPO 1668 (SEQ ID NO:I322), GHPO 1670 (SEQ ID N0:1324), GHPO 1671 (SEQ ID N0:1326), GHPO
1672 (SEQ ID N0:1328), GHPO 1678 (SEQ ID N0:1330), GHPO 1684 (SEQ
ID N0:1332), GHPO 1695 (SEQ ID N0:1334), GHPO 1697 (SEQ ID

N0:1336), GHPO 1701 (SEQ ID N0:1338), GHPO 1719 (SEQ ID N0:1340), GHPO 1723 (SEQ ID N0:1342), GHPO 1732 (SEQ ID N0:1344), GHPO
1739 (SEQ ID N0:1346), GHPO 1741 (SEQ ID N0:1348), GHPO 1747 (SEQ
ID N0:1350), GHPO 1749 (SEQ ID N0:1352), GHPO 1750 (SEQ ID
N0:1354), GHPO 1751 (SEQ ID N0:1356), GHPO 1755 (SEQ ID N0:1358), GHPO 1771 (SEQ ID N0:1360), GHPO 1786 (SEQ ID N0:1362), and GHPO
1789 (SEQ ID N0:1364), have been identified in the H. pylori genome. These polypeptides can be used, for example, in vaccination methods for preventing or treating Helicobacter infection. For example, GHPO 1320, GHPO 523, GHPO 792, GHPO 639, GHPO 669, GHPO 992, GHPO 576, GHPO 109, GHPO 129, GHPO 234, GHPO 257, GHPO 525, GHPO 626, GHPO 1034, GHPO 1275, GHPO 1308, GHPO 1600, GHPO 1615, GHPO 536, GHPO 66, GHPO 1363, GHPO 1595, and GHPO 1166 have been shown to be protective antigens that can be used in methods for preventing Helicobacter infection. By "protective antigen" is meant an antigen that is capable of reducing the infection level after challenge, relative to a positive control. Absolute protection from infection, although included in the invention, is not required.
Some of the new polypeptides are secreted polypeptides that can be produced in their mature forms (i.e., as polypeptides that have been exported through class II or class III secretion pathways) or as precursors that include signal peptides, which can be removed in the course of excretion/secretion by cleavage at the N-terminal end of the mature form. (The cleavage site is located at the C-terminal end of the signal peptide, adjacent to the mature form.) According to a first aspect of the invention, there are provided isolated polynucleotides that encode the precursor and mature forms of the Helicobacter GHPO proteins listed above. Examples of such polynucleotides are those encoding GHPO 35 (SEQ ID NO:1), GHPO 55 (SEQ ID N0:3), GHPO 78 (SEQ ID NO:S), GHPO 89 (SEQ ID N0:7), GHPO 129 (SEQ ID N0:9), GHPO 541 (SEQ ID NO:l 1), GHPO 607 (SEQ ID N0:13), GHPO 635 (SEQ
ID NO:15), GHPO 701 (SEQ ID N0:17), GHPO 712 (SEQ ID N0:19), GHPO
761 (SEQ ID N0:21), GHPO 838 (SEQ ID N0:23), GHPO 1034 (SEQ ID
N0:25), GHPO 1085 (SEQ ID N0:27), GHPO 1213 (SEQ ID N0:29), GHPO
1255 (SEQ ID N0:31), GHPO 1308 (SEQ ID N0:33), GHPO 1389 (SEQ ID
N0:35), GHPO 1706 (SEQ ID N0:37), GHPO 234 (SEQ ID N0:39), GHPO
314 (SEQ ID N0:41), GHPO 510 (SEQ ID N0:43), GHPO 603 (SEQ ID
N0:45), GHPO 937 (SEQ ID N0:47), GHPO 1027 {SEQ ID N0:49), GHPO
1099 (SEQ ID NO:51}, GHPO 1151 (SEQ ID N0:53), GHPO 1275 (SEQ ID
NO:55), GHPO 1365 (SEQ ID N0:57), GHPO 1578 (SEQ ID N0:59), GHPO
22 (SEQ ID N0:61), GHPO 58 (SEQ ID N0:63), GHPO 200 (SEQ ID N0:65), GHPO 558 (SEQ ID N0:67), GHPO 563 (SEQ ID N0:69), GHPO 695 (SEQ
ID N0:71), GHPO 699 (SEQ ID N0:73), GHPO 702 (SEQ ID N0:75), GHPO
1 S 709 (SEQ ID N0:77), GHPO 741 (SEQ ID N0:79), GHPO 762 (SEQ ID
N0:81), GHPO 827 (SEQ ID N0:83), GHPO 852 (SEQ ID N0:85), GHPO
1013 {SEQ ID N0:87), GHPO 1020 (SEQ ID N0:89), GHPO 1031 (SEQ ID
N0:91), GHPO 1052 (SEQ ID N0:93), GHPO 1127 (SEQ ID N0:95), GHPO
1149 (SEQ ID N0:97), GHPO 1176 (SEQ ID N0:99), GHPO 1250 (SEQ ID
NO:101), GHPO 1312 (SEQ ID N0:103), GHPO 1358 (SEQ ID NO:105), GHPO 1490 (SEQ ID N0:107), GHPO 1559 (SEQ ID N0:109), GHPO 1651 (SEQ ID NO:111 ), GHPO 1726 (SEQ ID N0:113), GHPO 1780 (SEQ ID
NO:115), GHPO 895 (SEQ ID N0:117), GHPO 1447 (SEQ ID N0:119), GHPO 28 (SEQ ID N0:121), GHPO 86 (SEQ ID N0:123), GHPO 155 (SEQ
ID N0:125), GHPO 157 (SEQ ID N0:127), GHPO 237 (SEQ ID N0:129), GHPO 290 (SEQ ID N0:131), GHPO 293 (SEQ ID NO:I33), GHPO 335 (SEQ ID N0:135), GHPO 374 (SEQ ID N0:137), GHPO 442 (SEQ ID
N0:139), GHPO 480 (SEQ ID N0:14I), GHPO 523 (SEQ ID N0:143), GHPO
610 (SEQ ID N0:145), GHPO 675 (SEQ ID N0:147), GHPO 690 (SEQ ID
N0:149), GHPO 829 (SEQ ID NO:151), GHPO 850 (SEQ ID N0:153), GHPO
876 (SEQ ID NO:155), GHPO 984 (SEQ ID N0:157), GHPO 989 (SEQ ID
N0:159), GHPO 1111 (SEQ ID N0:161), GHPO 1145 (SEQ ID N0:163), GHPO 1256 (SEQ ID N0:165), GHPO 1264 (SEQ LD N0:167), GHPO 1316 (SEQ ID N0:169), GHPO 1368 (SEQ ID N0:171 ), GHPO 1442 (SEQ ID
N0:173), GHPO 1506 (SEQ ID N0:175), GHPO 1543 (SEQ ID N0:177), GHPO 1574 (SEQ ID N0:179), GHPO 1627 (SEQ ID N0:181 ), GHPO 1657 (SEQ ID N0:183), GHPO 1664 (SEQ ID N0:185), GHPO 1694 (SEQ ID
N0:187), GHPO 1704 (SEQ ID N0:189), GHPO 1763 (SEQ ID N0:191), GHPO 616 (SEQ ID N0:193), GHPO 76 (SEQ ID N0:195), GHPO 109 (SEQ
ID N0:197), GHPO 163 (SEQ ID N0:199), GHPO 169 (SEQ iD N0:201), GHPO 208 (SEQ ID N0:203), GHPO 219 (SEQ ID N0:205), GHPO 445 (SEQ ID N0:207), GHPO 479 (SEQ ID N0:209), GHPO 525 (SEQ ID
N0:211), GHPO 535 (SEQ ID N0:213), GHPO 731 (SEQ ID N0:215), GHPO
836 (SEQ ID N0:217), GHPO 879 (SEQ ID N0:219), GHPO 881 {SEQ ID
N0:221), GHPO 886 (SEQ ID N0:223), GHPO 893 (SEQ ID N0:225), GHPO
894 (SEQ ID N0:227), GHPO 976 (SEQ ID N0:229), GHPO 1011 (SEQ ID
N0:231), GHPO 1024 (SEQ ID N0:233), GHPO 1084 (SEQ ID N0:235), GHPO 1329 (SEQ ID N0:237), GHPO 1330 (SEQ ID N0:239), GHPO 1346 (SEQ ID N0:241), GHPO 1360 (SEQ ID N0:243), GHPO 1388 (SEQ ID
N0:245), GHPO 1411 (SEQ ID N0:247), GHPO 1419 (SEQ ID N0:249), GHPO 1446 (SEQ ID N0:251), GHPO 1469 (SEQ ID N0:253), GHPO 1501 (SEQ ID N0:255), GHPO 1505 (SEQ ID N0:2S7), GHPO 1522 (SEQ ID
N0:259), GHPO 1525 (SEQ ID N0:261 ), GHPO 161 S (SEQ ID N0:263), GHPO 1689 (SEQ ID N0:265), GHPO 1733 (SEQ ID N0:267), GHPO 18 (SEQ ID N0:269), GHPO 139 (SEQ ID N0:271), GHPO 142 (SEQ ID
N0:273), GHPO 250 (SEQ ID N0:275), GHPO 257 (SEQ ID N0:277), GHPO
325 (SEQ ID N0:279), GHPO 355 (SEQ ID N0:281), GHPO 357 (SEQ ID
N0:283), GHPO 454 (SEQ ID N0:285), GHPO 475 (SEQ ID N0:287), GHPO
S 515 (SEQ ID N0:289), GHPO 527 (SEQ ID N0:291), GHPO 551 (SEQ ID
N0:293), GHPO 602 (SEQ ID N0:295), GHPO 626 (SEQ ID N0:297), GHPO
646 (SEQ ID N0:299), GHPO 653 (SEQ ID N0:301), GHPO 655 (SEQ ID
N0:303), GHPO 670 (SEQ ID N0:305), GHPO 739 (SEQ ID N0:307), GHPO
798 (SEQ ID N0:309), GHPO 1102 (SEQ ID N0:311 ), GHPO 1114 (SEQ ID
N0:313), GHPO 1152 (SEQ ID N0:315), GHPO 1272 (SEQ ID N0:317), GHPO 1345 (SEQ ID N0:319}, GHPO 1377 (SEQ ID N0:321), GHPO 1424 (SEQ ID N0:323), GHPO 1430 (SEQ ID N0:325), GHPO 1502 (SEQ ID
N0:327), GHPO 1600 (SEQ ID N0:329), GHPO 1714 (SEQ ID N0:331), GHPO 359 (SEQ ID N0:333), GHPO 678 (SEQ ID N0:335), GHPO 708 (SEQ ID N0:337}, GHPO 759 (SEQ ID N0:339), GHPO 847 (SEQ ID
N0:341), GHPO 1050 (SEQ ID N0:343), GHPO 1101 (SEQ ID N0:345), GHPO 1120 (SEQ ID N0:347), GHPO 1138 (SEQ ID N0:349), GHPO 1310 (SEQ ID N0:351), GHPO 1320 (SEQ ID N0:353), GHPO 1375 (SEQ ID
N0:355), GHPO 1432 (SEQ ID N0:357), GHPO 21 (SEQ ID N0:359), GHPO
282 (SEQ ID N0:361), GHPO 1089 (SEQ ID N0:363), GHPO 1141 (SEQ ID
N0:365), GHPO 1280 (SEQ ID N0:367), GHPO 1608 (SEQ ID N0:369), GHPO 15 (SEQ ID N0:371), GHPO 16 (SEQ ID N0:373), GHPO 36 (SEQ ID
N0:375), GHPO 38 (SEQ ID N0:377), GHPO 52 (SEQ ID N0:379), GHPO
57 (SEQ ID N0:381), GHPO 64 (SEQ ID N0:383), GHPO 79 (SEQ ID
N0:385), GHPO 84 (SEQ ID N0:387), GHPO 86 (SEQ ID N0:389), GHPO
99 (SEQ ID N0:391), GHPO 106 (SEQ ID N0:393), GHPO 118 (SEQ ID

N0:39S), GHPO 122 {SEQ ID N0:397), GHPO 128 (SEQ ID N0:399), GHPO
138 {SEQ ID N0:401), GHPO 1 S3 {SEQ ID N0:403), GHPO 160 (SEQ ID
N0:40S), GHPO 168 (SEQ ID N0:407), GHPO 179 (SEQ ID N0:409), GHPO
189 (SEQ ID N0:411), GHPO 229 (SEQ ID N0:413), GHPO 243 {SEQ ID
N0:41 S), GHPO 244 (SEQ ID N0:417), GHPO 2S 1 (SEQ ID N0:419), GHPO
S 267 (SEQ ID N0:421), GHPO 269 (SEQ ID N0:423), GHPO 279 {SEQ ID~
N0:42S), GHPO 284 (SEQ ID N0:427), GHPO 296 (SEQ ID N0:429), GHPO
300 (SEQ ID N0:431), GHPO 30S (SEQ ID N0:433), GHPO 319 (SEQ ID
N0:43S), GHPO 330 (SEQ ID N0:437), GHPO 340 (SEQ ID N0:439), GHPO
342 (SEQ ID N0:441), GHPO 344 (SEQ ID N0:443), GHPO 3S8 (SEQ ID
N0:44S), GHPO 373 (SEQ ID N0:447), GHPO 382 (SEQ ID N0:449), GHPO
384 (SEQ ID N0:4S1), GHPO 398 (SEQ ID N0:4S3), GHPO 409 (SEQ ID
N0:4SS), GHPO 422 (SEQ ID N0:4S7), GHPO 430 (SEQ ID N0:4S9), GHPO
446 (SEQ ID N0:461 ), GHPO 447 (SEQ ID N0:463), GHPO 4S0 (SEQ ID
N0:46S), GHPO 4S 1 (SEQ ID N0:467), GHPO 4S2 (SEQ ID N0:469), GHPO
1S 4S6 (SEQ ID N0:471), GHPO 461 (SEQ ID N0:473), GHPO 476 (SEQ ID
N0:47S), GHPO 478 (SEQ ID N0:477), GHPO 491 (SEQ ID N0:479), GHPO
S 11 (SEQ ID N0:481), GHPO S 19 (SEQ ID N0:483), GHPO S26 (SEQ ID
N0:48S), GHPO S34 {SEQ ID N0:487), GHPO S36 (SEQ ID N0:489), GHPO
S42 (SEQ ID N0:491), GHPO 544 (SEQ ID N0:493), GHPO S76 (SEQ ID
N0:49S), GHPO S78 (SEQ ID N0:497), GHPO S80 (SEQ ID N0:499), GHPO
S85 (SEQ ID NO:SO1), GHPO S99 (SEQ ID NO:S03), GHPO 639 (SEQ ID
NO:SOS), GHPO 642 (SEQ ID NO:S07), GHPO 647 (SEQ ID NO:S09), GHPO
654 (SEQ ID NO:S11), GHPO 669 {SEQ ID NO:S13), GHPO 710 (SEQ ID
NO:S1S), GHPO 713 (SEQ ID NO:S17), GHPO 716 (SEQ ID NO:S19), GHPO
2S 718 (SEQ ID NO:S21), GHPO 726 (SEQ ID NO:S23), GHPO 734 (SEQ ID
NO:S2S), GHPO 740 (SEQ ID NO:S27), GHPO 770 (SEQ ID NO:S29), GHPO

782 (SEQ ID N0:531), GHPO 786 (SEQ ID N0:533), GHPO 792 (SEQ ID
N0:535), GHPO 797 (SEQ ID N0:537), GHPO 816 (SEQ ID N0:539), GHPO
828 (SEQ ID N0:541), GHPO 839 (SEQ ID N0:543), GHPO 840 (SEQ ID
N0:545), GHPO 842 (SEQ ID N0:547), GHPO 885 (SEQ ID N0:549), GHPO _ 889 (SEQ ID N0:551), GHPO 903 (SEQ ID N0:553), GHPO 912 (SEQ ID
N0:555), GHPO 946 (SEQ ID N0:557), GHPO 958 (SEQ ID N0:559), GHPO
968 (SEQ ID N0:561), GHPO 987 (SEQ ID N0:563), GHPO 992 (SEQ ID
N0:565), GHPO 996 (SEQ ID N0:567), GHPO 997 (SEQ ID N0:569), GHPO
1002 (SEQ ID N0:571), GHPO 1026 (SEQ ID N0:573), GHPO 1028 (SEQ ID
N0:575), GHPO 1034 (SEQ ID N0:577), GHPO 1038 (SEQ ID N0:579), GHPO 1059 (SEQ ID N0:581), GHPO 1065 (SEQ ID N0:583), GHPO 1072 (SEQ ID N0:585), GHPO 1073 (SEQ ID N0:587), GHPO 1088 (SEQ ID
N0:589), GHPO 1091 (SEQ ID N0:591), GHPO 1105 (SEQ ID N0:593), GHPO 1115 (SEQ ID N0:595), GHPO 1159 (SEQ ID N0:597), GHPO 1177 (SEQ ID N0:599), GHPO 1187 (SEQ ID N0:601), GHPO 1192 (SEQ ID
N0:603), GHPO 1195 (SEQ ID N0:605), GHPO 1224 (SEQ ID N0:607), GHPO 1225 (SEQ ID N0:609), GHPO 1228 (SEQ ID N0:6I 1), GHPO 1229 (SEQ ID N0:613), GHPO 1231 (SEQ ID N0:615), GHPO 1236 (SEQ ID
N0:617), GHPO 1242 (SEQ ID N0:619), GHPO 1248 (SEQ ID N0:621 ), GHPO 1270 (SEQ ID N0:623), GHPO 1271 (SEQ ID N0:625), GHPO 1298 (SEQ ID N0:627), GHPO 1301 (SEQ ID N0:629), GHPO 1304 (SEQ ID
N0:631), GHPO 1315 (SEQ ID N0:633), GHPO 1319 (SEQ ID N0:635), GHPO 1323 (SEQ ID N0:637), GHPO 1331 (SEQ ID N0:639), GHPO 1332 (SEQ ID N0:641), GHPO 1347 (SEQ ID N0:643), GHPO 1373 (SEQ ID
N0:645), GHPO 1376 (SEQ ID N0:647), GHPO 1380 (SEQ ID N0:649), GHPO 1394 (SEQ ID N0:651), GHPO 1407 (SEQ ID N0:653), GHPO 1415 (SEQ ID N0:655), GHPO 1425 (SEQ ID N0:657), GHPO 1427 (SEQ ID

N0:659), GHPO 1444 (SEQ ID N0:661), GHPO 1449 (SEQ ID N0:663), GHPO 1465 (SEQ ID N0:665), GHPO 1475 (SEQ ID N0:667), GHPO 1479 (SEQ ID N0:669), GHPO 1483 (SEQ ID N0:671), GHPO 1488 (SEQ ID
N0:673), GHPO 1496 (SEQ ID N0:675), GHPO 1524 (SEQ ID N0:677), GHPO 1536 (SEQ ID N0:679), GHPO 1539 (SEQ ID N0:681 ), GHPO 1540 (SEQ ID N0:683), GHPO 1542 (SEQ ID N0:685), GHPO 1555 (SEQ ID
N0:687), GHPO 1560 (SEQ ID N0:689), GHPO 1564 (SEQ ID N0:691 ), GHPO 1570 (SEQ ID N0:693), GHPO 1588 (SEQ ID N0:695), GHPO 1604 (SEQ ID N0:697), GHPO 1605 (SEQ ID N0:699), GHPO 1619 (SEQ ID
N0:701), GHPO 1629 (SEQ ID N0:703), GHPO 1642 (SEQ ID N0:705), GHPO 1654 (SEQ ID N0:707), GHPO 1661 (SEQ ID N0:709), GHPO 1673 {SEQ ID N0:711), GHPO 1687 (SEQ ID N0:713), GHPO 1692 (SEQ ID
N0:715), GHPO 1693 (SEQ ID N0:717), GHPO 1699 (SEQ ID N0:719), GHPO 1738 (SEQ ID N0:721), GHPO 1745 (SEQ ID N0:723), GHPO 1746 (SEQ ID N0:725), GHPO 1754 {SEQ ID N0:727), GHPO 1792 (SEQ ID
N0:729), GHPO 1795 (SEQ ID N0:731), GHPO 1796 (SEQ ID N0:733), GHPO 7 (SEQ ID N0:735), GHPO 8 (SEQ ID N0:737), GHPO 9 (SEQ ID
N0:739), GHPO 10 (SEQ ID N0:741), GHPO 12 (SEQ ID N0:743), GHPO
(SEQ ID N0:745), GHPO 27 (SEQ ID N0:747), GHPO 29 (SEQ ID
N0:749), GHPO 30 (SEQ ID N0:751), GHPO 37 (SEQ ID N0:753), GHPO
20 49 (SEQ ID N0:755), GHPO 51 (SEQ ID N0:757), GHPO 54 (SEQ ID
N0:759), GHPO 65 {SEQ ID N0:761), GHPO 66 (SEQ ID N0:763), GHPO
68 (SEQ ID N0:765), GHPO 70 (SEQ ID N0:767), GHPO 77 (SEQ ID
N0:769), GHPO 83 (SEQ ID N0:771), GHPO 85 (SEQ ID N0:773), GHPO
87 (SEQ ID N0:775), GHPO 91 (SEQ ID N0:777), GHPO 92 (SEQ ID
25 N0:779), GHPO 96 (SEQ ID N0:781), GHPO 97 (SEQ ID N0:783), GHPO
111 (SEQ ID N0:785), GHPO 115 (SEQ ID N0:787), GHPO 117 (SEQ ID

N0:789), GHPO 123 (SEQ ID N0:791), GHPO 124 (SEQ ID N0:793), GHPO
126 (SEQ ID N0:795), GHPO 127 (SEQ ID N0:797), GHPO 128 (SEQ ID
N0:799), GHPO 131 (SEQ ID N0:801), GHPO 133 (SEQ ID N0:803), GHPO
140 (SEQ ID NO:805), GHPO 141 (SEQ ID N0:807), GHPO 145 (SEQ ID
N0:809), GHPO 147 (SEQ ID N0:811), GHPO 166 (SEQ ID N0:813), GHPO
181 (SEQ ID N0:815), GHPO 187 (SEQ ID N0:817), GHPO 188 (SEQ ID
N0:819), GHPO 192 (SEQ ID N0:821}, GHPO 202 (SEQ ID N0:823), GHPO
204 (SEQ ID N0:825), GHPO 205 (SEQ ID N0:827), GHPO 212 (SEQ ID
N0:829), GHPO 218 (SEQ ID N0:831), GHPO 226 (SEQ ID N0:833), GHPO
231 (SEQ ID N0:835), GHPO 236 (SEQ ID N0:837), GHPO 239 (SEQ ID
N0:839), GHPO 245 (SEQ ID N0:841), GHPO 246 (SEQ ID N0:843), GHPO
248 (SEQ ID N0:845}, GHPO 253 (SEQ ID N0:847), GHPO 265 (SEQ ID
N0:849), GHPO 266 (SEQ ID N0:851), GHPO 271 (SEQ ID N0:853), GHPO
272 (SEQ ID N0:855), GHPO 286 (SEQ ID N0:857), GHPO 291 (SEQ ID
N0:859), GHPO 292 (SEQ ID N0:861), GHPO 297 (SEQ ID N0:863), GHPO
1 S 304 (SEQ ID N0:865), GHPO 307 (SEQ ID N0:867), GHPO 324 (SEQ ID
N0:869), GHPO 326 (SEQ ID N0:871), GHPO 331 (SEQ ID N0:873), GHPO
343 (SEQ ID N0:875), GHPO 345 (SEQ ID N0:877), GHPO 346 (SEQ ID
N0:879), GHPO 352 (SEQ ID N0:881), GHPO 355 (SEQ ID N0:883), GHPO
363 (SEQ ID N0:885), GHPO 369 (SEQ ID N0:887), GHPO 376 (SEQ ID
N0:889), GHPO 378 (SEQ ID N0:891), GHPO 388 (SEQ ID N0:893), GHPO
396 (SEQ ID N0:895), GHPO 403 (SEQ ID N0:897), GHPO 410 (SEQ ID
N0:899), GHPO 415 (SEQ ID N0:901), GHPO 421 (SEQ ID N0:903), GHPO
439 (SEQ ID N0:905), GHPO 441 (SEQ ID N0:907), GHPO 443 (SEQ ID
N0:909), GHPO 453 (SEQ ID N0:911), GHPO 455 (SEQ ID N0:913), GHPO
464 {SEQ ID N0:915), GHPO 467 (SEQ ID N0:917), GHPO 468 (SEQ ID
N0:919), GHPO 470 (SEQ ID N0:921), GHPO 486 (SEQ ID N0:923), GHPO

487 (SEQ ID N0:92S), GHPO 488 {SEQ ID N0:927), GHPO 489 (SEQ ID
N0:929), GHPO 498 (SEQ ID N0:931), GHPO SO1 (SEQ ID N0:933), GHPO
S04 (SEQ ID N0:93S), GHPO S 12 (SEQ ID N0:937), GHPO S 17 (SEQ ID
N0:939), GHPO S20 (SEQ ID N0:941), GHPO S28 (SEQ ID N0:943), GHPO
S30 (SEQ ID N0:94S), GHPO S32 (SEQ ID N0:947), GHPO S48 (SEQ ID
S N0:949), GHPO S61 (SEQ ID N0:9S1), GHPO S64 (SEQ ID N0:9S3), GHPO
S72 (SEQ ID N0:9SS), GHPO S73 (SEQ ID N0:9S7), GHPO S74 (SEQ ID
N0:9S9), GHPO S77 {SEQ ID N0:961), GHPO S79 (SEQ ID N0:963), GHPO
S83 (SEQ ID N0:96S), GHPO S88 (SEQ ID N0:967), GHPO S93 (SEQ ID
N0:969), GHPO S97 {SEQ ID N0:971), GHPO S98 (SEQ ID N0:973), GHPO
604 (SEQ ID N0:97S), GHPO 606 (SEQ ID N0:977), GHPO 611 (SEQ ID
N0:979), GHPO 612 (SEQ ID N0:981 ), GHPO 61 S (SEQ ID N0:983), GHPO
632 (SEQ ID N0:98S), GHPO 633 (SEQ ID N0:987), GHPO 637 (SEQ ID
N0:989), GHPO 6S1 (SEQ ID N0:991), GHPO 663 (SEQ ID N0:993), GHPO
686 (SEQ ID N0:99S), GHPO 693 (SEQ ID N0:997), GHPO 698 (SEQ ID
1 S N0:999), GHPO 703 (SEQ ID NO:1001 ), GHPO 704 (SEQ ID N0:1003), GHPO 70S (SEQ ID NO:l00S), GHPO 707 (SEQ ID N0:1007), GHPO 721 (SEQ ID N0:1009), GHPO 727 (SEQ ID NO:1011), GHPO 728 (SEQ ID
N0:1013), GHPO 733 (SEQ ID NO:101 S), GHPO 7S8 (SEQ ID N0:1017), GHPO 763 (SEQ ID N0:1019), GHPO 771 (SEQ ID N0:1021), GHPO 774 (SEQ ID N0:1023), GHPO 776 {SEQ ID N0:1025), GHPO 783 (SEQ ID
N0:1027), GHPO 800 (SEQ ID N0:1029), GHPO 806 (SEQ ID N0:1031), GHPO 807 (SEQ ID NO:I033), GHPO 808 (SEQ ID N0:1035), GHPO 809 (SEQ ID N0:1037), GHPO 811 (SEQ ID N0:1039), GHPO 81S {SEQ ID
N0:1041), GHPO 819 (SEQ ID N0:1043), GHPO 841 (SEQ ID N0:1045), 2S GHPO 843 (SEQ ID N0:1047), GHPO 846 (SEQ ID N0:1049), GHPO 87S
(SEQ ID NO:IOSl), GHPO 892 (SEQ ID NO:lOS3), GHPO 902 (SEQ ID

NO:1055), GHPO 904 (SEQ ID N0:1057), GHPO 906 {SEQ ID NO:1059), GHPO 908 (SEQ ID N0:1061), GHPO 921 (SEQ ID N0:1063), GHPO 923 (SEQ ID N0:1065), GHPO 926 (SEQ ID N0:1067), GHPO 933 (SEQ ID
N0:1069), GHPO 939 (SEQ ID N0:1071 ), GHPO 940 (SEQ ID N0:1073), GHPO 943 (SEQ ID N0:1075), GHPO 951 (SEQ ID N0:1077), GHPO 961 (SEQ ID N0:1079), GHPO 965 (SEQ ID N0:1081), GHPO 990 (SEQ ID
N0:1083), GHPO 991 (SEQ ID N0:1085), GHPO 998 (SEQ ID N0:1087), GHPO 1001 (SEQ ID N0:1089), GHPO 1005 (SEQ ID N0:1091 ), GHPO
1033 (SEQ ID N0:1093), GHPO 1039 {SEQ ID N0:1095), GHPO 1041 (SEQ
ID N0:1097), GHPO 1043 (SEQ ID N0:1099), GHPO 1044 (SEQ ID
NO:1101), GHPO 1051 (SEQ ID N0:1103), GHPO 1058 (SEQ ID NO:1105), GHPO 1060 (SEQ ID N0:1107), GHPO 1075 (SEQ ID N0:1109), GHPO
1077 (SEQ ID NO:1111 ), GHPO 1082 (SEQ ID N0:1113), GHPO 1083 (SEQ
ID NO:I 115), GHPO 1086 (SEQ ID N0:1117), GHPO 1087 (SEQ ID
N0:1119), GHPO 1090 (SEQ ID N0:1121), GHPO 1097 (SEQ ID N0:1123), GHPO 1098 (SEQ ID N0:1125), GHPO 1103 (SEQ ID NO:l 127), GHPO
1113 (SEQ ID N0:1129), GHPO 1116 (SEQ ID N0:1131 ), GHPO 1123 (SEQ
ID N0:1133), GHPO 1125 (SEQ ID N0:1135), GHPO 1129 {SEQ ID
N0:1137), GHPO 1130 (SEQ ID N0:1139), GHPO 1134 (SEQ ID N0:1141), GHPO 1161 (SEQ ID N0:1143), GHPO 1166 (SEQ ID N0:1145), GHPO
1170 (SEQ ID N0:1147), GHPO 1175 (SEQ ID N0:1149), GHPO 1181 (SEQ
ID NO:1151 ), GHPO 1186 (SEQ ID N0:1153), GHPO 1188 (SEQ ID
NO: l 155), GHPO 1191 (SEQ ID N0:1157}, GHPO 1193 (SEQ ID NO: l 159), GHPO 1196 (SEQ ID N0:1161}, GHPO 1204 (SEQ ID N0:1163}, GHPO
1210 (SEQ ID N0:1165), GHPO 1211 (SEQ ID N0:1167), GHPO 1216 (SEQ
ID N0:1169), GHPO 1218 (SEQ ID N0:1171), GHPO 1220 (SEQ ID
N0:1173), GHPO 1223 (SEQ ID N0:1175), GHPO 1226 (SEQ ID N0:1177), GHPO 1240 (SEQ ID N0:1179), GHPO 1246 {SEQ ID N0:1181), GHPO
1251 (SEQ ID N0:1183), GHPO 1252 (SEQ ID N0:1185), GHPO 1261 (SEQ
ID N0:1187), GHPO 1265 (SEQ ID N0:1189), GHPO 1267 (SEQ ID
. NO: i 191), GHPO 1278 (SEQ ID N0:1193), GHPO 1282 (SEQ ID N0:1195), GHPO 1283 (SEQ ID N0:1197), GHPO 1287 (SEQ ID N0:1199), GHPO
1292 (SEQ ID N0:1201), GHPO 1293 (SEQ ID N0:1203), GHPO 1302 (SEQ
ID N0:1205), GHPO 1309 (SEQ ID N0:1207), GHPO 1317 (SEQ ID
N0:1209), GHPO 1318 (SEQ ID N0:1211 ), GHPO 1321 (SEQ ID N0:1213), GHPO 1325 (SEQ ID N0:1215), GHPO 1341 (SEQ ID N0:1217), GHPO
1351 (SEQ ID N0:1219), GHPO 1354 (SEQ ID N0:1221), GHPO 1363 (SEQ
ID N0:1223), GHPO 1371 (SEQ ID N0:1225), GHPO 1381 (SEQ ID
N0:1227), GHPO 1401 (SEQ ID N0:1229), GHPO 1402 (SEQ ID N0:1231), GHPO 1403 (SEQ ID N0:1233), GHPO 1408 (SEQ ID N0:1235), GHPO
1416 (SEQ ID N0:1237), GHPO 1420 (SEQ ID N0:1239), GHPO 1428 (SEQ
ID N0:1241), GHPO 1437 (SEQ ID N0:1243), GHPO 1439 (SEQ ID
N0:1245), GHPO 1460 (SEQ ID N0:1247), GHPO 1463 (SEQ ID N0:1249), GHPO 1472 (SEQ ID N0:1251), GHPO 1474 (SEQ ID N0:1253), GHPO
1484 (SEQ ID N0:1255), GHPO 1489 (SEQ ID N0:1257), GHPO 1494 (SEQ
ID N0:1259), GHPO 1495 (SEQ ID N0:1261), GHPO 1498 (SEQ ID
N0:1263), GHPO 1499 (SEQ ID N0:1265), GHPO 1500 (SEQ ID N0:1267), GHPO 1503 (SEQ ID N0:1269), GHPO 1504 (SEQ ID N0:1271), GHPO
1510 (SEQ ID N0:1273), GHPO 1518 (SEQ ID N0:1275), GHPO 1533 {SEQ
ID N0:1277), GHPO 1541 (SEQ ID N0:1279), GHPO 1544 (SEQ ID
N0:1281), GHPO 1548 (SEQ ID N0:1283), GHPO 1565 (SEQ ID N0:1285), GHPO 1575 (SEQ ID N0:1287), GHPO 1582 (SEQ ID N0:1289), GHPO
1595 {SEQ ID N0:1291), GHPO 1597 (SEQ ID N0:1293), GHPO 1599 (SEQ
ID N0:1295), GHPO 1601 (SEQ ID N0:1297), GHPO 1609 (SEQ ID

N0:1299), GHPO 1613 (SEQ ID N0:1301), GHPO 1614 (SEQ ID N0:1303), GHPO 1626 (SEQ ID N0:1305), GHPO 1628 (SEQ ID N0:1307), GHPO
1639 (SEQ ID N0:1309), GHPO 1640 {SEQ ID N0:1311), GHPO 1641 (SEQ
ID N0:1313), GHPO 1646 (SEQ ID N0:1315), GHPO 1662 (SEQ ID -N0:1317), GHPO 1667 (SEQ ID N0:1319), GHPO 1668 (SEQ ID N0:1321), S GHPO 1670 (SEQ ID N0:1323), GHPO 1671 (SEQ ID N0:1325), GHPO
1672 (SEQ ID N0:1327), GHPO 1678 {SEQ ID N0:1329}, GHPO 1684 (SEQ
ID N0:1331), GHPO 1695 (SEQ ID N0:1333), GHPO 1697 (SEQ ID
N0:1335), GHPO 1701 (SEQ ID N0:1337), GHPO 1719 (SEQ ID N0:1339), GHPO 1723 (SEQ ID N0:1341 ), GHPO 1732 (SEQ ID N0:1343), GHPO
1739 (SEQ ID N0:1345), GHPO 1741 (SEQ ID N0:1347), GHPO 1747 (SEQ
ID N0:1349), GHPO 1749 (SEQ ID N0:1351), GHPO 1750 (SEQ ID
N0:1353), GHPO 1751 (SEQ ID N0:1355), GHPO 1755 (SEQ ID N0:1357), GHPO 1771 (SEQ ID N0:1359), GHPO 1786 (SEQ ID N0:1361), and GHPO
1789 (SEQ ID N0:1363).
An isolated polynucleotide of the invention encodes (i) a polypeptide having an amino acid sequence that is homologous to a Helicobacter amino acid sequence of a polypeptide, the Helicobacter amino acid sequence being selected from the group consisting of the amino acid sequences shown in the sequence listing (even numbers, up to SEQ ID N0:1364), or (ii) a derivative of the polypeptide.
In addition to the full-length polypeptides encoded by the polynucleotides of the invention, as set forth above, polynucleotides included in the invention can also encode polypeptides that lack signal sequences, as well as other polypeptide or peptide fragments of the full-length polypeptides.
The term "isolated polynucleotide" is defined as a poiynucleotide that is removed from the environment in which it naturally occurs. For example, a naturally-occurring DNA molecule present in the genome of a living bacteria or as part of a gene bank is not isolated, but the same molecule, separated from the remaining part of the bacterial genome, as a result of, e.g., a cloning event (amplification), is "isolated." Typically, an isolated DNA molecule is free from DNA regions (e.g., coding regions) with which it is immediately contiguous, at the 5' or 3' ends, in the naturally occurring genome. Such isolated polynucleotides can be part of a vector or a composition and still be isolated, as such a vector or composition is not part of its natural environment.
A polynucleotide of the invention can consist of RNA or DNA (e.g., cDNA, genomic DNA, or synthetic DNA), or modifications or combinations of RNA or DNA. The polynucleotide can be double-stranded or single-stranded and, if single-stranded, can be the coding (sense) strand or the non-coding (anti-sense) strand. The sequences that encode polypeptides of the invention, as shown in the sequence listing (even numbers, up to SEQ ID N0:1364), can be (a) the coding sequence as shown in any of the nucleotide sequences of the sequence listing (odd numbers, up to SEQ ID N0:1363); (b) a ribonucleotide sequence derived by transcription of (a); or (c) a different coding sequence that, as a result of the redundancy or degeneracy of the genetic code, encodes the same polypeptides as the polynucleotide molecules having the sequences illustrated in any of the nucleotide sequences of the sequence listing (odd numbers, up to SEQ ID N0:1363). The polypeptide can be one that is naturally secreted or excreted by, e.g., H. fells, H. mustelae, H. heilmanii, or H.
pylori.
By "polypeptide" or "protein" is meant any chain of amino acids, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation). Both terms are used interchangeably in the present application.

By "homologous amino acid sequence" is meant an amino acid sequence that differs from an amino acid sequence shown in the sequence listing (even numbers, up to SEQ ID N0:1364), or an amino acid sequence encoded by a nucleotide sequence shown in the sequence listing (odd numbers, up to SEQ ID
N0:1363), by one or more non-conservative amino acid substitutions, deletions, or additions located at positions at which they do not destroy the specific antigenicity of the polypeptide. Preferably, such a sequence is at least 75%, more preferably at least 80%, and most preferably at least 90% identical to an amino acid sequence shown in the sequence listing (even numbers, up to SEQ ID N0:1364). Homologous amino acid sequences include sequences that are identical or substantially identical to an amino acid sequence as shown in the sequence listing (even numbers, up to SEQ ID N0:1364). By "amino acid sequence that is substantially identical" is meant a sequence that is at least 90%, preferably at least 95%, more preferably at least 97%, and most preferably at least 99% identical to an amino acid sequence of reference and that differs from the sequence of reference, if at all, by a majority of conservative amino acid substitutions.
Conservative amino acid substitutions typically include substitutions among amino acids of the same class. These classes include, for example, amino acids having uncharged polar side chains, such as asparagine, glutamine, serine, threonine, and tyrosine; amino acids having basic side chains, such as lysine, arginine, and histidine; amino acids having acidic side chains, such as aspartic acid and glutamic acid; and amino acids having nonpolar side chains, such as glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tlyptophan, and cysteine.
Homology can be measured using sequence analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Similar amino acid sequences are aligned to obtain the maximum degree of homology (i.e., identity). To this end, it may be necessary to artificially introduce gaps into the sequence. Once the optimal alignment has been set up, the degree of homology (i.e., identity) is established by recording all of the positions in which the amino acids of both sequences are identical, relative to the total number of positions.
Homologous polynucleotide sequences are defined in a similar way.
Preferably, a homologous sequence is one that is at least 45%, more preferably at least 60%, and most preferably at least 85% identical to a coding sequence of any of the nucleotide sequences set forth in the sequence listing {odd numbers, up to SEQ ID N0:1363).
Polypeptides having a sequence homologous to any one of the sequences shown in the sequence listing (even numbers, up to SEQ ID N0:1364), include naturally-occurring allelic variants, as well as mutants or any other non-naturally occurring variants that are analogous in terms of antigenicity, to a polypeptide having a sequence as shown in the sequence listing (even numbers, up to SEQ ID N0:1364).
As is known in the art, an allelic variant is an alternate form of a polypeptide that is characterized as having a substitution, deletion, or addition of one or more amino acids that does not alter the biological function of the polypeptide. By "biological function" is meant a function of the polypeptide in the cells in which it naturally occurs, even if the function is not necessary for the growth or survival of the cells. For example, the biological function of a porin is to allow the entry into cells of compounds present in the extracellular medium. The biological function is distinct from the antigenic function. A
polypeptide can have more than one biological function.

Allelic variants are very common in nature. For example, a bacterial species, e.g., H. pylori, is usually represented by a variety of strains that differ from each other by minor allelic variations. Indeed, a polypeptide that fulfills the same biological function in different strains can have an amino acid sequence that is not identical in each of the strains. Such an allelic variation can be equally reflected at the polynucleotide level.
Support for the use of allelic variants of polypeptide antigens comes from, e.g., studies of the Helicobacter urease antigen. The amino acid sequence of Helicobacter urease varies widely from species to species, yet cross-species protection occurs, indicating that the urease molecule, when used as an immunogen, is highly tolerant of amino acid variations. Even among different strains of the single species H. pylori, there are amino acid sequence variations.
For example, although the amino acid sequences of the UreA and Urea subunits of H. pylori and H .. fells ureases differ from one another by 26.5%
and 11.8%, respectively (Ferrero et al., Molecular Microbiology 9(2):323-333, 1993), it has been shown that H. pylori urease protects mice from H. fells infection (Michetti et al., Gastroenterology 107:1002, 1994). In addition, it has been shown that the individual structural subunits of urease, UreA and Urea, which contain distinct amino acid sequences, are both protective antigens against Helicobacter infection (Michetti et al., supra). Similarly, Cuenca et al.
(Gastroenterology 110:1770, 1996) showed that therapeutic immunization of H. mustelae-infected ferrets with H. pylori urease was effective at eradicating H. mustelae infection. Further, several urease variants have been reported to be effective vaccine antigens, including, e.g., recombinant UreA + Urea apoenzyme expressed from pORV 142 (UreA and Urea sequences derived from H. pylori strain CPM630; Lee et al., J. Infect. Dis.172:161, 1995);
recombinant UreA + Urea apoenzyme expressed from pORV214 (UreA and Urea sequences differ from H. pylori strain CPM630 by one and two amino acid changes, respectively; Lee et al., supra, 1995); a UreA-glutathione-S-transferase fusion protein (UreA sequence from H. pylori strain ATCC 43504;
Thomas et al., Acta Gastro-Enterologica Belgica 56:54, 1993); UreA + Urea holoenzyme purified from H. pylori strain NCTC11637 (Marchetti et al., Science 267:1655, 1995); a UreA-MBP fusion protein (UreA from H. pylori strain 85P; Ferrero et al., infection and Immunity 62:4981, 1994); a Urea-MBP
fusion protein (Urea from H. pylori strain 85P; Ferrero et al., supra); a UreA-MBP fusion protein (UreA from H. fells strain ATCC 49179; Ferrero et al., supra); a Urea-MBP fusion protein (Urea from H. fells strain ATCC 49179;
Ferrero et al., supra); and a 37 kDa fragment of Urea containing amino acids 220-569 (Dore-Davin et al., "A 37 kD fragment of Urea is sufficient to confer protection against Helicobacter fells infection in mice"). Finally, Thomas et al.
(supra) showed that oral immunization of mice with crude sonicates of H. pylori protected mice from subsequent challenge with H. fells.
Polynucleotides, e.g., DNA molecules, encoding allelic variants can easily be obtained by polymerase chain reaction (PCR) amplification of genomic bacterial DNA extracted by conventional methods. This involves the use of synthetic oligonucleotide primers matching sequences that are upstream and downstream of the 5' and 3' ends of the coding region. Suitable primers can be designed based on the nucleotide sequence information provided in the sequence listing (odd numbers, up to SEQ ID N0:1363). Typically, a primer consists of 10 to 40, preferably 15 to 25 nucleotides. It can also be advantageous to select primers containing .C and G nucleotides in proportions sufficient to ensure efficient hybridization, e.g., an amount of C and G
nucleotides of at least 40%, preferably 50%, of the total nucleotide amount.

Those skilled in the art can readily design primers that can be used to isolate the polynucleotides of the invention from different Helicobacter strains.
Experimental conditions for carrying out PCR can readily be determined by one skilled in the art and an illustration of carrying out PCR is provided in Example 2. As is well known in the art, restriction endonuclease recognition sites that contain, typically, 4 to 6 nucleotides (for example, the sequences 5'-GGATCC-3' (BamHI) or 5'-CTCGAG-3' (XlzoI)), can be included on the 5' ends of the primers. Restriction sites can be selected by those skilled in the art so that the amplified DNA can be conveniently cloned into an appropriately digested vector, such as a plasmid.
Useful homologs that do not occur naturally can be designed using known methods for identifying regions of an antigen that are likely to be tolerant of amino acid sequence changes and/or deletions. For example, sequences of the antigen from different species can be compared to identify conserved sequences.
Polypeptide derivatives that are encoded by polynucleotides of the invention include, e.g., fragments, polypeptides having large internal deletions derived from full-length polypeptides, and fusion proteins. Polypeptide fragments of the invention can be derived from a polypeptide having a sequence homologous to any of the sequences of the sequence listing (even numbers, up to SEQ ID N0:1364), to the extent that the fragments retain the substantial antigenicity of the parent polypeptide (specific antigenicity).
Polypeptide derivatives can also be constructed by large internal deletions that remove a substantial part of the parent polypeptide, while retaining specific antigenicity. Generally, polypeptide derivatives should be about at least 12 amino acids in length to maintain antigenicity. Advantageously, they can be at least 20 amino acids, preferably at least 50 amino acids, more preferably at least 75 amino acids, and most preferably at least 100 amino acids in length.
Useful polypeptide derivatives, e.g., polypeptide fragments, can be designed using computer-assisted analysis of amino acid sequences in order to identify sites in protein antigens having potential as surface-exposed, antigenic regions (Hughes et al., Infect. Immun. 60(9):3497, 1992). For example, the Laser Gene Program from DNA Star can be used to obtain hydrophilicity, antigenic index, and intensity index plots for the polypeptides of the invention.
This program can also be used to obtain information about homologies of the polypeptides with known protein motifs. One skilled in the art can readily use the information provided in such plots to select peptide fragments for use as vaccine antigens. For example, fragments spanning regions of the plots in which the antigenic index is relatively high can be selected. One can also select fragments spanning regions in which both the antigenic index and the intensity plots are relatively high. Fragments containing conserved sequences, particularly hydrophilic conserved sequences, can also be selected.
Polypeptide fragments and polypeptides having large internal deletions can be used for revealing epitopes that are otherwise masked in the parent polypeptide and that may be of importance for inducing a protective T cell-dependent immune response. Deletions can also remove immunodominant regions of high variability among strains.
It is an accepted practice in the field of immunology to use fragments and variants of protein immunogens as vaccines, as all that is required to induce an immune response to a protein is a small (e.g., 8 to 10 amino acids) immunogenic region of the protein. This has been done for a number of vaccines against pathogens other than Helicobacter. For example, short synthetic peptides corresponding to surface-exposed antigens of pathogens such as murine mammary tumor virus (peptide containing 11 amino acids; Dion et al., Virology 179:474-477, 1990), Semliki Forest virus (peptide containing 16 amino acids; Snijders et al., J. Gen. Virol. 72:557-565, 1991), and canine parvovirus (2 overlapping peptides, each containing 15 amino acids; Langeveld et al., Vaccine 12(15):1473-1480, 1994) have been shown to be effective S vaccine antigens against their respective pathogens.
Polynucleotides encoding polypeptide fragments and polypeptides having large internal deletions can be constructed using standard methods (see, e.g., Ausubel et al., Current Protocols in Molecular Biology, John Wiley &
Sons Inc., 1994), for example, by PCR, including inverse PCR, by restriction enzyme treatment of the cloned DNA molecules, or by the method of Kunkel et al. (Proc. Natl. Acad. Sci. USA 82:448, 1985; biological material available at Stratagene).
A polypeptide derivative can also be produced as a fusion polypeptide that contains a polypeptide or a polypeptide derivative of the invention fused, e.g., at the N- or C-terminal end, to any other polypeptide (hereinafter referred to as a peptide tail). Such a product can be easily obtained by translation of a genetic fusion, i.e., a hybrid gene. Vectors for expressing fusion polypeptides are commercially available, and include the pMal-c2 or pMal-p2 systems of New England Biolabs, in which the peptide tail is a maltose binding protein, the glutathione-S-transferase system of Pharmacia, or the His-Tag system available from Novagen. These and other expression systems provide convenient means for further purification of polypeptides and derivatives of the invention.
Another particular example of fusion polypeptides included in invention includes a polypeptide or polypeptide derivative of the invention fused to a polypeptide having adjuvant activity, such as, e.g., subunit B of either cholera toxin or E. coli heat-labile toxin. Several possibilities can be used for producing such fusion proteins. First, the polypeptide of the invention can be fused to the N-terminal end or, preferably, to the C-terminal end of the polypeptide having adjuvant activity. Second, a polypeptide fragment of the invention can be fused within the amino acid sequence of the polypeptide having adjuvant activity. Spacer sequences can also be included, if desired.
As stated above, the polynucleotides of the invention encode Helicobacter polypeptides in precursor or mature form. They can also encode hybrid precursors containing heterologous signal peptides, which can mature into polypeptides of the invention. By "heterologous signal peptide" is meant a signal peptide that is not found in the naturally-occurring precursor of a polypeptide of the invention.
A polynucleotide of the invention hybridizes, preferably under stringent conditions, to a polynucleotide having a sequence as shown in the sequence listing (odd numbers, up to SEQ ID N0:1363). Hybridization procedures are, 1 S e.g., described by Ausubel et al. (supra); Silhavy et al. (Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1984); and Davis et al. (A Manual for Genetic Engineering: Advanced Bacterial Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1980). Important parameters that can be considered for optimizing hybridization conditions are reflected in the following formula, which facilitates calculation of the melting temperature (Tm), which is the temperature above which two complementary DNA strands separate from one another (Casey et al., Nucl. Acid Res. 4:1539, 1977): Tm = 81.5 + 0.5 x (%
G+C) + 1.6 log (positive ion concentration) - 0.6 x (% formamide). Under appropriate stringency conditions, hybridization temperature (Th) is approximately 20 to 40 ° C, 20 to 25 ° C, or, preferably, 30 to 40 ° C below the calculated Tm. Those skilled in the art will understand that optimal temperature and salt conditions can be readily determined empirically in preliminary experiments using conventional procedures. For example, stringent conditions can be achieved, both for pre-hybridizing and hybridizing incubations, (i) within 4-16 hours at 42°C, in 6 x SSC containing SO% formamide or (ii) within 4-16 hours at 65 °C in an aqueous 6 x SSC
solution ( 1 M NaCI, 0.1 M sodium citrate (pH 7.0)). For polynucleotides containing 30 to 600 nucleotides, the above formula is used and then is corrected by subtracting (600/polynucleotide size in base pairs). Stringency conditions are defined by a Th that is 5 to 10°C below Tm.
Hybridization conditions with oligonucleotides shorter than 20-30 bases do not precisely follow the rules set forth above. In such cases, the formula for calculating the Tm is as follows: Tm = 4 x (G+C) + 2 (A+T), For example, an 18 nucleotide fragment of 50% G+C would have an approximate Tm of 54°C.
A polynucleotide molecule of the invention, containing RNA, DNA, or modifications or combinations thereof, can have various applications. For example, a polynucleotide molecule can be used (i) in a process for producing the encoded polypeptide in a recombinant host system, (ii) in the construction of vaccine vectors such as poxviruses, which are further used in methods and compositions for preventing and/or treating Helicobacter infection, (iii) as a vaccine agent, in a naked form or formulated with a delivery vehicle and, (iv) in the construction of attenuated Helicobacter strains that can over-express a polynucleotide of the invention or express it in a non-toxic, mutated form.
According to a second aspect of the invention, there is therefore provided (i) an expression cassette containing a polynucleotide molecule of the invention placed under the control of elements (e.g., a promoter) required for expression; (ii) an expression vector containing an expression cassette of the invention; (iii) a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, as well as (iv) a process for producing a polypeptide or polypeptide derivative encoded by a polynucleotide of the invention, which involves culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette and/or vector of the invention, under conditions that allow expression of the polynucleotide molecule of the invention and, recovering the encoded polypeptide or polypeptide derivative from the cell culture.
A recombinant expression system can be selected from procaryotic and eucaryotic hosts. Eucaryotic hosts include, for example, yeast cells (e.g., Saccharomyces cerevisiae or Pichia Pastoris), mammalian cells (e.g., COS1, NIH3T3, or JEG3 cells), arthropods cells (e.g., Spodoptera frugiperda (SF9) cells), and plant cells. Preferably, a procaryotic host such as E. coli is used.
Bacterial and eucaryotic cells are available from a number of different sources that are known to those skilled in the art, e.g., the American Type Culture Collection (ATCC; Rockville, Maryland).
The choice of the expression cassette will depend on the host system selected, as well as the features desired for the expressed polypeptide. For example, it may be useful to produce a polypeptide of the invention in a particular lipidated form or any other form. Typically, an expression cassette includes a constitutive or inducible promoter that is functional in the selected host system; a ribosome binding site; a start codon (ATG); if necessary, a region encoding a signal peptide, e.g., a lipidation signal peptide; a polynucleotide molecule of the invention; a stop codon; and, optionally, a 3' terminal region {translation and/or transcription terminator). The signal peptide-encoding region is adjacent to the polynucleotide of the invention and is placed in the proper reading frame. The signal peptide-encoding region can be homologous or heterologous to the polynucleotide molecule encoding the mature polypeptide and it can be specific to the secretion apparatus of the host used for expression. The open reading frame constituted by the polynucleotide molecule of the invention, alone or together with the signal peptide, is placed under the control of the promoter so that transcription and translation occur in S the host system. Promoters and signal peptide-encoding regions are widely known and available to those skilled in the art and include, for example, the promoter of Salmonella typhimurium (and derivatives) that is inducible by arabinose (promoter araB) and is functional in Gram-negative bacteria such as E. coli (U.S. Patent No. 5,028,530; Cagnon et al., Protein Engineering 4(7):843, 1991 ); the promoter of the bacteriophage T7 RNA polymerise gene, which is functional in a number of E. coli strains expressing T7 polymerise (U.S. Patent No. 4,952,496); the OspA lipidation signal peptide; and RlpB
lipidation signal peptide (Takase et al., J. Bact. 169:5692, 1987).
The expression cassette is typically part of an expression vector, which is selected for its ability to replicate in the chosen expression system.
Expression vectors (e.g., plasmids or viral vectors) can be chosen from, for example, those described in Pouwels et al. (Cloning Vectors: A Laboratory Manual, 1985, Supp. 1987) and can purchased from various commercial sources. Methods for transforming or transfecting host cells with expression vectors are well known in the art and will depend on the host system selected, as described in Ausubel et al. (supra).
Upon expression, a recombinant polypeptide of the invention (or a polypeptide derivative) is produced and remains in the intracellular compartment, is secreted/excreted in the extracellular medium or in the periplasmic space, or is embedded in the cellular membrane. The polypeptide can then be recovered in a substantially purified form from the cell extract or from the supernatant after centrifugation of the cell culture. Typically, the recombinant polypeptide can be purified by antibody-based affinity purification or by any other method known to a person skilled in the art, such as by genetic fusion to a small affinity-binding domain. Antibody-based affinity purification methods are also available for purifying a polypeptide of the invention extracted from a Helicobacter strain. Antibodies useful for immunoaffinity purification of the polypeptides of the invention can be obtained using methods described below.
Polynucleotides of the invention can also be used in DNA vaccination methods, using either a viral or bacterial host as gene delivery vehicle (live vaccine vector) or administering the gene in a free form, e.g., inserted into a plasmid. Therapeutic or prophylactic efficacy of a polynucleotide of the invention can be evaluated as is described below.
Accordingly, in a third aspect of the invention, there is provided (i) a vaccine vector such as a poxvirus, containing a polynucleotide molecule of the 1 S invention placed under the control of elements required for expression;
(ii) a composition of matter containing a vaccine vector of the invention, together with a diluent or carrier; (iii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a vaccine vector of the invention; (iv) a method for inducing an immune response against Helicobacter in a mammal (e.g., a human; alternatively, the method can be used in veterinary applications for treating or preventing Helicobacter infection of animals, e.g., cats or birds), which involves administering to the mammal an immunogenically effective amount of a vaccine vector of the invention to elicit an immune response, e.g., a protective or therapeutic immune response to Helicobacter; and (v) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. fells, H. mustelae, or H. heilmanii) infection, which involves -SO-administering a prophylactic or therapeutic amount of a vaccine vector of the invention to an individual in need. Additionally, the third aspect of the invention encompasses the use of a vaccine vector of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
A vaccine vector of the invention can express one or several polypeptides or derivatives of the invention, as well as at least one additional Helicobacter antigen such as a urease apoenzyme or a subunit, fragment, homolog, mutant, or derivative thereof. In addition, it can express a cytokine, such as interleukin-2 (IL-2) or interleukin-12 (IL-12), that enhances the immune response. Thus, a vaccine vector can include an additional polynucleotide molecules encoding, e.g., urease subunit A, B, or both, or a cytokine, placed under the control of elements required for expression in a mammalian cell.
Alternatively, a composition of the invention can include several vaccine vectors, each of which being capable of expressing a polypeptide or derivative of the invention. A composition can also contain a vaccine vector capable of expressing an additional Helicobacter antigen such as urease apoenzyme, a subunit, fragment, homolog, mutant, or derivative thereof, or a cytokine such as IL-2 or IL-12.
In vaccination methods for treating or preventing infection in a mammal, a vaccine vector of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasal, oral, gastric, pulmonary, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route. Preferred routes depend upon the choice of the vaccine vector. The administration can be achieved in a single dose or repeated at intervals. The appropriate dosage depends on various parameters that are understood by those skilled in the art, such as the nature of the vaccine vector itself, the route of administration, and the condition of the mammal to be vaccinated (e.g., the weight, age, and general health of the mammal).
Live vaccine vectors that can be used in the invention include viral vectors, such as adenoviruses and poxviruses, as well as bacterial vectors, e.g., Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bilie de Calmette-Guerin (BCG), and Streptococcus. An example of an adenovirus vector, as well as a method for constructing an adenovirus vector capable of expressing a polynucleotide molecule of the invention, is described in U.S. Patent No.
4,920,209. Poxvirus vectors that can be used in the invention include, e.g., vaccinia and canary pox viruses, which are described in U.S. Patent No.
4,722,848 and U.S. Patent No. 5,364,773, respectively (also see, e.g., Tartaglia et al., Virology 188:217, 1992, for a description of a vaccinia virus vector, and Taylor et al, Vaccine 13:539, 1995, for a description of a canary poxvirus vector). Poxvirus vectors capable of expressing a polynucleotide of the invention can be obtained by homologous recombination, as described in Kieny et al. (Nature 312:163, 1984) so that the polynucleotide of the invention is inserted in the viral genome under appropriate conditions for expression in mammalian cells. Generally, the dose of viral vector vaccine, for therapeutic or prophylactic use, can be from about 1 x 104 to about 1 x 10", advantageously from about 1x10' to about 1x10'°, or, preferably, from about 1x10' to about 1x109 plaque-forming units per kilogram. Preferably, viral vectors are administered parenterally, for example, in 3 doses that are 4 weeks apart.
Those skilled in the art will recognize that it is preferable to avoid adding a chemical adjuvant to a composition containing a viral vector of the invention and thereby minimizing the immune response to the viral vector itself.

Non-toxicogenic Vibrio cholerae mutant strains that can be used in live oral vaccines are described by Mekalanos et al. (Nature 306:551, 1983) and in U.S. Patent No. 4,882,278 (strain in which a substantial amount of the coding sequence of each of the two ctxA alleles has been deleted so that no functional cholerae toxin is produced); WO 92/11354 (strain in which the irgA locus is inactivated by mutation; this mutation can be combined in a single strain with ctxA mutations); and WO 94/1533 (deletion mutant lacking functional ctxA and attRSl DNA sequences). These strains can be genetically engineered to express heterologous antigens, as described in WO 94/19482. An effective vaccine dose of a Y. cholerae strain capable of expressing a polypeptide or polypeptide derivative encoded by a polynucleotide molecule of the invention can contain, e.g., about 1x105 to about 1x109, preferably about 1x106 to about 1 x 1 O8 viable bacteria in an appropriate volume for the selected route of administration. Preferred routes of administration include all mucosal routes, but, most preferably, these vectors are administered intranasally or orally.
Attenuated Salmonella typhimurium strains, genetically engineered for recombinant expression of heterologous antigens, and their use as oral vaccines, are described by Nakayama et al. (Bio/Technology 6:693, 1988) and in WO 92/11361. Preferred routes of administration for these vectors include all mucosal routes. Most preferably, the vectors are administered intranasally or orally.
Others bacterial strains useful as vaccine vectors are described by High et al. (EMBO 11:1991, 1992) and Sizemore et al. (Science 270:299, 1995;
Shigella flexneri); Medaglini et al. (Proc. Natl. Acad. Sci. USA 92:6868, 1995;
(Streptococcus gordonii); Flynn (Cell. Mol. Biol. 40 (suppl. I):31, 1194), and in 6, WO 90/0594, WO 91/13157, WO 92/1796, and WO 92/21376 (Bacille Calmette Guerin). In bacterial vectors, a polynucleotide of the invention can be inserted into the bacterial genome or it can remain in a free state, for example, carried on a plasmid.
An adjuvant can also be added to a composition containing a bacterial vector vaccine. A number of adjuvants that can be used are known to those skilled in the art. For example, preferred adjuvants can be selected from the list provided below.
According to a fourth aspect of the invention, there is also provided (i) a composition of matter containing a polynucleotide of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a polynucleotide of the invention; (iii) a method for inducing an immune response against Helicobacter, in a mammal, by administering to the mammal an immunogenically effective amount of a polynucleotide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H.
felis, H. mustelae, or H. heilmanii) infection, by administering a prophylactic or therapeutic amount of a polynucleotide of the invention to an individual in need of such treatment. Additionally, the fourth aspect of the invention encompasses the use of a polynucleotide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection. The fourth aspect of the invention preferably includes the use of a polynucleotide molecule placed under conditions for expression in a mammalian cell, e.g., in a plasmid that is unable to replicate in mammalian cells and to substantially integrate into a mammalian genome.
Polynucleotides (for example, DNA or RNA molecules) of the invention can also be administered as such to a mammal as a vaccine. When a DNA
molecule of the invention is used, it can be in the form of a plasmid that is unable to replicate in a mammalian cell and unable to integrate into the mammalian genome. Typically, a DNA molecule is placed under the control of a promoter suitable for expression in a mammalian cell. The promoter can function ubiquitously or tissue-specifically. Examples of non-tissue specific promoters include the early Cytomegalovirus (CMV) promoter (U.S. Patent No. 4,168,062) and the Rous Sarcoma Virus promoter (Norton et al., Molec.
Cell Biol. 5:281, 1985). The desmin promoter (Li et al., Gene 78:243, 1989; Li et al., J. Biol. Chem. 266:6562, 1991; Li et al., J. Biol. Chem. 268:10403, 1993) is tissue-specific and drives expression in muscle cells. More generally, useful promoters and vectors are described, e.g., in WO 94/21797 and by Hartikka et al. (Human Gene Therapy 7:1205, 1996).
For DNA/RNA vaccination, the polynucleotide of the invention can encode a precursor or a mature form of a polypeptide of the invention. When it encodes a precursor form, the precursor sequence can be homologous or heterologous. In the latter case, a eucaryotic leader sequence can be used, such as the leader sequence of the tissue-type plasminogen factor (tPA).
A composition of the invention can contain one or several poiynucleotides of the invention. It can also contain at least one additional polynucleotide encoding another Helicobacter antigen, such as unease subunit A, B, or both, or a fragment, derivative, mutant, or analog thereof. A
polynucleotide encoding a cytokine, such as interleukin-2 (IL-2) or interleukin-12 (IL-12), can also be added to the composition so that the immune response is enhanced. These additional polynucleotides are placed under appropriate control for expression. Advantageously, DNA molecules of the invention and/or additional DNA molecules to be included in the same composition are carried in the same plasmid.

Standard methods can be used in the preparation of therapeutic polynucleotides of the invention. For example, a polynucleotide can be used in a naked form, free of any delivery vehicles, such as anionic liposomes, cationic lipids, microparticles, e.g., gold microparticles, precipitating agents, e.g., calcium phosphate, or any other transfection-facilitating agent. In this case, the polynucleotide can be simply diluted in a physiologically acceptable solution, such as sterile saline or sterile buffered saline, with or without a carrier.
When present, the carrier preferably is isotonic, hypotonic, or weakly hypertonic, and has a relatively low ionic strength, such as provided by a sucrose solution, e.g., a solution containing 20% sucrose.
Alternatively, a polynucleotide can be associated with agents that assist in cellular uptake. It can be, e.g., {i) complemented with a chemical agent that modifies cellular permeability, such as bupivacaine (see, e.g., WO 94/16737), (ii) encapsulated into liposomes, or {iii) associated with cationic lipids or silica, gold, or tungsten microparticles.
1 S Anionic and neutral liposomes are well-known in the art (see, e.g., Liposomes: A Practical Approach, RPC New Ed, IRL Press, 1990, for a detailed description of methods for making liposomes) and are useful for delivering a large range of products, including polynucleotides.
Cationic lipids can also be used for gene delivery. Such lipids include, for example, Lipofectin'~'', which is also known as DOTMA {N-[1-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium chloride), DOTAP (1,2-bis(oleyloxy)-3-(trimethylammonio)propane), DDAB
(dimethyldioctadecylammonium bromide), DOGS (dioctadecylamidologlycyl spermine), and cholesterol derivatives. A description of these cationic lipids can be found in EP 187,702, WO 90/11092, U.S. Patent No. 5,283,185, WO 91/15501, WO 95/26356, and U.S. Patent No. 5,527,928. Cationic lipids for gene delivery are preferably used in association with a neutral lipid such as DOPE (dioleyl phosphatidylethanolamine; WO 90/11092). Other transfection-facilitating compounds can be added to a formulation containing cationic liposomes. A number of them are described in, e.g., WO 93118759, WO 93/19768, WO 94/25608, and WO 95/2397. They include, e.g., spermine derivatives useful for facilitating the transport of DNA through the nuclear membrane (see, for example, WO 93/18759) and membrane-permeabilizing compounds such as GALA, Gramicidine S, and cationic bile salts (see, for example, WO 93/19768).
Gold or tungsten microparticles can also be used for gene delivery, as described in WO 91/359, WO 93/17706, and by Tang et al. (Nature 356:152, 1992). In this case, the microparticle-coated polynucleotides can be injected via intradermal or intraepidermal routes using a needleless injection device ("gene gun"), such as those described in U.S. Patent No. 4,945,050, U.S.
Patent No. 5,015,580, and WO 94/24263.
The amount of DNA to be used in a vaccine recipient depends, e.g., on the strength of the promoter used in the DNA construct, the immunogenicity of the expressed gene product, the condition of the mammal intended for administration (e.g., the weight, age, and general health of the mammal), the mode of administration, and the type of formulation. In general, a therapeutically or prophylactically effective dose from about 1 ~,g to about 1 mg, preferably, from about 10 ~,g to about 800 ~.g, and, more preferably, from about 25 ~,g to about 250 ~,g, can be administered to human adults. The administration can be achieved in a single dose or repeated at intervals.
The route of administration can be any conventional route used in the vaccine field. As general guidance, a polynucleotide of the invention can be administered via a mucosal surface, e.g., an ocular, intranasal, pulmonary, oral, intestinal, rectal, vaginal, or urinary tract surface, or via a parenteral route, e.g., by an intravenous, subcutaneous, intraperitoneal, intradermal, intraepidermal, or intramuscular route. The choice of administration route will depend on, e.g., the formulation that is selected. A polynucleotide formulated in association with bupivacaine is advantageously administered into muscle. When a neutral or anionic liposome or a cationic lipid, such as DOTMA, is used, the formulation can be advantageously injected via intravenous, intranasal (for example, by aerosolization), intramuscular, intradermal, and subcutaneous routes. A polynucleotide in a naked form can advantageously be administered via the intramuscular, intradermal, or subcutaneous routes. Although not absolutely required, such a composition can also contain an adjuvant. A
systemic adjuvant that does not require concomitant administration in order to exhibit an adjuvant effect is preferable.
The sequence information provided in the present application enables the design of specific nucleotide probes and primers that can be used in diagnostic methods. Accordingly, in a fifth aspect of the invention, there is provided a nucleotide probe or primer having a sequence found in, or derived by degeneracy of the genetic code from, a sequence shown in the sequence listing (odd numbers, up to SEQ ID N0:1363).
The term "probe" as used in the present application refers to DNA
(preferably single stranded) or RNA molecules (or modifications or combinations thereof) that hybridize under the stringent conditions, as defined above, to polynucleotide molecules having sequences homologous to any of those shown in the sequence listing (odd numbers, up to SEQ ID N0:1363), or to a complementary or anti-sense sequence of any of those shown in the sequence listing {odd numbers, up to SEQ ID N0:1363). Generally, probes are significantly shorter than the full-length sequences shown in the sequence listing. For example, they can contain from about 5 to about 100, preferably from about 10 to about 80 nucleotides. In particular, probes have sequences that are at least 75%, preferably at least 85%, more preferably 95%
homologous to a portion of a sequence as shown in the sequence listing (odd numbers, up to SEQ ID N0:1363), or a sequence complementary to any of such sequences.
Probes can contain modified bases, such as inosine, methyl-5-deoxycytidine, deoxyuridine, dimethylamino-5-deoxyuridine, or diamino-2, 6-purine. Sugar or phosphate residues can also be modified or substituted. For example, a deoxyribose residue can be replaced by a polyamide (Nielsen et al., Science 254:1497, 1991) and phosphate residues can be replaced by ester groups such as diphosphate, alkyl, arylphosphonate, and phosphorothioate esters. In addition, the 2'-hydroxyl group on ribonucleotides can be modified by addition of, e.g., alkyl groups.
Probes of the invention can be used in diagnostic tests, or as capture or detection probes. Such capture probes can be immobilized on solid supports, directly or indirectly, by covalent means or by passive adsorption. A
detection probe can be labeled by a detectable label, for example a label selected from radioactive isotopes; enzymes, such as peroxidase and alkaline phosphatase;
enzymes that are able to hydrolyze a chromogenic, fluorogenic, or luminescent substrate; compounds that are chromogenic, fluorogenic, or luminescent;
nucleotide base analogs; and biotin.
Probes of the invention can be used in any conventional hybridization method, such as in dot blot methods (Maniatis et al., Molecular Cloning: A
Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1982), Southern blot methods (Southern, J. Mol. Biol.
98:503, 1975), northern blot methods (identical to Southern blot to the exception that RNA is used as a target), or a sandwich method {Dunn et al., Cell 12:23, 1977). As is known in the art, the latter technique involves the use of a specific capture probe and a specific detection probe that have nucleotide sequences that are at least partially different from each other.
Primers used in the invention usually contain about 10 to 40 nucleotides and are used to initiate enzymatic polymerization of DNA in an amplification process (e.g., PCR), an elongation process, or a reverse transcription method.
In a diagnostic method involving PCR, the primers can be labeled.
Thus, the invention also encompasses (i) a reagent containing a probe of the invention for detecting and/or identifying the presence of Helicobacter in a biological material; (ii) a method for detecting and/or identifying the presence of Helicobacter in a biological material, in which (a) a sample is recovered or derived from the biological material, (b) DNA or RNA is extracted from the material and denatured, and (c) the sample is exposed to a probe of the invention, for example, a capture probe, a detection probe, or both, under stringent hybridization conditions, so that hybridization is detected; and (iii) a method for detecting and/or identifying the presence ofHelicobacter in a biological material, in which (a) a sample is recovered or derived from the biological material, (b) DNA is extracted therefrom, (c) the extracted DNA is contacted with at least one, or, preferably two, primers of the invention, and amplified by the polymerase chain reaction, and (d) an amplified DNA
molecule is produced.
As mentioned above, polypeptides that can be produced by expression of the polynucleotides of the invention can be used as vaccine antigens.
Accordingly, a sixth aspect of the invention features a substantially purified polypeptide or polypeptide derivative having an amino acid sequence encoded by a polynucleotide of the invention.

A "substantially purified polypeptide" is defined as a polypeptide that is separated from the environment in which it naturally occurs and/or a polypeptide that is free of most of the other polypeptides that are present in the environment in which it was synthesized. The polypeptides of the invention can be purified from a natural source, such as a Helicobacter strain, or can be produced using recombinant methods.
Homologous polypeptides or polypeptide derivatives encoded by polynucleotides of the invention can be screened for specific antigenicity by testing cross-reactivity with an antiserum raised against a polypeptide having an amino acid sequence as shown in the sequence listing (even numbers, up to SEQ ID N0:1364). Briefly, a monospecific hyperimmune antiserum can be raised against a purified reference polypeptide as such or as a fusion polypeptide, for example, an expression product of MBP, GST, or His-tag systems, or a synthetic peptide predicted to be antigenic. The homologous polypeptide or derivative that is screened for specific antigenicity can be produced as such or as a fusion polypeptide. In the latter case, and if the antiserum is also raised against a fusion polypeptide, two different fusion systems are employed. Specific antigenicity can be determined using a number of methods, including Western blot (Towbin et al., Proc. Natl. Acad. Sci. USA
76:4350, 1979), dot blot, and ELISA methods, as described below.
In a Western blot assay, the product to be screened, either as a purified preparation or a total E. toll extract, is fractionated by SDS-PAGE, as described, for example, by Laemmli (Nature 227:680, 1970). After being transferred to a filter, such as a nitrocellulose membrane, the material is incubated with the monospecific hyperimmune antiserum, which is diluted in a range of dilutions from about 1:50 to about 1:5000, preferably from about 1:100 to about 1:500. Specific antigenicity is shown once a band corresponding to the product exhibits reactivity at any of the dilutions in the range.
In an ELISA assay, the product to be screened can be used as the coating antigen. A purified preparation is preferred, but a whole cell extract can also be used. Briefly, about 100 ~.1 of a preparation of about 10 ~,g protein/ml is S distributed into wells of a 96-well ELISA plate. The plate is incubated for about 2 hours at 37°C, then overnight at 4°C. The plate is washed with phosphate buffer saline (PBS) containing 0.05% Tween 20 (PBS/Tween buffer) and the wells are saturated with 250 ~1 PBS containing 1% bovine serum albumin (BSA), to prevent non-specific antibody binding. After 1 hour of incubation at 37°C, the plate is washed with PBS/Tween buffer. The antiserum is serially diluted in PBS/Tween buffer containing 0.5% BSA, and 100 ~,1 dilutions are added to each well. The plate is incubated for 90 minutes at 37 ° C, washed, and evaluated using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be added to the wells when the specific antibodies used were raised in rabbits. Incubation is carried out for about 90 minutes at 37°C and the plate is washed. The reaction is developed with the appropriate substrate and the reaction is measured by colorimetry (absorbance measured spectrophotometrically). Under these experimental conditions, a positive reaction is shown once an O.D. value of 1.0 is detected with a dilution of at least about 1:50, preferably of at least about 1:500.
In a dot blot assay, a purified product is preferred, although a whole cell extract can be used. Briefly, a solution of the product at a concentration of about 100 ~,g/ml is serially diluted two-fold with 50 mM Tris-HCl (pH 7.5).
One hundred ~,1 of each dilution is applied to a filter, such as a 0.45 ~m nitrocellulose membrane, set in a 96-well dot blot apparatus (Biorad). The buffer is removed by applying vacuum to the system. Wells are washed by addition of SO mM Tris-HC1 (pH 7.5) and the membrane is air-dried. The membrane is saturated in blocking buffer (50 mM Tris-HCl (pH 7.5}, 0.15 M
NaCI, 10 g/L skim milk) and incubated with an antiserum diluted from about 1:50 to about 1:5000, preferably about 1:500. The reaction is detected using standard methods. For example, a goat anti-rabbit peroxidase conjugate can be S added to the wells when rabbit antibodies are used. Incubation is carried out for about 90 minutes at 37°C and the blot is washed. The reaction is developed with the appropriate substrate and stopped. The reaction is then measured visually by the appearance of a colored spot, e.g., by colorimetry. Under these experimental conditions, a positive reaction is associated with detection of a colored spot for reactions carried out with a dilution of at least about 1:50, preferably, of at least about 1:500. Therapeutic or prophylactic efficacy of a polypeptide or polypeptide derivative of the invention can be evaluated as described below.
According to a seventh aspect of the invention, there is provided (i) a composition of matter containing a polypeptide of the invention together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a polypeptide of the invention; (iii) a method for inducing an immune response against Helicobacter in a mammal by administering to the mammal an immunogenically effective amount of a polypeptide of the invention to elicit an immune response, e.g., a protective immune response to Helicobacter; and (iv) a method for preventing and/or treating a Helicobacter (e.g., H. pylori, H. fells, H. mustelae, or H.
heilmanii) infection, by administering a prophylactic or therapeutic amount of a polypeptide of the invention to an individual in need of such treatment.
Additionally, this aspect of the invention includes the use of a polypeptide of the invention in the preparation of a medicament for preventing and/or treating Helicobacter infection.
The immunogenic compositions of the invention can be administered by any conventional route in use in the vaccine field, for example, to a mucosal (e.g., ocular, intranasal, pulmonary, oral, gastric, intestinal, rectal, vaginal, or urinary tract) surface or via a parenteral (e.g., subcutaneous, intradermal, intramuscular, intravenous, or intraperitoneal) route. The choice of the administration route depends upon a number of parameters, such as the adjuvant used. For example, if a mucosal adjuvant is used, the intranasal or oral route will be preferred, and if a lipid formulation or an aluminum compound is used, a parenteral route will be preferred. In the latter case, the subcutaneous or intramuscular route is most preferred. The choice of administration route can also depend upon the nature of the vaccine agent. For example, a polypeptide of the invention fused to CTB or to LTB will be best administered to a mucosal surface.
1 S A composition of the invention can contain one or several polypeptides or derivatives of the invention. It can also contain at least one additional Helicobacter antigen, such as the urease apoenzyme, or a subunit, fragment, homolog, mutant, or derivative thereof.
For use in a composition of the invention, a polypeptide or polypeptide derivative can be formulated into or with liposomes, such as neutral or anionic liposomes, microspheres, ISCOMS, or virus-like particles (VLPs), to facilitate delivery and/or enhance the immune response. These compounds are readily available to those skilled in the art; for example, see Liposomes: A Practical Approach (supra). Adjuvants other than liposomes can also be used in the invention and are well known in the art (see, for example, the list provided below).

ø_ Administration can be achieved in a single dose or repeated as necessary at intervals that can be determined by one skilled in the art. For example, a priming dose can be followed by three booster doses at weekly or monthly intervals. An appropriate dose depends on various parameters, including the nature of the recipient (e.g., whether the recipient is an adult or an infant), the particular vaccine antigen, the route and frequency of administration, the presence/absence or type of adjuvant, and the desired effect (e.g., protection and/or treatment), and can be readily determined by one skilled in the art. In general, a vaccine antigen of the invention can be administered mucosally in an amount ranging from about 10 ~g to about 500 mg, preferably from about 1 mg to about 200 mg. For a parenteral route of administration, the dose usually should not exceed about 1 mg, and is, preferably, about 100 ~,g.
When used as components of a vaccine, the polynucleotides and polypeptides of the invention can be used sequentially as part of a mufti-step immunization process. For example, a mammal can be initially primed with a vaccine vector of the invention, such as a pox virus, e.g., via a parenteral route, and then boosted twice with a polypeptide encoded by the vaccine vector, e.g., via the mucosal route. In another example, liposomes associated with a polypeptide or polypeptide derivative of the invention can be used for priming, with boosting being carried out mucosally using a soluble polypeptide or polypeptide derivative of the invention, in combination with a mucosal adjuvant (e.g., LT).
Polypeptides and polypeptide derivatives of the invention can also be used as diagnostic reagents for detecting the presence of anti-Helicobacter antibodies, e.g., in blood samples. Such polypeptides can be about 5 to about 80, preferably, about 10 to about 50 amino acids in length and can be labeled or unlabeled, depending upon the diagnostic method. Diagnostic methods involving such a reagent are described below.
Upon expression of a polynucleotide molecule of the invention, a polypeptide or polypeptide derivative is produced and can be purified using known methods. For example, the polypeptide or polypeptide derivative can be produced as a fusion protein containing a fused tail that facilitates purification.
The fusion product can be used to immunize a small mammal, e.g., a mouse or a rabbit, in order to raise monospecific antibodies against the polypeptide or polypeptide derivative. The eighth aspect of the invention thus provides a monospecific antibody that binds to a polypeptide or polypeptide derivative of the invention.
By "monospecific antibody" is meant an antibody that is capable of reacting with a unique, naturally-occurring Helicobacter polypeptide. An antibody of the invention can be polyclonal or monoclonal. Monospecific antibodies can be recombinant, e.g., chimeric (e.g., consisting of a variable region of murine origin and a human constant region), humanized {e.g., a human immunoglobulin constant region and a variable region of animal, e.g., murine, origin), and/or single chain. Both polyclonal and monospecific antibodies can also be in the form of immunoglobulin fragments, e.g., F(ab)'2 or Fab fragments. The antibodies of the invention can be of any isotype, e.g., IgG or IgA, and polyclonal antibodies can be of a single isotype or can contain a mixture of isotypes.
The antibodies of the invention, which can be raised to a polypeptide or polypeptide derivative of the invention, can be produced and identified using standard immunological assays, e.g., Western blot assays, dot blot assays, or ELISA (see, e.g., Coligan et al., Current Protocols in Immunology, John Wiley & Sons, Inc., New York, NY, 1994). The antibodies can be used in diagnostic methods to detect the presence of Helicobacter antigens in a sample, such as a biological sample. The antibodies can also be used in affinity chromatography methods for purifying a polypeptide or polypeptide derivative of the invention.
As is discussed further below, the antibodies can also be used in prophylactic and therapeutic passive immunization methods.
Accordingly, a ninth aspect of the invention provides (i) a reagent for detecting the presence of Helicobacter in a biological sample that contains an antibody, polypeptide, or polypeptide derivative of the invention; and (ii) a diagnostic method for detecting the presence of Helicobacter in a biological sample, by contacting the biological sample with an antibody, a polypeptide, or a polypeptide derivative of the invention, so that an immune complex is formed, and detecting the complex as an indication of the presence of Helicobacter in the sample or the organism from which the sample was derived. The immune complex is formed between a component of the sample and the antibody, polypeptide, or polypeptide derivative, and that any unbound material can be removed prior to detecting the complex. A polypeptide reagent can be used for detecting the presence of anti-Helicobacter antibodies in a sample, e.g., a blood sample, while an antibody of the invention can be used for screening a sample, such as a gastric extract or biopsy sample, for the presence of Helicobacter polypeptides.
For use in diagnostic methods, the reagent (e.g., the antibody, polypeptide, or polypeptide derivative of the invention) can be in a free state or can be immobilized on a solid support, such as, for example, on the interior surface of a tube or on the surface, or within pores, of a bead.
Immobilization can be achieved using direct or indirect means. Direct means include passive adsorption (i.e., non-covalent binding) or covalent binding between the support and the reagent. By "indirect means" is meant that an anti-reagent compound WO 98/43478 PCTlUS98/06371 that interacts with the reagent is first attached to the solid support. For example, if a polypeptide reagent is used, an antibody that binds to it can serve as an anti-reagent, provided that it binds to an epitope that is not involved in recognition of antibodies in biological samples. Indirect means can also employ a ligand-receptor system, for example, a molecule, such as a vitamin, can be grafted onto the polypeptide reagent and the corresponding receptor can be immobilized on the solid phase. This concept is illustrated by the well known biotin-streptavidin system. Alternatively, indirect means can be used, e.g., by adding to the reagent a peptide tail, chemically or by genetic engineering, and immobilizing the grafted or fused product by passive adsorption or covalent linkage of the peptide tail.
According to a tenth aspect of the invention, there is provided a process for purifying, from a biological sample, a polypeptide or polypeptide derivative of the invention, which involves carrying out antibody-based affinity chromatography with the biological sample, wherein the antibody is a monospecif c antibody of the invention.
For use in a purification process of the invention, the antibody can be polyclonal or monospecific, and preferably is of the IgG type. Purified IgGs can be prepared from an antiserum using standard methods (see, e.g., Coligan et al., supra). Conventional chromatography supports, as well as standard methods for grafting antibodies, are described, for example, by Harlow et al.
(Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1988).
Briefly, a biological sample, such as an H. pylori extract, preferably in a buffer solution, is applied to a chromatography material, which is, preferably, equilibrated with the buffer used to dilute the biological sample, so that the polypeptide or polypeptide derivative of the invention (i.e., the antigen) is allowed to adsorb onto the material. The chromatography material, such as a gel or a resin coupled to an antibody of the invention, can be in batch form or in a column. The unbound components are washed off and the antigen is eluted with an appropriate elution buffer, such as a glycine buffer, a buffer containing a chaotropic agent, e.g., guanidine HCI, or a buffer having high salt concentration (e.g., 3 M MgCl2). Eluted fractions are recovered and the presence of the antigen is detected, e.g., by measuring the absorbance at 280 nm.
An antibody of the invention can be screened for therapeutic efficacy as follows. According to an eleventh aspect of the invention, there is provided (i) a composition of matter containing a monospecific antibody of the invention, together with a diluent or carrier; (ii) a pharmaceutical composition containing a therapeutically or prophylactically effective amount of a monospecific antibody of the invention, and (iii) a method for treating or preventing Helicobacter (e.g., H. pylori, H. felis, H. mustelae, or H. heilmanii) infection, by administering a therapeutic or prophylactic amount of a monospecific antibody of the invention to an individual in need of such treatment. In addition, the eleventh aspect of the invention includes the use of a monospecific antibody of the invention in the preparation of a medicament for treating or preventing Helicobacter infection.
The monospecific antibody can be polyclonal or monoclonal, and is, preferably, predominantly of the IgA isotype. In passive immunization methods, the antibody is administered to a mucosal surface of a mammal, e.g., the gastric mucosa, e.g., orally or intragastrically, optionally, in the presence of a bicarbonate buffer. Alternatively, systemic administration, not requiring a bicarbonate buffer, can be carried out. A monospecific antibody of the invention can be administered as a single active agent or as a mixture with at least one additional monospecific antibody specific for a different Helicobacter polypeptide. The amount of antibody and the particular regimen used can be readily determined by one skilled in the art. For example, daily administration of about 100 to 1,000 mg of antibody over one week, or three doses per day of about 100 to 1,000 mg of antibody over two or three days, can be effective regimens for most purposes.
Therapeutic or prophylactic efficacy can be evaluated using standard methods in the art, e.g., by measuring induction of a mucosal immune response or induction of protective and/or therapeutic immunity, using, e.g., the H.
fells mouse model and the procedures described by Lee et al. (Eur. J.
Gastroenterology & Hepatology 7:303, 1995) or Lee et al. (J. Infect. Dis.
172:161, 1995). Those skilled in the art will recognize that the H. fells strain of the model can be replaced with another Helicobacter strain. For example, the efficacy of polynucleotide molecules and polypeptides from H. pylori is, preferably, evaluated in a mouse model using an H. pylori strain. Protection can be determined by comparing the degree of Helicobacter infection in the gastric tissue assessed by, for example, urease activity, bacterial counts, or gastritis, to that of a control group. Protection is shown when infection is reduced by comparison to the control group. Such an evaluation can be made for polynucleotides, vaccine vectors, polypeptides, and polypeptide derivatives, as well as for antibodies of the invention.
For example, various doses of an antibody of the invention can be administered to the gastric mucosa of mice previously challenged with an H.
pylori strain, as described, e.g., by Lee et al. (supra). Then, after an appropriate period of time, the bacterial load of the mucosa can be estimated by assessing urease activity, as compared to a control. Reduced urease activity indicates that the antibody is therapeutically effective.

Adjuvants that can be used in any of the vaccine compositions described above are described as follows. Adjuvants for parenteral administration include, for example, aluminum compounds, such as aluminum hydroxide, aluminum phosphate, and aluminum hydroxy phosphate. The antigen can be precipitated with, or adsorbed onto, the aluminum compound using standard methods. Other adjuvants, such as RIBI (ImmunoChem, Hamilton, MT), can also be used in parenteral administration.
Adjuvants that can be used for mucosal administration include, for example, bacterial toxins, e.g., the cholera toxin (CT), the E. coli heat-labile toxin (LT), the Clostridium dij~cile toxin A, the pertussis toxin (PT), and combinations, subunits, toxoids, or mutants thereof. For example, a purified preparation of native cholera toxin subunit B (CTB) can be used. Fragments, homologs, derivatives, and fusions to any of these toxins can also be used, provided that they retain adjuvant activity. Preferably, a mutant having reduced toxicity is used. Suitable mutants are described, e.g., in WO 95/17211 1 S (Arg-7-Lys CT mutant), WO 96/6627 {Arg-192-Gly LT mutant), and WO
95/34323 (Arg-9-Lys and Glu-129-Gly PT mutant). Additional LT mutants that can be used in the methods and compositions of the invention include, e.g., Ser-63-Lys, Ala-69-Gly, Glu-110-Asp, and Glu-112-Asp mutants. Other adjuvants, such as the bacterial monophosphoryl lipid A (MPLA) of, e.g., E.
coli, Salmonella minnesota, Salmonella typhimurium, or Shigella, flexneri;
saponins, and polylactide glycolide (PLGA) microspheres, can also be used in mucosal administration. Adjuvants useful for both mucosal and parenteral administrations, such as polyphosphazene (WO 95/2415), can also be used.
Any pharmaceutical composition of the invention, containing a polynucleotide, polypeptide, polypeptide derivative, or antibody of the invention, can be manufactured using standard methods. It can be formulated with a pharmaceutically acceptable diluent or carrier, e.g., water or a saline solution, such as phosphate buffer saline, optionally, including a bicarbonate salt, such as sodium bicarbonate, e.g., 0.1 to 0.5 M. Bicarbonate can advantageously be added to compositions intended for oral or intragastric administration. In general, a diluent or carrier can be selected on the basis of the mode and route of administration, and standard pharmaceutical practice.
Suitable pharmaceutical carriers and diluents, as well as pharmaceutical necessities for their use in pharmaceutical formulations, are described in Remington's Pharmaceutical Sciences, a standard reference text in this field and in the USP/NF.
The invention also includes methods in which gastroduodenal infections, such as Helicobacter infection, are treated by oral administration of a Helicobacter polypeptide of the invention and a mucosal adjuvant, in combination with an antibiotic, an antisecretory agent, a bismuth salt, an antacid, sucralfate, or a combination thereof. Examples of such compounds that can be administered with the vaccine antigen and an adjuvant are antibiotics, including, e.g., macrolides, tetracyclines, (3-lactams, aminoglycosides, quinolones, penicillins, and derivatives thereof (specific examples of antibiotics that can be used in the invention include, e.g., amoxicillin, clarithromycin, tetracycline, metronidizole, erythromycin, cefuroxime, and erythromycin); antisecretory agents, including, e.g., HZ-receptor antagonists (e.g., cimetidine, ranitidine, famotidine, nizatidine, and roxatidine), proton pump inhibitors (e.g., omeprazole, lansoprazole, and pantoprazole), prostaglandin analogs (e.g., misoprostil and enprostil), and anticholinergic agents (e.g., pirenzepine, telenzepine, carbenoxolone, and proglumide); and bismuth salts, including colloidal bismuth subcitrate, tripotassium dicitrate bismuthate, bismuth subsalicylate, bicitropeptide, and pepto-bismol (see, e.g., Goodwin et al., Helicobacter pylori, Biology and Clinical Practice, CRC Press, Boca Raton, FL, pp 366-395, 1993; Physicians' Desk Reference, 49~' edn., Medical Economics Data Production Company, Montvale, New Jersey, 1995). In addition, compounds containing more than one of the above-listed components coupled together, e.g., ranitidine coupled to bismuth subcitrate, can be used. The invention also includes compositions for carrying out these methods, i.e., compositions containing a Helicobacter antigen (or antigens) of the invention, an adjuvant, and one or more of the above-listed compounds, in a pharmaceutically acceptable carrier or diluent.
Amounts of the above-listed compounds used in the methods and compositions of the invention can readily be determined by one skilled in the art. In addition, one skilled in the art can readily design treatment/immunization schedules. For example, the non-vaccine components can be administered on days 1-14, and the vaccine antigen + adjuvant can be administered on days 7, 14, 21, and 28.
Methods and pharmaceutical compositions of the invention can be used to treat or to prevent Helicobacter infections and, accordingly, gastroduodenal diseases associated with these infections, including acute, chronic, and atrophic gastritis, and peptic ulcer diseases, e.g., gastric and duodenal ulcers.
The invention is further illustrated by the following examples. Example 1 describes identification of genes, such as genes that encode the polypeptides of the invention, in the Helicobacter genome, as well as identification of signal sequences, and primer design for amplification of genes lacking signal sequences. Example 2 describes cloning of DNA molecules encoding polypeptides of the invention into a vector that provides a histidine tag, and production and purification of the resulting his-tagged fusion proteins.
Example 3 describes methods for cloning DNA encoding the polypeptides of WO 98/43478 PCT/US9$/OG371 the invention so that they can be produced without his-tags, and Example 4 describes methods for purifying recombinantly produced polypeptides of the invention.
EXAMPLE 1: Identification of genes in the H. pylori genome, identification of signal sequences, and primer design for amplification of genes lacking signal sequences 1.A. Creating H. pylori genomic databases The H. pylori genome was provided as a text file containing a single contiguous string of nucleotides that had been determined to be 1.76 Megabases in length. The complete genome was split into 17 separate files using the program SPLIT (Creativity in Action), giving rise to 16 contigs, each containing 100,000 nucleotides, and a 17~" contig containing the remaining 76,000 nucleotides. A header was added to each of the 17 files using the format: >hpg0.txt (representing contig 1 ), .hpg l .txt (representing contig 2), etc.
The resulting 17 files, named hpg0 through hpg 16, were then copied together to form one file that represented the plus strand of the complete H. pylori genome.
The constructed database was given the designation "H." A negative strand database of the H. pylori genome was created similarly by first creating a reverse complement of the positive strand using the program SeqPup (D.G.
Gilbert, Indiana University Biology Department) and then performing the same procedure as described above for the plus strand. This database was given the designation "N."
The regions predicted to encode open reading frames (ORFs) were defined for the complete H. pylori genome using the program GENEMARKTM
(Borodovsky et al., Comp. Chem. 17:123, 1993). A database was created from a text file containing an annotated version of all ORFs predicted to be encoded by the H. pylori genome for both the plus and minus strands, and was given the designation "O." Each ORF was assigned a number indicating its location on the genome and its position relative to other genes. No manipulation of the text file was required.
1.B. Searching the H. pylori databases The databases constructed as is described above were searched using the program FASTA (Pearson et al., Proc. Natl. Acad. Sci. USA 85:2444-2448, 1988). FASTA was used for searching either a DNA sequence against either of the gene databases ("H" and/or "N"), or a peptide sequence against the ORF
library ("O"). TFASTX was used to search a peptide sequence against all possible reading frames of a DNA database ("H" and/or "N" libraries).
Potential frameshifts also being resolved, FASTX was used for searching the translated reading frames of a DNA sequence against either a DNA database, or a peptide sequence against the protein database.
1.C. Isolation of DNA sequences from the H. pylori genome The FASTA searches against the constructed DNA databases identified exact nucleotide coordinates on one or more of the isolated contigs, and therefore the location of the target DNA. Once the exact location of the target sequence was known, the contig identified to carry the gene was exported into the software package MapDraw (DNAStar, Inc.) and the gene was isolated.
Gene sequences with flanking DNA was then excised and copied into the EditSeq. Software package (DNAStar, Inc.) for further analysis.

1.D. Identification of signal sequences The deduced protein encoded by a target gene sequence is analyzed using the PROTEAN software package (DNAStar, Inc.). This analysis predicts those areas of the protein that are hydrophobic by using the Kyte-Doolittle algorithm, and identifies any potential polar residues preceding the hydrophobic core region, which is typical-for many signal sequences. For confirmation, the target protein is then searched against a PROSITE database (DNAStar, Inc.) consisting of motifs and signatures. Characteristic of many signal sequences and hydrophobic regions in general, is the identification of predicted prokaryotic lipid attachment sites. Where confirmation between the two approaches is apparent at the N-terminus of any protein, putative cleavage sites are sought. Specifically, this includes the presence of either an Alanine (A), Serine (S), or Glycine (G) residue immediately after the core hydrophobic region. In the case of lipoproteins, a Cysteine (C) residue would be identified as the +1 residue, post-cleavage.
1.E. Rational design of PCR primers based on the identification of signal sequences In order to clone gene sequences as N-terminus translational fusions for the generation of recombinant proteins with N-terminal Histidine tags, the gene sequence that specifies the,signal sequence is omitted. The S'-end of the gene specific portion of the N-terminal primer is designed to start at the first codon beyond the cleavage site. In the case of lipoproteins, the 5'-end of the N-terminal primer begins at the second codon, immediately after the modifiable residue at position +1 post-cleavage. The omission of the signal sequence from the recombinant allows for one-step purification, and potential problems associated with insertion of signal sequences in the membrane of the host strain carrying the hybrid construct are avoided.
EXAMPLE 2: Preparation of isolated DNA encoding the polypeptides of the invention, and production of these polypeptides as histidine-tagged S fusion proteins 2.A. Preparation of genomic DNA from Helicobacter pylori H. pylori strain ORV2001, stored in LB medium containing 50%
glycerol at -70°C, is grown on Colombia agar containing 7% sheep blood for 48 hours under microaerophilic conditions (8-10% C02, 5-7% 02, 85-87% N2).
Cells are harvested, washed with phosphate buffer saline (PBS) (pH 7.2), and DNA is then extracted from the cells using the Rapid Prep Genomic DNA
Isolation kit (Pharmacia Biotech).
2.B. PCR amplification DNA molecules encoding the polypeptides of the invention are amplified from genomic DNA, as can be prepared as is described above, by the Polymerase Chain Reaction (PCR) using primers that can readily be designed by one skilled in the art. Specific examples of primers that can be used in the invention are shown in Table 1. As specific examples, to amplify genes encoding GHPO 147, GHPO 615, GHPO 961, GHPO 1282, GHPO 296, and GHPO 840 the following primers can be used:
GHPO 147: S'-CTGAATTCGAATGAAAAGAATTTTAGTCTCT-3' (SEQ ID
N0:1365), and 5'-CCGCTCGAGTTAAAACTCATAATTCAAAT-3' (SEQ ID
N0:1366).
GHPO 615: S'-CGCGGATCCGAAGACATGTGCAACCGATG-3' (SEQ ID

WO 98/43478 PCT/US9$/06371 _77_ N0:1367), and 5'-CCGCTCGAGCTAAAAGTTTTGCAA.AATCAC-3' (SEQ ID
" N0:1368).
GHPO 961: S'-CGCGGATCCGATTTTACTTGAAA.AATTTAAAC-3' (SEQ
ID N0:1369), and 5'-CCGCTCGAGTTAGAAAGTGTAGTTCAAATAC-3' (SEQ ID
N0:1370).
GHPO 1282: S'-GCGGATCCTTTTCTTCAATGTTTG-3' (SEQ ID N0:1371), and 5'-CCGCTCGAGTCAAAGTTTTAAACAAATTC-3' (SEQ ID
N0:1372).
GHPO 2~6: 5'-CCGAATTCGGTTATAAAGCCCCT-3' (SEQ ID N0:1373), and 5'-CCGCTCGAGTTAAGGCTGATTTAA-3' (SEQ ID N0:1374).
GHPO 840: 5'-CGCGGATCCGAGGAAATAGCATGTTAATAACC-3' (SEQ
ID N0:1375), and 5'-CCGCTCGAGTCACTGCTTGCATGACTTATTCCA-3' (SEQ ID
N0:1376).
The N-terminal and C-terminal primers for each clone can each include a 5' clamp and a restriction enzyme recognition sequence for cloning purposes ~ 20 (for example, BamHI (GGATCC) and XhoI (CTCGAG) recognition sequences).
' Amplification of gene-specific DNA is carried out using Vent DNA
Polymerase {New England Biolabs). or Taq DNA polymerase (Appligene), according to the manufacturer's instructions. The reaction mixture, which is brought to a final volume of 100 ~,l with distilled water, is as follows:

_78_ dNTPs mix 200 pM

lOx ThermoPol buffer10 ~1 p~~s 300 nM each DNA template 50 ng Heat-stable DNA polymerase2 units Appropriate amplification reaction conditions can readily be determined by one skilled in the art. For example, the following conditions can be used for amplification of DNA encoding GHPO 615 using the primers set forth above:
initial denaturation at 94°C for 5 minutes, 25 cycles of denaturation at 97°C for 30 seconds, hybridization at 55°C for 1 minute, and elongation at 72°C for 2 minutes, using Vent DNA polymerase. In the case of amplifying DNA
encoding GHPO 1282 with the primers set forth above, the following conditions can be used: initial denaturation at 94°C for 5 minutes, 25 cycles of denaturation at 94°C for 30 seconds, hybridization at 45°C for 30 seconds, and elongation at,72°C for 30 seconds, followed by a final elongation at 72°C for 7 minutes, using Vent DNA polymerase. The following conditions can be used for amplification of DNA encoding GHPO 840 using the primers set forth above: 25 cycles of denaturation at 97°C for 30 seconds, hybridization at 55°C
for 1 minute, and elongation at 72°C for 2 minutes using Vent DNA
polymerase. Table 1 sets forth conditions for using the primers listed therein.
2.C. Transformation and selection of transformants A single PCR product is thus amplified and then is digested at 37°C
for 2 hours with BamHI and XhoI together in a 20 ~1 reaction volume. The digested product is ligated to similarly cleaved pET28a (Novagen) that is dephosphorylated prior to the ligation by treatment with Calf Intestinal Alkaline Phosphatase (CIP). The gene fusion constructed in this manner allows one-step affinity purification of the resulting fusion protein because of the WO 98!43478 PCT/US98/06371 presence of histidine residues at the N-terminus of the fusion protein, which are encoded by the vector.
The ligation reaction (20 ~,I) is carried out at 14°C overnight and then is used to transform 100 ~.1 fresh E. coli XI,1-blue competent cells (Novagen).
The cells are incubated on ice for 2 hours, heat-shocked at 42°C
for 30 seconds, and returned to ice for 90 seconds. The samples are then added to 1 ml LB broth in the absence of selection and grown at 37°C for 2 hours. The cells are plated out on LB agar containing kanamycin (50 ~,g/ml) at a lOx and neat dilution and incubated overnight at 37°C. The following day, 50 colonies are picked, plated onto secondary plates, and incubated at 37°C
overnight.
Five colonies are picked, grown in 3 ml LB broth supplemented with kanamycin (100 ~,g/ml), and grown overnight at 37°C. Plasmid DNA is extracted using the Quiagen mini-prep method and is quantitated by agarose gel electrophoresis.
PCR is performed with the gene-specific primers under the conditions 1 S set forth above and transformant DNA is confirmed to contain the desired insert. If PCR-positive, one of the five plasmid DNA samples (500 ng) extracted from the E. coli XL1-blue cells is used to transform competent BL21 (~,DE3) E. coli competent cells (Novagen; as described previously).
Transformants (10) are picked, plated onto selective kanamycin (50 ~,g/ml)-containing LB agar plates, and stored as a research stock in LB containing 50% glycerol.
2.D. Purification of recombinant proteins One ml of frozen glycerol stock prepared as described in 2.C. is used to inoculate 50 ml of LB medium containing 25 ~,g/ml kanamycin in a 250 ml Erlenmeyer flask. The flask is incubated at 37°C for 2 hours or until the absorbance at 600 nm (ODboo) reaches 0.4-1Ø The culture is stopped from growing by placing the flask at 4°C overnight. The following day, 10 ml of the overnight culture is used to inoculate 240 ml LB medium containing kanamycin (25 ~.g/ml), with the initial OD6oo being about 0.02-0.04. Four flasks are inoculated for each ORF. The cells are grown to an OD~oo of 1.0 (about 2 hours at 37°C), a 1 ml sample is harvested by centrifugation, and the sample is analyzed by SDS-PAGE to detect any leaky expression. The remaining culture is induced with 1 mM IPTG and the induced cultures are grown for an additional 2 hours at 37°C.
The final OD6oo reading is taken and the cells are harvested by centrifugation at 5,000 x g for 15 minutes at 4°C. The supernatant is discarded and the pellets are resuspended in 50 mM Tris-HCl (pH 8.0), 2 mM EDTA.
Two hundred and fifty ml of buffer are used for each 1 L of culture and the cells are recovered by centrifugation at 12,000 x g for 20 minutes. The supernatant is discarded and the pellets are stored at -45°C.
2. E. Protein purification Pellets obtained using the methods described in 2.D. are thawed and resuspended in 95 ml of 50 mM Tris-HCl (pH 8.0). Pefabloc and lysozyme are added to final concentrations of 100 ~M and 100 ~,g/ml, respectively. The mixture is homogenized with magnetic stirring at 5°C for 30 minutes.
Benzonase (Merck) is added to a final concentration of 1 U/ml, in the presence of 10 mM MgCl2, to ensure total digestion of the DNA. The suspension is sonicated (Branson Sonifier 450) for 3 cycles of 2 minutes each at maximum output. The homogenate is centrifuged at 19,000 x g for 15 minutes and both the supernatant and the pellet are analyzed by SDS-PAGE to detect the cellular location of the target protein in the soluble or insoluble fractions, as is described further below.
2.E.1. Soluble fraction If the target protein is produced in a soluble form (i. e., in the supernatant obtained using the methods described in 2.E.) NaCI and imidazole are added to the supernatant to final concentrations of 50 mM Tris-HCl (pH $.0), 0.5 M
NaCI, and 10 mM imidazole (buffer A). The mixture is filtered through a 0.45 ~m membrane and loaded onto an IMAC column (Pharmacia HiTrap chelating Sepharose; 1 ml), which has been charged with nickel ions according to the manufacturer's recommendations. After loading, the column is washed with 50 column volumes of buffer A and the recombinant protein is eluted with 5 ml of buffer B (50 mM Tris-HCl (pH 8.0), 0.5 M NaCI, 500 mM imidazole).
The elution profile is monitored by measuring the absorbance of the fractions at 280 nm. Fractions corresponding to the protein peak are pooled, dialyzed against PBS containing 0.5 M arginine, filtered through a 0.22 ~m membrane, and stored at -45°C.
2.E.2. Insoluble fraction If the target protein is expressed in the insoluble fraction (pellets obtained using the methods described in 2.E.), purification is conducted under denaturing conditions. NaCI, imidazole, and urea are added to the resuspended pellet to final concentrations of 50 mM Tris-HCl (pH 8.0), 0.5 M NaCI, 10 mM
imidazole, and 6 M urea (buffer C). After complete solubilization, the mixture is filtered through a 0.45 ~m membrane and loaded onto an IMAC column.
The purification procedures on the IMAC column are the same as are described in 2.E.1., except that 6 M urea is included in all of the buffers used and 10 column volumes of buffer C are used to wash the column after protein loading, instead of 50 column volumes.

The protein fractions eluted from the IMAC column with buffer D
(buffer C containing S00 mM imidazole) are pooled. Arginine is added to the solution to a final concentration of 0.5 M, and the mixture is dialyzed against PBS containing 0.5 M arginine and various concentrations of urea (4 M, 3 M, 2 M, 1 M, and 0.5 M) to progressively decrease the concentration of urea. The final dialysate is filtered through a 0.22 ~m membrane and stored at -45°C.
Alternatively, when the above-described purification process is not as efficient as it should be, two other processes can be used and are described as follows. A first alternative involves the use of a mild denaturant, N-octyl glucoside (NOG). Briefly, a pellet obtained as is described in Z.E. is homogenized in a solution of 5 mM imidazole, 500 mM sodium chloride, and mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi, and is clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 50 mM NaP04 (pH 7.5) containing 1-2% weight /volume NOG, and homogenized. The NOG-soluble impurities are removed by centrifugation.
15 The pellet is extracted once more by repeating the preceding extraction step.
The pellet is dissolved in 8 M urea, 50 mM Tris (pH 8.0). The urea-solubilized protein is diluted with an equal volume of 2 M arginine, 50 mM Tris (pH 8.0), and is dialyzed against 1 M arginine for 24-48 hours to remove the urea. The final dialysate is filtered through a 0.22 ~,m membrane and stored at -45°C.
20 A second alternative involves the use of a strong denaturant, such as guanidine hydrochloride. Briefly, a pellet obtained as is described in 2.E. is homogenized in a solution of 5 mM imidazole, 500 mM sodium chloride, and 20 mM Tris-HCl (pH 7.9) by microfluidization at a pressure of 15,000 psi, and is clarified by centrifugation at 4,000-5,000 x g. The pellet is recovered, resuspended in 6 M guanidine hydrochloride, and passed through an IMAC
column charged with Ni~. The bound antigen is eluted with 8 M urea (pH 8.5).

~i-mercaptoethanol is added to the eluted protein to a final concentration of mM, and then the eluted protein is passed through a Sephadex G-25 column equilibrated in 0.1 M acetic acid. Protein eluted from the column is slowly added to 4 volumes of 50 mM phosphate buffer (pH 7.0), and the protein remains in solution.
S
2.F. Evaluation of the protective activity of the purified protein Groups of 10 OF 1 mice (IFFA Credo) are immunized rectally with 25 ~.g of the purified recombinant protein, admixed with 1 gg of cholera toxin (Berna) in physiological buffer. Mice are immunized on days 0, 7, 14, and 21.
Fourteen days after the last immunization, the mice are challenged with H.
pylori strain ORV2001, grown in liquid media (the cells are grown on agar plates, as described in 2.A., and, after harvest, are resuspended in Brucella broth; the flasks are then incubated overnight at 37°C). Fourteen days after challenge, the mice are sacrificed and their stomachs are removed. The amount of H. pylori is determined by measuring the urease activity in the stomach and by culture.
2.G. Production of monospecific polyclonal antibodies 2.G.1. Hyperimmune rabbit antiserum New Zealand rabbits are injected both subcutaneously and intramuscularly with 100 gg of a purified fusion polypeptide, as obtained using the methods described in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a total volume of approximately 2 ml. Twenty one and 42 days after the initial injection, booster doses, which are identical to the priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way. Fifteen days after the last injection, animal serum is recovered, decomplemented, and filtered through a 0.45 ~m membrane.
2.G.2. Mouse 6yperimmune ascites fluid Ten mice are injected subcutaneously with 10-50 ~,g of a purified fusion polypeptide as obtained using the methods described in 2.E.1. or 2.E.2., in the presence of Freund's complete adjuvant and in a volume of approximately 200 ~,1. Seven and 14 days after the initial injection, booster doses, which are identical to the priming doses, except that Freund's incomplete adjuvant is used, are administered in the same way. Twenty one and 28 days after the initial infection, mice receive 50 ~,g of the antigen alone intraperitoneally. On day 21, mice are also injected intraperitoneally with sarcoma 180/TG cells CM26684 (Lennette et al., Diagnostic Procedures for Viral, Rickettsial, and Chlamydial Infections, 5th Ed. Washington DC, American Public Health Association, 1979). Ascites fluid is collected 10-13 days after the last injection.
EXAMPLE 3: Methods for producing transcriptional fusions lacking His-tags Methods for amplification and cloning of DNA encoding the polypeptides of the invention as transcriptional fusions lacking His-tags are described as follows. Two PCR primers for each clone are designed based upon the sequences of the polynucleotides that encode them (see the attached sequence listing, odd numbers, up to SEQ ID N0:1363). These primers can be used to amplify DNA encoding the polypeptides of the invention from any H.
pylori strain, including, for example, ORV2001 and the strain deposited as ATCC deposit number 43579, as well as from other Helicobacter species.

The N-terminal primers are designed to include the ribosome binding site of the target gene, the ATG start site, and any signal sequence and cleavage site. The N-terminal primers can include a 5' clamp and a restriction endonuclease recognition site, such as that for BamHI (GGATCC), which facilitates subsequent cloning. Similarly, the C-terminal primers can include a restriction endonuclease recognition site, such as that for XhoI (CTCGAG), which can be used in subsequent cloning, and a TAA stop codon.
Amplification of genes encoding the polypeptides of the invention can be carried out using Thermalase DNA Polymerase under the conditions described above in Example 2. Alternatively, Vent DNA polymerase (New England Biolabs), Pwo DNA polymerase (Boehringer Mannheim), or Tag DNA polymerase (Appligene) can be used, according to instructions provided by the manufacturers.
A single PCR product for each clone is amplified and cloned into appropriately cleaved pET 24 (e.g., BamHI-XhoI cleaved pET 24), resulting in the construction of a transcriptional fusion that permits expression of the proteins without His-tags. The expressed products can be purified as denatured proteins that are refolded by dialysis into 1 M arginine.
Cloning into pET 24 allows transcription of the genes from the T7 promoter, which is supplied by the vector, but relies upon binding of the RNA-specific DNA polymerase to the intrinsic ribosome binding sites of the genes, and thereby expression of the complete ORF. The amplification, digestion, and cloning protocols that can be used in this method are as described above for constructing translational fusions.

EXAMPLE 4: Purification of the polypeptides of the invention by immunoaffinity 4.A. Purification of specific IgGs An immune serum, as prepared as is described in section 2.G., is applied to a protein A Sepharose Fast Flow column (Pharmacia) equilibrated in 100 mM Tris-HCl {pH 8.0). The resin is washed by applying 10 column volumes of 100 mM Tris-HCl and 10 volumes of 10 mM Tris-HCl (pH 8.0) to the column. IgG antibodies are eluted with 0.1 M glycine buffer (pH 3.0) and are collected as 5 ml fractions to each of which is added 0.25 ml 1 M Tris-HCl (pH 8.0). The optical density of the eluate is measured at 280 nm and fractions containing the IgG antibodies are pooled, dialyzed against 50 mM Tris-HCl (pH 8.0), and, if necessary, stored frozen at -70°C.
4.B. Preparation of the column An appropriate amount of CNBr-activated Sepharose 4B gel ( 1 g of dried gel provides for approximately 3.5 ml of hydrated gel; gel capacity is from 5 to 10 mg coupled IgG/ml of gel) manufactured by Pharmacia (17-0430-01) is suspended in 1 mM HCl buffer and washed with a buchner by adding small quantities of I mM HCl buffer. The total volume of buffer is 200 ml per gram of gel.
Purified IgG antibodies are dialyzed for 4 hours at 205 °C against 50 volumes of S00 mM sodium phosphate buffer (pH 7.5). The antibodies are then diluted in 500 mM phosphate buffer (pH 7.5) to a final concentration of 3 mg/ml.
IgG antibodies are mixed with the gel overnight at 5~3 ° C. The gel is packed into a chromatography column and is washed with 2 column volumes of 500 mM phosphate buffer (pH 7.5), and 1 column volume of SO mM sodium _87_ phosphate buffer, containing 500 mM NaCI (pH 7.5). The gel is then transferred to a tube, mixed with 100 mM ethanolamine (pH 7.5) for 4 hours at room temperature, and washed twice with 2 column volumes of PBS. The gel . is then stored in 1/10,000 PBS/merthiolate. The amount of IgG antibodies coupled to the gel is determined by measuring the optical density (OD) at 280 nm of the IgG solution and the direct eluate, plus washings.
4.C. Adsorption and elution of the antigen An antigen solution in 50 mM Tris-HCl (pH 8.0), 2 mM EDTA, for example, the supernatant or the solubilized pellet obtained using the methods described in 3.E., after centrifugation and filtration through a 0.45 ~,m membrane, is applied to a column equilibrated with 50 mM Tris-HCl {pH 8.0), 2 mM EDTA, at a flow rate of about 10 ml/hour, The column is then washed with 20 volumes of 50 mM Tris-HCl (pH 8.0), 2 mM EDTA. Alternatively, adsorption can be achieved by mixing overnight at 5~3 °C.
The adsorbed gel is washed with 2 to 6 volumes of 10 mM sodium phosphate buffer (pH 6.8) and the antigen is eluted with 100 mM glycine buffer (pH 2.5). The eluate is recovered in 3 ml fractions, to each of which is added 150 ~,1 of 1 M sodium phosphate buffer (pH 8.0). Absorption is measured at 280 nm for each fraction; those fractions containing the antigen are pooled and stored at -20 ° C.

_ 88 _ d ~ c c c c c c c c c c c c c c c c ac c a= c c c a~ a~ m m m ~ ~ co c~ a~ a~ a~ a~ a~ a~ a~ a~ a~ a~ a~ ~ a~ a~ ~ a~ m a~ a~ a~
o > > > E- I- t- > > 1- > > > > > > > > > > > > i- > > ~- > 1- > > >
a H
N M N N N N N N N N N N N N N N N N N N N N N N N N N N N N
N
O ~ ~ N n ~ ~ ~ i ~ ~ ~ i r i ~ ~ i i ~ n i i i ~ ~ ~ ~ i ~ i ~
O - O O O O O O O O O O O O O O O O O O O O - O O O O O -M ' c''~ C~ Ih M M M M M M M M M M M C'~ N M M _ () - - - - - - - - - - - - - - _ M M M e-N N N y v W v v v v v v v v v v v ~ v v v v ~ ~ r r ~ r r- N
N ~ N N N N N N N N N N N N N N N ~ N N N N
a ~ ~ ~ r~ r~ r~ r. ~ r~ ~ r~. ~ t~ r~ r~ r~ ~ ~ ~ ~ rN.. ~ ~ t~ ~ r~
o e... 0 0 o Cn in o 0 0 0 o in o 0 0 o in i~ - ' - ' -~ M o v v v v v v ~ ~ ~ ~ wn mn u~ v m°n ° ~ ~ o W o - in -w o ,n mn u~ m mn u~ mn ~mn u~
W'n in i~ ~ ~ ~ v ~ ~r v ~r s ~t s ww un ,yn u» ~ ~ ~
m ~n u~ own u» wc'w ~n mn um wwn ~ ~ ~ own ~
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - - ' - ' -MMMNNN('~~MNMMM' M M c1 c' c' M M M N M M N l~ N c'°~ MM
~ Q' ~ N
O O O) O7 O O O O O O O O O O O O O O O O O O 07 O O O O O O O
c i ~ ~ ~ ~ ~ v i i ~ v i ~ i ~ i ~ ~ i i D O ~ ~ ~ ~ i ~ i ~ i ~

H
~ U ~ r < a r Q a U U U ~ U V ~ a H V U
U a ~ ~ U' a U a FU- Y- U' U ~ U U U U 1- a UQ V(~.~ a HUa U UQqa a N U Q FU- C~ U' U
p ~~g~Q~ ~~~p~~~~ Up~, ~UUFV-V~V~V
Q U" ~ ~ ar V r H U V ~ CU7 ~ ~ d U Q U Q ~ ~ V ~ U U
O V U ~' 1- U U r U r U ~ h V U U' G r ~ N H a Q ~~r-- QQ U r H r UVV~U4 U~~~FQ-~~C~~rUU' U
U' UU ~' U U U U U (~ U H QQ
Q Q ~~. ~ ~ U t- U' ~ U' r H U' a qq H!U-U~H~~U~~U~H U H ~ ~ IV- ~ ~ U ~ ~ ~ ~ ~ U H ~Q
t9 C7 C9 C7 CD C7 C7 C7 C~ C7 C7 C7 C7 C7 U' C9 U' U U' C7 C9 C~ C7 U C~ C9 C9 C9 C9 C~
a a a a a a a a a a a a a a a a a a a a a c~ c~ cs c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ t~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~
c~ cap c~ c~
U U U U U V U U U U U U U U U U U U U U U U U U
r r 1- r r r r r H r r r !- r r F- r H h- r r r f- r ~ ~ V U U U
U U U U U U U U U U U U U U U U U U U U U U V U U U U U U U
C7 C7 U C9 C7 C9 C7 C7 C9 U' (9 (9 U' (9 C~ C~ C9 (9 C7 U C9 C9 C~ C'J C~ C7 U U U U U U U U U U U U U V U U U U U
U U U U U U U U V U U U U U U U U V U U U U U V U U U U U U

C'!
U C7 a a V a U
IU- !r-ar~UU ~r~~ I- U U. r r a H U V H r f-VHU' VFa-UU' Ur HQ~, FU- Fa- ~ ~ a U U Q a (9 F-H rH
~r~~rrpQ[qta...U' C~QUC7 C~7 V a U a H ~ a rU~~r Q Q ~ qg U' C7 ~ V ~ U ~ NQ U' ~ V U V ~ ~ C7 U ~ U Q
UaU'U' aU~'.F~'-VU'~r~H'~V' VU U, U~QQ~UQ~a p Uca ~ Uc' ~ Q V H U V t- Q Ua Va <U ~ r UQ r U a UI~- ra Q
QU<rQ GU' VUC7~~~~U' ~~~UU~U ~~~~G
Q U' ~ U' U' a V U' U' C7 U' U' U' U' U' C7 C7 a a a r r ~ r r H r r r r r ~ r U C7 ~ ~ ~ <
r ~ a ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ a ~ a QVUU~,~UUUUUUUUUUUUUUUUU' U' C7(~(~U.Ur VUY-I~U-HUUHHFV-HY-FV-FU- H H r H FU- iU- r H H 1-U~ H YU-H FV-HH
~ ~ ~ ~ G ~ ~ a a a a a a a a a a a a a a a a a a a a t~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ ~c~ c~ c~ c~ c~ c~ c~ c~
~ ~ c~ c~ ~ ~ ~ c~ c~ c~ ~ c~

U C7 V' C~ r r C7 C7 C9 C7 C9 (7 C7 C7 C7 C9 C7 G9 C7 C7 C7 C7 C7 C9 C9 C~ C9 C7 (9 U U U U U U U U U U U U U U U U U V U U U U U U U V V U U U
a w L ~ ~7 ~ L L
d ~ Q Q ~ ~ a Q °' > > ~ ~ ~ ~ ~ ~ ~ ~ n- n c a '' c a c ~ ~ a~
~ ~ -~ Q Q ~ ~ ~ > > Q ~ ~ -~ ~ Q a ~ Q ~ ' Q > > Q ~
i ~ ~ N O 07 N O N ~ CO I~ O ~ N ~' (p ~- CD 00 O O M ~ tn I~ O M CO et ~ N
x ~ V 00 lI7 O N M ~ '~ V ~ tO lf~ In lL~ Ln CO I~ ~ iw o0 OD r v- r- e- N N M
'Ct C
CD r- ~ N M er e* ~f ~ V '~? sf 'ct V' d' V' tt ~ ~t ~' 'ct tt V ~ tn tl7 tn ~
~ ~ ~ tt7 i r r r ~ ~ r r r N N ~ N M e"~ M N ~"~ ~ e~ ~ 0~f M tD 1~ 00 ~ W
O ~- ~- e- e- a- a- t!7 ~!9 r. ~ tn e- e- 1n T ~ tt~

.., w. ... ... .. ... .. .r ~.. w, w. .., ... .., a' ~, ~, ».. ... .., .. ~. .. .r ... .., C C C C G C C C C C C C C C C C C C C C C C C C G C C C ~ C C C
>>>>>>>>>>>>>>>>>>>~?~~!>>>~>>>
In tn tn ~ tn tt7 ~ Ln O tn O ~ O ~ tt7 O tf7 tn O tn O O tn ~ O tn O tp tn tn O ~ ~ tn N N N N N N N N N N N N N N N N N N M N N M N N M N N N N N M N N N
~ ~ N N ~ N ~ ~ ~ N ~ ~ N ~ ~ N ~, c~ N ~ N N N N N
1~ ~ ~ ~ I~~ ~ ~ ~ ~ ~ ~ t~. ~
O O O O lO _ O O _ O O _ ' ' O O ' O
M C~ C~ ('~ M r M M r M M r O O - - M -- ' ~., ' ' "~- ' ' ~. M N M v ~ M O M r O M '- O M O ~ O p v v v s v a r v N v v N \ N '~ N r r ~ ~ N M r N M r (~ \ _ ,~ M
~rN.~~rN.~~rN..~rN,.~~~~~~~~~~~~~~~~~~c\vtN~~~~
~
r r ~ r ~ r r r ~ O O ~ O ~ ~ ~ O ~ ~ p _ _ - - - - _ - -\ ~V ~ \ ~ \ 1 \ \ 1 et M M \ M \ r r O O O O in O O O O O
N ~ ~ ll~ ~ t,n tn rn In tn ~~ ~" ~.~ O \ ~ ~ ~ M ~ ~ ~ \ \ M Q' M M ~ M M M M
M
~. \ 1 \ \ \ \ \ \ \
~ O ~ ~ tn yfi ~ ~ t~ ~ ~ er O ~ ~ O ~ ~ O 1~ O ~ O O O u~ O O O O O
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o b o 0 0 0 0 0 M M M M M M M M M C~ M M M M M M M M M M M M M M M N M C~ M M M M M M
\ \ \ 1 ~. \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
~ i i ~ i i i i ~ i ~ ~ ~ ~ ~ ~ ~ ~ ~ i ~ ~ ~ i O O O O O O O O ~ O C7 O) O O
U U V ~ a V U < V" ~ ~ U' U' FU'-H U V
G fa'J U. La. ~ ~ q U ~ ~ ~ I- U' ~ Ua ~ CU7 U IU- U' ~ U' G V !U-, a~ a a ~ Q ~Q UQ (~ ~ U U' V a H ~ ~ U ~ V eg(44U ~ U H
UUH~~~U~~ ~a ~ aV~~~~V' ~~~t~-qa UV' V V V f. U ~ ~ ~ V V ~ H IV-~ ~ V ~ < ~ U FU- U V U ~ ~ VU' a ~ ~ ~ ~ ~ a ~ ~ ~ ~ a ~ a "' 'e ~ a a ~ a H U U ~ U ~ ~ U ~ ~ 1- ~ ~ U F- U U U U ~ ~ ~ ~ V ~ H H ~ ~ y- V U U
V C7 C~ C9 C~ C9 C9 C7 C7 C7 V U' V C9 C9 C9 C9 C9 C~ C7 C9 C7 C9 t7 C7 V V V
C9 C7 C7 C7 Ct"7 C9 V C~ V U' C9 U U U U U U V U U U U U U U U U U U U U U U U U U U U U U V U U U U
~.- m- r- ~ f- f- f- t- f- i- W - W - f- t- f- t- ~- ~- ~ ~- ~- f- ~- i- r r-m- v U' U' U' C7 U' C7 V' V' U' C9 U' U' U' ~ U U' U' U' ~ ~ U U U U U
U U U U U U U U U U U U U U U U U V U ~ U U U U U U V U U U U U U U U
U U U U U V U U U U U U U V V U U V U U U U U U U V U U U U U U V U

H
a ~ c~ a F- (7 U U ~ ~ U ~ Q V C~ U F- q U' U
'' ~ F- Q g,~. 4~ C7 a<C~1V ~ a aU' Vava V~'' V<~~U~ V U C7 H V' U U ~ U G V a ~ ~ H V U V U sa r-VUGa"-U~~~~G~qq~ U' FU-4V Vqgqga ~~Ua~
V V V ~ a U H ~ V ~ ~ U' ~ ~ ~ ~ ~ H C7 H tV- H 1-V~ ~' ~ V' ~ U' ~ !V- U V ~ U U' dd ffU-- U ~ U U (7 V
UG ~ C7 ~ V ~ ~ ~ ~ tq' ~ ~ ~ a U ~a V C7 V C7 V' U' U' U' C7 V' U' C7 U' V' '' f7 U (,~ ~' F- U a(~
UVUVUUUUUUU~UU(~UUU~VaV, ~~~~V~Q~QVU' U U V V U V U U U U U U U U V U U
U U U U U U V U U U U U U
C7 C7 a ~ C7 t7 C7 ~ C7 a t7 ~ ~ a "' ta.~ ~ ~ ~.- a W- WQQ.- W- ~- r r a C9 t9 t~ C~ C9 C~ C~ C~ ~ V ~ C9 ~ ~ V C~ ~ ~ ~ ~ ~e~ ~ g C7 q ~ V Ca7 cal Ca7 Ca7 U U U V U U U U U C7 U' U U' U U U U V U' G U C7 U (7 C7 U U' (9 (~ (~ (~
V' V' U' V' U' C7 (7 U' U' V' U' C7 U' V' V' C7 U' C7 ~ U V' U' U' U' U' U' U
U V U U
U U U V V U U U U U U U' V' U C7 U V U U' U U V' U' U ~ V U' G U G7 (,~ ~7 (~
Vr aa~'ac c c c c aaa'"a ~ a Q -' Q ' ~ ~ ~ ' Q Q Q ~ Q Q ~ Q ' ~ Q ~ ~ Q ' Q Q Q Q Q Q Q
'ti' r Op r M lD CO O Ln O N f~ O_ _Lf~ (p tn tD M tn O N Cfl f~ ~ In O'7 O In OD O In O'1 r N
~COOtQDtDCfl(DCND~~CMO(MO~~C~O(~Ot~Dt00(~OC~O(~0~~~~0 ~ N ~ r r e~~- r ~ t~0 ~ r ~ O ~ 0 ~ ~ ~ ~ ~ ~ N ~ ~"~ ef tt tn t0 h. 00 01 N N r- ~ t. ~ I~ r 1~ i~ M t~ tw 1~ I~ t~

C C ~' '~' w r.. .,.. .,.. r.. .,.. ~.. .,_. ~ ~, ~ r", w n... ~.. .r.
w w ~ w ~ ~ w a~
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C
> > E' F- H' > > J > > > > > > > > > > > > > > > > > >
O ~ ~f7 tn t,C~ O O O O O tn tn O O O O O tn ~ O vf7 O 1n tf> tt~ tn In O ~
tl~ ~ tl~ 1n tn N N N N N M M N M M N N M M M M M N N M N M N N N N N M N N N N N N
~ ~ n ~ r r ' ' r ' ' " ' ' ' ' ' ' r N r N r i N i r N r r r r r r N N N N N N N N N N N N N
MMC~c~M000000000000r.~pMpM~O~ _ ~_ 00000 M M M M M M M Cp M CO M M ' ' M M N ~ e- M c~ M M th ' ' e- ~ ~ ' ' ' ' ' ' ' ' ' ' ' ' M ' ~.. M w N N ' ' ' N N N N N N N N N N N N N N ' ~ ' r' ~ ' N N N r' r' ~ r r n ~ ~ ~ 1~ ~ ~ ~ t~ ~ f~ f~ 1~ f~ I~ 1~ I~ N N ' ' N ' ' ' ' ' ~ ~ n ~ ~ ~ ~ N ~
O O O O O O O O O O O O O O O O O O
M M st 't1 ~f' M M M M M M M CQ M CO M M M v M y O r '-' O ~ ~ O ~ O
~.. ' ' ' " ' ' ' " " ' ' ' ' M ' ' M ' ' M ' O O tn tn ~ O O O O O O O tn p O O O O ~ ' O ' ~ ~ ' ~ In ~. ~ ' '~' ' ' tn ~ ~ tn 0 47 In ~ ~ ~ O tLW l7 in W t7 ~ tn tn tn ~ O ~ ~ tn tO ~f O tn tn tn ~ ~ O ~ ~ ~ CD ~ t1~ ~
O O O O CO O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
M M N N N M M M M M M M M M M M M M M M M M M M M M M M M
O O O O O Q1 O O O O O 07 O 07 O O O O O ~ ~ ~ O~7 ~ ~
in iryr~ in in in in Cn icy ic'~ Cn in in iu in i r r ' ' ' ' ' ' ' ' ' ' ' ' ' r ~ r ~ r r ~ r r ~ r r r r r r' 0~7 0~7 '~ ~1 tY ~f ~Y '~' ~f' 'a' '~ ~f '~ ~1 '~ V
O OS 07 O O O O O O O O O O O ~ ~ O
a U U U
Q~UV V U' V ~~VV Vv <~ ra~aVV~' Q ~C9 ~ ~VgUU~~V~~ ~UGU~'~CU'1C~7~~
UQF-H V' V' ~~U' a UU~~ U ~QUU~~ ~a (U,'l~
V 4 U U U U a ~ ~ ~ G 1- ~ ~ r"' ~ ~ U U U
U ~ a V (' V ~ V ~ ~ U ~ a f' Q U ~ H ~ U ~ U U H U a V U
V V ~ U U ~ a ~ ~ Q H H ~ U CV'J V' H U V U U V ~ ~ ~ H V a C7 U
a a U' F- U ~ U ~ U U U U V a a a ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ U ~ ~ Q ~ a " a U ~ H F- U y.. ~ a Q ~ ~ ~ ~ U U H H U
U U 1- U U 1- F- F- U U V a ~ a ~ V Q V ~ V U U
(9 C7 C~ C~ C7 C7 C7 V C9 C7 C9 C9 C9 C9 t9 C7 C7 C~ C9 C7 C9 CD C9 (~ C9 C7 C7 (7 C~ C7 V (9 C7 a a a U' Ca'! ~ U' U V' ~ V' aV' V' Cq7 V' (7 Ca7 ~ ~ V' ~ ta.7 C7 r' H H FV- IU- IV- H ~ FV- H H IU- FU~- IU- H H U V V U V ~ ~ H V U V U U U U
V U U U U V U U U U U U U U U U U U f' f' ~ I- H h- H
V' U (7 U' C7 U U (7 U U ~ U C9 U (J C7 U C~ C9 U (U9 ~ ~ ~ VU' CU7 U U U U U U V U U U U' U U V V U U U U U U U U V U U U V V U U U U U
U U U U U U U U U U U' U U U U U U U U U U U U U U U U U U U U U U U
H
U U V. U
V ~ ~ V Q
V V ~ ~ V C9 C7 t- V
U g c~ c~ ~ ~ ~ ~ a U ~ c~ " G ~ ~ ~ ~ ~ a ~ a ~ ~
~ a UHV V' aV~ UU' rf7<VQU IU-UU V V' Ha V' V QHV ~C7~
~ a H C7 U V F- ~ U C7 1- !- U QQ U Q V U' r-. ~ C7 a ~ a U ~ U U' V U V ~ ~ V H H V V ~ U' U V Va Q ~ U' pa. V
VU' <h-Q<UH V' 1~a-U~VU~~~UFa-~~UF'~~~V~~HU
H !V- H U' Q a a ~ H ~ U ~ ~ V V ~ a V U ~ U V C1 Q (~ H Q
U < Va ~-hVU~~~U~(U]~ ~~C~V' ~UHU~~, FU-N
a ~ ~ ~ U U V ~ ~ ~ ~ c~.~ ~ ~ cs ~ ~ ~ ~ a ~
U' U L V U U ~ V V U a U V U U U ~ ~ ~ U U U U ~ ~ U U
U U U U U U U U U V U U U V V U U U U U U U U V U U
c~ ~ c~ c~
G a a a ~' a ~ a Q ~ ~ Q ~ ~ +- ~ ~ ~ c~ c~ c~ c~ c~ c~ c~ ad c~ c~ c~ ~ c~ c~
c~ c~ c~ c~ c~ ~ ~ ~ c~ g ~ ~ ~ ~ c~ ~ c~ c~ c~ v c~ c~ c~ c~ c~ c~ c~ c~ c~ ~
c~ c~
~ V U U U U ~ U U V' L U U U U U U' C9 C7 (~ C7 C7 C9 C9 (7 C7 C9 C7 C9 C~ (~ (~ (~ V ~ U Ur ~ V ~ V ~ ~ C7 U' C9 ~ C7 (9 ('J C7 U V U U U U' U U U' U' U U U U U U' V U U U U U U V U V U U U U U U
Q Q C C C C C ~ ~ ~ w ~ ~ N
Q Q ~ ~ ~ ~ ~ °' > j °' a 3 ~ ~ ~ ~ °' ~ ~ a C Q o.
c c ?' c -.
-~ ~ -~ ~ -s Q -~ -~ Q Q ~ -~ -~ ~ ~ Q '~ ~ Q ~ Q Q ~ ~ ' ~ ' Q Q Q Q Q
00 O O M (O CO (O ~- ~t O 07 N O N CD N h OD 00 CO I~ CO O CO O O r N CO I~ O
N (p ~
O O e- e- ~ r- N M M ~' Ln CO f~ OD 00 O O O O v- N N N M M ~' ef ~' e1' V tn in f~ OO
~ O O O O O O O O O O O O O O O O
v- N 1"! er 1p t0 !w 0D CI O e- N
le'~r100~~~~r~e~-e~-MO~O~~e~,-~~~~Nr~~ONf~~NOt~DON~0~10~f~
N r N

... .r .. .. ~.. ~, .. .r .. .. .., ~- .. .r .. ... .. .~ .., »., ~,, .r ~ ~, ~, .r ~., .. .. .. ... ~., .., .., C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C
> > > > > > > ~ > > > > > > > > > > > > > ~ > > > > > > > > > 5 >
O ~ O In ~ tn O O O O O tn lI~ 4I) 1n O t!7 O O tn 6n Lf~ lt) O tn In tn ~ I,n In In 4n In t!7 M N M N N N M M N M M N N N N M N M M N N N N N N N N N N N N N N N
i v v ~ v ~ ~ C v v v i i i ~ i i ~ ~ i ~ ~ i i ~
N N N N N N N N N N N
~h W v v _M T ~ M _O M M O O O O M r, e- ~ _ _ _ - _ - - fO
O O O ~ O O O
th _ _ M M M e- M c~'7 M T O i- M
\ \ \ 1 N N N ~"' r e- T e- ~-- e- \ e-I~ N N ~ N fN~. ~ INS. N ~ ~ N N N N N N N N "'~ 1~ 1,~ t," .." \ \ \ \ \ \ N
N .\
1~ 1~ I~ ~ ~ ~' ~ ~ ~ n i~ f~. N N N N N N N N ~' N n N
e- e- ~- r i-- r r i- O i- r e- O O ' O O O O - _ _ _ e_- r_ r_ r_ \ \ \ ~ \ \ \ \ Ln \ \ ~.. _('~ _Ln ~.T~ (_p (_~ (_'' (_Y~ \ 1 \ \ \ ~.. \ 1 \
~
In O t~ tn tn ~ ~ ~ ~ ~' ~ tn O O tf~ O O O O ~ t!~ ~ tn tn tW .f~ t~ O tn ~
tn tI7 O tI~
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M
"~. \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
~~it~~it~t~i~~~~~d'V'\'i~i~~i~i~iii~i~
O O O O
a V
~ C~ ~- ~ V ..rqee~~ ~ c~ t' c~ a c~ ~ V Q ~ r QUUU' ~ 4GUU F- ar 4U~~~~UU, V r Va UV' U H ~ a V FU- V ~ V ~ V U H IU- a r ((~JJ ~ V
V ~tU~C9aU f9 U' U' V UU' ~ejV V~~--aa g V U~~~U ~QUIa-Ur hU-~Q~VV-UC7VlU-UVIU-VCV'J41U-U' V ~ V U a U ~ U V a H Q ~FF.. ~ ~ ~ ~ V H- ad a a ~ ~ ~ ~ V ~ ~ ~ L7 N V
CQ9 r ~ a ra a ~ V U r r r r a ~- V
U ~ U' U ~ r ~ ~ ~, ~ ~ V U' 4 a a a ~ a a ~ a ~ a a ~ 4 a H V V V r V ~ ~ iQ- ~ ~ FC4- 4 ~ ~ ~ Q
U U U U V U r r r a U
r U r ~ U r U ~ ~ ~ r ~ r r r ~ ~ r ~ r r ~ ~-~ ~ ~ ~ r v- C~ r C9 (9 t~ C7 O C~ Cg Cg V U C7 C~ G9 (J C9 C9 C9 CJ C7 U C9 C1 C9 C9 V C7 C9 C) C7 C9 C7 U C~
a a a a a a a a 4 a U a C9 ~ ~ ~ ~ ~ ~ C9 ~ ~ ~ ~ ~ C7 f? C9 C7 C7 C7 C7 C9 ~ ~ C9 C9 t- C~
~ U U U U U U U U U U U U U U U U U U U U U V U V U U a U
r r r r r r r r r r r r. r r r r r r r r r r r r r r c~ r U U U U U U U U U U U U U U U U U U U U U U U V U U U' U
UVC7UUU' UC7CaVVUUU' V' U' C7 U' U' U'C9U' U' C7 V' U' U(7 U U U U ~ ~ U U U U U U U U U U U U U U U U U U U U U U U U U U U' U
U U U U U V U U U U U U U U U U U U U U U U U U U U U U U C: U U U U
U V' U
rV,~rt~- ~ VU Vrr ~a a U' U't7 V ~ Q Q Q < ~ ~ ~ V ~ U ~ ~ ~ ~ ~ ~ a < r Q ~ ~ U ~ Q C7 ~ V U QdQd H ~ a V ~ ~ ~ ~ U a ~ ~ ~ U V
r r ~ r ~ ~ r ~. ~ ..g~~ a Q ~ a to rs a c~ ~ c~ cs c9 t~a.. ~ ~ ~d ~ a 1- VU' r V V~~~~~~r~~U~V~VrIU-<<(7<ar UVV
r V' ~rU~V HV' H~~'~U~~d~U~ ~r~V~r V
~ cap ~ ~ ~ ~ ~ ~ ~ a < ~ c~a ~ 'dU C~
~G"""~"~~~~'c~c~aaa~~~~~c~~~c~"~C~7U~C~U
~ a d ~ ~ c~ ~ ~ " ~ U ~ ~ 4 r r G
c~ c~ a c~ c~ ~ ~ c~ c~ a ~ ~ ~ ~ c~ ~ ~ ~ ~ a r a ~ a V U U U U U U U U U U U V U V V V U U U U a U U V U U ~ U U U L U U
U U V U U U U U U U V V U U U r U U U U U V U V U U V U U U U U U
t- r r r r r r r ~- r r r ~ r r r r r r r r v r r r r r V r r r r r c'~ c'~c'~ ~c'~c'~Ca7c~c~c~~c'~c'~c~~~~~~~c~c'~c'~La7c~~c'~LC'~c'~~c'~
c~c~c~c~c~c~~~~c~c~c~ ~~c~ ~.c~~~~~~c~c~~~~~~~c~~
U U U U U U U U U U U U U U U U U U U V
V' C7 U' U' V U' U' U' U' U' U' U' C7 U' U' U' U' C7 U' U' IV- U' U' C7 C7 ~ U
U U U U
U U V U U U U U U U U U U U U U U U U U U U V U U U U U ~ L U V U U
~ w L w N N w N ~ ~, N ~, w N N N
~ 4 ~ Q Q 4 ~ ~ Q ~ ~ ~ ~ °' °' ~ ~ > > > °- a a a c o.
~. ~ c a -~ a c c -~ ~ -~ ~ ~ Q 4 ~ ~ --~ -~ -~ Q Q Q Q ~ Q Q a ' Q ~ Q ' ~
1C) (p O) M 'fit ~C7 M N t~ Cfl OD r 00 (p <T f~ N I\ N '_- ~_"~ O ~f Cn I~ r ~t 00 O a- N p~ O
O O 40 01 07 O) O T M <f tn (fl (p f~ OD 00 O ~ Q7 ~ O O O O O O O O O O O O O
O
CO OO 00 OD 00 00 O O O O O p~ Cn O 07 O 07 O !~ T e- ~-- e- e- e- ~- e- r T e-e- e- r- e 00 ~ O e- 0s O N e- N ,~ M th th eh 1p tG h 0D tC N. 00 0s 01 O e- N N N eh M
st r eQ- r e~- ~ r ~ r ~c r ~ r ~ ~ O O O ~ r r r ~ T r- 0D 0D
N T ~ T

c c c c c c c o -r ..-. .-. .... ..-. .. ..., a~ a~ ~ a~ a~ a~ a~ ~ ~ 3 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 0 3 ~ a'~ ~ co »»»»>a»» >»»»>i--a>~>i-..
~n ~ ~n ~n ~n ~n ~ ~n ~n o en ~n u~ ~n ~ o ~n wn ~n ~n ~n ~n ~n ~n o m ~
N N N N N N N N N M N N N N N M N N N N N N N N N M N N
i n i a i i i N i ~ i ~ i N i i i ~ ~ N i N ~ i i i O _ ~ O O _ ' O _ - _ M N ('O M O ' M N r C~ M
N ~ N w w ~ N w ~ N ~ N N N N ~ N ~ ' (~1( r ~ T w w T
TrrTTTTTTOTTTI~r-OTOO~ OO
w w w w w w ~ w W'~ ~ l 7 t ~ ~ l j ~ w t' ~ v Ln M v M W' Ln v tn In In O ~j] Ln Ln In In tl7 In lf~ In ~ tn ~ u] L~ ,~ tn ~ tn Ln tn O O O O O O O O O O O O O O O O O O O O O O O O O O O O
M M M M M M M M M M M M M Ch M M M M M M M M M M M M M M
w w w w w w w w w w w w w w w w w w w w w w w w w w w w ~ i ~ ~ ~ i i i ~ ~ ~ ~ ~ i i ~ ~ ~ i i ~ ~ ~ ~ ~ ~ i i U
c9 V U
C7 U U C7 U' U
U U ~ ~ ~ ~ U ~ U ~ U a U U
aaV'UQ U qU rUU~~Q aC7(U7V
(at !V- ~ Q ~ U U U r ~ U 4 H U Fa- r U U' U U
QQ G ~ ~ a ~ ~ ra- c~ ~ ~ a ~ ~ a a ~ ~ c~
r U ~ H UQQ U dd FU- QQa Fr- V a ~ t a U U' U U ~ r a U~~U~U' U' G~QU~Ir-U ir- aU' aUC7rV
C7 U' r~~U' a~ U~~UU~ ~HIU-~hr-Uf V ~Qa r U c~ ~ ~ ~ ~ a ~ " r U ~ ~ FU ~ ~ ~ c~ G ~ c~ ~ G
~ a ~a ~~. a ~ U r~. 1U~- rU. ~ U U' U~ U ~ ~ a r r r h~r-I-Ir-~~~UIU-~~~~~~U~G~~~~HUU~~U
c~ ~ ~ ~ ~ a a a a ~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ a a a a a a a a a a a a a a '- ~
U U' U' U' U' U' U' U' U' U' U' U' U' U' U ~ V U U
H H H ~ ~ ~ ~ ~ ~ U U U U
U U U U U U U U U U U U U r r r r U U U U U U U U U U' U U U U U U U U U U U U U U' U U U U
U U U U U U U U U C7 U U U U U U U U U U U U U' U U U U
a U ~ r H
F~-Q tata C C
H ~ U ~ a Q V ~Q r Q U ~Q ~ U Q ~ r ~ Fr- ~ V ~ U ~ V U
Q U q Q U r U' U' U U' H~Q~V~~C7Ula-UU' C7QCr'J~~U ~V. t0 HU ~ HU' aU' QUUHaaU~(V~~ ~'HUUrUa~ ~O~n7R
r r- r r ~ r r r r r r r r ~ v rte- ~ r ~ ~ a ~ C~ ~ 3 .G
~~g~~~~~~~~G~~~~~~ oZm°.a~
U U U U ~ U U U U r U U U U Q U U U U U ~ U ~ U U U U (~
r FV- H ~ N H ~ ~ V ~ ~ V ~ ~ V U U U U U U U U U U U U G1 a U r r r r r v r v r r- r r N d U' U' U' C7 r G U' U' < U' G7 ~ C7 C7 a U' C) U' U' U' a U a ~ U' c7 i L lC f~
c~ c~ c~ c~ ~ c~ c~ c~ c~ c~ c~ c~ ~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~ c~
~ Q ~ ~ d o U U U U C7 U U U U U' U U U~ U V U' U U U U U U' U U' U U U U
U U U U U' U U U U U' U U U U U G U U U U U U' U U' U U U U
w y, ~ w >, w. L w w L N ~ w N N N N N ~ ~ ?v E ~' a a n a a ~ Q a c c c c c -'' is $. a a a c H
-s Q -~ Q Q Q Q Q ~ Q <C ._~~ ~ ~ ' ' ~ Q Q Q Q Q ~ ~ ~ Q. O Z
waQ ao N M v u~ ~ ao rn r~ ~ T N ~ E Z Z Q a ~t ~ O 00 O f~ 1~ N tn N O L
I~ Iw 00 OD QO OO GO 07 O O O T T r N M In I~ W O O 00 ~ N 1~ M O
O O O O O O O O O T T ~ T r- ~ T ,- ,- T ~ T N M M M ~ (D I~ ~
T T T T T T [- T T T T T T T T T T T T T T T T T T T T T
~ ~ co ~~ ~ x ~ r r' N r r N r C~ r T r r N N r ~ r r r N r M N ~ er M t i-->ai'-°u".i MISSING UPON TIME OF PUBLICATION

WO 98/43478 PCT/ITS98l06371 SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: MERIEUX ORAVAX SOCIETE EN NOM COLLECTIF PASTEUR
MERIEUX SERUMS ET VACCINS S.A. HUMAN GENOME
SCIENCES, INC.
(ii) TITLE OF THE INVENTION: Identification of Polynucleotides Encoding Novel Helicobacter Polypeptides in the Helicobacter Genome (iii) NUMBER OF SEQUENCES: 1376 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Clark & Elbing LLP
(B) STREET: 176 Federal Street (C) CITY: Boston (D) STATE: MA
(E) COUNTRY: USA
(F) ZIP: 02110 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Windows Version 2.0 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: PCT/US98/----(B) FILING DATE: O1-APR-98 (C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 08/833,457 (B) FILING DATE: O1-APR-1997 (vii) PRIOR APPLICATION DATA: 08/881,227 (A) APPLICATION NUMBER: 24-JUN-1997 (B) FILING DATE:
(vii) PRIOR APPLICATION DATA: 08/902,615 (A) APPLICATION NUMBER: 29-JUL-1997 (B) FILING DATE:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Clark, Paul T.
(B) REGISTRATION NUMBER: 30,162 (C) REFERENCE/DOCKET NUMBER: 06132/041W01 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 617-428-0200 (B) TELEFAX: 617-428-7045 (C) TELEX:

(2) INFORMATION FOR SEQ ID N0:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 base pairs {B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...212 (D) OTHER INFORMATION:
{xi) SEQUENCE DESCRIPTION: SEQ ID N0:1:

Met Glu Phe Leu Gly Leu Ile Leu Ser Leu Ala Ala Ile Leu Ile Ala Phe Lys Lys Pro Glu Lys Glu Asn Trp Ala Phe Gly Ile Leu Met Val Val Trp Leu Val Glu Leu Ile Ile Phe Ile Ala His Ser Ser Ser Val Leu Pro Asn Met Asn Leu (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 54 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:

Met Glu Phe Leu Gly Leu Ile Leu Ala IleLeu Leu Ser Ala Ile Ala Phe Lys Lys Pro Glu Lys Glu Ala Gly IleLeu Asn Trp Phe Met Val Val Trp Leu Val Glu Leu Ile Ile His SerSerSer Ile Phe Ala Val Leu Pro Asn Met Asn Leu (2) INFORMATION FOR SEQ N0:3:
ID

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 670 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA

(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...617 {D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION:
SEQ ID N0:3:

ACAAAATCAA ATG
GCGGTTTTAT AAA
CAAAACCAAA

Met Lys TTA TGG GGC

Lys Ile Ala Phe Ile Leu Ala Val Leu LeuGlyAlaPhe Leu Trp Gly TAT TTT GCT

Glu Pro Lys Lys Ser His Ile Gly Met ValGlyLeuAla Tyr Phe Ala CCG GCT GAT

Pro Ile Lys Ile Thr Pro Lys Ser Ser SerTyrThrAla Pro Ala Asp GGG TAT TTC

Phe Leu Trp Gly Ala Lys Gly Gln Ala PhePheLysAla Gly Tyr Phe TCC TAC ATG

Leu Ala Leu Arg Gly Glu Phe Leu Ala IleLysProThr Ser Tyr Met TCT TTA AGC

Ala Leu His Thr Ile Asn Thr Leu Leu AsnIleAspVal Ser Leu Ser WO 98!43478 PCT/US98/06371 Leu Ser Asp Phe Tyr Thr Tyr Lys Lys Tyr Ser Phe Gly Val Tyr Gly Gly Leu GIy Ile Gly Tyr Phe Tyr Gln Ser Asn His Leu Gly Met Lys Asn Ser Ser Phe Met Gly Tyr Asn Gly Leu Phe Asn Val Gly Leu Gly Ser Thr Ile Asp Arg His His Arg Ile Glu Leu Gly Ala Lys Ile Pro Phe Ser Lys Thr Arg Asn Ser Phe Lys Asn Pro Tyr Phe Leu Glu 5er Val Phe Ile His Ala Thr Tyr Ser Tyr Met Phe (2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 189 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Met Lys Lys Ile Ala Phe Ile Leu Ala Leu Trp Val Gly Leu Leu Gly Ala Phe Glu Pro Lys Lys Ser His Ile Tyr Phe Gly Ala Met Val Gly Leu Ala Pro Ile Lys Ile Thr Pro Lys Pro Ala Ser Asp Ser Ser Tyr " 35 40 45 Thr Ala Phe Leu Trp Gly Ala Lys Gly Gly Tyr Gln Phe Ala Phe Phe ' Lys Ala Leu Ala Leu Arg Gly Glu Phe Ser Tyr Leu Met Ala Ile Lys Pro Thr Ala Leu His Thr Ile Asn Thr Ser Leu Leu Ser Leu Asn Ile Asp Val Leu Ser Asp Phe Tyr Thr Tyr Lys Lys Tyr Ser Phe Gly Val Tyr Gly Gly Leu Gly Ile Gly.Tyr Phe Tyr Gln Ser Asn His Leu Gly MetLys Ser Ser Phe Met Gly Tyr GlyLeu Phe Asn Val Gly Asn Asn LeuGly Thr Ile Asp Arg His His IleGlu Leu Gly Ala Lys Ser Arg IlePro Ser Lys Thr Arg Asn Ser LysAsn Pro Tyr Phe Leu Phe Phe GluSer Phe Ile His Ala Thr Tyr TyrMet Phe Val Ser (2) INFORMATION FOR SEQ ID
N0:5:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 434 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
Genomic DNA

(ix) EATURE:
F

(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...380 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: N0:5:
SEQ
ID

AAATTGAGTT ATG
GAATCAAAAC CTC
CTGCATTAAG

Met Leu TTG CTT TTT

LysLys Ser LeuLeu Val LeuValLeu GlnLeu Ser Gly Leu Leu Phe AAC GCC AAA

AlaGlu Glu AsnGln Pro AsnThrPro ProGlu Leu Asn Asn~ Ala Lys GCT GCG AAC

ProAla Asn LysGly Pro SerAsnThr GlnIle Thr Pro Ala Ala Asn AAC CTG GAC

LysAsn Asp SerAsn Leu LysLeuGly SerPro Glu Asn _ Asn Leu Asp GAG GCC ATT

AlaGln Thr LeuSer Gly AspLeuAla LysLys Gly Asp Glu Ala Ile GCT CTT TCC

TyrGln Gly PheLys Phe GlnSerCys AspAsn Gly Asn Ala Leu Ser Ala Ala Gly Cys Phe Ala Ser Gly Gly Asp Val Cys (2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 110 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Met Leu Lys Lys Ser Leu Leu Leu Leu Val Phe Leu Val Leu Gln Leu Ser Gly Ala G1u Glu Asn Asn Gln Ala Pro Lys Asn Thr Pro Pro Glu Leu Asn Pro Ala Asn Ala Lys Gly Ala Pro Asn Ser Asn Thr Gln Ile Thr Pro Lys Asn Asp Asn Ser Asn Leu Leu Asp Lys Leu Gly Ser Pro Glu Asn Ala Gln Thr Glu Leu Ser Ala Gly Ile Asp Leu Ala Lys Lys Gly Asp Tyr Gln G1y Ala Phe Lys Leu Phe Ser Gln Ser Cys Asp Asn Gly Asn Ala Ala Gly Cys Phe Ala Ser Gly Gly Asp Val Cys (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 575 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 73...522 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:

CCACAAAAGC TTAATGAATA TTAAATTAAA
AACATGTTAA TCTTTAGTTA
TTTTTAAAAT

AAT AAA ACT TTT
TTA CCC AGC CAA
TCC

Met His Gln Asn Asn Lys Thr Phe Leu Pro Ser Gln Ser CTC TTT ACC

AlaHis Ser Lys Ile Ile Leu Leu Asn Gly Phe Leu Ala Leu Phe Thr TTA AAT ATA

TyrLeu Ser Ala Cys Gly Ala Val Pro Glu Glu Val Leu Leu Asn Ile GAT GCC GTC

VaILys Pro Lys Glu Thr Lys Gln Glu Ala Arg Glu Glu Asp Ala Val ATC ACT GCG

LysAla Gln Gln Glu Asn Ala Ile Asp Arg Thr Thr Pro Ile Thr Ala AAT AGC GGC

LeuIle Arg Phe Thr Asn Tyr Ala Tyr Ser Leu Asn Gly Asn Ser Gly AAT AAT ATG

PheTyr Ser Val Asp Asn Leu Ser Pro Gln Asn Gly Met Asn Asn Met GGC TAT CCC

TyrGly Tyr Tyr Met Pro Tyr Tyr Met Tyr Gly Phe Met Gly Tyr Pro GGG TAT TAT

ProTyr Ser Gly Leu Met Pro Gly Pro Gly Tyr Gly Ala Gly Tyr Tyr TAC TAT G

ProGly Phe Pro Tyr Ala Phe Tyr Tyr GGTGTTGGTG
TTTTTACTCA
AACACG

(2) INFORMATION FOR SEQ N0:8:
ID

(i) QUENCE CHARACTERISTICS:
SE

(A) LENGTH: 150 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
protein (v) FRAGMENT
TYPE:
internal (xi) EQUENCE DESCRIPTION:ID NO:
S SEQ B:

Met His Gln Asn Asn Lys Thr Phe Leu Pro Ser Gln Ser Ala His Leu Ser Lys Ile Ile Leu Phe Leu Asn Thr Gly Phe Leu Ala Tyr Leu Leu Ser Ala Cys Gly Ala Asn Val Pro Ile Glu Glu Val Leu Val Lys Asp Pro Lys Glu Thr Lys Ala Gln Glu Val Ala Arg Glu Glu Lys Ala Ile Gln Gln Glu Asn Ala Thr Ile Asp Ala Arg Thr Thr Pro Leu Ile Asn Arg Phe Thr Asn Tyr Ser Ala Tyr Gly Ser Leu Asn Gly Phe Tyr Asn Ser Val Asp Asn Leu Asn Ser Pro Met Gln Asn Gly Met Tyr Gly Gly Tyr Tyr Met Pro Tyr Tyr Tyr Met Pro Tyr Gly Phe Met Pro Tyr Gly Ser Gly Leu Met Pro Tyr Gly Pro Tyr Gly Tyr Gly Ala Pro Gly Tyr Phe Pro Tyr Ala Phe Tyr (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 910 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...860 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:

Met Lys Lys Phe Val Val Phe Lys Thr Leu Cys Leu Ser Val Val Leu Gly Asn Ser Leu Val Ala Ala Glu Gly Ser Thr Glu Val Gln Lys Gln Leu Glu Lys Pro Lys Glu Tyr Lys Ala Val Lys Gly Glu Lys Asn Ala Trp Tyr ATT AAC

LeuGly SerTyr GlnValGly GlnAla SerGlnSer ValLysAsn Ile AAA

ProPro SerSer GluPheAsn TyrPro LysPhePro ValGlyLys Lys TAT

ThrAsp LeuAla ValMetGln GlyLeu GlyLeuThr ValGlyTyr Tyr TTT

LysGln PheGly GluLysArg TrpPhe GlyAlaArg TyrTyrGly Phe GAT

PheMet TyrGly HisAlaVal PheGly AlaAsnAla LeuThrSer Asp GGT

AspAsn GlyVal CysGluLeu HisGln ProCysAla ThrLysVal Gly ATG

GlyThr GlyAsn LeuSerAsp MetPhe ThrTyrGly ValGlyIle Met TTA

AspThr TyrAsn ValIleAsn LysGlu AspAlaSer PheGlyPhe Leu GGG

PhePhe AlaGln IleAlaGly AsnSer TrpGlyAsn ThrThrGly Gly TTG

AlaPhe GluThr LysSerPro TyrLys HisThrSer TyrSerLeu Leu GCG

AspPro IlePhe GlnPheLeu PheAsn LeuGlyIle ArgThrHis Ala CGG

IleGly HisGln GluPheAsp PheGly ValLysIle ProThrIIe Arg TAT

AsnVal TyrPhe AsnHisGly AsnLeu SerPheThr TyrArgArg Tyr AGC CGCTTG

Gln Tyr Ser Leu Tyr Val Gly Tyr Arg Tyr Asn Phe (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 270 amino acids (B) TYPE: amino acid _ (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:10:
Met Lys Lys Phe Val Val Phe Lys Thr Leu Cys Leu Ser Val Val Leu Gly Asn Ser Leu Val Ala Ala Glu Gly Ser Thr Glu Val Gln Lys Gln Leu Glu Lys Pro Lys Glu Tyr Lys Ala Val Lys Gly Glu Lys Asn Ala Trp Tyr Leu Gly Ile Ser Tyr Gln Val Gly Gln Ala 5er Gln Ser Val Lys Asn Pro Pro Lys Ser Ser Glu Phe Asn Tyr Pro Lys Phe Pro Val Gly Lys Thr Asp Tyr Leu Ala Val Met Gln Gly Leu Gly Leu Thr Val Gly Tyr Lys Gln Phe Phe Gly Glu Lys Arg Trp Phe Gly Ala Arg Tyr Tyr Gly Phe Met Asp Tyr Gly His Ala Val Phe Gly Ala Asn Ala Leu Thr Ser Asp Asn Gly Gly Val Cys Glu Leu His Gln Pro Cys Ala Thr Lys Val Gly Thr Met G1y Asn Leu Ser Asp Met Phe Thr Tyr Gly Val Gly Ile Asp Thr Leu Tyr Asn Val Ile Asn Lys Glu Asp Ala Ser Phe Gly Phe Phe Phe Giy Ala Gln Ile Ala Gly Asn Ser Trp Gly Asn Thr Thr Gly Ala Phe Leu Glu Thr Lye Ser Pro Tyr Lys His Thr Ser Tyr Ser Leu Asp Pro Ala Ile Phe Gln Phe Leu Phe Asn Leu Gly Ile Arg Thr His Ile Gly Arg His Gln Glu Phe Asp Phe Gly Val Lys Ile Pro Thr Ile Asn Val Tyr Tyr Phe Asn His Gly Asn Leu Ser Phe Thr Tyr Arg Arg Gln Tyr Ser Leu Tyr Val Gly Tyr Arg Tyr Asn Phe 260 265 ~ 270 (2) INFORMATION FOR SEQ ID NO:11:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1357 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic RNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 58...1305 _ (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION:
SEQ
ID
NO:11:

AGGAATTTTA ATG
ACTAAAATAT
TGAGTTTAAA
TCCACGATGA

Met GCG

GlnTyr LysLysAsn LysLysArg TyrTyrTyr Leu Leu Gly Ile Ala ATC

PhePhe LeuAsnGly LeuSerLeu LysAlaLeu Glu Ala Val Lys Ile GCG

ProPhe GlyTyrLeu GlyLeuLeu TyrAsnGln Gly Gln Lys Asn Ala GTG

ProHis SerTyrVal GlyAlaLeu AlaArgLeu Gly Asp Phe Ser Val GGG

TyrSer AsnGlyTrp SerPheGly IleGlyAla Ile Ala Trp Asn GIy AGT

IleTyr AsnLysGln ArgLeuAla AsnLeuTyr Ile Leu Gly Asn Ser _ AGC

PhePhe GlySerSer LysAsnVal LysProTyr Leu Ala Gly Asp Ser TTT

ValSer AspAlaTyr ValGlnTyr ThrAsnGln Arg Lys Ile Ala Phe TGG

Leu Gly Arg Phe Asn Thr Asp Phe Val Asp Phe Asp Trp Ile Gly Gly Asn Ile Gln Gly Val Ser Val Ala Phe Lys Gln Asn Ser Met Arg Tyr Phe Gly Ile Phe Met Asp Ser Met Leu Tyr Asn Gly His Gln Ile Asn Lys Glu Gln Gly Asn Arg Ile Ala Thr Ser Leu Asn Ala Leu Ala Ser Tyr Asp Pro Val Ser Lys Arg Leu Tyr Val Gly Gly Glu Val Phe Val Leu Gly Ala Glu Tyr Arg His Glu Asn Leu Lys Val Val Pro Phe Ile Leu Thr Asp Thr Arg Leu Pro Leu Ser Thr Gln Asn Val Leu Val Gln Val Gly Gly Lys Leu Glu Tyr Asp Ala Ser Leu Ala Lys Gly Phe Thr Ser His Thr Leu Val His Gly Met Tyr Gln Tyr Gly Asn Thr Asp Ala Ala Thr Ser Val Lys Asn Ala Gly Leu Phe Leu Ile Asp Gln Thr Phe Lys Tyr Lys Ile Phe Asn Phe Gly Thr Gly Phe Tyr Ile Val Pro Ala Arg Asn Asn Lys Gly Tyr Leu Trp Thr Phe Asn Asp Arg Thr Lys Phe ' TAT GGC CGT GGG ATC AAT GCG CCC GGC GTG CCA GCG ATT TAT TTT GCA 1068 Tyr Gly Arg Gly Ile Asn Ala Pro Gly Val Pro Ala Ile Tyr Phe Ala Asn Ser Ser Ile Ser Gly Tyr Val Phe Leu Gly Leu Lys Thr Lys Arg WO

GAC GGG TAT

ValArgLeu Ala MetValAla Phe Asp TyrGlnGlu Ser Asp Gly Tyr AGT TAT TTT

LeuMetSer Phe ArgValTrp Thr Arg SerLeuSer Asp Ser Tyr Phe GGG AAT AGA

MetGlyGly Tyr ValTyrAla Tyr Ser LysAlaThr Lys Gly Asn Arg Ser Leu Gly Asn Ser Ser Phe Val Phe Phe Gly Lys Phe Leu Phe (2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 416 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
Met Gln Tyr Lys Lys Asn Lys Lys Arg Tyr Tyr Tyr Leu Ala Leu Gly Ile Phe Phe Leu Asn Gly Leu Ser Leu Lys Ala Leu Glu Ile Ala Val Lys Pro Phe Gly Tyr Leu Gly Leu Leu Tyr Asn Gln Gly Ala Gln Lys Asn Pro His Ser Tyr Val Gly Ala Leu Ala Arg Leu Gly Val Asp Phe Ser Tyr Ser Asn Gly Trp Ser Phe Gly Ile Gly Ala Ile Gly Ala Trp Asn Ile Tyr Asn Lys Gln Arg Leu Ala Asn Leu Tyr Ile Ser Leu Gly Asn Phe Phe Gly Ser Ser Lys Asn Val Lys Pro Tyr Leu Ser Ala Gly Asp Val Ser Asp Ala Tyr Val Gln Tyr Thr Asn Gln Arg Phe Lys Ile Ala Leu Gly Arg Phe Asn Thr Asp Phe Val Asp Phe Asp Trp Ile Gly Gly Asn Ile Gln Gly Val Ser Val Ala Phe Lys Gln Asn Ser Met Arg . 145 150 155 160 Tyr Phe Gly Ile Phe Met Asp Ser Met Leu~ Tyr Asn Gly His Gln Ile Asn Lys Glu Gln Gly Asn Arg Ile Ala Thr Ser Leu Asn Ala Leu Ala Ser Tyr Asp Pro Val Ser Lys Arg Leu Tyr Val Gly Gly Glu Val Phe Val Leu Gly Ala Glu Tyr Arg His Glu Asn Leu Lys Val Val Pro Phe Ile Leu Thr Asp Thr Arg Leu Pro Leu Ser Thr Gln Asn Val Leu Val Gln Val Gly Gly Lys Leu Glu Tyr Asp Ala Ser Leu Ala Lys Gly Phe Thr Ser His Thr Leu Val His Gly Met Tyr Gln Tyr Gly Asn Thr Asp Ala Ala Thr Ser Val Lys Asn Ala Gly Leu Phe Leu Ile Asp Gln Thr Phe Lys Tyr Lys Ile Phe Asn Phe Gly Thr Gly Phe Tyr Ile Val Pro Ala Arg Asn Asn Lys Gly Tyr Leu Trp Thr Phe Asn Asp Arg Thr Lys Phe Tyr Gly Arg Gly Ile Asn Ala Pro Gly Val Pro Ala Ile Tyr Phe Ala Asn Ser Ser Ile Ser Gly Tyr Val Phe Leu Gly Leu Lys Thr Lys Arg Val Arg Leu Asp Ala Met Val Ala Phe Gly Asp Tyr Gln Glu Tyr Ser Leu Met Ser Ser Phe Arg Val Trp Thr Tyr Arg Ser Leu Ser Phe Asp Met Gly Gly Gly Tyr Val Tyr Ala Tyr Asn Ser Lys Ala Thr Arg Lys Ser Leu Gly Asn Ser Ser Phe Val Phe Pie Gly Lys Phe Leu Phe (2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1562 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 73...1509 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:

Met Lys Asn Ser Thr Pro Leu Lys Asn Gln Val Phe Cys Gly TyrVal LeuSerLeu SerAla Leu GlnAlaPhe AspTyr Leu Ser ATT GTT TCT AAT

Lys GluVal SerAlaGlu SerPhe Lys ValGlyPhe AsnLys Ile Ser AAG TAT

_ Lys IleAsp IleAlaArg GlyIle Pro ThrGluThr PheVal Lys Tyr GCT GCG

Thr ValGly GlnGlyAsn IleTyr Asp PheLeuPro LysGly Ala Ala AAA GGA

Leu AspGln GlyHisVal LeuGlu Lys IleGlyGly ThrLeu Lys Gly GGG TTC

Gly ValAla TyrAspSer ThrLys Asn GlnGlyGly SerVal Gly Phe TAT GGC

Ile AsnTyr IleGlyTyr TrpAsp Tyr LeuGlyGly LysArg Tyr Gly TTG GAG

Ala LeuAsp GlyThrSer IleHis Cys AlaLeuGly SerAsp Leu Glu AAG GGG

Gly ValIle AspSerIle AlaCys Asn AlaArgAla AsnLys Lys Gly CGC GCT

Ile ArgAsn TyrLeuMet AsnAsn Phe LeuGluTyr ArgTyr Arg Ala GAT CGT

Lys IlePhe LeuAlaLys GlyGly Tyr GlnSerAsn AlaPro Asp Arg ATG GAA

_ Tyr SerGly TyrThrGln GlyPhe Ile SerAlaLys ValLys Met Glu AAA TGG

Asp AsnGlu GlyIleHis LysLeu Trp PheSerSer TrpGly Lys Trp GCG TAT

Arg PheAla TyrGlyGlu TrpIle Asp PheTyrSer ProArg Ala Tyr Thr Val Val Lys Asn Gly Arg Thr Leu Asn Tyr Gly Ile His Leu Val Asn Tyr Thr Tyr Glu Arg Lys Gly Val Ser Val Ser Pro Phe Phe Gln Phe Ser Pro Gly Thr Tyr Tyr Ser Pro Gly Val Val Val Gly Tyr Asp Ser Asn Pro Asn Phe Asn Gly Val Gly Phe Arg Ser Glu Thr Lys Ala Tyr Ile Leu Leu Pro Val His Asp Pro Leu Arg Arg Asp Thr Tyr Arg Tyr Ala Ile Lys Ala Gly Thr Ala Gly Gln Ser Leu Leu Ile Arg Gln Arg Phe Asp Tyr Asn Glu Phe Asn Phe Gly Gly Ala Phe Tyr Lys Val Trp Lys Asn Ala Asn Ala Tyr Ile Gly Thr Thr Gly Asn Pro Leu Gly Ile Asp Phe Trp Thr Asn Ser Val Tyr Asp Ile Gly Gln Ala Leu Ser His Val Val Thr Ala Asp Ala Val Ser Gly Trp Val Phe Gly Gly Gly Val His Lys Lys Trp Leu Trp Gly Thr Leu Trp Arg Trp Thr Ser Gly ~ ACT TTA GCC AAT GAA GCG AGT GCG GCT GTT AAT GTG GGC TAT AAG ATC 1359 Thr Leu Ala Asn Glu Ala Ser Ala Ala Val Asn Val Gly Tyr Lys Ile Ser Lys Ser Leu Thr Ala Ser Val Lys Leu Glu Tyr Leu Gly Val Met Thr His Ala Gly Phe Thr Va1 Gly Ser Tyr Arg Pro Thr Pro Gly Ser Lys Ala Leu Tyr Ser Asp Arg Ser His Leu Met Thr Thr Leu Ser Ala Lys Phe A ' (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 479 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v} FRAGMENT TYPE: internal (xi} SEQUENCE DESCRIPTION: SEQ ID N0:14:
Met Lys Asn Ser Thr Pro Leu Lys Asn Gln Val Phe Cys Gly Leu Tyr Val Leu Ser Leu Ser Ala Ser Leu Gln Ala Phe Asp Tyr Lys Ile Glu Val Ser Ala Glu Ser Phe Ser Lys Val Gly Phe Asn Lys Lys Lys Ile Asp Ile Ala Arg Gly Ile Tyr Pro Thr Glu Thr Phe Val Thr Ala Val Gly Gln Gly Asn IIe Tyr Ala Asp Phe Leu Pro Lys Gly Leu Lys Asp Gln Gly His Val Leu Glu Gly Lys Ile Gly Gly Thr Leu Gly Gly Val Ala Tyr Asp Ser Thr Lys Phe Asn Gln Gly Gly Ser Val Ile Tyr Asn Tyr Ile Gly Tyr Trp Asp Gly Tyr Leu Gly Gly Lys Arg Ala Leu Leu Asp Gly Thr Ser Ile His Glu Cys Ala Leu Gly Ser Asp Gly Lys Val Ile Asp Ser Ile Ala Cys Gly Asn Ala Arg Ala Asn Lys Ile Arg Arg Asn Tyr Leu Met Asn Asn Ala Phe Leu Glu Tyr Arg Tyr Lys Asp Ile _ Phe Leu Ala Lys Gly Gly Arg Tyr Gln Ser Asn Ala Pro Tyr Met Ser Gly Tyr Thr Gln Gly Phe Glu Ile Ser Ala Lys Val Lys Asp Lys Asn Glu Gly Ile His Lys Leu Trp Trp Phe Ser Ser Trp Gly Arg Ala Phe Ala Tyr Gly Glu Trp Ile Tyr Asp Phe Tyr Ser Pro Arg Thr Val Val Lys Asn Gly Arg Thr Leu Asn Tyr Gly Ile His Leu Val Asn Tyr Thr _ CA 02286306 1999-10-O1 Tyr Glu Arg Lys Gly Val Ser Val Ser Pro Phe Phe Gln Phe Ser Pro Gly Thr Tyr Tyr Ser Pro Gly Val Val Val Gly Tyr Asp Ser Asn Pro Asn Phe Asn Gly Val Gly Phe Arg Ser Glu Thr Lys Ala Tyr Ile Leu Leu Pro Val His Asp Pro Leu Arg Arg Asp Thr Tyr Arg Tyr Ala Ile ' Lys Ala Gly Thr Ala Gly Gln Ser Leu Leu Ile Arg Gln Arg Phe Asp Tyr Asn Glu Phe Asn Phe Gly Gly Ala Phe Tyr Lys Val Trp Lys Asn Ala Asn Ala Tyr Ile Gly Thr Thr Gly Asn Pro Leu Gly Ile Asp Phe Trp Thr Asn Ser Val Tyr Asp Ile Gly Gln Ala Leu Ser His Val Val Thr Ala Asp Ala Val Ser Gly Trp Val Phe Gly Gly Gly Val His Lys Lys Trp Leu Trp Gly Thr Leu Trp Arg Trp Thr Ser Gly Thr Leu Ala Asn Glu Ala Ser Ala Ala Val Asn Val Gly Tyr Lys Ile Ser Lys Ser Leu Thr Ala Ser Val Lys Leu Glu Tyr Leu Gly Val Met Thr His Ala Gly Phe Thr Val Gly Ser Tyr Arg Pro Thr Pro Gly Ser Lys Ala Leu Tyr Ser Asp Arg Ser His Leu Met Thr Thr Leu Ser Ala Lys Phe (2) INFORMATION FOR SEQ ID N0:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 810 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 98...757 (D) OTHER INFORMATION:
' (xi) SEQUENCE DESCRIPTION: SEQ ID N0:15:

Met Asn Lys Thr Thr Val Lys IleLeuMet Met AlaLeuLeu SerSer Gln AlaGlu Gly Leu Ala GAA TTT GAC

Ala GluLeuAsp Lys SerLysLys ProLys Ala ArgAsn Glu Phe Asp GGG GCG AAC

Thr PheTyrLeu Val GlyTyrGln LeuSer Ile ThrSer Gly Ala Asn TCT ATG GGC

Phe SerThrGlu Val AspLysSer TyrPhe Thr AsnGly Ser Met Gly TTA AAA CAA

Phe GlyValVal Gly GlyLysPhe ValAla Thr AlaVal Leu Lys Gln TTC GAT ACC

Glu HisValGly Arg TyrGlyLeu PheTyr Gln PheSer Phe Asp Thr ~5 100 TAT GAA AGC

Ser HisLysSer Ile SerThrTyr GlyLeu Phe GlyLeu Tyr Glu Ser AAT GGG GAG

Trp AspAlaPhe Ser ProLysMet PheLeu Leu PheGly Asn Gly Glu GGG GGG ATG

Leu GlyIleAla Ala ThrTyrMet ProGly Ala HisGly Gly Gly Met AAT CTT CAA

Ile IleAlaGln Leu GlyLysGlu AsnSer Phe LeuLeu Asn Leu Gln TTT AAT ATC

Val LysValGly Arg PheGlyPhe LeuHis Glu ThrPhe Phe Asn Ile CCT ACG ATC

Gly LeuLysPhe Val IleProAsn LysArg Glu IleAsp Pro Thr Ile ACT CCG GCT

Gly LeuSerThr Thr LeuTrpHis ArgLeu Val TyrPhe Thr Pro Ala AAT TATTTAGAGG
TTTTAGATTT
GACAAAAT

Asn TyrIleTyr Phe Asn (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 220 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear _ (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Met Asn Lys Thr Thr Val Lys Ile Leu Met Gly Met Ala Leu Leu Ser Ser Leu Gln Ala Ala Glu Ala Glu Leu Asp Glu Lys Ser Lys Lys Pro Lys Phe Ala Asp Arg Asn Thr Phe Tyr Leu Gly Val Gly Tyr Gln Leu Ser Ala Ile Asn Thr Ser Phe Ser Thr Glu Ser Val Asp Lys Ser Tyr Phe Met Thr Gly Asn Gly Phe Gly Val Val Leu Gly Gly Lys Phe Val Ala Lys Thr Gln Ala Val Glu His Val Gly Phe Arg Tyr Gly Leu Phe Tyr Asp Gln Thr Phe Ser Ser His Lys Ser Tyr Ile Ser Thr Tyr Gly Leu Glu Phe Ser Gly Leu Trp Asp Ala Phe Aan Ser Pro Lys Met Phe Leu Gly Leu Glu Phe Gly Leu Gly Ile Ala Gly Ala Thr Tyr Met Pro Gly Gly Ala Met His Gly Ile Ile Ala Gln Asn Leu Gly Lys Glu Asn Ser Leu Phe Gln Leu Leu Val Lys Val Gly Phe Arg Phe Gly Phe Leu His Asn Glu Ile Thr Phe Gly Leu Lys Phe Pro Val Ile Pro Asn Lys Arg Thr Glu Ile Ile Asp Gly Leu Ser Thr Thr Thr Leu Trp His Arg Leu Pro Val Ala Tyr Phe Asn Tyr Ile Tyr Asn Phe (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1516 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1463 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:

AAAGTATTTG
CATTTATCAA
TCTCATTTTA
GGAGGCATGC

Met Lys .

CAG TTA AGC

LysAlaSer Val PhePheGly AlaPhe Leu SerSer Leu Gln Leu Ser GAA AAG CAA

GlnGlyPhe Ala LeuAsnGly PheVal Asp SerSer Thr Glu Lys Gln AAC CAT GGT

IleGlyPhe Gln LysIleAsn LysGlu Arg IleTyr Pro Asn His Gly TTC ACG CTT

MetGlnGln Ala IleAlaGly TyrLeu Gly GlyPhe Ser Phe Thr Leu AAA GTT GGC

LeuLeuPro Lys SerAspHis ValLeu Lys LysIle Gly Lys Val Gly GGA ATT AAG

GlyMetVal Ser PheTyrAsp GlyThr Lys PheGlu Asp Gly Ile Lys GCT AAC GGG

SerSerVal Tyr LeuPheGly TyrTyr Asp PheMet Gly Ala Asn Gly AAC TTA ACA

GlyTyrThr Ile GlnSerAsp AspLeu Ala GlnAsn Met Asn Leu Thr Lys Tyr Asn Lys Asn Val Arg Asn Tyr Val Phe Ser Asp Ala Tyr Leu Glu Tyr Ala Tyr Lys Asn Tyr Phe Glu Ile Lys Ala Gly Arg Tyr Leu Ser Thr Met Pro Tyr Lys Ser Gly Gln Thr Gln Gly Phe Gln Ile Ser Gly Gln Tyr Lys Lys Ala Arg Leu Thr Trp Phe Ser Ser Phe Gly Arg Ala Phe Ala Tyr Gly Ser Phe Leu Met Asp Trp Phe Ala Ala Arg Thr ' 195 200 205 210 Thr Tyr Ser Gly Gly Phe Thr Lys Asn Asp Lys Gly Gly Tyr Asp Ser His Gly Arg Lys Val Leu Tyr Gly Thr His Ala Val Gln Leu Thr Tyr Lys Pro His Arg Phe Leu Ile Glu Gly Phe Tyr Tyr Leu Ser Pro Gln Ile Phe Asn Ala Pro Gly Val Lys Ile Gly Trp Asp Ser Asn Pro Asn Phe Ser Gly Thr Gly Phe Arg Ser Asp Thr Ala Ile Ile Gly Phe Phe Pro Ile Tyr Tyr Pro Trp Met Ile Val Lys Ser Asn Gly Ser Pro Val Tyr Lys Tyr Asp Thr Pro Ala Thr Gln Asn Gly Gln Asn Leu Ile Ile Leu G1n Arg Phe Asp Ile Asn Asn Tyr Asn Val Ser Ile Ala Phe Tyr Lys Val Phe Gln Asn Ala Asn Gly Trp Ile Gly Asn Met Gly Asn Pro ~ 340 345 350 Ser Gly Val Ile Met Gly Ser Asn Ser Val Tyr Ala Gly Phe Thr Gly Thr Ala Leu Lys Arg Asp Ala Ala Thr Ile Phe Leu Ser Cys Gly Gly ThrHis PheAla LysLysPhe Thr Trp Phe Thr GlnTyrSer Lys Ala GCG ATC

AsnSer ValVal SerTrpGlu Ala Arg Met Ser LeuGlyTyr Ala Ile GTG CTT

LysPhe ThrGlu TyrLeuSer Gly Ser Asp Ala TyrTyrGly Val Leu GGT AAC

ValTyr ThrAsn LysGlyPhe Lys Pro Glu Gly ProValPro Gly Asn AGG GCG

LysAsp PhePro AlaLeuTyr Ser Asp Ser Leu TyrThrAla Arg Ala CTATGATTAT G
GGTGGGCGTC

LeuVal AlaSer Phe (2) INFORMATION FOR SEQ ID NO:1B:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 471 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:
Met Lys Lys Ala Ser Gln Val Leu Phe Phe Gly Ala Phe Leu Ser Ser Ser Leu Gln Gly Phe Glu Ala Lys Leu Asn Gly Phe Val Asp Gln Ser Ser Thr Ile Gly Phe Asn Gln His Lys Ile Asn Lys Glu Arg Gly Ile Tyr Pro Met Gln Gln Phe Ala Thr Ile Ala Gly Tyr Leu Gly Leu Gly Phe Ser Leu Leu Pro Lys Lys Val Ser Asp His Val Leu Lys Gly Lys Ile Gly Gly Met Val Gly Ser Ile Phe Tyr Asp Gly Thr Lys Lys Phe Glu Asp Ser Ser Val Ala Tyr Asn Leu Phe Gly Tyr Tyr Asp Gly Phe Met Gly Gly Tyr Thr Asn Ile Leu Gln Ser Asp Asp Leu Ala Thr Gln Asn Met Lys Tyr Asn Lys Asn Val Arg Asn Tyr Val Phe Ser Asp Ala Tyr Leu Glu Tyr Ala Tyr Lys Asn Tyr Phe Glu Ile Lys Ala Gly Arg Tyr Leu Ser Thr Met Pro Tyr Lys Ser Gly Gln Thr Gln Gly Phe Gln Ile Ser Gly Gln Tyr Lys Lys Ala Arg Leu Thr Trp Phe Ser Ser Phe Gly Arg Ala Phe Ala Tyr Gly Ser Phe Leu Met Asp Trp Phe Ala Ala Arg Thr Thr Tyr Ser Gly Gly Phe Thr Lys Asn Asp Lys Gly Gly Tyr Asp Ser His Gly Arg Lys Val Leu Tyr Gly Thr His Ala Val Gln Leu Thr Tyr Lys Pro His Arg Phe Leu Ile Glu Gly Phe Tyr Tyr Leu Ser Pro Gln Ile Phe Asn Ala Pro Gly Val Lys Ile Gly Trp Asp Ser Asn Pro Asn Phe Ser Gly Thr Gly Phe Arg Ser Asp Thr Ala Ile Ile Gly Phe Phe Pro Ile Tyr Tyr Pro Trp Met Ile Val Lys Ser Asn Gly Ser Pro Val Tyr Lys Tyr Asp Thr Pro Ala Thr Gln Asn Gly Gln Asn Leu Ile Ile Leu Gln Arg Phe Asp Ile Asn Asn Tyr Asn Val Ser Ile Ala Phe Tyr Lys Val Phe Gln Asn Ala Asn Gly Trp Ile Gly Asn Met Gly Asn Pro Ser Gly Val Ile Met Gly Ser Asn Ser Val Tyr Ala Gly Phe Thr Gly Thr Ala Leu Lys Arg Asp Ala Ala Thr Ile Phe Leu Ser Cys Gly Gly Thr His Phe Ala Lys Lys Phe Thr Trp Lys Phe Ala Thr Gln Tyr Ser Asn Ser Val Val Ser Trp Glu Ala Arg Ala Met Ile Ser Leu Gly Tyr Lys Phe Thr Glu Tyr Leu Ser Gly Ser Val Asp Leu Ala Tyr Tyr Gly Val Tyr Thr Asn Lys Gly Phe Lys Pro Gly Glu Asn Gly Pro Val Pro Lys Asp Phe Pro Ala Leu Tyr Ser Asp Arg Ser Ala Leu Tyr Thr Ala Leu Val Ala Ser Phe ' (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 377 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 87...323 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:

CGCATTGTAA
ATAAATTCTC
ATTTTGATAC
ATTTTTACAA

CATCTT
ATG
AAA
AAA
ACG
AAA
AAA
ACG
ATT
CTG

Met Lys Lys Thr Lys Lys Thr Ile Leu GCG GGC

LeuSer LeuThrLeu Ala Ser Leu Leu His Ala Glu Asp Asn Ala Gly GGT GTG

ValPhe LeuSerVal Tyr Gln Ile Gly Glu Ala Val Gln Lys Gly Val GTG TTA

LysAsn AlaAspLys Gln Lys Leu Ser Asp Thr Tyr Glu Gln Val Leu AAC GCG

SerArg LeuLeuThr Asp Asn Gly Thr Asn Ser Lys Thr Ser Asn Ala GGT AGCCGGTG

GlnXaa GlnProSer Gly (2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 79 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:20:
Met Lys Lys Thr Lys Lys Thr Ile Leu Leu Ser Leu Thr Leu Ala Ala Ser Leu Leu His Ala Glu Asp Asn Gly Val Phe Leu Ser Val Gly Tyr Gln Ile Gly Glu Ala Val Gln Lye Val Lys Asn Ala Asp Lys Val Gln -lla-Lys Leu Ser Asp Thr Tyr Glu Gln Leu Ser Arg Leu Leu Thr Asn Asp Asn Gly Thr Asn Ser Lys Thr Ser Ala Gln Xaa Gln Pro Ser Gly (2) INFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2169 base pairs (B) TYPE: nucleic acid . (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 60...2039 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:

Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu Asp Ala Lys Glu Ile Ala Met Gln Arg Phe Asp Lys Gln Asn His Lys Ile Phe G1u Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val Ile Thr Ala Ser Gly Asn Ala Ile Leu Leu Asn Tyr Asp Val Tyr Ile Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu ' 65 70 75 80 Gly Asn Ile Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu Ile Ile Phe Pro Phe WO

Tyr Asp ValSer GlyIleTrp Val AlaAspIle Ala Val Ser Ser Gln AAG CAA TAT AAA ATG

Ser Gly Asp LysTyr LysValLys Asn SerThrSer Gly Lys Gln Met ATT AAC GCG

Cys Ser Asp ProIle TrpHisVal Asn ThrSerGly Ser Ile Asn Ala ATG AAA AAT

Phe Asn Gln SerHis LeuSerMet Trp ProLysIle Tyr Met Lys Asn GAT CCT ATT

Val Gly Ile ValLeu TyrLeuPro Tyr PheMetSer Thr Asp Pro Ile AAA ACT GAG

Ser Asn Arg ThrGly PheLeuTyr Pro PheGlyThr Ser Lys Thr Glu GAC TTT TAT

Asn Leu Gly IleTyr LeuGlnPro Phe LeuAlaPro Lys Asp Phe Tyr TGG ATG CGC

Asn Ser Asp ThrPhe ThrProGln Ile TyrLysArg Gly Trp Met Arg TTG TTT TCT

Phe Gly Asn GluAla ArgTyrIle Asn LysAsnAsp Arg Leu Phe Ser TTC GCG ACC

Phe Leu Asn ArgTyr PheArgAsn Tyr GlnTyrVal Lys Phe Ala Thr GAT AGG TTT

Arg Tyr Leu AsnGln AsnIleTyr Gly GluPheLeu Ser Asp Arg Phe AGG ACT CTT

_ Ser Ser Asp LeuGln LysTyrPhe His LysSerAsn Ile Arg Thr Leu GGG TAC AAC

Asp Asn His IleAsp PheLeuTyr Met LeuAsp Tyr Gly Tyr Asn Asp TTT AAG AAG GAC

Val Arg Glu ValAsn LysArgIle Thr AlaThrHis Met Phe Lys Asp Ser Arg Ala Asn Tyr Tyr Leu Gln Thr Glu Asn Asn Tyr Tyr Gly Leu Asn Ile Lys Tyr Phe Leu Asn Leu Asn Lys Ile Asn Asn Asn Arg Thr Phe Gln Ser Val Pro Asn Leu Gln Tyr His Lys Tyr Leu Asn Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gln Phe Arg Asn Thr Ala Arg Glu Ile Gly Tyr Gly Tyr Val Gln Asn Ala Leu Asn Vai Pro Val Gly Leu Gln Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu Trp Asn Asp Leu Gln Leu Ser Asn Val Ala Leu Met Gln Ser Lys Asn Ser Phe Val Pro Thr Ile Pro Asn Glu Ser Arg Glu Phe Gly Asn Phe Val Ser Ser Asn Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu Tyr Asn Lys Leu Phe His Thr Ile Gln Leu Glu Ala Ile Phe Asn Ile Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe Ser Gln Asn Met Tyr Ala Leu Ser Ala Gln Ala Leu Asn Ser Tyr Thr Ser Pro Leu Leu Arg Asp Tyr Asp Tyr Gln Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser Ile Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr Gln Tyr Leu Tyr Gly Leu Gly Gly Gln Glu Leu Leu Tyr Phe Lys Ile Ser Gln Leu Ile Asn Leu Asp Asp Lys Val Ser Pro Phe Arg Met Pro Leu Glu Ser Lys Ile Gly Phe Ser Pro Leu Thr Gly Leu Asn Ile Phe AAT TTT TTT CGC GAA ATC

Gly Val Tyr Ser Tyr Gln Asn Leu Glu Ser Asn Phe Phe Arg Glu Ile AAC AAT CGC AGC AAC TCT

Val Ala Tyr Gln Lys Phe Leu Phe Leu Tyr Asn Asn Arg Ser Asn Ser TTA AAC AGC AAT ATT GAA

Phe Lys Asn Phe Ser Gly Ile Ser Val Asn Leu Asn Ser Asn Ile Glu CGG ATT TTTTAGCAAC
GACTTTGGCT
ATTTTTCCAT

Leu Ile Arg Ile (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 660 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:22:
Met Ile Tyr Trp Leu Tyr Leu Ala Val Phe Phe Leu Leu Ser Ala Leu Asp Ala Lys Glu Ile Ala Met Gln Arg Phe Asp Lys Gln Asn His Lys Ile Phe Glu Ile Leu Ala Asp Lys Val Ser Ala Lys Asp Asn Val Ile Thr Ala Ser Gly Asn Ala Ile Leu Leu Asn Tyr Asp Val Tyr Ile Leu Ala Asp Lys Val Arg Tyr Asp Thr Lys Thr Lys Glu Ala Leu Leu Glu WO 98/43478 PCT/US98/063?1 Gly Asn Ile Lys Val Tyr Arg Gly Glu Gly Leu Leu Val Lys Thr Asp Tyr Val Lys Leu Ser Leu Asn Glu Lys Tyr Glu Ile Ile Phe Pro Phe Tyr Val Gln Asp Ser Val Ser Gly Ile Trp Va1 Ser Ala Asp Ile Ala Ser Gly Lys Asp Gln Lys Tyr Lys Val Lys Asn Met Ser Thr Ser Gly Cys Ser Ile Asp Asn Pro Ile Trp His Val Asn Ala Thr Ser Gly Ser Phe Asn Met Gln Lys Ser His Leu Ser Met Trp Asn Pro Lys Ile Tyr Val Gly Asp Ile Pro Val Leu Tyr Leu Pro Tyr Ile Phe Met Ser Thr Ser Asn Lys Arg Thr Thr Gly Phe Leu Tyr Pro Glu Phe Gly Thr Ser Asn Leu Asp Gly Phe Ile Tyr Leu Gln Pro Phe Tyr Leu Ala Pro Lys Asn Ser Trp Asp Met Thr Phe Thr Pro Gln Ile Arg Tyr Lys Arg Gly Phe Gly Leu Asn Phe Glu Ala Arg Tyr Ile Asn Ser Lys Asn Asp Arg Phe Leu Phe Asn Ala Arg Tyr Phe Arg Asn Tyr Thr Gln Tyr Val Lys Arg Tyr Asp Leu Arg Asn Gln Asn Ile Tyr Gly Phe Glu Phe Leu Ser Ser Ser Arg Asp Thr Leu Gln Lys Tyr Phe His Leu Lys Ser Asn Ile Asp Asn Gly His Tyr Ile Asp Phe Leu Tyr Met Asn Asp Leu Asp Tyr Val Arg Phe Glu Lys Val Asn Lys Arg Ile Thr Asp Ala Thr His Met Ser Arg Ala Asn Tyr Tyr Leu Gln Thr Glu Asn Asn Tyr Tyr Gly Leu Asn Ile Lys Tyr Phe Leu Asn Leu Asn Lys Ile Asn Asn Asn Arg Thr Phe Gln Ser Val Pro Asn Leu Gln Tyr His Lys Tyr Leu Asn Ser Leu Tyr Phe Arg Asn Leu Leu Tyr Ser Val Asp Tyr Gln Phe Arg Asn Thr Ala Arg Glu Ile Gly Tyr Gly Tyr Val Gln Asn Ala Leu Asn Val Pro Val Gly Leu Gln Phe Ser Leu Phe Lys Lys Tyr Leu Ser Leu Gly Leu ~ Trp Asn Asp Leu Gln Leu Ser Asn Val Ala Leu Met Gln Ser Lys Asn Ser Phe Val Pro Thr Ile Pro Asn Glu Ser Arg Glu Phe Gly Asn Phe Val Ser Ser Asn Phe Ser Met Tyr Val Asn Thr Asp Leu Ala Arg Glu Tyr Asn Lys Leu Phe His Thr Ile Gln Leu Glu Ala Ile Phe Asn Ile Pro Tyr Tyr Thr Phe Lys Asn Gly Leu Phe Ser Gln Asn Met Tyr Ala Leu Ser Ala Gln Ala Leu Asn Ser Tyr Thr Ser Pro Leu Leu Arg Asp Tyr Asp Tyr Gln Gly Arg Leu Tyr Asp Ser Val Trp Asn Pro Ser Ser Ile Leu Pro Ser Asn Ala Ser Asn Lys Thr Val Asp Leu Thr Leu Thr Gln Tyr Leu Tyr Gly Leu Gly Gly Gln Glu Leu Leu Tyr Phe Lys Ile Ser Gln Leu Ile Asn Leu Asp Asp Lys Val Ser Pro Phe Arg Met Pro Leu Glu Ser Lys Ile Gly Phe Ser Pro Leu Thr Gly Leu Asn Ile Phe Gly Asn Val Phe Tyr Ser Phe Tyr Gln Asn Arg Leu Glu Glu Ile Ser Val Asn Ala Asn Tyr Gln Arg Lys Phe Leu Ser Phe Asn Leu Ser Tyr Phe Leu Lys Asn Asn Phe Ser Ser Gly Ile Asn Ser Ile Val Glu Asn Leu Arg Ile Ile (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 454 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...401 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:

CTTGAAACTC AAA
AATTTTTGAA
TCTCAATTTT

Met Lys GlyPhe ValMetSer GlyLeuLys AlaPheSer CysValValVal Leu CysGly AlaMetAla AsnThrAla IleAlaGly ProLysIleGlu Ala ArgGly GluPheGly ArgPheTrp GlyGlyAla ValGlyGlyAla Ile Gly Gly Gly Val Gly Gly Ala Val Gly Gly Ala Val Gly Gly Pro Ala Gly Gly Trp Ala Gly Arg Leu Val Gly Gly Ser Val Gly Arg Glu Phe Gly Arg Glu Ile Gly Asp Arg Val Glu Asp Tyr Ile Arg Gly Val Asp Arg Glu Pro Gln Ala Pro Arg Glu Pro Thr Tyr Asp Arg His Phe Val Tyr Asp Arg (2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 117 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:24:
Met Lys Gly Phe Val Met Ser Gly Leu Lys Ala Phe Ser Cys Val Val Val Leu Cys Gly Ala Met Ala Asn Thr Ala Ile Ala Gly Pro Lys Ile Glu Ala Arg Gly Glu Phe Gly Arg Phe Trp Gly Gly Ala Val Gly Gly Ala Ile Gly Gly Gly Val Gly Gly Ala Val Gly Gly Ala Val Gly Gly Pro Ala Gly Gly Trp Ala Gly Arg Leu Val Gly Gly Ser Val Gly Arg Glu Phe Gly Arg Glu Ile Gly Asp Arg Val Glu Asp Tyr Ile Arg Gly Val Asp Arg Glu Pro Gln Ala Pro Arg Glu Pro Thr Tyr Asp Arg His Phe Val Tyr Asp Arg (2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 856 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
Genomic DNA

(ix) FEATURE:

(A) NAME/KEY: Coding Sequence ' (B) LOCATION: 59...802 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:

AAA
AGA

MetLeu LysArgMet IleLeu LeuGlyAlaLeu GlyValLeu AlaSer AlaGlu GluSerAla AlaPhe ValGlyValAsn TyrGlnVal SerMet IleGln AsnGlnThr LysMet ValAsnAspAsn GlyLeuGln LysPro LeuIle LysPhePro ProTyr AlaGlyAlaGly PheGluVal GlyTyr LysGln PhePheGly LysLys LysTrpPheGly MetArgTyr TyrGly PhePhe AspTyrAla HisAsn ArgPheGlyVal MetLysLys GlyIle ProVal GlyAspSer GlyPhe IleTyrAsnSer PheSerPhe GlyGly AsnThr LeuThrGlu ArgAsp SerTyrGlnGly GlnTyrTyr ValAsn LeuPhe ThrTyrGly ValGly LeuAspThrLeu TrpAsnPhe ValAsn Lys Glu Asn Met Val Phe Gly Phe Val Val Gly Ile Gln Leu Ala Gly Asp Ser Trp Ala Thr Ser Ile Ser Lys Glu Ile Ala His Tyr Ala Lys His His Ser Asn Ser Ser Tyr Ser Pro Ala Asn Phe Gln Phe Leu Trp .-Lys Phe Gly Val Arg Thr His Ile Ala Lys His Asn Ser Leu Glu Leu Gly Ile Lys Val Pro Thr Ile Thr His Gln Leu Phe Ser Leu Thr Asn Glu Lys Gly Tyr Thr Leu Gln Ala Asp Val Arg Arg Val Tyr Ala Phe Gln Ile Ser Tyr Leu Arg Asp Phe (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 248 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
Met Leu Lys Arg Met Ile Leu Leu Gly Ala Leu Gly Val Leu Ala Ser Ala Glu Glu Ser Ala Ala Phe Val Gly Val Asn Tyr Gln Val Ser Met Ile Gln Asn Gln Thr Lys Met Val Asn Asp Asn Gly Leu Gln Lys Pro Leu Ile Lys Phe Pro Pro Tyr Ala Gly Ala Gly Phe Glu Val Gly Tyr Lys Gln Phe Phe Gly Lys Lys Lys Trp Phe Gly Met Arg Tyr Tyr Gly Phe Phe Asp Tyr Ala His Asn Arg Phe Gly Val Met Lys Lys Gly Ile Pro Val Gly Asp Ser Gly Phe Ile Tyr Asn Ser Phe Ser Phe Gly Gly Asn Thr Leu Thr Glu Arg Asp Ser Tyr Gln Gly Gln Tyr Tyr Val Asn Leu Phe Thr Tyr Gly Val Gly Leu Asp Thr Leu Trp Asn Phe Val Asn Lys Glu Asn Met Val Phe Gly Phe Val Val Gly Ile Gln Leu Ala Gly Asp Ser Trp Ala Thr Ser Ile Ser Lys Glu Ile Ala His Tyr Ala Lys His His Ser Asn Ser Ser Tyr Ser Pro Ala Asn Phe Gln Phe Leu Trp Lys Phe Gly Val Arg Thr His Ile Ala Lys His Asn Ser Leu Glu Leu Gly Ile Lys Val Pro Thr Ile Thr His Gln Leu Phe Ser Leu Thr Asn Glu Lys Gly Tyr Thr Leu Gln Ala Asp Val Arg Arg Val Tyr Ala Phe Gln Ile Ser Tyr Leu Arg Asp Phe (2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
{A) LENGTH: 2750 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 69...2699 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:27:

Met Leu Arg Asn Gln Phe Arg Ile Val Phe Val Ser Cys Ile Val Ala Ser Asn Leu Gln Ala Gln Glu Thr Thr His Thr Leu Gly Lys Val Thr Thr Lys Gly Glu Arg Thr Phe Glu Tyr Asn Asn Lys Met Tyr Ile Asp Arg Lys Glu Leu Gln Gln Arg Gln Ser Asn Gln Ile Arg Asp Ile Phe Arg Thr Arg Ala Asp Val Asn Val Ala Ser Gly Gly Leu Met Ala Gln Lys Ile Tyr Val Arg Gly Ile Glu Ser Arg Leu Leu Arg Val ..

Thr Ile Asp Gly Val Ala Gln Asn Gly Asn Ile Phe His His Asp Ala Asn Thr Val Ile Asp Pro Asn Met Ile Lys Glu Val Glu Val Ile Lys Gly Ala Ala Asn Ala Ser Ala Gly Pro Gly Ala Val Ala Gly Lys Leu Ser Phe Thr Thr Ile Asp Ala Asn Asp Phe Leu Arg Lys Asn Gln Thr Tyr Gly Ala Lys Ala Giu Ala Ala Phe Tyr Thr Asn Phe Gly Tyr Arg 160 165 3.70 Met Asn Ala Thr Ala Ala Tyr Arg Gly Lys Asn Trp Asp Ile Leu Ala Tyr Tyr Asn His Gln Asn Ile Phe Tyr Tyr Arg Asp Gly Asn Asn Ala Phe Arg Asn Val Phe His Pro Asn Tyr Asp Leu Gln Asp Pro Ser Asn ~ AGC GAT ATG AGC GTA GGG ACT CCC AGT GAA GTC AAT AGE GTT TTA GCT 782 Ser Asp Met Ser Val Gly Thr Pro Ser Glu Val Asn Ser Val Leu Ala Lys Ile Asn Gly Tyr Ile Asn Glu Thr Asp Ser Ile Ser Val Ser Tyr Asn Leu Thr Arg Asp Asn Ser Thr Arg Leu Leu Arg Pro Asn Thr Thr GCC AAA GGA CAG TTT
GCC

Ser LeuSer AlaAsn AspPro Ser Pro Ala Pro Ala Lys Gly Gln Phe ATT GGG CAT ATC CAC

Val AspPhe LysGlu LeuAla Thr Asn Phe Asn Ile Gly His Ile His TTG AAA GGC CCT CAG

Asn SerLeu TyrLys HisGlu Gly Asn Phe Asn Leu Lys Gly Pro Gln 305 310 315 ', CGC TCC GGG AGG TAT

Pro ValGlu ThrAla PheLeu Val Gly Gly Asn Arg Ser Gly Arg Tyr CCT AAT AAT AAC AAC

Asn ValVal ProPhe AlaTyr Ser Glu Pro Ala Pro Asn Asn Asn Asn GAT CCT TGG AAT AAT

Pro TyrIle GluVal LysGlu Cys Asn Pro Asp Asp Pro Trp Asn Asn AGC ACG AGG TCT TAT

Ile GlnCys GlnGly AlaIle Pro Asn Gly Gly Ser Thr Arg Ser Tyr ATA GGC ATT TGG AGC

Gln GlyTyr ThrPro AsnSer Asn Gln Gly Thr Ile Gly Ile Trp Ser TCT GGG TAT CAG ATT

Asp SerGly AlaGln AlaGly Gly Leu Asn Ala Ser Gly Tyr Gln Ile ACA AAC CTT CCT GAT

Ser SerAla ValTyr HisGly Val Lys Asn Pro Thr Asn Leu Pro Asp GAC CCC AAC AGC TGG

Tyr MetThr ProAsn AlaGln Pro Ala Asn Asp Asp Pro Asn Ser Trp TTA GCG ACT GCC TTT

Thr GlyA.sn AspAla GluGly Leu Arg Arg Ile Leu Ala Thr Ala Phe ATC GGC GTA CAC GAA

Leu AsnSer ValAsn PheLys Thr Pro Ile Ser Ile Gly Val His Glu TAT GTG ATG TAT AGC

Asp GlyAsn PheGlu TyrGly Ile Gln Asn Leu Tyr Val Met Tyr Ser Val Phe Ser Gly Leu Asp Lys Gly Lys Asn Gly Tyr Tyr Lys Asn Asn Ile Asp Pro Asn Asp Pro Asn Gly Pro Gly Leu Pro Tyr Arg His Tyr Tyr Thr Asp Gln Ser Ser Gln Tyr Pro Gln Asn Leu Asn Thr Pro Asn Pro Leu Tyr Arg Asn Met Pro Gln Asn Ser His Ala Ile Gly Asn Ile Ile Gly Gly Phe Met Gln Ala Asn Tyr Asn Ile Leu Ser Asn Val Ile Val Gly Ala Gly Thr Arg Tyr Asp Ile Tyr Thr Leu Leu Asp Lys Asn Gly Arg Thr His Val Thr Ser Gly Phe Ser Pro Ser Ala Thr Val Leu Tyr Asn Pro Ile Glu Ser Ile Gly Leu Lys Val Ser Tyr Ala Tyr Val Thr Lys Gly Ala Leu Pro Gly Asp Gly Val Leu Met Arg Asp Pro Thr 625 630 ' 635 Val Ile Tyr Gln Arg Asn Leu Arg Pro Ala Ile Gly Gln Asn Val Glu Phe Asn Val Asp Phe Asn Ser Lys Tyr Phe Asn Val Arg Gly Ala Ala Phe Tyr Gln Val Ile Asn Asn Phe Ile Asn Ser Tyr Gly Gln Asp Thr Ser Lys Asn Gly Gly Gly Asn Ala Thr Ala Lys Asn Met Ser Gly Asn Leu Pro Glu Thr Ile Asn Ile Tyr Gly Tyr Glu Val Ser Gly Asn Val WO

AAG ACT TGG

Arg Tyr Asn Phe Leu Gly PheSer ValAlaArg Ser Pro Lys Thr Trp AGG GCG GCA

Thr Ala Gly His Leu Leu AspThr TyrAlaLeu Ala Thr Arg Ala Ala AAT AAA AGG

Thr Gly Val Phe Ile Leu AlaAsp TyrAspVal Arg Trp Asn Lys Arg 755 760 765 ' ACT TCG TAT

Gly Leu Leu Thr Trp Leu ArgPhe ValThrAsn Met Tyr ' Thr Ser Tyr TAT CCG ATC

Glu Gly Ser Ile Tyr Tyr GlnTyr GlyLeuIle Lys His Tyr Pro Ile GGG AAT CCG

Lys Pro Tyr Gly Val His ValPhe IleAsnTrp Thr Pro Gly Asn Pro AAA AGG AAT

Ser Lys Trp Gln Gly Leu IleSer AlaValPhe Asn Ile Lys Arg Asn AAG CAA AGC

Leu Asn Gln Tyr Val Asp ThrSer ValPheGln Ala Ala Lys Gln Ser CCA ATC GCG

' Asp Ala Ala Ser Asp Met ProLys GlyLysArg Met Leu Pro Ile Ala CCT CGT TTC

Pro Ala Gly Phe Asn Ala PheGlu ValSerTyr Gln Pro Arg Phe CTTAGGATTT
CTTTTTGAAT

(2) INFORMATION ID N0:28:
FOR
SEQ

(i) CS: ' SEQUENCE
CHARACTERISTI

(A) LENGTH: ids amino ac (B) TYPE: amino acid (C) STRANDEDNESS:
single (D) TOPOLOGY:
linear (ii) MOLECULE TYPE: grotein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:28:

Met Leu Arg Asn Gln Phe Arg Ile Val Phe Val Ser Cys Ile Val Ala Ser Asn Leu Gln Ala Gln Glu Thr Thr His Thr Leu Gly Lys Val Thr Thr Lys Gly Glu Arg Thr Phe Glu Tyr Asn Asn Lys Met Tyr Ile Asp Arg Lys Glu Leu Gln Gln Arg Gln Ser Asn Gln Ile Arg Asp Ile Phe ' Arg Thr Arg Ala Asp Val Asn Val Ala Ser Gly Gly Leu Met Ala Gln Lys Ile Tyr Val Arg Gly Ile Glu Ser Arg Leu Leu Arg Val Thr Ile Asp Gly Val Ala Gln Asn Gly Asn Ile Phe His His Asp Ala Asn Thr Val Ile Asp Pro Asn Met Ile Lys Glu Val Glu Val Ile Lys Gly Ala Ala Asn Ala Ser Ala Gly Pro Gly Ala Val Ala Gly Lys Leu Ser Phe Thr Thr Ile Asp Ala Asn Asp Phe Leu Arg Lys Asn Gln Thr Tyr Gly Ala Lys Ala Glu Ala Ala Phe Tyr Thr Asn Phe Gly Tyr Arg Met Asn Ala Thr Ala Ala Tyr Arg Gly Lys Asn Trp Asp Ile Leu Ala Tyr Tyr Asn His Gln Asn Ile Phe Tyr Tyr Arg Asp Gly Asn Asn Ala Phe Arg Asn Val Phe His Pro Asn Tyr Asp Leu Gln Asp Pro Ser Asn Ser Asp Met Ser Val Gly Thr Pro Ser Glu Val Asn Ser Val Leu Ala Lys Ile Asn Gly Tyr Ile Asn Glu Thr Asp Ser Ile Ser Val Ser Tyr Asn Leu Thr Arg Asp Asn Ser Thr Arg Leu Leu Arg Pro Asn Thr Thr Ser Ala Leu Ser Lys Ala Asn Asp Pro Gly Ser Gln Pro Ala Pro Phe Val Ile Asp Phe Gly Lys Glu Leu Ala His Thr Ile Asn Phe Asn His Asn Leu Ser Leu Lys Tyr Lys His Glu Gly Gly Pro Asn Phe Asn Gln Pro Arg Val Glu Ser Thr Ala Phe Leu Gly Val Arg Gly Gly Asn Tyr Asn Pro Val Val Asn Pro Phe Ala Tyr Asn Ser Asn Glu Pro Ala Asn Pro Asp ~ Tyr Ile Pro Glu Val Lys Glu Trp Cys Asn Asn Pro Asp Asn Ile Ser Gln Cys Thr Gln Gly Ala Ile Arg Pro Ser Asn Gly Gly Tyr Gln Ile ' 370 375 380 Gly Tyr Gly Thr Pro Asn Ser Ile Asn Trp Gln Gly Thr Ser Asp Ser " 385 390 395 400 Ser Gly Gly Ala Gln Ala Gly Tyr Gly Gin Leu Asn Ala Ile Ser Thr Ser Ala Asn Val Tyr His Gly Leu Val Pro Lys Asn Pro Asp Tyr Asp Met Thr Pro Pro Asn Ala Gln Asn Pro Ser Ala Asn Asp Trp Thr Leu Gly Asn Ala Asp Ala Glu Gly Thr Leu Ala Arg Arg Ile Phe Leu Ile Asn Ser Gly Val Asn Phe Lys Val Thr His Pro Ile Ser Glu Asp Tyr Gly Asn Val Phe Glu Tyr Gly Met Ile Tyr Gln Asn Leu Ser Val Phe Ser Gly Leu Asp Lys Gly Lys Asn Gly Tyr Tyr Lys Asn Asn Ile Asp Pro Asn Asp Pro Asn Gly Pro Gly Leu Pro Tyr Arg His Tyr Tyr Thr Asp Gln Ser Ser Gln Tyr Pro Gln Asn Leu Asn Thr Pro Asn Pro Leu Tyr Arg Asn Met Pro Gln Asn Ser His Ala Ile Gly Asn Ile Ile Gly Gly Phe Met Gln Ala Asn Tyr Asn Ile Leu Ser Asn Val Ile Val Gly Ala Gly Thr Arg Tyr Asp Ile Tyr Thr Leu Leu Asp Lys Asn Gly Arg Thr His Val Thr Ser Gly Phe Ser Pro Ser Ala Thr Val Leu Tyr Asn Pro Ile Glu Ser Ile Gly Leu Lys Val Ser Tyr Ala Tyr Val Thr Lys Gly Ala Leu Pro Gly Asp Gly Val Leu Met Arg Asp Pro Thr Val Ile Tyr Gln Arg Asn Leu Arg Pro Ala Ile Gly Gln Asn Val Glu Phe Asn Val Asp Phe Asn Ser Lys Tyr Phe Asn Val Arg Gly Ala Ala Phe Tyr Gln Val Ile Asn Asn Phe Ile Asn Ser Tyr Gly Gln Asp Thr Ser Lys Asn Gly Gly Gly Asn Ala Thr Ala Lys Asn Met Ser Gly Asn Leu Pro Glu Thr Ile Asn Ile Tyr Gly Tyr Glu Val Ser Gly Asn Val Arg Tyr Lys Asn Phe Leu Gly Thr Phe Ser Val Ala Arg Ser Trp Pro Thr Ala Arg Gly His Leu Leu Ala Asp Thr Tyr Ala Leu Ala Ala Thr Thr Gly Asn Val Phe Ile Leu Lys Ala Asp Tyr Asp Val Arg Arg Trp Gly Leu Thr Leu Thr Trp Leu Ser Arg Phe Val Thr Asn Met Tyr Tyr Glu Gly Tyr Ser Ile Tyr Tyr Pro Gln Tyr Gly Leu Ile Lys Ile His Lys Pro Gly Tyr Gly Val His Asn Val Phe Ile Asn Trp Thr Pro Pro Ser Lys Lys Trp Gln Gly Leu Arg Ile Ser Ala Val Phe Asn Asn Ile Leu Asn Lys Gln Tyr Val Asp Gln Thr Ser Val Phe Gln Ala Ser Ala Asp Ala Pro Ala Ser Asp Met Ile Pro Lys Gly Lys Arg Met Ala Leu Pro Ala Pro Gly Phe Asn Ala Arg Phe Glu Val Ser Tyr Gln Phe (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...317 (D) OTHER INFORMATION:
(xi) DESCRIPTION: N0:29:
SEQUENCE SEQ ID

CCCGTTACAG ATG
CATCACTGAA CGT
ATCACTAATA

Met Arg AAG GTT'TTATAC CTT GTG GGC TTGTTGGCT TTTAGCGCT TTA 104 GCT TTT

Lys ValLeu Tyr Leu Val Gly LeuLeuAla PheSerAla Leu Ala Phe TTT GCG

Lys AlaAsp Asp Leu Glu Glu AsnGluThr AiaProAla His Phe Ala ATG AAC

Leu AsnHis Pro Gln Asp Leu AlaIleGln GlySerPhe Phe Met Asn TCA AAC

Asp LysAsn Arg Lys Met Ser ThrLeuAsn IleAspTyr Phe Ser Asn TAT CTT

Gln GlyGln Thr Lys Ile Pro AlaLeuCys AspGlyXaa Leu Tyr Leu AAA ATAAGGTGGG
TTTT

Ile ValPhe Phe Thr His Lys TTTTAGAAA

(2) INFORMATION N0:30:
FOR
SEQ
ID

( i) CHARACTERISTICS:
SEQUENCE

(A) LENGTH:89 amino acids (B) TYPE:
amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE:.internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
Met Arg Lys Val Leu Tyr Ala Leu Val Gly Phe Leu Leu Ala Phe Ser Ala Leu Lys Ala Asp Asp Phe Leu Glu Glu Ala Asn Glu Thr Ala Pro Ala His Leu Asn His Pro Met Gln Asp Leu Aan Ala Ile Gln Gly Ser Phe Phe Asp Lys Asn Arg Ser Lys Met Ser Asn Thr Leu Asn Ile Asp Tyr Phe Gln Gly Gln Thr Tyr Lys Ile Pro Leu Ala Leu Cys Asp Gly Xaa Leu Ile Val Phe Phe Lys Thr His (2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 357 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...305 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:

Met Lys Lys Val Phe Leu Gly Met Ala Leu Ala Phe Ser Val Ser Met Ala Glu Lys Ser Gly Ala Phe Leu Gly Gly Gly Phe Gln Tyr Ser Asn Leu Glu AAC CAA AAC ACC ACC CGC ACC.CCA GGC GCT AAC AAT AAC ACC CCG ATA 200 Asn Gln Asn Thr Thr Arg Thr Pro Gly Ala Asn Asn Asn Thr Pro Ile Asp Thr Ser Met Phe Gly Ser Asn Lys Thr Ala Pro Ala Gln Glu Thr Gln Ser Ala Ser Lys Pro Asp Thr Lys Val Asn Pro Ser Ala Ser Trp Met Lys Lys ~ 85 (2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 85 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
Met Lys Lys Val Phe Leu Gly Met Ala Leu Ala Phe Ser Val Ser Met Ala Glu Lys Ser Gly Ala Phe Leu Gly Gly Gly Phe Gln Tyr Ser Asn Leu Glu Asn Gln Asn Thr Thr Arg Thr Pro Gly Ala Asn Asn Asn Thr Pro Ile Asp Thr Ser Met Phe Gly Ser Asn Lys Thr Ala Pro Ala Gln Glu Thr Gln Ser Ala Ser Lys Pro Asp Thr Lys Val Asn Pro Ser Ala Ser Trp Met Lys Lys (2) INFORMATION FOR SEQ ID N0:33:
' (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 961 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/ICEY: Coding Sequence (B) LOCATION: 52...908 _ CA 02286306 1999-10-O1 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION:
SEQ ID N0:33:

GAATGTAGCA TTTAGAACTC GAAGGAATAC ATG 56 .
AAGTAGAGAA AAG
AATGTAGAAG

Met Lys TCT GCT GCA AGC

Lys Val Ile Val Gly IleSer Leu MetThr Leu Leu Ser Ala Ala Ser GCA CAA ATT AGC

Ser Glu Thr Pro Lys GluLys Ala LysThr Pro Thr Ala Gln Ile Ser AAA GCT GGG TAC

Lys Gly Glu Arg Asn AlaPhe Ile IleAsp Gln Leu Lys Ala Gly Tyr ATG GCT TCC AAT

Gly Leu Ser Thr Thr GlnAsn Cys HisGly Cys Asn Met Ala Ser Asn AAT TAC ACG ATG

Gly Gln Ser Gly Ala GlySer Asn ProAsn Pro Thr Asn Tyr Thr Met TCA GGG GGC GGG

Ala Asn Pro Thr Gly PheThr His AlaLeu Thr Arg Ser Gly Gly Gly TAT AAC GCT GGT

Gly Lys Gay Leu Ser GlnGln Tyr IleAsn Phe Gly Tyr Asn Ala Gly GTT CAT AAA CAA

Phe Val Gly Tyr Lys PhePhe Lys SerPro Phe Gly Val His Lys Gln CGT TTT AGC TAT

_ Met Tyr Tyr Gly Phe AspPhe Ala SerTyr Lys Tyr Arg Phe Ser Tyr ACT GGC GCT GGT

Tyr Tyr Asn Asp Tyr MetArg Asp ArgLys Ser Gln Thr Gly Ala Gly _ TTC GGG GAT TTT

Ser Met Phe Gly Tyr AlaGly Thr ValLeu Asn Pro Phe Gly Asp Phe Ala Ile Phe Asn Arg Glu Asn Leu His Phe Gly Phe Phe Leu Gly Val Ala Ile Gly Gly Thr Ser Trp Gly Pro Thr Asn Tyr Tyr Phe Lys Asp Leu Ala Asp Glu Tyr Arg Gly Ser Phe His Pro Ser Asn Phe Gln Val ~ TTA GTT AAT GGT GGG ATT CGC TTA GGC ACT AAA CAC CAA GGT TTT GAA 776 Leu Val Asn Gly Gly Tle Arg Leu Gly Thr Lys His Gln Gly Phe Glu Ile Gly Leu Lys Ile Gln Thr Ile Arg Asn Asn Tyr Tyr Thr Ala Ser Ala Asp Asn Val Pro Glu Gly Thr Thr Tyr Arg Phe Thr Phe His Arg Pro Tyr Ala Phe Tyr Trp Arg Tyr Ile Val Ser Phe (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 286 amino acids (B) TYPE: amino acid {C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
Met Lys Lys Ser Val Ile Val Gly Ala Ile Ser Leu Ala Met Thr Ser ' 1 5 10 15 Leu Leu Ser Ala Glu Thr Pro Lys Gln Glu Lys Ala Ile Lys Thr Ser Pro Thr Lys Lys Gly Glu Arg Asn Ala Ala Phe Ile Gly Ile Asp Tyr Gln Leu Gly Met Leu Ser Thr Thr Ala Gln Asn Cys Ser His Gly Asn Cys Asn Gly Asn Gln Ser Gly Ala Tyr Gly Ser Asn Thr Pro Asn Met Pro Thr Ala Ser Asn Pro Thr Gly Gly Phe Thr His Gly Ala Leu Gly Thr Arg Gly Tyr Lys Gly Leu Ser Asn Gln Gln Tyr Ala Ile Asn Gly Phe Gly Phe Val Val Gly Tyr Lys His Phe Phe Lys Lys Ser Pro Gln .Phe Gly Met Arg Tyr Tyr Gly Phe Phe Asp Phe Ala Ser Ser Tyr Tyr Lys Tyr Tyr Thr Tyr Asn Asp Tyr Gly Met Arg Asp Ala Arg Lys Gly Ser Gln Ser Phe Met Phe Gly Tyr Gly Ala Gly Thr Asp Val Leu Phe Asn Pro Ala Ile Phe Asn Arg Glu Asn Leu His Phe Gly Phe Phe Leu '.

Gly Val Ala Ile Gly Gly Thr Ser Trp Gly Pro Thr Asn Tyr Tyr Phe Lys Asp Leu Ala Asp Glu Tyr Arg Gly Ser Phe His Pro Ser Asn Phe Gln Val Leu Val Asn Gly Gly Ile Arg Leu Gly Thr Lys His Gln Gly Phe Glu Ile Gly Leu Lys Ile Gln Thr Ile Arg Asn Asn Tyr Tyr Thr Ala Ser Ala Asp Asn Val Pro Glu Gly Thr Thr Tyr Arg Phe Thr Phe His Arg Pro Tyr Ala Phe Tyr Trp Arg Tyr Ile Val Ser Phe (2) INFORMATION FOR SEQ ID N0:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 289 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...236 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:35:

Met Phe Arg Asp Ile Val Asp Ile Leu Ile Ser Val Val Ile Ile Gly Leu Val Leu Thr Ala Ile Arg Ala Thr Ile Met Ala Phe Lys Gly Asp Thr Asp Asp Asp Glu Val Glu Ser Asp Gly Phe Phe Ser Arg Ile Trp Asp Lys Phe Val Glu Tyr Phe Gly Tyr Thr Leu Val Thr Ile ' 55 60 (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 62 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
Met Phe Arg Asp Ile Val Asp Ile Leu Ile Ser Val Val Ile Ile Gly Leu Val Leu Thr Ala Ile Arg Ala Thr Ile Met Ala Phe Lys Gly Asp Thr Asp Asp Asp Glu Val Glu Ser Asp Gly Phe Phe Ser Arg Ile Trp Asp Lys Phe Val Glu Tyr Phe Gly Tyr Thr Leu Val Thr Ile (2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1544 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 52...1491 '1 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:

Met Pro ATT ATT ATA AGG TTG ATT
TCT
TTG

PheCys PheIleLeu IleSer GlyVal Val Glu Ile Ile Leu Arg Leu TAT TTT CTT TCT

_ LysLys PheSerTyr SerLeu PheLeu Phe Ser Leu Tyr Phe Leu Ser TCC AAA ATG ATT

PheLeu LysLeuGln AlaTyr PheAsn Ser Val Gly Ser Lys Met Ile AGC GGC AAC AGA

LysVal SerTyrThr LysPhe PheAsn Gln Tyr Gln Ser Gly Asn Arg AAA GGT ACT TTA

ProSer AspIleTyr ProThr SerTyr Ser Leu Gly Lys Gly Thr Leu AAT TAC TTG GCG

GluLeu LeuSerMet GlyLeu LysGly Arg Glu Val Asn Tyr Leu Ala 85 90 g5 ATG TAT ACC TAT

GlyAla MetAlaAla LeuPro AspSer Ala Gln Gly Met Tyr Thr Tyr ATC GGC ACC CCT

AsnAsn ProAsnGly GlnPro SerArg Asp Phe Gly Ile Gly Thr Pro ATC GGT GCG CAT

AlaGly PheTrpGln TyrIle TrpTyr Gly Ser Gly Ile Gly Ala His GTG GCC CAT GCT

LeuGln GlnLysPro ArgLeu MetVal Asn Phe Leu Val Ala His Ala AAC TTC GGC AAA

SerTyr TyrLysLys AspLys SerPhe Val Gly Gly Asn Phe Gly Lys GAC TGG TCT ACT

ArgTyr AlaGluGlu TyrAsp PheThr Tyr Gln Gly Asp Trp Ser Thr GGC GAC TTC GTG

ValGlu PheValLys TyrLys ThrArg Arg Met Tyr Gly Asp Phe Val Ser Asp Ala Arg Ala Ser Ala Ser Ser Asp Trp Phe Trp Tyr Phe Gly Arg Tyr Tyr Thr Ser Gly Lys Ala Leu Met Val Ala Asp Leu Lys Tyr Glu Lys Asp Asn Leu Lys Ile Asn Pro Tyr Phe Tyr Ala Ile Phe Gln Arg Met Tyr Ala Pro Gly Ile Asn Ile Thr Tyr Asp Thr Asn Pro Asn Phe Asn Asn Lys Gly Phe Arg Phe Val Gly Thr Phe Val Gly Phe Phe Pro Ile Phe Ala Thr Pro Ala Asn Gln Asn Asp Ile Ile Leu Phe Gln Gln Val Pro Leu Gly Lys Ser Gly Gln Thr Tyr Phe Phe Arg Thr Arg Phe Tyr Tyr Asn Lys Trp Gln Phe Gly Gly Ser Val Tyr Lys Asn Ile Gly Asn Ala Asn Gly Asp Ile Gly Ile Tyr Gly Asp Pro Leu Gly Tyr 340 345 ' 350 Asn Ile Trp Thr Asn Ser Ile Tyr Asp Ala Glu Ile Asn Asn Ile Val Gly Ala Asp Val Ile Asn Gly Phe Leu Tyr Val Gly Ser Gln Tyr Arg ' GGG TTT AGT TGG AAA ATT TTA GGC CGT TGG ACG GAT AGC CCA AGG GCT 1257 Gly Phe Ser Trp Lys Ile Leu Gly Arg Trp Thr Asp Ser Pro Arg Ala r 390 395 400 Asp Glu Arg Ser Leu Ala Leu Phe Leu Ser Tyr Phe Ser Asn Lys Tyr Asn Ile Arg Met Asp Leu Lys Leu Glu Tyr Tyr Gly Asn Ile Thr Lys TAT GTT

LysGlyTyr CysIleGly CysGly MetTyr ProValAsp Pro Tyr Val CCT GTG

AsnGlyPro GlyThrGln LeuThr HisAsn TyrSerAsp Arg Pro Val ATT AGG

SerHisIle MetPheAsn AlaTyr GlyPhe IleTyr Ile Arg (2) INFORMATION FOR SEQ ID N0:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 480 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:38:
Met Pro Phe Cys Ile Phe Ile Leu Ile Ser Leu Gly Val Arg Val Leu Glu Ile Lys Lys Tyr Phe Ser Tyr Ser Leu Phe Phe Leu Leu Phe Ser Ser Leu Phe Leu Ser Lys Leu Gln Ala Tyr Lys Phe Asn Met Ser Ile Val Gly Lys Val Ser Ser Tyr Thr Lys Phe Gly Phe Asn Asn Gln Arg Tyr Gln Pro Ser Lys Asp Ile Tyr Pro Thr Gly Ser Tyr Thr Ser Leu Leu Gly Glu Leu Asn Leu Ser Met Gly Leu Tyr Lys Gly Leu Arg Ala Glu Val Gly Ala Met Met Ala Ala Leu Pro Tyr Asp Ser Thr Ala Tyr Gln Gly Asn Asn Ile Pro Asn Gly Gln Pro Gly Ser Arg Thr Asp Pro Phe Gly Ala Gly Ile Phe Trp Gln Tyr Ile Gly Trp Tyr Ala Gly His 130 135 140 ' Ser Gly Leu Gln Val Gln Lys Pro Arg Leu Ala Met Val His Asn Ala Phe Leu Ser Tyr Asn Tyr Lys Lys Asp Lys Phe Ser Phe Gly Val Lys Gly Gly Arg Tyr Asp Ala Glu Glu Tyr Asp Trp Phe Thr Ser Tyr Thr Gln Gly Val Glu Gly Phe Val Lys Tyr Lys Asp Thr Arg Phe Arg Val Met Tyr Ser Asp Ala Arg Ala Ser Ala Ser Ser Asp Trp Phe Trp Tyr WO 98!43478 PCT/US98/06371 Phe Gly Arg Tyr Tyr Thr Ser Gly Lys Ala Leu Met Val Ala Asp Leu Lys Tyr Glu Lys Asp Asn Leu Lys Ile Asn Pro Tyr Phe Tyr Ala Ile Phe Gln Arg Met Tyr Ala Pro Gly Ile Asn Ile Thr Tyr Asp Thr Asn Pro Asn Phe Asn Asn Lys Gly Phe Arg Phe Val Gly Thr Phe Val Gly Phe Phe Pro Ile Phe Ala Thr Pro Ala Asn Gln Asn Asp Ile Ile Leu Phe Gln Gln Val Pro Leu Gly Lys Ser Gly Gln Thr Tyr Phe Phe Arg Thr Arg Phe Tyr Tyr Asn Lys Trp Gln Phe Gly Gly Ser Val Tyr Lys Asn Ile Gly Asn Ala Asn Gly Asp Ile Gly Ile Tyr Gly Asp Pro Leu Gly Tyr Asn Ile Trp Thr Asn Ser Ile Tyr Asp Ala Glu Ile Asn Asn Ile Val Gly Ala Asp Val Ile Asn Gly Phe Leu Tyr Val Gly Ser Gln Tyr Arg Gly Phe Ser Trp Lys Ile Leu Gly Arg Trp Thr Asp Ser Pro Arg Ala Asp Glu Arg Ser Leu Ala Leu Phe Leu Ser Tyr Phe Ser Asn Lys Tyr Asn Ile Arg Met Asp Leu Lys Leu Glu Tyr Tyr Gly Asn Ile Thr Lys Lys Gly Tyr Cys Ile Gly Tyr Cys Gly Met Tyr Val Pro Val Asp Pro Asn Gly Pro Gly Thr Gln Pro Leu Thr His Asn Val Tyr Ser Asp Arg Ser His Ile Met Phe Asn Ile Ala Tyr Gly Phe Arg Ile Tyr (2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 658 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single {D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
~ {A) NAME/KEY: Coding Sequence (B) LOCATION: 51...605 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:39:

Met Arg WO 98!43478 PCT/US98/06371 AAG TTT TTT ATC GTT ATT

IleLysAla Tyr Leu Arg IleAlaLeuVal Leu Val Leu Phe Phe Ile GCT AAT GAT

LeuGlyPhe Ser Cys Lys SerGlnLysSer Gln Ser Gln Ala Asn Asp _ AACAATACC CCC CAA GAT CCTAAAACCTAC ACC ATG GAT 200 ' CAA AGC GCT

AsnAsnThr Pro Gln Asp ProLysThrTyr Thr Met Asp Gln Ser Ala GAA ATC TCT

LeuAsnAsn Gln Tyr Thr ThrGlyAspLeu Asp Leu Asn Glu Ile Ser TCC CCT AGC

IleSerPro Asp Asn Thr ThrLeuLeuVal Leu Ala Leu Ser Pro Ser AAA GCC TTA

AspAsnSer Leu Asp Tyr ProSerPheAsn Ile Lys Lys Lys Ala Leu CGT GTG AAA

ThrPheLys Asp Leu Arg LeuIleLeuLeu Asn Pro Tyr Arg Val Lys ATC TTT GCT

SerSerAsp Ala Lys Asp SerAlaHisPhe Gln Asp Leu Ile Phe Ala CCT ACC TTA

MetIleLeu Asn Lys Asp AlaLeuPheAsp His Lys Tyr Pro Thr Leu CAT AAC AAA

AspAlaLeu Asn Ser Phe MetLeuLeuTyr His His Gln His Asn Lys TAT ATC CTC

LeuIleLys Met Gln Gly ValProIleGlu Met Gln Phe Tyr Ile Leu 165 170 175 ' TTA TAAAAAAAAC
CATGTTTAAT
TTTTTCAAAA
AAAT

AspIleSer Asn Lys Asp ' Leu TGTCAATAAA TTAAGGGT
A

(2) INFORMATION ID N0:40:
FOR
SEQ

(i ) CS:
SEQUENCE
CHARACTERISTI

(A) LENGTH: 185 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal ' (xi) SEQUENCE DESCRIPTION: SEQ ID N0:40:
Met Arg Ile Lys Ala Tyr Phe Leu Arg Phe Ile Ala Leu Val Leu Ile Val Leu Leu Gly Phe Ser Ala Cys Lys Asn Ser Gln Lys Ser Gln Asp Ser Gln Asn Asn Thr Pro Gln Gln Asp Ser Pro Lys Thr Tyr Thr Ala Met Asp Leu Asn Asn Gln Glu Tyr Thr Ile Thr Gly Asp Leu Asp Ser Leu Asn Ile Ser Pro Asp Ser Asn Thr Pro Thr Leu Leu Val Leu Ser Ala Leu Asp Asn Ser Leu Lys Asp Tyr Ala Pro Ser Phe Asn Ile Leu Lys Lys Thr Phe Lys Asp Arg Leu Arg Val Leu Ile Leu Leu Asn Lys Pro Tyr Ser Ser Asp Ala Ile Lys Asp Phe Ser Ala His Phe Gln Ala Asp Leu Met Ile Leu Asn Pro Lys Asp Thr Ala Leu Phe Asp His Leu Lys Tyr Asp Ala Leu Asn His Ser Phe Asn Met Leu Leu Tyr His Lys His Gln Leu Ile Lys Met Tyr Gln Gly Ile Val Pro Ile Glu Met Leu Gln Phe Asp Ile Ser Asn Leu Lys Asp (2) INFORMATION FOR SEQ ID N0:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 460 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence ' (B) LOCATION: 51...407 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:

AAAATGATAT
AATAGACTTG
ATGAACTCAT

Met Pro CGT GCC GGT AAT

Met Leu His Thr Phe Phe IleAsnSer LeuLeuVal Ala Arg Ala Gly CTT GGT CTC

Ser Leu Ile Ser Cys Ser PheLysLys ArgAsnThr Asn Leu Gly Leu CAG CCT AAT

Ala Leu IIe Pro Ser Ala GlyLeuGIn AlaProIle Tyr Gln Pro Asn CCA ACC AAG

Pro Thr Asn Phe Pro Arg SerIleGln ProLeuPro Ser Pro Thr Lys CGC AAC CCC

Pro Leu Glu Asn Asp Gln ValIleSer SerAsnPro Thr Arg Asn Pro ~5 80 GCT ACC CTC

Asn Ile Pro Asn Pro Ile ThrProAsn AsnValIle Glu Ala Thr Leu AAC TGG CTC

Leu Ala Trp Ala Ala Trp GlnAsnPro ProPheHis Pro Asn Trp Leu AAG TAGCCAAGCG
GGCGGCTATC
GTTGATGGCT
ACCGCCAGTT
G

Leu Pro Trp Leu Lys GGTGAAAAAA
TG

(2) INFORMATIONFOR SEQ N0:42:
ID

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: amino acids (B) TYPE: aminoacid (C) STRANDEDNESS: single (D) TOPOLOGY: near li (i i) : protein MOLECULE
TYPE

(v) internal FRAGMENT
TYPE:

(xi) EQUENCE ID N0:42:
S DESCRIPTION:
SEQ

Met Met Arg Leu Thr Ala PheGlyIle AsnSerLeu Leu Pro His Phe Val Ser Leu Leu Ser Gly SerLeuPhe LysLysArg Asn Ala Ile Cys Thr Asn Ala Gln Leu Ile Pro Pro Ser Ala Asn Gly Leu Gln Aia Pro Ile Tyr Pro Pro Thr Asn Phe Thr Pro Arg Lys Ser Ile Gln Pro Leu Pro Ser Pro Arg Leu Glu Asn Asn Asp Gln Pro Val Ile Ser Ser Asn Pro Thr Asn Ala Ile Pro Asn Thr Pro Ile Leu Thr Pro Asn Asn Val ' Ile Glu Leu Asn Ala Trp Ala Trp Ala Trp Leu Gln Asn Pro Pro Phe His Pro Leu Lys Pro Trp Leu (2) INFORMATION FOR SEQ ID N0:43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1285 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1232 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:

Met Lys Glu Thr Arg Leu Leu Lys Leu Arg Ala Leu Ser Leu Ala Cys Leu Met Gly Leu Gly Val Ser Gly Cys Ala Phe Leu Asp Lys Gln Ile Leu Asn Asp His Leu Thr Lys Ala Lys Asn Asn Pro Lys Tyr Asp Cys Gln Lye ' 35 40 45 50 Glu Met Trp Ser Phe Pro Lys Lys Tyr Asp Gly Ile Asn Gln Cys Leu Lys Ala Gln Glu Glu Leu Ile Glu Pro Ile Ile Thr Lys Lys Ile Asp TAT GAT GAA AAA
GAT

Gln GlnCys AspPhe ThrAsn GlyLeuLys Lys Cys Tyr Asp Glu Asp AAA GAT ACC CCC

Phe ArgAsn AlaTyr LeuAsn LeuLeuThr Ile Ile Lys Asp Thr Pro _ CAA CAAGAG CGTTTT AGCTGC GATTTCCAT CCA GAG 440 ' AAA CGT TCT AAC

Gln GlnGlu ArgPhe SerCys AspPheHis Pro Glu Lys Arg Ser Asn 115 120 125 130 ' AAA TGC AAC AAG

Leu GluGln MetAsp LysThr AlaTyrGlu Gln Lys Lys Cys Asn Lys CGA AGA GTG GCG

Asp GlnLys LeuIle AsnLeu GlnLeuGlu Phe Giu Arg Arg Val Ala GAA CAA ATT TTC

Lys TyrAla TyrLys ProTyr IleProTyr Thr Lys Glu Gln Ile Phe TGC AAT GCC AGA

Glu ValLys AlaPro HisLeu AsnLysGlu Leu Cys Cys Asn Ala Arg AAA CAT GAC AGC

Gln GluVal GluLys PheAsp ProTyrSer Ser Lys Lys His Asp Ser GAA AGCGTT TCGGCT ATTTCT TGCATTAAA GT"..' 728 TTG CAA TTT AAA GAT

Glu SerVal SerAla IleSer CysIleLys Val Asp Leu Gln Phe Lys AAA AAA AAT ATA

Ala LeuGlu AlaAla LeuMet GlyValTyr Ser Pro Lys Lys Asn Ile AAA ACC ACG AAT

Tyr LysSer HisCys GlnArg HisLeuGlu Lys Ser Lys Thr Thr Asn 245 250 255 ' AAA GCT CCT AAG

Leu GluIle LeuAsn MetAsn LysLeuGlu Gln Ser Lys Ala Pro Lys TTT GCG ATG GGG

Pro IleAsp AspLys MetAla GlnSerAla Leu Leu Phe Ala Met Gly AAG GGT TTT ATT

Arg Lys Asn Lys Gly Val Leu Ile Ala Phe Ala Thr Asp Ile Cys Met Glu Arg Asn Glu His Lys Lys Glu Glu Phe Ile Ser Leu Lys Asp Ser Cys Thr Gln Ser Gln Ala Lys Ile Tyr Asn Asn Lys Glu Arg Phe Asp x Lys Phe Ile Gln Asp Tyr Gln Lys Asp Leu Lys Thr Cys Leu Leu Asp Thr Ser Asn Thr Lys Glu Glu Val Glu Gln Asn Phe Ser Gln Cys Gln Lys Glu Gln Leu Arg Asp Asp Asn Lys Gly Leu Gly Phe Thr Leu Glu Glu Leu Val Lys Lys Tyr Ala Lys (2) INFORMATION FOR SEQ ID N0:44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 394 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:44:
Met Lys Glu Thr Arg Leu Leu Lys Leu Arg Ala Leu Ser Leu Ala Cys Leu Met Gly Leu Gly Val Ser Gly Cys Ala Phe Leu Asp Lys Gln Ile Leu Asn Asp His Leu Thr Lye Ala Lys Asn Asn Pro Lys Tyr Aep Cys ' 35 40 45 Gln Lys Glu Met Trp Ser Phe Pro Lys Lys Tyr Asp Gly Ile Asn Gln ' 50 55 60 Cys Leu Lys Ala Gln Glu Glu Leu Ile Glu Pro Ile Ile Thr Lys Lys Ile Asp Gln Tyr Gln Cys Asp Asp Phe Thr Asn Glu Gly Leu Lys Asp Lys Cys Phe Lys Arg Asn Asp Ala Tyr Leu Asn Thr Leu Leu Thr Pro Ile Ile Gln Lys Gln Glu Arg Arg Phe Ser Cys Ser Asp Phe His Asn Pro Glu Leu Lys Glu Gln Cys Met Asp Lys Thr Asn Ala Tyr Glu Lys Gln Lys Asp Arg Gln Lys Arg Leu Ile Asn Leu Val Gln Leu Glu Ala Phe Glu Lys Glu Tyr Ala Gln Tyr Lys Pro Tyr Ile Ile Pro Tyr Phe Thr Lys Glu Cys Val Lys Asn Ala Pro His Leu Ala Asn Lys Glu Arg Leu Cys Gln Lys Glu Val His Glu Lys Phe Asp Asp Pro Tyr Ser Ser Ser Lys Glu Leu 5er Val Gln Ser Ala Ile Ser Phe Cys Ile Lys Lys Val Asp Ala Lys Leu Glu Lys Ala Ala Leu Met Asn Gly Val Tyr Ile Ser Pro Tyr Lys Lys Ser Thr His Cys Gln Arg Thr His Leu Glu Asn Lys Ser Leu Lys Glu Ile Ala Leu Asn Met Asn Pro Lys Leu Glu Lys Gln Ser Pro Phe Ile Asp Ala Asp Lys Met Ala Met Gln Ser Ala Gly Leu Leu Arg Lys Asn Lys Gly Val Leu Ile Ala Phe Ala Thr Asp Ile Cys Met Glu Arg Asn Glu His Lys Lys Glu Glu Phe Ile Ser Leu Lys Asp Ser Cys Thr Gln Ser Gln Ala Lys Ile Tyr Asn Asn Lys Glu Arg Phe Asp Lys Phe Ile Gln Asp Tyr Gln Lys Asp Leu Lys Thr Cys Leu Leu Asp Thr Ser Asn Thr Lys Glu Glu Val Glu Gln Asn Phe Ser Gln Cys Gln Lys Glu Gln Leu Arg Asp Asp Asn Lys Gly Leu Gly Phe Thr Leu Glu Glu Leu Val Lys Lys Tyr Ala Lys (2) INFORMATION FOR SEQ ID N0:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 835 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic RNA
(ix) FEATURE:
(A) NAME/IG;Y: Coding Sequence (B) LOCATION: 84...704 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:

Met Leu Arg Val Leu Ser Val Gly Val Ala - Phe Ile Leu Leu Gly Cys Gln Phe Phe Asn Lys Thr Thr Leu His Leu Lys Tyr Lys Asp Tyr Pro Lys Asn Ser Ala Leu Lys Thr Ala Phe Thr 30 35 40 ' Leu Thr Pro Pro Lys Ile Phe Phe Asn Ala Arg Phe Val Pro Pro Phe Tyr Gln Lys Glu Phe Lys Lys Ala Ile Thr Gln Gln Ile Ala Tyr Phe Leu Lys Asp Lys Ser Ala Phe Ile Leu Asn Val Ser Gly Asn Val Phe Phe Ser Phe Glu Glu Asn Pro Lys Asp Leu Lys Ala Ile Lys Glu Arg Leu Lys Lys Thr Ile Glu Pro Asn Ala Asp Pro Lys Ala Val Met Arg Phe Leu Asn Leu Gln Ala Ser Leu Ile Leu Glu Cys Val Pro Gln Thr Thr Cys Pro Phe Asp Thr Leu Leu Ile Pro Thr Ala Phe Ser Val Pro Val Tyr Tyr Ala Asn Arg Leu Gly Asp Asn Pro Ser Leu Phe Ser Gln ' GAG GAT AAA ACC TAT CAT AAC GCT TTG ATC AAA GCC CTT AAT AAG GCT 641 Glu Asp Lys Thr Tyr His Asn Ala Leu Ile Lys Ala Leu Asn Lys Ala Tyr Tyr Ser Leu Met Glu Gly Leu Glu Lys Arg Leu Asn Ala Ile Lys WO 98/434'18 PCT/US98/06371 Asn Ala Glu Trp Leu GTTTTtGGTA GGGATTTATT TAATTTGGCA TGTTTTATTG GAAAAAGCCC TAGAATTGAA 805 (2) INFORMATION FOR SEQ ID N0:46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 207 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single .
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:
Met Leu Arg Val Leu Ser Val Gly Val Ala Phe Ile Leu Leu Gly Cys Gln Phe Phe Asn Lys Thr Thr Leu His Leu Lys Tyr Lys Asp Tyr Pro Lys Asn Ser Ala Leu Lys Thr Ala Phe Thr Leu Thr Pro Pro Lys Ile Phe Phe Asn Ala Arg Phe Val Pro Pro Phe Tyr Gln Lys Glu Phe Lys Lys Ala Ile Thr Gln Gln Ile Ala Tyr Phe Leu Lys Asp Lys Ser Ala Phe Ile Leu Asn Val Ser Gly Asn Val Phe Phe Ser Phe Glu Glu Asn Pro Lys Asp Leu Lys Ala Ile Lys Glu Arg Leu Lys Lys Thr Ile Glu Pro Asn Ala Asp Pro Lys Ala Val Met Arg Phe Leu Asn Leu Glr.. Ala Ser Leu Ile Leu Glu Cys Val Pro Gln Thr Thr Cys Pro Phe Asp Thr Leu Leu Ile Pro Thr Ala Phe Ser Val Pro Val Tyr Tyr Ala Asn Arg Leu Gly Asp Asn Pro Ser Leu Phe Ser Gln Glu Asp Lys Thr Tyr His Asn Ala Leu Ile Lys Ala Leu Asn Lys Ala Tyr Tyr Ser Leu Met Glu Gly Leu Glu Lys Arg Leu Asn Ala Ile Lys Asn Ala Glu Trp Leu (2) INFORMATION FOR SEQ ID N0:47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 763 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...710 (D) OTHER INFORMATION:
(xi) DESCRIPTION: SEQID N0:47:
SEQUENCE

CATGAGTTAT ATG
TCAAAAATTT CGT
AACTTTATAA

Met Arg Leu LysHis PheLys ThrPheLeu PheIleThr MetAlaIle IleVal Ile GlyThr GlyCys AlaAsnLys LysLysLys LysAspGlu TyrAsn Lys ProAla IlePhe TrpTyrGln GlyIleLeu ArgGluIle LeuPhe Ala AsnLeu GluThr AlaAspAsn TyrTyrSer SerLeuGln SerGlu His IleAsn SerPro LeuValPro GluAlaMet LeuAlaLeu GlyGln 70 75 ' 80 Ala HisMet LysLys LysGluTyr ValLeuAla SerPheTyr PheAsp Glu TyrIle LysArg PheGlyThr LysAspAsn ValAspTyr LeuThr Phe LeuLys LeuGln SerHisTyr TyrAlaPhe LysAsnHis SerLys Asp GlnGlu PheIle SerAsnSer IleValSer LeuGlyGlu PheIle Glu LysTyr ProAsn SerArgTyr ArgProTyr ValGluTyr MetGln AAA GGG AAT GCG ATC

IleLys PheIleLeu GlyGln GluLeu Asn Arg Ile Ala Asn Asn Ala CCT CGC

ValTyr LysLysArg HisLys GluGly Val Lys Tyr Leu Glu Pro Arg AAA AAA

ArgIle AspGluThr LeuGlu GluThr Lys Pro Pro Ser His Lys Lys TTT CAAAACCATA
CAC

MetPro TrpTyrVal LeuIle AspTrp Phe (2) INFORMATION FOR SEQ ID N0:48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 220 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:48:
Met Arg Leu Lys His Phe Lys Thr Phe Leu Phe Ile Thr Met Ala Ile Ile Val Ile Gly Thr Gly Cys Ala Asn Lys Lys Lys Lys Lys Asp Glu Tyr Asn Lys Pro Ala Ile Phe Trp Tyr Gln Gly Ile Leu Arg Glu Ile Leu Phe Ala Asn Leu Glu Thr Ala Asp Asn Tyr Tyr Ser Ser Leu Gln Ser Glu His Ile Asn Ser Pro Leu Val Pro Glu Ala Met Leu Ala Leu Gly Gln Ala His Met Lys Lys Lys Glu Tyr Val Leu Ala Ser Phe Tyr Phe Asp Glu Tyr Ile Lys Arg Phe Gly Thr Lys Asp Asn Val Asp Tyr Leu Thr Phe Leu Lys Leu Gln Ser His Tyr Tyr Ala Phe Lys Asn His Ser Lys Asp Gln Glu Phe Ile Ser Asn Ser Ile Val Ser Leu Gly Glu Phe Ile Glu Lys Tyr Pro Asn Ser Arg Tyr Arg Pro Tyr Val Glu Tyr Met Gln Ile Lys Phe Ile Leu Gly Gln Asn.Glu Leu Asn Arg Ala Ile Ala Asn Val Tyr Lys Lys Arg His Lys Pro Glu Gly Val Lys Arg Tyr Leu Glu Arg Ile Asp Glu Thr Leu Glu Lys Glu Thr Lys Pro Lys Pro Ser His Met Pro Trp Tyr Val Leu Ile Phe Asp Trp (2) INFORMATION FOR SEQ ID N0:49:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 801 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 75...749 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49:

Met Arg Ala Thr Ala Ile Lys Ile Phe Ser Leu Ser Ser Ala Leu Ala Leu Leu Leu His Gly Cys Leu Ser Ile Asn Leu Lys Gln Met Leu Pro Glu Ile Arg Thr Tyr Asp Leu Asn Ala Ser Ser Phe Glu Ile Thr Gln Cys Ala Lys Pro Leu Thr Glu Val Arg Leu Ile Ser Ile Leu Ser Ala Asp Leu Phe Asn Thr Lys Glu Ile Val Phe Lys Ala ~ 65 70 75 " Lys Asp Gly Gln Ile Thr His Gly Lys His Gln Lys Trp Ile Asp Leu Pro Arg Asn Met Leu Lys Thr Met Phe Met Gln Glu Ala Gln Lys Ala CysLeuGly AlaLeuPro ProTyrGly Ala Gly Ala Pro Thr Tyr Val TTT

AlaValArg ThrIleLeu SerPheSer Leu Leu Glu Lys Glu Asn Phe AGG

_ SerThrTyr AlaGluPhe AlaLeuGly Tyr Asp Ile Ser Val Lys Arg ATT AAG AAT ATT
TCT AGC

Gly Asp Ser His Ser Gly Val Ile His Glu Ile Ser Ser ' Ile Lys Asn AGT AAA AAT

Leu Glu Asn Lys Thr Thr Lys Thr Asn Gly Gln Asp Phe Ser Lys Asn CAA CAT GTG

Gln Glu Ser Ala Ile Gln Ser Leu Val Ser Gln Ala Ile Gln His Val AAA GCC GCG

Gln Glu Ala Val Ser Leu Ile Lys Ile Glu Gln Ser Val Lys Ala Ala GGAGGAATTG TTTGATTTTA G

Ser Pro Leu Lys Lys AGCAAGCGTT T

(2) INFORMATION FOR SEQ ID N0:50:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 225 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ N0:50:
ID

Met Arg Ala Thr Ala Ile Lys Ile Phe Ser Leu Ser Ser Ala Leu Ala Leu Leu Leu His Gly Cys Leu Ser Ile Asn Leu Lys Gln Met Leu Pro Glu Ile Arg Thr Tyr Asp Leu Asn Ala Ser Ser Phe Glu Ile Thr Gln Cys Ala Lys Pro Leu Thr Glu Val Arg Leu Ile Ser Ile Leu Ser Ala Asp Leu Phe Asn Thr Lys Glu Ile Val Phe Lys Ala Lys Asp Gly Gln Ile Thr His Gly Lys His Gln Lys Trp Ile Asp Leu Pro Arg Asn Met Leu Lys Thr Met Phe Met Gln Glu Ala Gln Lys Ala Cys Leu Gly Val Ala Leu Pro Pro Tyr Gly Ala Gly Ala Pro Thr Tyr Ala Val Arg Phe Thr Ile Leu Ser Phe Ser Leu Leu Glu Lys Glu Asn Ser Thr Tyr Arg ' 130 135 140 Ala Glu Phe Ala Leu Gly Tyr Asp Ile Ser Val Lys.Gly Asp Ser His ,, 145 150 155 160 Ser Gly Val Ile Ile Lys His Glu Asn Ile Ser Ser Leu Glu Asn Lys Thr Thr Lys Thr Ser Lys Asn Gly Asn Gln Asp Phe Gln Glu Ser Ala Ile Gln Ser Leu Gln His Val Sex Val Gln Ala Ile Gln Glu Ala Val Ser Leu Ile Lys Lys Ala Ile Glu Ala Gln Ser Val Ser Pro Leu Lys Lys (2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 448 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...395 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51:

Met Lys ' Ser Lys Ile Thr His Phe Ile Ala Ile Ser Phe Val Leu Ser Leu Phe Ser Ala Cys Lys Asp Glu Pro Lys Lys Ser Ser Gln Ser His Gln Asn Asn Thr IleThrLys Asn Asn Pro Ile Asn Gln Ala Asn Lys Asn Asp AAA AAA GAA

Ile Arg IleGluHis Glu Glu Glu Asp Glu Lys Ala Thr Lys Lys Glu GAT AAT AAT

_ Val Asn LeuIleAsn Asn Glu Asn Lys Ile Asp Glu Ile Asp Asn Asn AAC TTG CAA

Glu Glu AlaAspPro Ser Gln Lys Arg Thr Asn Asn Val Asn Leu Gln ACT AGG AAG

Arg Ala AsnHisGln Asp Asn Leu Asn Ser Pro Leu Asn Thr Arg Lys AACTTTTTTC
AAAGGATTTA
TTTAAAAAAG
TAACCCCTTT
ATT

Tyr (2) INFORMATION FOR SEQ ID N0:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 115 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:
Met Lys Ser Lys Ile Thr His Phe Ile Ala Ile Ser Phe Val Leu Ser Leu Phe Ser Ala Cys Lys Asp Glu Pro Lys Lys Ser Ser Gln Ser His Gln Asn Asn Thr Lys Ile Thr Lys Asn Asn Pro Ile Asn Gln Ala Asn Asn Asp Ile Arg Lys Ile Glu His Glu Glu Glu Asp Glu Lys Ala Thr Lys Glu Val Asn Asp Leu Ile Asn Asn Glu Asn Lys Ile Asp Glu Ile Asn Asn Glu Glu Asn Ala Asp Pro Ser Gln Lys Arg Thr Asn Asn Val Leu Gln Arg Ala Thr Asn His Gln Asp Asn Leu Asn Ser Pro Leu Asn Arg Lys Tyr (2) INFORMATION FOR SEQ ID N0:53:

WO 98/43478 PCTlUS98/06371 (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1121 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 121...1065 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:

AGTATTTCCA
ATATCCAATT
AGCCAATGAT
CTCAAAGATT
CTAATATTTT

CGTTTAATCA
TCCCCACCAA
CAAAAAATTA
CTCGCTACAA
GGGAATTTTA

TTG GCG
TTG GAA
AAA GTT
TGT TTT
TTA GGC
GTT ATT
TTT TTG

Met Gly Leu Ala Leu Glu Lys Val Cys Phe Leu Gly Val Ile Phe Leu TGC AAA GGG AAT

Ile Ser Ala ThrVal LysGlu Val Lys LeuSerTyr Cys Lys Gly Asn AGC GCT AAC GAT

Lys His Glu LeuArg TyrGlu Ala Lys TyrAspPro Ser Ala Asn Asp AAA TAT AAT GAA

Thr Thr Lys AlaAla LysArg Phe Phe ArgHisPhe Lys Tyr Asn Glu TCC CAA AAC GAT

Lys Arg Tyr AspSer AspSer Thr Lys GlnProLeu Ser Gln Asn Asp ATG TCT ATC GCC

Asp Asn Gly ArgAsp SerSer Gln Arg ThrMetArg Met Ser Ile Ala GTG AAG TAC AAA

Pro Tyr Gln GlyGly TrpTyr Pro Thr ValAspLeu Val Lys Tyr Lys Gly Glu Lys Phe Asp Gly Val Ala Ser Trp Tyr Gly Pro Asn Phe His Ala Lys Lys Thr Ser Asn Gly Glu Ile Tyr Asn Met Tyr Ala His Thr AAA AAC AAA AAT
ACT GTC

AlaAlaHis LysThrLeu ProMet AsnThrValVal LysValIle Asn ValAspAsn AsnLeuSer ThrIle ValArgIleAsn AspArgGly Pro PheValSer AspArgIle IleAsp LeuSerAsnAla AlaAlaArg Asp IleAspMet ValLysLys GlyThr AlaSerValArg LeuIleVal Leu GlyPheGly GlyValIle SerThr GlnTyrGluGln SerPheAsn Ala SerSerSer LysIleLeu HisLys GluPheLysVal GlyGluSer Glu LysSerVal SerGlyGly LysPhe SerLeuGlnMet GlyAlaPhe Arg AsnGlnIle GlyAlaGln ThrLeu AlaAspLysLeu GlnAlaGlu Asn ProAsnTyr SerValLys ValAla PheLysAspAsp LeuTyrLys Val LeuValGln GlyPheGln SerGlu GluGluAlaArg AspPheMet Lys ATTGCTTTTA

LysTyrAsn GlnAsnAla ValLeu ThrArgGlu GCACGCTCAC
AGACGGATCG

(2) INFORMATION ID N0:54:
FOR
SEQ

(i) CS:
SEQUENCE
CHARACTERISTI

(A) LENGTH: 315 amino ids ac (B) TYPE: acid amino (C) STRANDEDNESS:
single (D) TOPOLOGY:
linear (ii) TYPE: n MOLECULE protei (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:
Met Gly Leu Ala Leu Glu Lys Val Cys Phe Leu Gly Val Ile Phe Leu Ile Ser Ala Cys Thr Val Lys Lys Glu Gly Val Lys Asn Leu Ser Tyr Lys His Glu Ser Leu Arg Ala Tyr Glu Asn Ala Lys Asp Tyr Asp Pro Thr Thr Lys Lys Ala Ala Tyr Lys Arg Asn Phe Phe Glu Arg His Phe Lys Arg Tyr 5er Asp Ser Gln Asp Ser Asn Thr Lys Asp Gln Pro Leu Asp Asn Gly Met Arg Asp Ser Ser Ser Ile Gln Arg Ala Thr Met Arg Pro Tyr Gln Val Gly Gly Lys Trp Tyr Tyr Pro Thr Lys Val Asp Leu Gly Glu Lys Phe Asp Gly Val Ala Ser Trp Tyr Gly Pro Asn Phe His Ala Lys Lys Thr Ser Asn Gly Glu Ile Tyr Asn Met Tyr Ala His Thr Ala Ala His Lys Thr Leu Pro Met Asn Thr Val Val Lys Val Ile Asn Val Asp Asn Asn Leu Ser Thr Ile Val Arg Ile Asn Asp Arg Gly Pro Phe Val Ser Asp Arg Ile Ile Asp Leu Ser Asn Ala Ala Ala Arg Asp Ile Asp Met Val Lys Lys Gly Thr Ala Ser Val Arg Leu Ile Val Leu Gly Phe Gly Gly Val Ile Ser Thr Gln Tyr Glu Gln Ser Phe Asn Ala Ser Ser Ser Lys Ile Leu His Lys Glu Phe Lys Val Gly Glu Ser Glu Lys Ser Val Ser Gly Gly Lys Phe Ser Leu Gln Met Gly Ala Phe Arg Asn Gln Ile Gly Ala Gln Thr Leu Ala Asp Lys Leu Gln Ala Glu Asn Pro Asn Tyr Ser Val Lys Val Ala Phe Lys Asp Asp Leu Tyr Lys Val Leu Val Gln Gly Phe Gln Ser Glu Glu Glu Ala Arg Asp Phe Met Lys Lys Tyr Asn Gln Asn Ala Val Leu Thr Arg Glu (2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 811 base pairs ' (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...761 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:55:

AAATTCTAAA
CGAAAATTAA
ACTGAATGAA
AGGAGTTTGA

MetLys LysIle ValLeu ValAlaIle AlaLeu LeuMetSerAla CysAlaSer TyrLys IleThr ProGluHis ValThr SerTyrAsnAsn GlyIleGln ValMet ThrSer ThrGlnAla LysSer LysValGlnLeu GluIleAla GlnSer LysLeu LysGlyLeu AsnGlu SerProLeuVal LeuTyrVal AlaAla GlnVal IleGluGly SerPro ValValPheSer ArgLysAla IleSer ValSer IleAsnGln ThrAsn LeuProValLeu SerLeuArg GlnVal MetLys SerSerPhe AspPhe GluGlyIleLeu GlnSerPhe 100 lay 110 AsnIle AlaVal ProThrThr ProIle AspAsnValAsn MetIleThr ProPro MetPhe TyrTyrGly GlnGly GlyPheLeuAla TyrAsnGly MetMet TyrGly GlyMetGly MetTyr GlyProGlyPhe GlyMetMet MetMet AspAsp ValGluGlu GlnGlu ValMetGlnGlu SerArgGln Ala Leu Lys Ile Leu Ala Ile Asn Tyr Leu Lys Asn Asn Thr Leu Asn Val Glu Ser Lys Ala Lys Gly Gly Phe Val Val Val Asp Thr Lys Asn ' 195 200 205 210 ~ CTT AAA ACC CCG GGT GTG GTG GTG GTT AAA GTC TTT TTA GAA GAT GAA 728 Leu Lys Thr Pro Gly Val Val Val Val Lys Val Phe Leu Glu Asp Glu Ile His Thr Phe Lys Ile Asp Ile Ser Lys Met (2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 237 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:56:
Met Lys Lys Ile Val Leu Val Ala Ile Ala Leu Leu Met Ser Ala Cys Ala Ser Tyr Lys Ile Thr Pro Glu His Val Thr Ser Tyr Asn Asn Gly Ile Gln Val Met Thr Ser Thr Gln Ala Lys Ser Lys Val Gln Leu Glu Ile Ala Gln Ser Lys Leu Lys Gly Leu Asn Glu Ser Pro Leu Val Leu Tyr Val Ala Ala Gln Val Ile Glu Gly Ser Pro Val Val Phe Ser Arg Lys Ala Ile Ser Val Ser Ile Asn Gln Thr Asn Leu Pro Val Leu Ser Leu Arg Gln Val Met Lys Ser Ser Phe Asp Phe Glu Gly Ile Leu Gln ' Ser Phe Asn Ile Ala Val Pro Thr Thr Pro Ile Asp Asn Val Asn Met ' Ile Thr Pro Pro Met Phe Tyr Tyr Gly Gln Gly Gly Phe Leu Ala Tyr Asn Gly Met Met Tyr Gly Gly Met Gly Met. Tyr Gly Pro Gly Phe Gly Met Met Met Met Asp Asp Val Glu Glu Gln Glu Val Met Gln Glu Ser __ _ _ _ ~ _ Arg Gln Ala Leu Lys Ile Leu Ala Ile Asn Tyr Leu Lys Asn Asn Thr Leu Asn Val Glu Ser Lys Ala Lys Gly Gly Phe Val Val Val Asp Thr Lys Asn Leu Lys Thr Pro Gly Val Val Val Val Lys Val Phe Leu Glu Asp Glu Ile His Thr Phe Lys Ile Asp Ile Ser Lys Met (2) INFORMATION FOR SEQ ID N0:57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1425 base pairs (B) TYPE: nucleic acid (C} STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix} FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 97...1371 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: N0:57:
SEQ
ID

AAAATCAATC ATAGCTATAA
TAAAATTCTT

GTTTCAAAAT
TTAGCC

MetArgLeu LeuLeuPhe AsnGln AsnAla PheLeuLeu AlaCysMet PheValSer SerValTyr ValAsn AlaVal LeuAspAla TyrAlaIle GluAsnPro TyrIleSer IleThr LeuThr SerLeuLeu AlaProLeu SerMetLeu AlaPheLeu 40 45 50 ' LysThr ProArg AsnSerAla PheAlaLeu GlyPhePhe ValGlyAla LeuLeu PheTyr TrpCysAla LeuSerPhe ArgTyrSer AspPheThr Tyr Leu Leu Pro Leu Ile Ile Val Leu Ile Ala Leu Val Tyr Gly Val Leu Phe Tyr Leu Leu Leu Tyr Phe Glu Asn Pro Tyr Phe Arg Leu Leu Ser Phe Leu Gly Ser Ser Phe Ile His Pro Phe Gly Phe Asp Trp Leu Val Pro Asp Ser Phe Phe Ser Tyr Ser Val Phe Arg Val Asp Lys Leu Ser Leu Gly Leu Val Phe Leu Ala Cys Ile Phe Leu Ser Thr Lys Pro Leu Lys Lys Tyr Arg Ile Ile Gly Val Leu Leu Leu Leu Gly Ala Leu Asp Phe Asn Gly Phe Lys Thr Ser Asp Leu Lys Lys Val Gly Asn Ile Glu Leu Val Ser Thr Lys Thr Pro Gln Asp Leu Lys Phe Asp Ser Ser Tyr Leu Asn Asp Ile Glu Asn Asn Ile Leu Lys Glu Ile Lys Leu Ala Gln Ser Lys Gln Lys Thr Leu Ile Val Phe Pro Glu Thr Ala Tyr Pro Ile Ala Leu Glu Asn Ser Pro Phe Lys Ala Lys Leu Glu Asp Leu Ser Asp Asn Ile Ala Ile Leu Ile Gly Thr Leu Arg Thr Gln Gly Tyr Asn Leu Tyr Asn Ser Ser Phe Leu Phe Ser Lys Glu Ser Val Gln Ile Ala Asp Lys Val Ile Leu Ala Pro Phe Gly Glu Thr Met Pro Leu Pro Glu Phe Leu Gln Lys Pro Leu Glu Lys Leu Phe Phe Gly Glu Ser Thr Tyr Leu Tyr Arg Asn Ala Pro His Phe Ser Asp Phe Thr Leu Asp Asp Phe ACT TTT .CGC CCC CTG ATT TGC TAT GAA GGC ACT TCC AAA CCC GCT TAT 1170 Thr Phe Arg Pro Leu Ile Cys Tyr Glu Gly Thr Ser Lys Pro Ala Tyr 345 350 355 ' TCA AGC CCT TCA AAA ATT TTT GTGATG AGCAAT GCATGG 1218 ' AAC ATC AAC

SerAsnSer Pro Ser Lys Ile Phe ValMet SerAsn AlaTrp Ile Asn TTA TTA

PheSerPro Ser Ile Glu Pro Thr GlnArg ThrLeu LysTyr Leu Leu ATC AAC

TyrAlaArg Arg Tyr Asp Lys Ile LeuHis SerAla PheSer Ile Asn TTA CTT

ThrSerTyr Ile Leu Ser Pro Ser LeuGly AspIle PheArg Leu Leu TCTCATGCTT

LysArgSer TGGCG

(2) INFORMATION FOR SEQ N0:58:
ID

(i ) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 425 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (i i) MOLECULE
TYPE:
protein (v ) FRAGMENT
TYPE:
internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:58:
Met Arg Leu Leu Leu Phe Asn Gln Asn Ala Phe Leu Leu Ala Cys Met Phe Val Ser Ser Val Tyr Val Asn Ala Val Leu Asp Ala Tyr Ala Ile Glu Asn Pro Tyr Ile Ser Ile Thr Leu Thr.Ser Leu Leu Ala Pro Leu Ser Met Leu Ala Phe Leu Lys Thr Pro Arg Asn Ser Ala Phe Ala Leu Gly Phe Phe Val Gly Ala Leu Leu Phe Tyr Trp Cys Ala Leu Ser Phe Arg Tyr Ser Asp Phe Thr Tyr Leu Leu Pro Leu Ile Ile Val Leu Ile Ala Leu Val Tyr Gly Val Leu Phe Tyr Leu Leu Leu Tyr Phe Glu Asn Pro Tyr Phe Arg Leu Leu Ser Phe Leu Gly Ser Ser Phe Ile His Pro Phe Gly Phe Asp Trp Leu Val Pro Asp Ser Phe Phe Ser Tyr Ser Val Phe Arg Val Asp Lys Leu Ser Leu Gly Leu Val Phe Leu Ala Cys Ile Phe Leu Ser Thr Lys Pro Leu Lys Lys Tyr Arg Ile Ile Gly Val Leu ' 165 170 175 Leu Leu Leu Gly Ala Leu Asp Phe Asn Gly Phe Lys Thr Ser Asp Leu Lys Lys Val Gly Asn Ile Glu Leu Val Ser Thr Lys Thr Pro Gln Asp Leu Lys Phe Asp Ser Ser Tyr Leu Asn Asp Ile Glu Asn Asn Ile Leu Lys Glu Ile Lys Leu Ala Gln Ser Lys Gln Lys Thr Leu Ile Val Phe Pro Glu Thr Ala Tyr Pro Ile Ala Leu Glu Asn Ser Pro Phe Lys Ala Lys Leu Glu Asp Leu Ser Asp Asn Ile Ala Ile Leu Ile Gly Thr Leu Arg Thr Gln Gly Tyr Asn Leu Tyr Asn Ser Ser Phe Leu Phe Ser Lys Glu Ser Val Gln Ile Ala Asp Lys Val Ile Leu Ala Pro Phe Gly Glu Thr Met Pro Leu Pro Glu Phe Leu Gln Lys Pro Leu Glu Lys Leu Phe Phe Gly Glu Ser Thr Tyr Leu Tyr Arg Asn Ala Pro His Phe Ser Asp Phe Thr Leu Asp Asp Phe Thr Phe Arg Pro Leu Ile Cys Tyr Glu Gly Thr Ser Lys Pro Ala Tyr Ser Asn Ser Pro Ser Lys Ile Phe Ile Val Met Ser Asn Asn Ala Trp Phe Ser Pro Ser Ile Glu Pro Thr Leu Gln Arg Thr Leu Leu Lys Tyr Tyr Ala Arg Arg Tyr Asp Lys Ile Ile Leu His Ser Ala Asn Phe Ser Thr Ser Tyr Ile Leu Ser Pro Ser Leu Leu Gly Asp Ile Leu Phe Arg Lys Arg Ser (2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 766 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...713 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION:
SEQ
ID
N0:59:

TCAAATCTAT ATG
TAAAAGATAA TTT

Met Phe AGC TTA

SerLeu SerTyrVal SerLysLys PheLeu Val Leu Leu Ile Ser Leu AAT GAC

SerLeu PheLeuSer AlaCysLys SerAsn Lys Lys Leu Asp Asn Asp TCC GAA

GluAsn LeuLeuSer SerGlySer GlnSer Lys Leu Asn Asp Ser Glu GCT TTA

GluArg AspAsnIle AspLysLys SerTyr Gly Glu Asp Val Ala Leu GAT TAC

PheSer AspAsnLys SerIleSer ProAsn Lys Met Leu Leu Asp Tyr 70 75 ' 80 GAA TTT

ValPhe GlyArgAsn GlyCysSer TyrCys Arg Lys Lys Asp Glu Phe ATT GAG

LeuLys AsnValLys GluLeuArg AspTyr Lys His Phe Ser Ile Glu GAG GAT

AlaTyr TyrValAsn IleSerTyr SerLys His Phe Lys Val Glu Asp ATG ACA

GlyAsp LysAsnAsn GluLysGlu IleLys Ser Glu Glu Leu Met Thr ACG GTT

AlaGln IleTyrAla ValGlnSer ThrPro Ile Leu Ser Asp Thr Va2 Lys Thr Gly Lys Thr Ile Tyr Glu Leu Pro Gly Tyr Met Pro Ser Thr Gln Phe Leu Ala Val Leu Glu Phe Ile Gly Asp Gly Lys Tyr Gln Asp ' ACA AAA GAC GAT GAG GAT CTC ACT AAA AAA TTA AAG GCT TAC ATC AAG 680 Thr Lys Asp Asp Glu Asp Leu Thr Lys Lys Leu Lys Ala Tyr Ile Lys Tyr Lys Thr Asn Leu Ser Lys Ser Lys Ser Asn (2) INFORMATION FOR SEQ ID N0:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 221 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:
Met Phe Ser Leu Ser Tyr Val Ser Lys Lys Phe Leu Ser Val Leu Leu Leu Ile Ser Leu Phe Leu Ser Ala Cys Lys Ser Asn Asn Lys Asp Lys Leu Asp Glu Asn Leu Leu Ser Ser Gly Ser Gln Ser Ser Lys Glu Leu Asn Asp Glu Arg Asp Asn Ile Asp Lys Lys Ser Tyr Ala Gly Leu Glu Asp Val Phe Ser Asp Asn Lys Ser Ile Ser Pro Asn Asp Lys Tyr Met Leu Leu Val Phe Gly Arg Asn Gly Cys Ser Tyr Cys Glu Arg Phe Lys Lys Asp Leu Lys Asn Val Lys Glu Leu Arg Asp Tyr Ile Lys Glu His Phe Ser Ala Tyr Tyr Val Asn Ile Ser Tyr Ser Lys Glu His Asp Phe Lys Val Gly Asp Lys Asn Asn Glu Lys Glu Ile Lys Met Ser Thr Glu ~ Glu Leu Ala Gln Ile Tyr Ala Val Gln Ser Thr Pro Thr Ile Val Leu Ser Asp Lys Thr Gly Lys Thr Ile Tyr Glu Leu Pro Gly Tyr Met Pro Ser Thr Gln Phe Leu Ala Val.Leu Glu Phe Ile Gly Asp Gly Lys Tyr Gln Asp Thr Lys Asp Asp Glu Asp Leu Thr Lys Lys Leu Lys Ala Tyr Ile Lys Tyr Lys Thr Asn Leu Ser Lys Ser Lys Ser Asn (2) INFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 980 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single -(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 53...931 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:

Met Arg LeuLeuPhe LeuLeuLeu SerAlaAla PheMetLeu LeuAlaGlu Glu LysIleSer LeuAsnAsp AspAlaPro IleLysLeu ValHisTrp Gln AsnAlaLeu LysGluVal GlnProAsp SerAsnAla ProAlaThr Pro ProIleLys AlaValGln ThrThrLeu ThrPheGlu ThrProPhe Asn LysThrPro LysIleMet GluValGlu GlyGlnLys ValIleVal Leu Lys Asn Ala Lys Leu Asp Ser Lys Lys Thr Met Asp Phe Lys Glu Ala Ser Leu Asn Ala Leu Glu Met Phe Ser Tyr Gln Asn Asp Ile Tyr Leu Leu Ser Lys Lys Ala Lys Val Glu Leu Glu Ile Gln Ala Ser Asn Ser Lys Asp Lys Lys Arg Leu Arg Phe Leu Phe Leu Pro Lys Gly Phe His ' TTA GCC CCA CCG CCT AAC CTG AAA GAA AAA TCT CAG CAA ACT AAC CTT 538 Leu Ala Pro Pro Pro Asn Leu Lys Glu Lys Ser Gln Gln Thr Asn Leu Ala Gln Lys Asp Thr Asn Glu Gln Pro Gln Ser Pro Leu Asn Thr Leu Glu Leu Lys Pro Pro Leu Asn Leu Ser His Ala Tyr Lys Ala Leu Ala Val Ile Ala Ala Leu Leu Leu Ile Leu Tyr Val Ile Lys Lys Lys Ile Val Pro Thr Gln Gly Ser Phe Ser Ala Lys Asp Phe Lys Leu Glu Ile Ser Val Leu Gly Arg Val Asp Ala Asn His Lys Ile Ile Ser Ile Glu Thr Asn Lys Glu Arg Tyr Leu Val Leu Leu Ser Asp Lys Tyr Gly Leu Leu Leu Asp Lys Ile Ser Pro Lys Thr Ser Lys Glu Glu Leu Ile Lys Glu Ala Glu Asn Asn Ile Lys Asn Ser Lys Leu Gly Asn Leu Tyr Ala Gly Lys Phe l (2) INFORMATION FOR SEQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 293 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:62:
Met Arg Leu Leu Phe Leu Leu Leu Ser Ala Ala Phe Met Leu Leu Ala Glu Glu Lys Ile Ser Leu Asn Asp Asp Ala Pro Ile Lys Leu Val His Trp Gln Asn Ala Leu Lys Glu Val Gln Pro Asp Ser Asn Ala Pro Ala 35 40 45 -_ Thr Pro Pro Ile Lys Ala Val Gln Thr Thr Leu Thr Phe Glu Thr Pro Phe Asn Lys Thr Pro Lys Ile Met Glu Val Glu Gly Gln Lys Val Ile Val Leu Lys Asn Ala Lys Leu Asp Ser Lys Lys Thr Met Asp Phe Lys Glu Ala Ser Leu Asn Ala Leu Glu Met Phe Ser Tyr Gln Asn Asp Ile Tyr Leu Leu Ser Lys Lys Ala Lys Val Glu Leu Glu Ile Gln Ala Ser Asn Ser Lys Asp Lys Lys Arg Leu Arg Phe Leu Phe Leu Pro Lys Gly Phe His Leu Ala Pro Pro Pro Asn Leu Lys Glu Lys Ser Gln Gln Thr Asn Leu Ala Gln Lys Asp Thr Asn Glu Gln Pro Gln Ser Pro Leu Asn Thr Leu Glu Leu Lys Pro Pro Leu Asn Leu Ser His Ala Tyr Lys Ala Leu Ala Val Ile Ala Ala Leu Leu Leu Ile Leu Tyr Val Ile Lys Lys Lys Ile Val Pro Thr Gln Gly Ser Phe Ser Ala Lys Asp Phe Lys Leu Glu Ile Ser Val Leu Gly Arg Val Asp Ala Asn His Lys Ile Ile Ser Ile Glu Thr Asn Lys Glu Arg Tyr Leu Val Leu Leu Ser Asp Lys Tyr Gly Leu Leu Leu Asp Lys Ile Ser Pro Lys Thr Ser Lys Glu Glu Leu Ile Lys Glu Ala Glu Asn Asn Ile Lys Asn Ser Lys Leu Gly Asn Leu Tyr Ala Gly Lys Phe (2) INFORMATION FOR SEQ ID N0:63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 620 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 70...567 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:
' CTAGCGCGAT CTTTGGTCTC ACACAAGCTA TAACAAGCTT GAGAATCGCA TAATATATTC 60 Met Leu Ser Pro Ala Thr Phe Lys Gln Ile Thr Leu Ala Leu Ile Ala Ser Arg Leu Ile Val Val Ile Leu Tyr Ala Phe Ile Phe Ile Val Leu Ser Phe Tyr Met Leu Asn Ile Ile Thr Ile Leu Asn Phe Lys Ala Leu Ile Leu Gly Phe Val Ser Val Phe Ser Ser Ala Leu Phe Cys Phe Cys Leu Ala Ile Phe Val Ala Arg Ile Phe Gln Asn Glu Gln Ser Ile Leu Gly Phe Cys Asn Ile Ile Asn Leu Tyr Ala Leu Met Ser Cys Asn Val Phe Val Pro Leu Glu Tyr Leu Pro Ser Ile Gly Gln Leu Phe Ile Lys Thr Ser Ile Phe Tyr Tyr Leu Asn Gln Leu Leu Ile Lys Ala Phe Gln Gly Ile Asp Thr Ile Leu Val Leu Ala Thr Ser Thr Phe Phe Ile Ile Gly Gly Ile Ile Leu Phe Leu Leu Ser Ala Asn Arg Met Leu ' CTA ACA CCA AAA GAA CGC ATG CGT TAAAGGCTTA GTCCCACCAT TGATTTATTT 597 Leu Thr Pro Lys Glu Arg Met Arg (2) INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 166 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
Met Leu Ser Pro Ala Thr Phe Lys Gln Ile Thr Leu Ala Leu Ile Ala Ser Arg Leu Ile Val Val Ile Leu Tyr Ala Phe Ile Phe Ile Val Leu Ser Phe Tyr Met Leu Asn Ile Ile Thr Ile Leu Asn Phe Lys Ala Leu Ile Leu Gly Phe Val Ser Val Phe Ser Ser Ala Leu Phe Cys Phe Cys Leu Ala Ile Phe Val Ala Arg Ile Phe Gln Asn Glu Gln Ser Ile Leu Gly Phe Cys Asn Ile Ile Asn Leu Tyr Ala Leu Met Ser Cys Asn Val Phe Val Pro Leu Glu Tyr Leu Pro Ser Ile Gly Gln Leu Phe Ile Lys Thr Ser Ile Phe Tyr Tyr Leu Asn Gln Leu Leu Ile Lys Ala Phe Gln Gly Ile Asp Thr Ile Leu Val Leu Ala Thr Ser Thr Phe Phe Ile Ile Gly Gly Ile Ile Leu Phe Leu Leu Ser Ala Asn Arg Met Leu Leu Thr Pro Lys Glu Arg Met Arg (2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1405 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA ' (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 50...1366 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:65:

Met Gln Val Lys Glu Asn Lys Gln Leu Cys Leu Ile Ser Leu Gly Cys Ser Lys Asn ' TTG GTG GAT TCA GAG GTG ATG TTA GGC AAG CTT TAT AAT TAC ACG CTC 154 Leu Val Asp Ser Glu Val Met Leu Gly Lys Leu Tyr Asn Tyr Thr Leu Thr Asn Asp Ala Lys Ser Ala Asp Val Ile Leu Ile Aan Thr Cys Gly Phe Ile Glu Ser Ala Lys Gln Glu Ser Ile Gln Thr Ile Leu Asn Ala Ala Lys Asp Lys Lys Glu Gly Ala Ile Leu Ile Ala Ser Gly Cys Leu 70 75 so Ser Glu Arg Tyr Lys Asp Glu Ile Lys Glu Leu Ile Pro Glu Val Asp Ile Phe Thr Gly Val Gly Asp Tyr Asp Lys Ile Asp Ile Met Ile Ala Lys Lys Gln Asn Gln Phe Ser Glu Gln Val Phe Leu Ser Glu His Tyr Asn Ala Arg Ile Ile Thr Gly Ser Ser Val His Ala Tyr Val Lys Ile Ser Glu Gly Cys Asn Gln Lys Cys Ser Phe Cys Ala Ile Pro Ser Phe ~ AAG GGG AAA TTG CAA AGC AGG GAA TTG GAC TCC ATT TTA AAA GAA GTG 586 Lys Gly Lys Leu Gln Ser Arg Glu Leu Asp Ser Ile Leu Lys Glu Val ' Glu Asn Leu Ala Leu Lys Gly Tyr Thr Asp Met Thr Phe Ile Ala Gln Asp Ser Ser Ser Phe Leu Tyr Asp Lys Gly Gln Lys Asp Gly Leu Ile CTC CAG GCC

Gln IleArgAla IleAspLys GlnAla LeuLysSer AlaArg Leu Gln TTA ACC

Ile TyrLeuTyr ProSerSer ThrLeu GluLeuIle GlyAla Leu Thr GAA AAT

Ile SerSerPro IlePheGln TyrPhe AspMetPro IleGln Glu Asn 245 250 255 ' ATC AAG

His SerAspSer MetLeuLys MetArg ArgAsnSer SerGln Ile Lys CAC GCC

Ala HisLeuLys LeuLeuAsp MetLys GlnValLys GluSer His Ala ATC GGG

Phe ArgSerThr IleIleVal HisPro GluGluAsn GluSer Ile Gly TTT TTA

Glu GluGluLeu SerAlaPhe AspGlu PheGlnPhe AspArg Phe Leu AAT GAA

Leu IlePheAla PheSerAla GluAsn ThrHisAla TyrSer Asn Glu GAA ATC

Leu LysValPro LysLysThr AsnAla ArgIleLys AlaLeu Glu Ile AAA AAC

Asn IleAlaLeu LysHisGln HisSer PheLysAla LeuLeu Lys Asn AAG GAA

Asn ProIleLys AlaLeuVal AsnLys GluGlyGlu TyrPhe Lys Glu AAA GCG

Tyr AlaArgAsp LeuArgTrp ProGlu ValAspGly GluIle Lys Ala ATC ACC

_ Leu AsnAspSer GluLeuThr ProLeu LysProGly HisTyr Ile Thr ATT GAT

Thr AlaProSer GluPheLys AsnIle LeuLeuAla LysVal Ile Asp Leu Ser Pro Phe (2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 439 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:
Met Gln Val Lys Glu Asn Lys Gln Leu Cys Leu Ile Ser Leu Gly Cys Ser Lys Asn Leu Val Asp Ser Glu Val Met Leu Gly Lys Leu Tyr Asn Tyr Thr Leu Thr Asn Asp Ala Lys Ser Ala Asp Val Ile Leu Ile Asn Thr Cys Gly Phe Ile Glu Ser Ala Lys Gln Glu Ser Ile Gln Thr Ile Leu Asn Ala Ala Lys Asp Lys Lys Glu Gly Ala Ile Leu Ile Ala Ser Gly Cys Leu Ser Glu Arg Tyr Lys Asp Glu Ile Lys Glu Leu Ile Pro Glu Val Asp Ile Phe Thr Gly Val Gly Asp Tyr Asp Lys Ile Asp Ile Met Ile Ala Lys Lys Gln Asn Gln Phe Ser Glu Gln Val Phe Leu Ser Glu His Tyr Asn Ala Arg Ile Ile Thr Gly Ser Ser Val His Ala Tyr Val Lys Ile Ser Glu Gly Cys Asn Gln Lys Cys Ser Phe Cys Ala Ile Pro Ser Phe Lys Gly Lys Leu Gln Ser Arg Glu Leu Asp Ser Ile Leu Lys Glu Val Glu Asn Leu Ala Leu Lys Gly Tyr Thr Asp Met Thr Phe Ile Ala Gln Asp Ser Ser Ser Phe Leu Tyr Asp Lys Gly Gln Lys Asp Gly Leu Ile Gln Leu Ile Arg Ala Ile Asp Lys Gln Gln Ala Leu Lys Ser Ala Arg Ile Leu Tyr Leu Tyr Pro Ser Ser Thr Thr Leu Glu Leu Ile Gly Ala Ile Glu Ser Ser Pro Ile Phe Gln Asn Tyr Phe Asp Met Pro Ile Gln His Ile Ser Asp Ser Met Leu Lys Lys Met Arg Arg Asn 260 265 . 270 Ser Ser Gln Ala His His Leu Lys Leu Leu Asp Ala Met Lys Gln Val Lys Glu Ser Phe Ile Arg Ser Thr Ile Ile Val Gly His Pro Glu Glu Asn Glu Ser Glu Phe Glu Glu Leu Ser Ala Phe Leu Asp Glu Phe Gln Phe Asp Arg Leu Asn Ile Phe Ala Phe Ser Ala Glu Glu Asn Thr His Ala Tyr Ser Leu Glu Lys Val Pro Lys Lys Thr Ile Asn Ala Arg Ile Lys Ala Leu Asn Lys Ile Ala Leu Lys His Gln Asn His Ser Phe Lys Ala Leu Leu Asn Lys Pro Ile Lys Ala Leu Val Glu Asn Lys Glu Gly Glu Tyr Phe Tyr Lys Ala Arg Asp Leu Arg Trp Ala Pro Glu Val Asp Gly Glu Ile Leu Ile Asn Asp Ser Glu Leu Thr Thr Pro Leu Lys Pro Gly His Tyr Thr Ile Ala Pro Ser Glu Phe Lys Asp Asn Ile Leu Leu Ala Lys Val Leu Ser Pro Phe (2) INFORMATION FOR SEQ ID N0:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1420 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 200...1366 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:67:

Met Val Phe Leu Phe Phe Arg Cys Tyr Phe Gln Ala Ser Leu Lys Glu Thr Phe Ala Ile Asn His Leu Lys Thr Met Ser Phe Lys Trp Leu Thr Leu Ala Phe Leu Gly Val Phe Leu Ser Ile Phe WO 98/43478 PCT/US98l06371 Pro Asn Met Phe Asn Met His Asp Ser Gln Thr Phe Arg Tyr Asn Leu Phe Ala Leu Asn Met Ser Leu Thr Tyr Ala Cys Gly Ala Leu Cys Leu ' CTT TTT GCC AGT TGC TTA AGA ATC AAA TTG AAT CAA AAA ATC CTT TTT 472 Leu Phe Ala Ser Cys Leu Arg Ile Lys Leu Asn Gln Lys Ile Leu Phe ~ 80 85 90 Tyr Ser Met Ala Val Ala Asn Phe Ile Asn Gly Leu Leu Ser Leu Val Gln Lys Ile Tyr Phe Asn Met Pro Arg Ala Gln Gly Phe Ser Thr Val Lys Glu Tyr Val Val Leu Val Ser Val Ser Ile Leu Gly Cys Tyr Ile Tyr Ala Leu Tyr Ser His Asn Gln Lys Glu Lys Leu Phe Phe Thr Leu Ser Val Phe Val Gly Phe Leu Val Val Ile Leu Ser Ala Thr Arg Ser Ala Thr Ile Ala Phe Val Ile Thr Phe Leu Ile Leu Ser Cys Phe Ile Leu Tyr Ala Lys Lys Ser Leu Lys Pro Leu Gly Tyr Met Val Val Val Ser Leu Ile Leu Ser Ala Leu Tyr Val Gly Ser Asn Ala Leu Glu Lys ~ AAG GGG GCA ATA GAG CAA TCT AGA GTT CAA AAT CAA AGC TTT GAA GAA 904 Lys Gly Ala Ile Glu Gln Ser Arg Val Gln Asn Gln Ser Phe Glu Glu Asp Leu Lys Arg Tyr Ala Lys Lys Asp Ala Asp Ser Ser Ile Gly Trp Arg Leu Glu Arg Trp Lys Glu Ala Leu Thr Val Leu Arg Leu Arg Pro AAA AGG

PhePheGly Met Ala Ala Ser Glu Cys Gln LeuGlu Glu Ile Lys Arg GCC TTG

LeuSerLeu Ser Lys Ser Tyr Arg Lys Asp IleLeu Cys Tyr Ala Leu CAC GCC

GluArgTyr Asp Asn Gln Ile Ile Ile Leu ThrArg Gly Ile His Ala 300 305 310 315 '.

TTT GTT

IleGlyPhe Leu Ile Trp Leu Phe Leu Leu IleVal Lys Ile Phe Val TCT TCG

PheTrpSer Gly Ile Lys Gln Asn Leu Ile PhePhe Ile Leu Ser Ser TTT GGG

MetThrLeu Ala Phe Tyr Leu Ile Gly Ile PheAsp Pro Phe Phe Gly TTT ATG

AspPhePhe Ile Thr Gly Ser Phe Val Gly IleMet Met Ala Phe Met GCT GGGTTTGACA
TTA

ValPheLeu Lys Lys Asp Lys Ser Phe Ala GTCAAGCGGT
AGTTTCTTGT
GATTCGTTCT
T

(2) INFORMATION FOR SEQ N0:68:
ID

(i ) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 389 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (i i) MOLECULE TYPE: protein (v ) FRAGMENT TYPE: internal (xi) SEQUENCE D$SCRIPTION: SEQ ID N0:68:
Met Val Phe Leu Phe Phe Arg Cys Tyr Phe Gln Ala Ser Leu Lys Glu Thr Phe Ala Ile Asn His Leu Lys Thr Met Ser Phe Lys Trp Leu Thr Leu Ala Phe Leu Gly Val Phe Leu Ser Ile Phe Pro Asn Met Phe Asn Met His Asp Ser Gln Thr Phe Arg Tyr Asn Leu Phe Ala Leu Asn Met WO 98!43478 PCT/US98/06371 Ser Leu Thr Tyr Ala Cys Gly Ala Leu Cys Leu Leu Phe Ala Ser Cys Leu Arg Ile Lys Leu Asn Gln Lys Ile Leu Phe Tyr Ser Met Ala Val Ala Asn Phe Ile Asn Gly Leu Leu Ser Leu Val Gln Lys I1e Tyr Phe Asn Met Pro Arg Ala Gln Gly Phe Ser Thr Val Lys Glu Tyr Val Val Leu Val Ser Val Ser Ile Leu Gly Cys Tyr Ile Tyr Ala Leu Tyr Ser ~ His Asn Gin Lys Glu Lys Leu Phe Phe Thr Leu Ser Val Phe Val Gly Phe Leu Val Val Ile Leu Ser Ala Thr Arg Ser Ala Thr Ile Ala Phe Val Ile Thr Phe Leu Ile Leu Ser Cys Phe Ile Leu Tyr Ala Lys Lys Ser Leu Lys Pro Leu Gly Tyr Met Val Val Val Ser Leu Ile Leu Ser Ala Leu Tyr Val Gly Ser Asn Ala Leu Glu Lys Lys Gly Ala Ile Glu Gln Ser Arg Val Gln Asn Gln Ser Phe Glu Glu Asp Leu Lys Arg Tyr Ala Lys Lys Asp Ala Asp Ser Ser Ile Gly Trp Arg Leu Glu Arg Trp Lys Glu Ala Leu Thr Val Leu Arg Leu Arg Pro Phe Phe Gly Met Ala Ala Ser Glu Lys Cys Gln Arg Leu Glu Glu Ile Leu Ser Leu Ser Lys Ser Tyr Arg Ala Lys Asp Leu Ile Leu Cys Tyr Glu Arg Tyr Asp Asn Gln Ile Ile His Ile Leu Ala Thr Arg Gly Ile Ile Gly Phe Leu Ile Trp Leu Phe Phe Leu Leu Val Ile Val Lys Ile Phe Trp Ser Gly Ile Lys Gln Asn Ser Leu Ile Ser Phe Phe Ile Leu Met Thr Leu Ala Phe Tyr Leu Ile Phe Gly Ile Gly Phe Asp Pro Phe Asp Phe Phe Ile Thr Gly Ser Phe Phe Val Gly Met Ile Met Met Ala Val Phe Leu Lys Lys Asp Lys Ser Ala Phe (2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1252 base pairs ' (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 89...1198 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:69:
AAAAAGGAAA TAGCACGATG AAACCTAAAG GGGATATTAG CGTTAATATG CTAATATAGT 60 _ Met Leu Ile Ser Ile Ala Phe Leu Leu Val Leu Tyr Leu Leu Asn Tyr Ser Ser Phe Arg Met Leu Lys Ser ACC ATC

PheLeuThr LeuLys LysIleSer GlnTyrAla TyrLeuTrp Phe Phe IleLeuLeu SerIle GlyGluAla AlaPheVal PheTyrArg Asn Ile MetProSer HisLeu PheValLeu ThrSerAla CysSerPhe Val Ser 60 65 7p PheIleIle PheIle LeuSerLeu SerPheTyr GlyPheSer Tyr Ser IleGluLys IleAsp PheLeuHis SerArgArg LysSerLeu Lys Asn PheLeuLys LeuGly PheTyrLeu AlaLeuLeu GlyTyrPhe Trp Arg GlyPheTyr GluGly LeuAlaArg ProLysIle LysGluThr Pro Ile TyrLeuAsp LysLeu AspLysGlu LeuLysIle IleLeuLeu Thr Asp MetHisVal GlySer LeuLeuGln LysAspPhe ValAspTyr Ile Val GluGluVal AsnGln LysGluVal AspMetVal LeuIleGly Gly Asp AGC AAA TCT
GTC

Leu ValAsp GluSerIle GluLys ValLysSer PheLeuLeuPro Leu Asn AsnLeu LysSerThr HisGly ThrPheTyr ValProGlyAsn His Glu TyrTyr HisGlyIle GluPro IleLeuSer PheLeuAspThr Leu Asn LeuThr IleLeuGly AsnGlu CysValHis LeuGlyGlyIle Asn Leu CysGly ValTyrAsp TyrPhe AlaArgLys ArgGlnAsnPhe Ala Pro AspIle AspLysAla LeuLys LysArgAsn GluSerLysPro Thr Ile LeuLeu AlaHisGln ProLys GlnIleArg SerLeuLysGlu Ser His SerVal AspLeuVal LeuSer GlyHisThr HisAlaGlyGln Ile Phe ProPhe SerLeuLeu ValLys LeuAlaGln ThrTyrLeuHis Gly Leu TyrLys HisSerPro ThrThr GlnIleTyr ValSerSerGly Ala Gly TyrTrp GlyIlePro LeuArg PheLeuAla ProSerGluIle Ala AAATCTTAAA
ATC

Tyr LeuArg LeuLeuPro LysAsn GlnAla TCAAGCGGTT
AAAAATAAGA
A

(2 ) N0:70:
INFORMATION
FOR
SEQ
ID

( i) CHARACTERIST ICS:
SEQUENCE

(A) LENGTH: 370 amino cids a (B) TYPE: minoacid a (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:
Met Leu Ile Ser Ile Ala Phe Leu Leu Val Leu Tyr Leu Leu Asn Tyr Ser Ser Phe Arg Met Leu Lys Ser Phe Leu Thr Leu Lys Lys Ile Ser ' Gln Tyr Ala Tyr Leu Trp Phe Phe Ile Leu Leu Ser Ile Gly Glu Ala Ala Phe Val Phe Tyr Arg Asn Ile Met Pro Ser His Leu Phe Val Leu Thr Ser Ala Cys Ser Phe Val Ser Phe Ile Ile Phe Ile Leu Ser Leu Ser Phe Tyr Gly Phe Ser Tyr Ser Ile Glu Lys Ile Asp Phe Leu His Ser Arg Arg Lys Ser Leu Lys Asn Phe Leu Lys Leu Gly Phe Tyr Leu Ala Leu Leu Gly Tyr Phe Trp Arg Gly Phe Tyr Glu Gly Leu Ala Arg Pro Lys Ile Lys Glu Thr Pro Ile Tyr Leu Asp Lys Leu Asp Lys Glu Leu Lys Ile Ile Leu Leu Thr Asp Met His Val Gly Ser Leu Leu Gln Lys Asp Phe Val Asp Tyr Ile Val Glu Glu Val Asn Gln Lys Glu Val Asp Met Val Leu Ile Gly Gly Asp Leu Val Asp Glu Ser Ile Glu Lys Val Lys Ser Phe Leu Leu Pro Leu Asn Asn Leu Lys Ser Thr His Gly Thr Phe Tyr Val Pro Gly Asn His Glu Tyr Tyr His Gly Ile Glu Pro Ile Leu Ser Phe Leu Asp Thr Leu Asn Leu Thr Ile Leu Gly Asn Glu Cys Val His Leu Gly Gly Ile Asn Leu Cys Gly Val Tyr Asp Tyr Phe Ala Arg Lys Arg Gln Asn Phe Ala Pro Asp Ile Asp Lys Ala Leu Lys Lys Arg Asn Glu Ser Lys Pro Thr Ile Leu Leu Ala His Gln Pro Lys Gln Ile Arg Ser Leu Lys Glu Ser His Ser Val Asp Leu Val Leu Ser -Gly His Thr His Ala Gly Gln Ile Phe Pro Phe Ser Leu Leu Val Lys Leu Ala Gln Thr Tyr Leu His Gly Leu Tyr Lys His Ser Pro Thr Thr _ Gln Ile Tyr Val Ser Ser Gly Ala Gly Tyr Trp Gly Ile Pro Leu Arg Phe Leu Ala Pro Ser Glu Ile Ala Tyr Leu Arg Leu Leu Pro Lys Asn Gln Ala (2) INFORMATION FOR SEQ ID N0:71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 431 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 103...381 (D) OTHER INFORMATION:
(xi) DESCRIPTION: SEQ N0:71:
SEQUENCE ID

CGCTATCAAA ACCGAAGAAA
AAAGAGATTT
TTTGCAATAT

GCGAGTTTGT ATG
GGGTGAAACG ATT
TTT
GTC

Met Ile Phe Val ATT GTT

Asn LysTyrLeu TyrGly LysSer ValValProLeu Ala Gly Ile Val AAA TTT

Phe SerLysTyr ProLeu LysPhe LeuTrpLeuAsn Val Ser Lys Phe ATC GCG

Ser PheLeuTrp AlaLeu ValGly SerValSerPhe Gln Ser Ile Ala TAT TCG

Asp TrpValLys ThrLeu GluArg LeuSerHisTyr Thr Phe Tyr Ser CTT TTA

' Phe IleIleSer PheVal IleAla LeuLeuIleTrp Phe Leu Leu Leu AAA CGATATTCG CGCAAA GGTTTT TAAGCAAGAT GTTTAATTAA

ATG ATGCGCT

Lys ArgTyrSer ArgLys GlyPhe Met CGC

(2 ) N0:72:
INFORMATION
FOR
SEQ
ID

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:
Met Ile Phe Val Asn Lys Tyr Leu Tyr Gly Ile Lys Ser Val Val Pro 1 5 10 15 °
Leu Ala Val Gly Phe Ser Lys Tyr Pro Leu Lys Lys Phe Leu Trp Leu Asn Val Phe Ser Ser Phe Leu Trp Ala Leu Ile Val Gly Ser Val Ser Phe Gln Ala Ser Asp Trp Val Lys Thr Leu Tyr Glu Arg Leu Ser His 50 55 6p Tyr Thr Ser Phe Phe Ile Ile Ser Phe Val Leu Ile Ala Leu Leu Ile Trp Phe Leu Leu Lys Arg Tyr Ser Arg Lys Met Gly Phe (2) INFORMATION FOR SEQ ID N0:73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1281 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 70...1227 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:73:

Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu Leu Ile Ile Ala Gln Ser Leu Pro His Ala Ile Leu Thr Pro Leu Leu Leu Ser Lys Gly Leu Ser Leu Ser Glu Ile Leu Leu Val Gln Thr Phe Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala Asp Leu Met Ser Arg Lys Asn Leu Phe Leu Val Ser Asn Ala Phe ~ TTA ATC GCT AGT TTT TCG TTT GTG CTG TTT TTT GAT AGC TTT ATT TTC 351 Leu Ile Ala Ser Phe Ser Phe Val Leu Phe Phe Asp Ser Phe Ile Phe Met Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly Thr Ile Glu Ala Ser Leu Ile Thr Asp Ile Lys Glu Asn Lys Lys Asp Leu Ser Lys Phe Leu Ala Lys Asn Asn Gln Ile Thr Tyr Leu Gly Met Ile Ile Gly Ser Ser Leu Gly Ser Phe Leu Tyr Leu Lys Val His Ala Met Leu Tyr Ile Val Gly Ile Phe Leu Ile Met Leu Cys Val Leu Thr Ile Ile Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gln Lys Ser Leu Lys Leu Leu Lys Glu Gln Val Lys Gly Ser Leu Lys Glu Leu Lys Asp Asn Pro Lys Leu Lys Ile Leu Leu Val Gly His Leu Ile ~ Thr Pro Val Phe Phe Met Ser His Phe Gln Met Trp Gln Ala Tyr Phe Leu Lys Gln Gly Val Lys Glu Gln Tyr Leu Phe Val Phe Tyr Ile Ala Phe Gln Val Ile Ser Ile Leu Ile His Phe Leu Lys Ala Ser Ser Tyr Ser Gln Lys Ile Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu Leu Leu Ser Asn Ile Pro Tyr Cys Phe Ile Gly Val Tyr Ala ACT ATG AAC

LeuMetVal AlaPhe Phe Tyr SerTyr CysLeu TyrGln -Thr Met Asn AAA AAC TCA

PheSerLys PheVal Ser Asn IleSer SerLeu SerLeu Lys Asn Ser GTG TCT TCG

LeuSerSer CysVal Arg Val ValLeu IleLeu LeuSer Val Ser Ser TTC CCC ACC

SerLeuGlu LeuArg Tyr Ser LeuThr IleIle MetHis Phe Pro Thr ATC TTT AAG

PheAlaLeu ThrLeu Ile Leu PhePhe LeuTyr AlaLys Ile Phe Lys TAAGAGTGCA
ACCTTTTAGC

ProPheAsp Glu (2} INFORMATION FOR SEQ ID N0:74:
(i) SEQUENCE CHARACTERISTICS:
(A} LENGTH: 386 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:74:
Met Leu Arg Lys Asn Ile Leu Ala Tyr Tyr Gly Ala Asn Phe Leu Leu Ile Ile Ala Gln Ser Leu Pro His Ala Ile Leu Thr Pro Leu Leu Leu Ser Lys Gly Leu Ser Leu Ser Glu Ile Leu Leu Val Gln Thr Phe Phe Ser Phe Cys Val Leu Val Ala Glu Tyr Pro Ser Gly Val Leu Ala Asp Leu Met Ser Arg Lys Asn Leu Phe Leu Val Ser Asn Ala Phe Leu Ile Ala Ser Phe Ser Phe Val Leu Phe Phe Asp Ser Phe Ile Phe Met Leu Leu Ala Trp Gly Leu Tyr Gly Leu Tyr Ser Ala Cys Ser Ser Gly Thr Ile Glu Ala Ser Leu Ile Thr Asp Ile Lys Glu Asn Lys Lys Asp Leu Ser Lys Phe Leu Ala Lys Asn Asn Gln Ile Thr Tyr Leu Gly Met Ile Ile Gly Ser Ser Leu Gly Ser Phe Leu Tyr Leu Lys Val His Ala Met Leu Tyr Ile Val Gly Ile Phe Leu Ile Met Leu Cys Val Leu Thr Ile Ile Phe Tyr Phe Lys Glu Lys Glu Gly Asp Phe Lys Ser Gln Lys Ser Leu Lys Leu Leu Lys Glu Gln Val Lys Gly Ser Leu Lys Glu Leu Lys Asp Asn Pro Lys Leu Lys Ile Leu Leu Val Gly His Leu Ile Thr Pro Val Phe Phe Met Ser His Phe Gln Met Trp Gln Ala Tyr Phe Leu Lys Gln Gly Val Lys Glu Gln Tyr Leu Phe Val Phe Tyr Ile Ala Phe Gln Val Ile Ser Ile Leu Ile His Phe Leu Lys Ala Ser Ser Tyr Ser Gln Lys Ile Ala Leu Ser Ser Leu Val Val Leu Leu Gly Val Ser Pro Leu Leu Leu Ser Asn Ile Pro Tyr Cys Phe Ile Gly Val Tyr Ala Leu Met Val Ala Phe Phe Thr Tyr Met Ser Tyr Cys Leu Asn Tyr Gln Phe Ser Lys Phe Val Ser Lys Asn Asn Ile Ser Ser Leu Ser Ser Leu Leu Ser Ser Cys Val Arg Val Val Ser Val Leu Ile Leu Ser Leu Ser Ser Leu Glu Leu Arg Tyr Phe Ser Pro Leu Thr Ile Ile Thr Met His Phe Ala Leu Thr Leu Ile Ile Leu Phe Phe Phe Leu Tyr Lys Ala Lys Pro Phe Asp Glu " 385 (2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2218 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA

(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 77...2167 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:

AGCCGTTTTG
CCACTATTCA
AAGAAATTTT
GGATTATAAT
AAAAAAACTG

ACAACA
ATG
ATT
AAA
CAA
TCA
TTA
AAT
GGA
GAG
GAC
ATG
CAA

Met Ile ln Ser Lys G Leu Asn Gly Glu Asp Met Gln TTG TGG TTT ATT GGG

LysSerLeuVal Ser~LeuAla Val Val Ala Leu Ala Trp Phe Ile Gly TTA TTA AAG AGC AAC

IleCysLeuGly Val Ala His Gly Glu Ile Thr Leu Leu Lys Ser Asn GCG GCT ATT ATA TAT

LeuTrpLeuVal Val Ser Cys Tyr Ser Gly Arg Ala Ala Ile Ile Tyr ATC TAT GTG CTA GAT

PheTyrSerHis Phe Ala Lys Leu Lys Asp Ser Ile Tyr Val Leu Asp TGC AGG GAT GAT GTG

ArgAlaThrPro Ala Val Asn Gly Lys Phe Pro Cys Arg Asp Asp Val ACT GGG CAT GCT GCT

ThrAspLysAla Ile Phe His Phe Ala Ile Gly Thr Gly His Ala Ala s5 100 105 GGC ATA GCC ATG TAC

AlaGlyProLeu Val Pro Leu Ala Gln Gly Leu Gly Ile Ala Met Tyr ATT ATA TCG GGG TGC

ProSerIleLeu Trp Leu Gly Val Leu Gly Val Ile Ile Ser Gly Cys CTT GCT ATT GAT AAG

_ HisAspPheVal Val Phe Ser Arg Arg Gly Ser Leu Ala Ile Asp Lys AAA GAA GGC GTA ATG

LeuGlyGluMet Ile Leu Met Gln Phe Gly Ile Lys Glu Gly Val Met Ala Ser Leu Gly Ile Leu Gly Ile Met Leu Ile Ile Ile Ala Ile Leu Ala Met Val Val Val Lys Ala Leu Ala His Ser Pro Trp Gly Phe Phe Thr Ile Ala Met Thr Ile Pro Ile Ala Ile Leu Met Gly Leu Tyr Met . CGG TTT TTC AGG CCA CAC AAG ATT TTA GAG GTT TCT GTT ATT GGC TTT 784 Arg Phe Phe Arg Pro His Lys Ile Leu Glu Val Ser Val Ile Gly Phe Ile Leu Leu Ile Ile Ala Ile Tyr Ala Gly Lys Tyr Val Ser Leu Asp Pro Lys Leu Ala Ser Ile Phe Thr Phe Glu Ala Ser Ser Leu Ala Trp Met Ile Met Gly Tyr Gly Phe Val Ala Ser Ile Leu Pro Val Trp Phe AAA
ATT

LeuLeuAla ProArgAsp TyrLeuSer ThrPheLeu LysIleGly Val IleGlyVal LeuValVal AlaIleIle PheValAla ProProLeu Gln IleProLys IleThrPro PheValAsp GlySerGly ProValPhe Ala GlySerVal PheProPhe LeuPheIle ThrValAla CysGlyThr Ile ' AGC GGA TTC CAT GCT TTA ATT TCT TCA GGC ACG ACC CCT AAA ATG CTC 1168 Ser Gly Phe His Ala Leu Ile Ser Ser Gly Thr Thr Pro Lys Met Leu Ala Lys Glu Ser Asp Ala Arg Leu Val Gly Tyr Gly Ser Met Val Met Glu Ser Val Val Ala Leu Met Ala Leu Val Cys Ala Gly Ile Leu His 98!43478 GGG TCG ATC AAA
CTT CCA
GAA
GTG

ProGly TyrPhe AlaIleAsn ValSer Gly Leu Ser Ile Lys Pro Glu GCT TCA

AspIleAla AspAla AlaSerVal IleSerSer TrpGlyPhe AsnIle SerAlaGlu GluIle ArgGluMet ThrLysAsn IleGlyGlu SerSer ATTTTGAGC CGCACC GGTGGGGCG CCCACTTTT GCGATCGGT TTAGCG 1456 ' IleLeuSer ArgThr GlyGlyAla ProThrPhe AlaIleGly LeuAla ' MetIleVal TyrHis IleLeuGly AspProSer ValMetAla PheTrp TyrHisPhe AlaIle LeuPheGlu AlaLeuPhe IleLeuThr AlaVal AspAlaGly ThrArg ThrAlaArg PheMetIle GlnAspLeu LeuGly AsnValTyr LysPro LeuGlyAsp LeuSerSer TyrLysAla GlyIle PheAlaThr LeuLeu CysValAla GlyTrpGly TyrPheLeu TyrGln GlyThrIle AspPro LysGlyGly IleTyrThr LeuTrpPro LeuPhe GlyValSer AsnGln MetLeuAla GlyMetAla LeuLeuLeu ValThr ValValLeu PheLys MetGlyArg PheLysGly AlaMetIle SerAla LeuProAla ValLeu IleLeuSer IleThrPhe TyrSerGly IleLeu AAG AAC

LysValVal ProLys Ser Asn SerValLeu AsnAsnVal SerHis Asp Val Ala Gln Met Gln Ile Ile Lys Glu Lys Met Ala Thr Thr Thr Asp Glu Lys Ala Leu Lys Thr Leu Gln Lys Ser Phe Phe Asn His Ala Ile Asp Ala Ile Leu Cys Val Phe Phe Met Leu Val Ala Leu Leu Val Leu Ile Val Ser Val Arg Ile Cys Ser Asn Ala Tyr Phe Lys Asn Lys Ile Tyr Pro Pro Leu Ala Glu Thr Pro Tyr Ile Lys Ala Ser (2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 697 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:
Met Ile Lys Gln Ser Leu Asn Gly Glu Asp Met Gln Lys Ser Leu Val Ser Leu Ala Trp Val Phe Val Ala Ile Leu Gly Ala Ile Cys Leu Gly Val Leu Ala Leu His Lys Gly Glu Ser Ile Asn Thr Leu Trp Leu Val Val Ala Ser Ala Cys Ile Tyr Ser Ile Gly Tyr Arg Phe Tyr Ser His Phe Ile Ala Tyr Lys Val Leu Lys Leu Asp Asp Ser Arg Ala Thr Pro ' 65 70 75 80 Ala Cys Val Arg Asn Asp Gly Lys Asp Phe Val Pro Thr Asp Lys Ala Ile Thr Phe Gly His His Phe Ala Ala Ile Ala Gly Ala Gly Pro Leu 100 lOS 110 Val Gly Pro Ile Leu Ala Ala Gln Met Gly Tyr Leu Pro Ser Ile Leu Trp Ile Leu Ile Gly Ser Val Leu Gly Gly Cys Val His Asp Phe Val Val Leu Phe Ala Ser Ile Arg Arg Asp Gly Lys Ser Leu Gly Glu Met Ile Lys Leu Glu Met Gly Gln Phe Val Gly Met Ile Ala Ser Leu Gly Ile Leu Gly Ile Met Leu Ile Ile Ile Ala Ile Leu Ala Met Val Val Val Lys Ala Leu Ala His Ser Pro Trp Gly Phe Phe Thr Ile Ala Met Thr Ile Pro Ile Ala Ile Leu Met Gly Leu Tyr Met Arg Phe Phe Arg Pro His Lys Ile Leu Glu Val Ser Val Ile Gly Phe Ile Leu Leu Ile Ile Ala Ile Tyr Ala Gly Lys Tyr Val Ser Leu Asp Pro Lys Leu Ala Ser Ile Phe Thr Phe Glu Ala Ser Ser Leu Ala Trp Met Ile Met Gly Tyr Gly Phe Val Ala Ser Ile Leu Pro Val Trp Phe Leu Leu Ala Pro Arg Asp Tyr Leu Ser Thr Phe Leu Lys Ile Gly Val Ile Gly Val Leu Val Val Ala Ile Ile Phe Val Ala Pro Pro Leu Gln Ile Pro Lys Ile Thr Pro Phe Val Asp Gly Ser Gly Pro Val Phe Ala Gly Ser Val Phe Pro Phe Leu Phe Ile Thr Val Ala Cys Gly Thr Ile Ser Gly Phe His Ala Leu Ile Ser Ser Gly Thr Thr Pro Lys Met Leu Ala Lys Glu Ser Asp Ala Arg Leu Val Gly Tyr Gly Ser Met Val Met Glu Ser Val Val Ala Leu Met Ala Leu Val Cys Ala Gly Ile Leu His Pro Gly Leu Tyr Phe Ala Ile Asn Ser Pro Glu Val Ser Ile Gly Lys Asp Ile Ala Asp Ala Ala Ser Val Ile Ser Ser Trp Gly Phe Asn Ile Ser Ala Glu Glu Ile Arg Glu Met Thr Lys Asn Ile Gly Glu Ser Ser Ile Leu Ser Arg Thr Gly Gly Ala Pro Thr Phe Ala Ile Gly Leu Ala Met Ile Val Tyr His Ile Leu Gly Asp Pro Ser Val Met Ala Phe Trp Tyr His Phe Ala Ile Leu Phe Glu Ala Leu Phe Ile Leu Thr Ala Val Asp Ala Gly Thr Arg Thr Ala Arg Phe Met Ile Gln Asp Leu Leu Gly Asn Val Tyr Lys Pro Leu Gly Asp Leu Ser Ser Tyr Lys Ala Gly Ile Phe Ala Thr Leu Leu Cys Val Ala Gly Trp Gly Tyr Phe Leu Tyr Gln Gly Thr Ile Asp Pro Lys Gly Gly Ile Tyr Thr Leu Trp Pro Leu Phe Gly Val Ser Asn _ Gln Met Leu Ala Gly Met Ala Leu Leu Leu Val Thr Val Val Leu Phe Lys Met Gly Arg Phe Lys Gly Ala Met Ile Ser Ala Leu Pro Ala Val Leu Ile Leu Ser Ile Thr Phe Tyr Ser Gly Ile Leu Lys Val Val Pro Lys Ser Asp Asn Ser Val Leu Asn Asn Val Ser His Val Ala Gln Met Gln Ile Ile Lys Glu Lys Met Ala Thr Thr Thr Asp Glu Lys Ala Leu Lys Thr Leu Gln Lys Ser Phe Phe Asn His Ala Ile Asp Ala Ile Leu Cys Val Phe Phe Met Leu Val Ala Leu Leu Val Leu Ile Val Ser Val Arg Ile Cys Ser Asn Ala Tyr Phe Lys Asn Lys Ile Tyr Pro Pro Leu Ala Glu Thr Pro Tyr Ile Lys Ala Ser (2) INFORMATION FOR SEQ ID N0:77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 911 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/ICEY: Coding Sequence (B) LOCATION: 121...861 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77:

Met Gln Lys Thr Ser Asn Thr Leu Ala Leu Gly Ser Leu Thr Ala Leu Phe Phe Leu Met Gly Phe Ile Thr Val Leu Asn Asp Ile Leu Ile Pro His Leu Lys Pro Ile Phe Asp Leu Thr Tyr Phe Glu Ala Ser Leu Ile ' Gln Phe Cys Phe Phe Gly Ala Tyr Phe Ile Met Gly Gly Val Phe Gly Asn Val Ile Ser Lys Ile Gly Tyr Pro Phe Gly Val Val Leu Gly Phe ValIle ThrAlaThr GlyCys AlaLeuPhe TyrProAla AlaHisPhe GlySer TyrGlyPhe PheLeu GlyAlaLeu PheIleLeu AlaSerGly IleVal CysLeuGln ThrAla GlyAsnPro PheValThr LeuLeuSer LysGly LysGluAla RrgAsn LeuValLeu ValGlnAla PheAsnSer LeuGly ThrThrLeu GlyPro IlePheGly SerLeuLeu IlePheSer ThrThr LysMetGly AspAsn AlaSerLeu IleAspLys LeuAlaAsp AlaLys SerValGln MetPro TyrLeuGly LeuAlaVal PheSerLeu LeuLeu AlaLeuIle MetTyr LeuLeuLys LeuProAsp ValGluLys GluMet ProLysGlu ThrThr GlnLysSer LeuPheSer HisLysHis PheVal PheGlyAla TrpGly SerPhePhe MetTrpGly GluXaaTrp AAAGCTTTTG
AATTTAGACT

ArgLeu AlaHisSer TrpCys CATTAC

(2)INFORMATION SEQID N0:78: ;
FOR

(i) CS:
SEQUENCE
CHARACTERISTI

(A)LENGTH: 247amino ids ac (B)TYPE: acid amino (C)STRANDEDNESS: ngle si (D)TOPOLOGY :
linear (ii) TYPE: otein MOLECULE pr (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:78:
Met Gln Lys Thr Ser Asn Thr Leu Ala Leu Gly Ser Leu Thr Ala Leu Phe Phe Leu Met Gly Phe Ile Thr Val Leu Asn Asp Ile Leu Ile Pro His Leu Lys Pro Ile Phe Asp Leu Thr Tyr Phe Glu Ala Ser Leu Ile Gln Phe Cys Phe Phe Gly Ala Tyr Phe Ile Met Gly Gly Val Phe Gly Asn Val Ile Ser Lys Ile Gly Tyr Pro Phe Gly Val Val Leu Gly Phe Val Ile Thr Ala Thr Gly Cys Ala Leu Phe Tyr Pro Ala Ala His Phe Gly Ser Tyr Gly Phe Phe Leu Gly Ala Leu Phe Ile Leu Ala Ser Gly Ile Val Cys Leu Gln Thr Ala Gly Asn Pro Phe Val Thr Leu Leu Ser Lys Gly Lys Glu Ala Arg Asn Leu Val Leu Val Gln Ala Phe Asn Ser Leu Gly Thr Thr Leu Gly Pro Ile Phe Gly Ser Leu Leu Ile Phe Ser Thr Thr Lys Met Gly Asp Asn Ala Ser Leu Ile Asp Lys Leu Ala Asp Ala Lys Ser Val Gln Met Pro Tyr Leu Gly Leu Ala Val Phe Ser Leu Leu Leu Ala Leu Ile Met Tyr Leu Leu Lys Leu Pro Asp Val Glu Lys Glu Met Pro Lys Glu Thr Thr Gln Lys Ser Leu Phe Ser His Lys His Phe Val Phe Gly Ala Trp Gly Ser Phe Phe Met Trp Gly Glu Xaa Trp Arg Leu Ala His Ser Trp Cys (2) INFORMATION FOR SEQ ID N0:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3084 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 49...3027 (D) OTHER INFORMATION:

( xi) SEQUENCE DESCRIPTION: SEQ N0:79:
ID

ATTAAATAAC
TCAAAATTTT
TGATCAAAGG
CTTGAAAT
ATG
TCA
AAA

Met Se r Lys TTG GCT TTT ACC

LysIlePro LeuLys AsnArg Arg Asp Thr Lys Pro Leu Ala Phe Thr _ ACAGATTTA GAAGTC CCTAAT TTA TTA CGA GAC TAT 153 .
TTA TTA CAA AGC

ThrAspLeu GluVal ProAsn Leu Leu Arg Asp Tyr Leu Leu Gln Ser 20 25 30 35 _ GAG AAA AGC GAA

AspSerPhe LeuTyr SerLys Gly Glu Gly Ile Lys Glu Lys Ser Glu ATC GAT CAT ATC

ValPheLys SerIle PhePro Gln Glu Asn Arg Thr Ile Asp His Ile TTT AAG AAA GTT

LeuGluTyr AlaGly CysGlu Gly Ser Tyr Thr Arg Phe Lys Lys Val ACC TCT CCT ATT

GluAlaMet GluArg GlyIle Tyr Ile Leu Lys Lys Thr Ser Pro Ile AAA ACC AGT AAG

ValArgLeu IleLeu TrpGlu Asp Lys Gly Glu Asn Lys Thr Ser Lys CAA ATT ATT ATC

GlyIleLys AspIle LysGlu Ser Phe Arg Glu Pro Gln Ile Ile Ile TTT ATT GGG CGC

LeuMetThr GluArg ThrSer Ile Asn Val Glu Val Phe Ile Gly Arg i35 140 145 AGC GGT ATT GAA

ValValAsn GlnLeu HisArg Pro Val Phe Lys Glu Ser Gly Ile Glu AAG ATT ACA ATC

GluSerSer ThrSer LeuAsn Leu Tyr Gly Gln Ile Lys Ile Thr Ile TAT GAA GAT GAT

ProAspArg GlySer TrpLeu Phe Tyr Ser Lys Val Tyr Glu Asp Asp CGC AAA CCT ATT

Leu Tyr Ala Arg Ile Asn Lys Arg Arg Lys Val Pro Val Thr Ile Leu Phe Arg Ala Met Asp Tyr Gln Lys Gln Asp Ile Ile Lys Met Phe Tyr Pro Leu Val Lys Val Arg Tyr Glu Asn Asp Lys Tyr Leu Ile Pro Phe Ala Ser Leu Asp Ala Asn Gln Arg Met Glu Phe Asp Leu Lys Asp Pro Gln Gly Lys Val Ile Leu Leu Ala Gly Lys Lys Leu Thr Ser Arg Lys Ile Lys Glu Leu Lys Glu Asn His Leu Glu Trp Val Glu Tyr Pro Met Asp Ile Leu Leu Asn Arg His Leu Ala Glu Pro Val Met Val Gly Lys Glu Val Leu Leu Asp Met Leu Thr Gln Leu Asp Lys Asn Lys Leu Glu Lye Ile His Asp Leu Gly Val Gln Glu Phe Val Ile Ile Asn Asp Leu Ala Leu Gly His Asp Ala Ser Ile Ile Gln Ser Phe Ser Ala Asp Ser Glu Ser Leu Lys Leu Leu Lys Gln Thr Glu Lys Ile Asp Asp Glu Asn ' Ala Leu Ala Ala Ile Arg Ile His Lys Val Met Lys Pro Gly Asp Pro Val Thr Thr Glu Val Ala Lys Gln Phe Val Lys Lys Leu Phe Phe Aap Pro Glu Arg Tyr Asp Leu Thr Met Val Giy Arg Met Lys Met Asn His WO

CAT ACG ACG GAA

LysLeu GlyLeuHis ValProAsp TyrIleThr ThrLeuThr HisGlu AspIle IleThrThr ValLysTyr LeuMetLys IleLysAsn AsnGln GlyLys IleAspAsp ArgAspHis LeuGlyAsn ArgArgIle ArgAla 455 460 465 ', GTAGGG GAATTGTTG GCCAATGAA TTGCATTCA GGTTTAGTG AAAATG 1497 ' ValGly GluLeuLeu AlaAsnGlu LeuHisSer GlyLeuVal LysMet GlnLys ThrIleLys AspLysLeu ThrThrMet SerGlyAla PheAsp SerLeu MetProHis AspLeuVal AsnSerLys MetIleThr SerThr ATCATG GAATTTTTC ATGGGCGGT CAGCTCTCG CAATTTATG GATCAA 1641' IleMet GluPhePhe MetGlyGly GlnLeuSer GlnPheMet AspGln ThrAsn ProLeuSer GluValThr HisLysArg ArgLeuSer AlaLeu GlyGlu GlyGlyLeu ValLysAsp ArgValGly PheGluAla ArgAsp ValHis ProThrHis TyrGlyArg IleCysPro IleGluThr ProGlu GlyGln AsnIleGly LeuIleAsn ThrLeuSer ThrPheThr ArgVal AAG

AsnAsp LeuGlyPhe IleGluAla ProTyrLys LysValVal AspGly AAG GTC GTG GGT GAG ACG ATT TAT TTG ACC GCT ATT CAA GAA GAC AGC 1929 _ Lys Val Val Gly Glu Thr Ile Tyr Leu Thr Ala Ile Gln Glu Asp Ser His Ile Ile Ala Pro Ala Ser Thr Pro Ile Asp Glu Glu Gly Asn Ile Leu Gly Asp Leu Ile Glu Thr Arg Val Glu Gly Glu Ile Val Leu Asn Glu Lys Ser Lys Val Thr Leu Met Asp Leu Ser Ser Ser Met Leu Val Gly Val Ala Ala Ser Leu Ile Pro Phe Leu Glu His Asp Asp Ala Asn Arg Ala Leu Met Gly Thr Asn Met Gln Arg Gln Ala Val Pro Leu Leu ' Arg Ser Asp Ala Pro Ile Val Gly Thr Gly Ile Glu Lys Ile Ile Ala Arg Asp Ser Trp Gly Ala Ile Lys Ala Asn Arg Ala Gly Val Val Glu Lys Ile Asp Ser Lys Asn Ile Tyr Ile Leu Gly Glu Ser Lys Glu Glu Ala Tyr Ile Asp Ala Tyr Ser Leu Gln Lys Asn Leu Arg Thr Asn Gln Asn Thr Ser Phe Asn Gln Val Pro Iie Val Lys Val Gly Asp Lys Val Gly Ala Gly Gln Ile Ile Ala Asp Gly Pro Ser Met Asp Arg Gly Glu Leu Ala Leu Gly Lys Asn Val Arg Val Ala Phe Met Pro Trp Asn Gly Tyr Asn Phe Glu Asp Ala Ile Val Val Ser Glu Cys Ile Thr Lys Asp ' Asp Ile Phe Thr Ser Thr His Ile Tyr Glu Lys Glu Val Asp Ala Arg Glu Leu Lys His Gly Val Glu Glu Phe Thr Ala Asp Ile Pro Asp Val CTC

LysGluGlu Ala Ala HisLeu AspGluSer GlyIleVal LysVal Leu AGC

GlyThrTyr Val Ala GlyMet IleLeuVal GlyLysThr SerPro Ser AAA

LysGlyGlu ile Ser ThrPro GluGluArg LeuLeuArg AlaIle Lys GCC

PheGlyAsp Lys Gly HisVal ValAsnLys SerLeuTyr CysPro Ala GGC

ProSerLeu Glu Thr ValIle AspValLys ValPheThr LysLys Gly GAC

GlyTyrGlu Lys Ala ArgVal LeuSerAla TyrGluGlu GluLys Asp ATG

AlaLysLeu Asp Glu HisPhe AspArgLeu ThrMetLeu AsnArg Met CGC

GluGluLeu Leu Val ThrArg SerPheLeu LysArgPhe Arg TAACGGCAAG
GATTATAAAG

(2) INFORMATION FORSEQ ID N0:80:

(i ) ISTICS:
SEQUENCE
CHARACTER

(A) LENGTH:993 amino ids ac (B) TYPE: acid amino (C) STRANDEDNESS :
single (D) TOPOLOGY: near li (i i) TYPE: n MOLECULE protei (v) interna l FRAGMENT
TYPE:

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:80:
Met Ser Lys Lys Ile Pro Leu Lys Asn Arg Leu Arg Ala Asp Phe Thr Lys Thr Pro Thr Asp Leu Glu Val Pro Asn Leu Leu Leu Leu Gln Arg Asp Ser Tyr Asp Ser Phe Leu Tyr Ser Lys Glu Gly Lys Glu Ser Gly Ile Glu Lys Val Phe Lys Ser Ile Phe Pro Ile Gln Asp Glu His Asn WO 98!43478 PCT/US98/06371 Arg Ile Thr Leu Glu Tyr Ala Gly Cys Glu Phe Gly Lys Ser Lys Tyr Thr Val Arg Glu Ala Met Glu Arg Gly Ile Thr Tyr Ser Ile Pro Leu Lys Ile Lys Val Arg Leu Ile Leu Trp Glu Lys Asp Thr Lys Ser Gly Glu Lys Asn Gly Ile Lys Asp Ile Lys Glu Gln Ser Ile Phe Ile Arg Glu Ile Pro Leu Met Thr Glu Arg Thr Ser Phe Ile Ile Asn Gly Val Glu Arg Val Val Val Asn Gln Leu His Arg Ser Pro Gly Val Ile Phe Lys Glu Glu Glu Ser Ser Thr Ser Leu Asn Lys Leu Ile Tyr Thr Gly Gln Ile Ile Pro Asp Arg Gly Ser Trp Leu Tyr Phe Glu Tyr Asp Ser Lys Asp Val Leu Tyr Ala Arg Ile Asn Lys Arg Arg Lys Val Pro Val Thr Ile Leu Phe Arg Ala Met Asp Tyr Gln Lys Gln Asp Ile Ile Lys Met Phe Tyr Pro Leu Val Lys Val Arg Tyr Glu Asn Asp Lys Tyr Leu Ile Pro Phe Ala Ser Leu Asp Ala Asn Gln Arg Met Glu Phe Asp Leu Lys Asp Pro Gln Gly Lys Val Ile Leu Leu Ala Gly Lys Lys Leu Thr Ser Arg Lys Ile Lys Glu Leu Lys Glu Asn His Leu Glu Trp Val Glu Tyr Pro Met Asp Ile Leu Leu Asn Arg His Leu Ala Glu Pro Val Met Val Gly Lys Glu Val Leu Leu Asp Met Leu Thr Gln Leu Asp Lys Asn Lys Leu Glu Lys Ile His Asp Leu Gly Val Gln Glu Phe Val Ile Ile Asn Asp Leu Ala Leu Gly His Asp Ala Ser Ile Ile Gln Ser Phe Ser Ala Asp Ser Glu Ser Leu Lys Leu Leu Lys Gln Thr Glu Lys Ile Asp Asp Glu Asn Ala Leu Ala Ala I1e Arg Ile His Lys Val Met Lys Pro Gly Asp Pro Val Thr Thr Glu Val Ala Lys Gln Phe Val Lys Lys Leu Phe Phe Asp Pro Glu Arg Tyr Asp Leu Thr Met Val Gly Arg Met Lys Met Asn His Lys Leu Gly Leu His Val Pro Asp Tyr Ile Thr Thr Leu Thr His Glu Asp Ile Ile Thr Thr Val Lys Tyr Leu Met Lys Ile Lys Asn Asn Gln Gly Lys Ile Asp Asp Arg Asp His Leu Gly Asn Arg Arg " 450 455 460 Ile Arg Ala Val Gly Glu Leu Leu Ala Asn Glu Leu His Ser Gly Leu Val Lys Met Gln Lys Thr Ile Lys Asp Lys Leu Thr Thr Met Ser Gly Ala Phe Asp Ser Leu Met Pro His Asp Leu Val Asn Ser Lys Met Ile Thr Ser Thr Ile Met Glu Phe Phe Met Gly Gly Gln Leu Ser Gln Phe Met Asp Gln Thr Asn Pro Leu Ser Glu Val Thr His Lys Arg Arg Leu Ser Ala Leu Gly Glu Gly Gly Leu Val Lys Asp Arg Val Gly Phe Glu Ala Arg Asp Val His Pro Thr His Tyr Gly Arg Ile Cys Pro Ile Glu Thr Pro Glu Gly Gln Asn Ile Gly Leu Ile Asn Thr Leu Ser Thr Phe Thr Arg Val Asn Asp Leu Gly Phe Ile Glu Ala Pro Tyr Lys Lys Val Val Asp Gly Lys Val Val Gly Glu Thr Ile Tyr Leu Thr Ala Ile Gln Glu Asp Ser His Ile Ile Ala Pro Ala Ser Thr Pro Ile Asp Glu Glu Gly Asn Ile Leu Gly Asp Leu Ile Glu Thr Arg Val Glu Gly Glu Ile Val Leu Asn Glu Lys Ser Lys Val Thr Leu Met Asp Leu Ser Ser Ser Met Leu Val Gly Val Ala Ala Ser Leu Ile Pro Phe Leu Glu His Asp Asp Ala Asn Arg Ala Leu Met Gly Thr Asn Met Gln Arg Gln Ala Val Pro Leu Leu Arg Ser Asp Ala Pro Ile Val Gly Thr Gly Ile Glu Lys Ile Ile Ala Arg Asp Ser Trp Gly Ala Ile Lys Ala Asn Arg Ala Gly Val Val Glu Lys Ile Asp Ser Lys Asn Ile Tyr Ile Leu Gly Glu Ser Lys Glu Glu Ala Tyr Ile Asp Ala Tyr Ser Leu Gln Lys Asn Leu Arg Thr Asn Gln Asn Thr Ser Phe Asn Gln Val Pro Ile Val Lys Val Gly Asp Lys Val Gly Ala Gly Gln IIe Ile Ala Asp Gly Pro Ser Met Asp Arg Gly Glu Leu Ala Leu Gly Lys Asn Val Arg Val Ala Phe Met Pro Trp Asn Gly Tyr Asn Phe Glu Asp Ala Ile Val Val Ser Glu Cys Ile Thr Lys Asp Asp Ile Phe Thr Ser Thr His Ile Tyr Glu Lys Glu Val Asp Ala Arg Glu Leu Lys His Gly Val Glu Glu Phe Thr Ala Asp Ile Pro Asp Val Lys Glu Glu Ala Leu Ala His Leu Asp Glu Ser Gly Ile Val Lys Val Gly Thr Tyr Val Ser Ala Gly Met Ile Leu Val Gly Lys Thr Ser Pro Lys Gly Glu Ile Lys Ser Thr Pro Glu Glu Arg_Leu Leu Arg Ala Ile Phe Gly Asp Lys Ala Gly His Val Val Asn Lys Ser Leu Tyr Cys Pro Pro Ser Leu Glu Gly Thr Val Ile Asp Val Lys Val Phe Thr Lys Lys Gly Tyr Glu Lys Asp Ala Arg Val Leu Ser Ala Tyr Glu Glu Glu Lys Ala Lys Leu Asp Met Glu His Phe Asp Arg Leu Thr Met Leu Asn Arg Glu Glu Leu Leu Arg Val Thr Arg Ser Phe Leu Lys Arg Phe (2) INFORMATION FOR SEQ ID N0:81:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 49...525 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81:

Met His Ser Pro Asn Leu Glu Lys Glu Glu Thr Glu Ile Ile Glu Thr Leu Leu Met Arg Glu Lys Met Arg Leu Cys Pro Leu Tyr Trp Arg Ile Leu Ala Phe Leu Thr Asp Gly Leu Leu Val Ala Phe Leu Leu Ser Asp Leu Leu Asp Ala Cys Asp Phe Leu His Ser Leu Tyr Trp Leu Ala Asn Pro Ile Tyr " CAC AGC GCA TTT GTT GCG ATG GGT TTT ATC ATC TTG TAT GGC GTT TAT 297 His Ser Ala Phe Val Ala Met Gly Phe Ile Ile Leu Tyr Gly Val Tyr Glu Ile Phe Phe Val Cys Leu Cys Lys Met Ser Leu Ala Lys Leu Val AGG GCA

Phe IleLys IleIleAsp IleTyrLeu AspCys Pro Ser Arg Arg Ala ATT ATC

Ala LeuLeu LysArgLeu GlyLeuLys ValVal Phe Leu Cys Ile Ile _ TTT CCC

Pro LeuTrp PheValAla PheLysAsn TyrHis Arg Ala Trp Phe Pro GAA TTG TTATTG

His GluLys SerLysSer LeuLeuVal Phe Glu Leu (2) INFORMATION FOR SEQ ID N0:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 159 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:82:
Met His Ser Pro Asn Leu Glu Lys Glu Glu Thr Glu Ile Ile Glu Thr Leu Leu Met Arg Glu Lys Met Arg Leu Cys Pro Leu Tyr Trp Arg Ile Leu Ala Phe Leu Thr Asp Gly Leu Leu Val Ala Phe Leu Leu Ser Asp Leu Leu Asp Ala Cys Asp Phe Leu His Ser Leu Tyr Trp Leu Ala Asn Pro Ile Tyr His Ser Ala Phe Val Ala Met Gly Phe Ile Ile Leu Tyr Gly Val Tyr Glu Ile Phe Phe Val Cys Leu Cys Lys Met Ser Leu Ala Lys Leu Val Phe Arg Ile Lys Ile Ile Asp Ile Tyr Leu Ala Asp Cys Pro Ser Arg Ala Ile Leu Leu Lys Arg Leu Gly Leu Lys Ile Val Val Phe Leu Cys Pro Phe Leu Trp Phe Val Ala Phe Lys Asn Pro Tyr His Arg Ala Trp His Glu Glu Lys Ser Lys Ser Leu Leu Val Leu Phe (2) INFORMATION FOR SEQ ID N0:83:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 901 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence t (B) LOCATION: 67...852 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:83:

Met Arg Tyr Gln Asn Met Phe Glu Thr Leu Lys Lys His Glu Lys Met Ala Phe Ile Pro Phe Val Thr Leu Gly Asp Pro Asn Tyr Glu Leu Ser Phe Glu Ile Ile Lys Thr Leu Ile Ile Ser Gly Val Ser Ala Leu Glu Leu Gly Leu Ala Phe Ser Asp Pro Val Ala Asp Gly Ile Thr Ile Gln Ala Ser His Leu Arg Ala Leu Lys His Ala Ser Met Ala Lys Asn Phe Gln Leu Leu Lys Lys Ile Arg Asp Tyr Asn His Asn Ile Pro Ile Gly Leu Leu Ala Tyr Ala Asn Leu Ile Phe Ser Tyr Gly Val Asp Gly Phe Tyr Ala Gln Ala Lys Glu Cys Gly Ile Asp Ser Val Leu Ile ' 115 120 125 Ala Asp Met Pro Leu Ile Glu Lys Glu Leu Val Ile Lys Ser Ala Gln ATC GCC CCC

LysHis Gln Ile Lys Gln Ile Phe Ser Asn Ala Ser Ser Ile Ala Pro CAT TCG GGC

LysAsp Leu Glu Gln Val Ala Thr Gln Tyr Ile Tyr Ala His Ser Gly GCG AGC ATT

LeuAla Arg Ser Gly Val Thr Gly Arg Leu Glu Asn Asp Ala Ser Ile TCGAGT GCT ATT ATT AAA ACC TTA TTT CCT ACC CCA GCC 684 "
AAA GCT AGC

SerSer Ala Ile Ile Lys Thr Leu Phe Pro Thr Pro Ala Lys Ala Ser AAA GAA ATC

LeuLeu Gly Phe Gly Ile Ser Lys His Thr Asn Ala Lys Lys Glu Ile TGC GGA GCG

GlyMet Gly Ala Asp Gly Val Ile Ser Leu Val Lys Ile Cys Gly Ala AAC GCC CTG

IleGlu Glu Asn Leu Asn Asn Glu Met Glu Lys Ile Lys Asn Ala Leu TAAGGCTTTT AGGCTTTGTT GCGTTAAAAA

GlyPhe Ile Gly Gly Met Ile Phe CAGATTAAC

(2) INFORMATION FOR SEQ ID 4:
N0:8 (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 262 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:84:
Met Arg Tyr Gln Asn Met Phe Glu Thr Leu Lys Lys His Glu Lys Met Ala Phe Ile Pro Phe Val Thr Leu Gly Asp Pro Asn Tyr Glu Leu Ser Phe Glu Ile Ile Lys Thr Leu Ile Ile Ser Gly Val Ser Ala Leu Glu Leu Gly Leu Ala Phe Ser Asp Pro Val Ala Asp Gly Ile Thr Ile Gln WO 98/43478 PCTlUS98/06371 Ala Ser His Leu Arg Ala Leu Lys His Ala Ser Met Ala Lys Asn Phe Gln Leu Leu Lys Lys Ile Arg Asp Tyr Asn His Asn Ile Pro Ile Gly Leu Leu Ala Tyr Ala Asn Leu Ile Phe Ser Tyr Gly Val Asp Gly Phe Tyr Ala Gln Ala Lys Glu Cys Gly Ile Asp Ser Val Leu Ile Ala Asp ' Met Pro Leu Ile Glu Lys Glu Leu Val Ile Lys Ser Ala Gln Lys His Gln Ile Lys Gln Ile Phe Ile Ala Ser Pro Asn Ala Ser Ser Lys Asp Leu Glu Gln Val Ala Thr His Ser Gln Gly Tyr Ile Tyr Ala Leu Ala Arg Ser Gly Val Thr Gly Ala Ser Arg Ile Leu Glu Asn Asp Ser Ser Ala Ile Ile Lys Thr Leu Lys Ala Phe Ser Pro Thr Pro Ala Leu Leu Gly Phe Gly Ile 5er Lys Lys Glu His Ile Thr Asn Ala Lys Gly Met Gly Ala Asp Gly Val Ile Cys Gly Ser Ala Leu Val Lys Ile Ile Glu Glu Asn Leu Asn Asn Glu Asn Ala Met Leu Glu Lys Ile Lys Gly Phe Ile Gly Gly Met Ile Phe (2) INFORMATION FOR SEQ ID N0:85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1081 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 49...954 (D) OTHER INFORMATION:
' (xi) SEQUENCE DESCRIPTION: SEQ ID N0:85:

Met Asn Lys Ala Ile Ala Ser Lys Ile Leu Ile Thr Leu Gly Phe Leu Phe Leu Tyr ArgValLeu AlaTyrIle ProIle ProGlyValAsp LeuAla AlaIle LysAlaPhe PheAspSer AsnSer AsnAsnAlaLeu GlyLeu PheAsn _ MetPheSer GlyAsnAla ValSer ArgLeuSerIle IleSer LeuGly IleMetPro TyrIleThr SerSer IleIleMetGlu LeuLeu SerAla -ThrPhePro AsnLeuAla LysMet LysLysGluArg AspGly MetGln LysTyrMet GlnIleVal ArgTyr LeuThrIleLeu IleThr LeuIle GlnAlaVal SerValSer ValGly LeuArgSerIle SerGly GlyAla AsnGlyAla IleMetIle AspMet GlnValPheMet IleVal SerAla PheSerMet LeuThrGly ThrMet LeuLeuMetTrp IleGly GluGln IleThrGln ArgGlyVal GlyAsn GlyIleSerLeu IleIle PheAla GlyIleVal SerGlyIle ProSer AlaIleSerGly ThrPhe AsnLeu ValAsnThr GlyValIle AsnIle LeuMetLeuIle GlyIle ValLeu IleValLeu AlaThrIle PheAla IleIleTyrVal GluLeu AlaGlu ArgArgIle ProLleSer TyrAla ArgLysValVal MetGln AsnGln Asn Lys Arg Ile Met Asn Tyr Ile Pro Ile Lys Leu Asn Leu Ser Gly Val Ile Pro Pro Ile Phe Ala Ser Ala Leu Leu Val Phe Pro Ser Thr ' ATT TTG CAG CAA GCC ACA AGC AAC AAA ACC TTG CAA GCG GTT GCG NAT 921 Ile Leu Gln Gln Ala Thr Ser Asn Lys Thr Leu Gln Ala Val Ala Xaa Phe Leu Ser Pro Gln Gly Met Arg Ile Ile Phe T'TTTTTGCTT ACTTTTATTC TTCTATTGTG TTCAATTCTA AGGATATTGC GGATAATTTG 1034 (2) INFORMATION FOR SEQ ID N0:86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 302 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:86:
Met Asn Lys Ala Ile Ala Ser Lys Ile Leu Ile Thr Leu Gly Phe Leu Phe Leu Tyr Arg Val Leu Ala Tyr Ile Pro Ile Pro Gly Val Asp Leu Ala Ala Ile Lys Ala Phe Phe Asp Ser Asn Ser Asn Aan Ala Leu Gly Leu Phe Asn Met Phe Ser Gly Asn Ala Val Ser Arg Leu Ser Ile Ile Ser Leu Gly Ile Met Pro Tyr Ile Thr Ser Ser Ile Ile Met Glu Leu Leu Ser Ala Thr Phe Pro Asn Leu Ala Lys Met Lys Lys Glu Arg Asp ~ Gly Met Gln Lys Tyr Met Gln Ile Val Arg Tyr Leu Thr Ile Leu Ile Thr Leu Ile Gln Ala Val Ser Val Ser Val Gly Leu Arg Ser Ile Ser Gly Gly Ala Asn Gly Ala Ile Met Ile Asp Met G1n Val Phe Met Ile ' 130 135 140 Val Ser Ala Phe Ser Met Leu Thr Gly Thr Met Leu Leu Met Trp Ile Gly Glu Gln Ile Thr Gln Arg Gly Val Gly Asn Gly Ile Ser Leu Ile Ile Phe Ala Gly Ile Val Ser Gly Ile Pro Ser Ala Ile Ser Gly Thr Phe Asn Leu Val Asn Thr Gly Val Ile Asn Ile Leu Met Leu Ile Gly Ile Val Leu Ile Val Leu Ala Thr Ile Phe Ala Ile Ile Tyr Val Glu Leu Ala Glu Arg Arg Ile Pro Ile Ser Tyr Ala Arg Lys Val Val Met Gln Asn Gln Asn Lys Arg Ile Met Asn Tyr Ile Pro Ile Lys Leu Asn Leu Ser Gly Val Ile Pro Pro Ile Phe Ala Ser Ala Leu Leu Val Phe 260 265 270 '_ Pro Ser Thr Ile Leu Gln Gln Ala Thr Ser Asn Lys Thr Leu Gln Ala Val Ala Xaa Phe Leu Ser Pro Gln Gly Met Arg Ile Ile Phe (2) INFORMATION FOR SEQ ID N0:87:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 423 base pairs (B) TYPE: nucleic acid (C} STRANDEDNESS: single (D} TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 109...363 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:87:

Met Arg Phe Leu Phe Ser Lys Thr Leu Leu Met Met Ser Cys Cys Asn Thr Glu Arg Met Leu Phe Val Val Gln Tyr Lys Thr Asn Pro Ala Gly Lys Val Ile Lys Lys Ile Val Aen Asn Arg Gly Lys Ser Leu Lys Ile Phe Ala Cys Met Gly Ser Val Met Val Phe Gly Val Thr Leu Trp Cys Gln Tyr Ile Asp Ala Pro Ile Arg Ser Gly Lys Ile Lys Tyr Gly Ser Met Met Asp Lys Ser (2) INFORMATION FOR SEQ ID N0:88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 85 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:88:
Met Arg Phe Leu Phe Ser Lys Thr Leu Leu Met Met Ser Cys Cys Asn Thr Glu Arg Met Leu Phe Val Val Gln Tyr Lys Thr Asn Pro Ala Gly Lys Val Ile Lys Lys Ile Val Asn Asn Arg Gly Lys Ser Leu Lys Ile Phe Ala Cys Met Gly Ser Val Met Val Phe Gly Val Thr Leu Trp Cys Gln Tyr Ile Asp Ala Pro Ile Arg Ser Gly Lys Ile Lys Tyr Gly Ser Met Met Asp Lys Ser (2) INFORMATION FOR SEQ ID N0:89:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 740 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear ' (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 59...688 ' (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89:

Met Leu Leu Lys Thr Lys Leu Lys Ile Ile Ser Ser Val Ile Leu Ser Ala Leu Leu Trp Val Gly Cys Ser Ser Glu Met Ala Thr Tyr Gln Asn Val Asn Asp Ala Thr Lys Asn Thr Thr Ala Ser Ile Asn Ser Thr Asp Leu Leu Leu Thr Ala Asn. Ala Met Leu Asp Ser Met Phe Ser Asp Pro Asn Phe Glu Gln Leu Lys Gly Lys His Leu Ile Glu Val Ser Asp Val Ile Asn Asp Thr Thr Gln Pro Asn Leu Asp Met Asn Leu Leu Thr Thr Glu Ile Ala Arg Gln Leu Arg Leu Arg Ser Asn Gly Arg Phe Asn Ile Thr Arg Ala Ser Gly Gly Ser Gly Ile Ala Ala Asp Ser Arg Met Val Lys Gln Arg Glu Lys Glu Arg Glu Ser Glu Glu Tyr Asn Gln Asp Thr Thr Val Glu Lys Gly Thr Leu Lys Ala Ala Asp Leu Ser Leu Ser Gly Lys Val Ser Ser Ile Ala Ala Ser Ile Ser Ser Ser Arg Gln Arg Leu Asp Tyr Asp Phe Thr Leu Ser Leu Thr Asn Arg Lys Thr Gly Glu Glu Val Trp Ser Asp Val Lys Pro Ile Val Lys Asn Ala Ser Asn Lys Arg Met Phe (2) INFORMATION FOR SEQ ID N0:90:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 210 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:90:
Met Leu Leu Lys Thr Lys Leu Lys Ile Ile Ser Ser Val Ile Leu Ser Ala Leu Leu Trp Val Gly Cys Ser Ser Glu Met Ala Thr Tyr Gln Asn Val Asn Asp Ala Thr Lys Asn Thr Thr Ala Ser Ile Asn Ser Thr Asp Leu Leu Leu Thr Ala Asn Ala Met Leu Asp Ser Met Phe Ser Asp Pro Asn Phe Glu Gln Leu Lys Gly Lys His Leu Ile Glu Val Ser Asp Val Ile Asn Asp Thr Thr Gln Pro Asn Leu Asp Met Asn Leu Leu Thr Thr Glu Ile Ala Arg Gln Leu Arg Leu Arg Ser Asn Gly Arg Phe Asn Ile Thr Arg Ala Ser Gly Gly Ser Gly Ile Ala Ala Asp Ser Arg Met Val Lys Gln Arg Glu Lys Glu Arg Glu Ser'Glu Glu Tyr Asn Gln Asp Thr Thr Val Glu Lys Gly Thr Leu Lys Ala Ala Asp Leu Ser Leu Ser Gly Lys Val Ser Ser Ile Ala Ala Ser Ile Ser Ser Ser Arg Gln Arg Leu Asp Tyr Asp Phe Thr Leu Ser Leu Thr Asn Arg Lys Thr Gly Glu Glu Val Trp Ser Asp Val Lys Pro Ile Val Lys Asn Ala Ser Asn Lys Arg Met Phe (2) INFORMATION FOR SEQ ID N0:91:
' (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1269 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 84...1214 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: N0:91: -_ SEQ
ID

TCGTTTTAAC ATTTTACCTA
AAACAAAATT

ATCTGAGAGA GAATTATATT '113 TTA
ATG
AAG
ACA
GAG
AAA
CAA
AAA
TTT
TTA
GAG

M et Lys Thr Glu Lys Gln Lys Phe Leu Glu MetArgLysAsp GlyAla AsnSerVal LeuIleLeu ArgGlyAsp Trp AspPheLysThr SerVal PheArgLeu AspGluLeu LysLysAsn Leu LeuAspHisGln GlyPro LeuLysMet AspPheSer GlyCysGln Lys ValAspPheVal PheGly MetPheLeu PheAspLeu ValLysGlu Arg 60 65 7p SerLeuAsnIle GluLeu CysAsnVal SerGluAsn AsnAlaCys Ala LeuLysValVal LysAep TrpLeuGlu LysGluGlu AspLeuGlu Ser LysLysAlaGly LysHis TyrGluLeu LeuIleThr LysLeuGly Lys SerIleValGlu ThrTyr AsnThrPhe LeuAsnAla PheAsnPhe Cys GlyMetIleLeu PheTyr PheIleLys SerValPhe AsnProLys Arg 140 145 ~ 150 PheCysIleThr ProLeu LeuTyrHis IleAsnGlu SerGlyPhe Lys Val Leu ProValSer IleLeu ThrVal PheIleValGly PheAlaVal Ala Leu GlnGlyAla LeuGln LeuGln AspMetGlyAla ProLeuMet Ser Val GluMetThr AlaLys LeuAla LeuArgGluIle GlyProPhe Ile Leu ThrLeuVal ValAla GlyArg SerAlaSerSer PheThrAla Gln Ile GlyValMet LysIle ThrGlu GluLeuAspAla MetLysThr Met Gly PheAsnPro PheGlu PheLeu ValLeuProArg ValLeuAla Leu Val IleValLeu ProLeu LeuVal PheIleAlaAsp AlaPheAla Ile Leu GlyGlyMet PheAla IleLys TyrGlnLeuAsp LeuGlyPhe Pro Ser TyrIleAsp ArgPhe HisAsp ThrValGlyTrp AsnHisPhe Leu Val GlyIleVal LysAla ProPhe TrpGlyPheAla IleAlaMet Val Gly CysMetArg GlyPhe GluVal LysGlyAspThr ~Glu5erIle ' 335 340 345 Gly Arg LeuThrThr IleSer ValVal AsnAlaLeuPhe TrpIleIle Phe Leu AspAlaIle PheSer IleIle PheSerLysLeu AsnIle ACAATCAAGT
CTTAATTGAA
GTGAAGGATC

(2) INFORMATION FOR SEQ ID N0:92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 377 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein ' (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:92:
Met Lys Thr Glu Lys Gln Lys Phe Leu Glu Met Arg Lys Asp Gly Ala Asn Ser Val Leu Ile Leu Arg Gly Asp Trp Asp Phe Lys Thr Ser Val Phe Arg Leu Asp Glu Leu Lys Lys Asn Leu Leu Asp His Gln Gly Pro Leu Lys Met Asp Phe Ser Gly Cys Gln Lys Val Asp Phe Val Phe Gly Met Phe Leu Phe Asp Leu Val Lys Glu Arg 5er Leu Asn Ile Glu Leu Cys Asn Val Ser Glu Asn Asn Ala Cys Ala Leu Lys Val Val Lys Asp Trp Leu Glu Lys Glu Glu Asp Leu Glu Ser Lys Lys Ala Gly Lys His Tyr Glu Leu Leu Ile Thr Lys Leu Gly Lys Ser Ile Val Glu Thr Tyr Asn Thr Phe Leu Asn Ala Phe Asn Phe Cys Gly-Met Ile Leu Phe Tyr Phe Ile Lys Ser Val Phe Asn Pro Lys Arg Phe Cys Ile Thr Pro Leu Leu Tyr His Ile Asn Glu Ser Gly Phe Lys Val Leu Pro Val Ser Ile Leu Thr Val Phe Ile Val Gly Phe Ala Val Ala Leu Gln Gly Ala Leu Gln Leu Gln Asp Met Gly Ala Pro Leu Met Ser Val Glu Met Thr Ala Lys Leu Ala Leu Arg Glu Ile Gly Pro Phe Ile Leu Thr Leu Val Val Ala Gly Arg Ser Ala Ser Ser Phe Thr Ala Gln Ile Gly Val Met Lys Ile Thr Glu Glu Leu Asp Ala Met Lys Thr Met Gly Phe Asn Pro Phe Glu Phe Leu Val Leu Pro Arg Val Leu Ala Leu Val Ile Val Leu Pro 260 265 270 -_ Leu Leu Val Phe Ile Ala Asp Ala Phe Ala Ile Leu Gly Gly Met Phe Ala Ile Lys Tyr Gln Leu Asp Leu Gly Phe Pro Ser Tyr Ile Asp Arg Phe His Asp Thr Val Gly Trp Asn His Phe Leu Val Gly Ile Val Lys Ala Pro Phe Trp Gly Phe Ala Ile Ala Met Val Gly Cys Met Arg Gly Phe Glu Val Lys Gly Asp Thr Glu Ser Ile Gly Arg Leu Thr Thr Ile Ser Val Val Asn Ala Leu Phe Trp Ile Ile Phe Leu Asp Ala Ile Phe Ser Ile Ile Phe Ser Lys Leu Asn Ile (2) INFORMATION FOR SEQ ID N0:93:
(i) SEQUENCE CHARACTERISTICS:
. (A) LENGTH: 557 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 60...503 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:93:

Met Gly Phe Leu Asn Gly Tyr Phe Leu Trp Val Lys Ala Phe His Val Ile Ala Val Ile Ser Trp Met Ala Ala Leu Phe Tyr Leu Pro Arg Leu Phe Val Tyr His Ala Glu Asn Ala His Lys Lys Glu Phe Val Gly Val Val Gln Ile Gln Glu Lys Lys Leu Tyr Ser Phe Ile Ala Ser Pro Ala ' 50 55 60 Met Gly Phe Thr Leu Ile Thr Gly Ile Leu Met Leu Leu Ile Glu Pro Thr Leu Phe Lys Ser Gly Gly Trp Leu His Ala Lys Leu Ala Leu Val Val Leu Leu Leu Ala Tyr His Phe Tyr Cys Lys Lys Cys Met Arg Glu Leu Glu Lys Asp Pro Thr Arg Arg Asn Ala Arg Phe Tyr Arg Val Phe Asn Glu Ala Pro Thr Ile Leu Met Ile Leu Ile Val Ile Leu Val Val GTC AAG CCT TTT TAAAGACAAG CCATGAAAAA AGAAAAGTCA TGAAAAAl~GA AAAGCA 549 Val Lys Pro Phe (2) INFORMATION FOR SEQ ID N0:94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 148 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:94:
Met Gly Phe Leu Asn Gly Tyr Phe Leu Trp Val Lys Ala Phe His Val Ile Ala Val Ile Ser Trp Met Ala Ala Leu Phe Tyr Leu Pro Arg Leu Phe Val Tyr His Ala Glu Asn Ala His Lys Lys Glu Phe Val Gly Val Val Gln Ile Gln Glu Lys Lys Leu Tyr Ser Phe Ile Ala Ser Pro Ala Met Gly Phe Thr Leu Ile Thr Gly Ile Leu Met Leu Leu Ile Glu Pro Thr Leu Phe Lys Ser Gly Gly Trp Leu His Ala Lys Leu Ala Leu Val Val Leu Leu Leu Ala Tyr His Phe Tyr Cys Lys Lys Cys Met Arg Glu Leu Glu Lys Asp Pro Thr Arg Arg Asn Ala Arg Phe Tyr Arg Val Phe Aan Glu Ala Pro Thr Ile Leu Met Ile Leu Ile Val Ile Leu Val Val Val Lys Pro Phe (2) INFORMATION FOR SEQ ID N0:95:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1671 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 50...1624 (D) OTHER INFORMATION:
(xi) DESCRIPTION: SEQ ID N0:95:
SEQUENCE

TGGTGCTAAG ATG
ACTTTGAAAC AAA
AACGCCAAAT CTT

Met Lys Leu Phe Asn AlaArg LeuIleVal PheIle GlyAlaLeuLeu LeuGlyVal Gly Phe SerVal ProSerLeu LeuGlu ThrLysGlyPro LysIleThr Leu Gly LeuAsp LeuArgGly GlyLeu AsnMetLeuLeu GlyValGln Thr Asp GluAla LeuLysAsn LysTyr LeuSerLeuAla SerAlaLeu Glu Tyr AsnAla LysLysGln AsnIle LeuLeuLysAsp IleLysSer Asn Leu GluGly IleSerPhe GluLeu LeuAspGluAsp GluAlaLys ' AAA TTA GACGCG CTTTTATTG GAATTG CAAGGCCATAGC CAGTTTGAA 394 Lys Leu AspAla LeuLeuLeu GluLeu GlnGlyHisSer GlnPheGlu Ile Lys LysGlu AlaGlyPhe TyrSer ValAsnLeuThr ProLeuGlu Gln Glu GluLeu ArgLysAsn ThrIle LeuGlnValIle GlyIleIle GAT CCT GTC

Arg AsnArgLeu Gln PheGlyLeu AlaGlu Val IleGln Asp Pro Val GAA GGC AAG

Gln GlyLysGlu Ile SerValGln LeuPro Ile ThrLeu Glu Gly Lys CGC AGA GCT

Glu GluGluArg Ala LysAspLeu IleSer Ser HisLeu Arg Arg Ala GTG GAT ATG

Gln MetMetAla Asp GluGluHis AsnLys Ala LysMet Val Asp Met GCT TTG TCT

Thr AspLeuGlu Gln LysLeuGly SerVal Leu AspVal Ala Leu Ser AAA CCC TTA

Glu MetGlyGly Ile LeuLeuLys AlaIle Ile AspGly Lys Pro Leu GAT CAA AAC

Glu MetLeuThr Ala LysValVal TyrAsp Asn GlnPro Asp Gln Asn ACG AAG TTT

Val VaISerPhe Leu AspAlaGln GlyAla Ile GlyAsp Thr Lys Phe AAT ATT TTA

Phe SerGlyAla Val GlyLysArg MetAla Val AspAsn Asn Ile Leu GCC ATC GGG

Lys ValTyrSer Pro ValIleArg GluArg Gly GlySer Ala Ile Gly GGG GCG GAT

Gly GlnIleSer Asn PheSerVal AlaGln Ser LeuAla Gly Ala Asp AGT ATT GTT

Ile AlaLeuArg Gly AlaMetSer AlaPro Gln LeuGlu Ser Ile Val GGC AGC AAA

Lys ArgIleIle Pro SerLeuGly LysAsp Val ThrSer Gly Ser Lys GTT ATG TTT

Ile IleAlaLeu Gly GlyPheIle LeuVal Gly MetVal Val Met Phe Leu Tyr Tyr Ser Met Ala Gly Val Ile Ala Cys Leu Ala Leu Val Val Asn Leu Phe Leu Ile Val Ala Val Met Ala Ile Phe Gly Ala Thr Leu ' ACT TTA CCG GGA ATG GCG GGG ATT GTT TTA ACC GTG GGG ATT GCC GTG 1306 Thr Leu Pro Gly Met Ala Gly Ile Val Leu Thr Val Gly Ile Ala Val ' 405 410 415 Asp Ala Asn Ile Ile Ile Asn Glu Arg Ile Arg Glu Val Leu Arg Glu Asn Glu Gly Ile Ala Lys Ala Ile His Leu Gly Tyr Ile Asn Ala Ser Arg Ala Ile Phe Asp Ser Asn Ile Thr Ser Leu Ile Ala Ser Val Leu Leu Tyr Ala Tyr Gly Thr Gly Ala Ile Lys Gly Phe Ala Leu Thr Thr Gly Ile Gly Ile Leu Ala Ser Ile Ile Thr Ala Ile Val Gly Thr Gln Gly Ile Tyr Gln Ala Leu Leu Pro Lys Leu Thr Gln Thr Lys Ser Leu Tyr Phe Trp Phe Gly Val Asn Lys Arg Ala (2) INFORMATION FOR SEQ ID N0:96:
' (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 525 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:96:

Met Lys Leu Phe Asn Ala Arg Leu Ile Val Phe Ile Gly Ala Leu Leu Leu Gly Val Gly Phe Ser Val Pro Ser Leu Leu Glu Thr Lys Gly Pro Lys Ile Thr Leu Gly Leu Asp Leu Arg Gly Gly Leu Asn Met Leu Leu Gly Val Gln Thr Asp Glu Ala Leu Lys Asn Lys Tyr Leu Ser Leu Ala Ser Ala Leu Glu Tyr Asn Ala Lys Lys Gln Asn Ile Leu Leu Lys Asp Ile Lys Ser Asn Leu Glu Gly Ile Ser Phe Glu Leu Leu Asp Glu Asp y.

Glu Ala Lys Lys Leu Asp Ala Leu Leu Leu Glu Leu Gln Gly His Ser Gln Phe Glu Ile Lys Lys Glu Ala Gly Phe Tyr Ser Val Asn Leu Thr Pro Leu Glu Gln Glu Glu Leu Arg Lys Asn Thr Ile Leu Gln Val Ile Gly Ile Ile Arg Asn Arg Leu Asp Gln Phe Gly Leu Ala Glu Pro Val Val Ile Gln Gln Gly Lys Glu Glu Ile Ser Val Gln Leu Pro Gly Ile Lys Thr Leu Glu Glu Glu Arg Arg Ala Lys Asp Leu Ile Ser Arg Ser Ala His Leu Gln Met Met Ala Val Asp Glu Glu His Asn Lys Asp Ala Met Lys Met Thr Asp Leu Glu Ala Gln Lys Leu Gly Ser Val Leu Leu Ser Asp Val Glu Met Gly Gly Lys Ile Leu Leu Lys Ala Ile Pro Ile Leu Asp Gly Glu Met Leu Thr Asp Ala Lys Val Val Tyr Asp Gln Asn Asn Gln Pro Val Val Ser Phe Thr Leu Asp Ala Gln Gly Ala Lys Ile Phe Gly Asp Phe Ser Gly Ala Asn Val Gly Lys Arg Met Ala Ile Val Leu Asp Asn Lys Val Tyr Ser Ala Pro Val Ile Arg Glu Arg Ile Gly Gly Gly Ser Gly Gln Ile Ser Gly Asn Phe Ser Val Ala Gln Ala Ser Asp Leu Ala Ile Ala Leu Arg Ser Gly Ala Met Ser Ala Pro Ile Gln Val Leu Glu Lys Arg Ile Ile Gly Pro Ser Leu Gly Lys Asp Ser Val Lys Thr Ser Ile Ile Ala Leu Val Gly Gly Phe Ile Leu Val Met Gly Phe Met Val Leu Tyr Tyr Ser Met Ala Gly Val ile Ala Cys Leu Ala Leu Val Val Asn Leu Phe Leu Ile Val Ala Val Met Ala Ile Phe Gly Ala Thr Leu Thr Leu Pro Gly Met Ala Gly Ile Val Leu Thr Val Gly Ile Ala Val Asp Ala Asn Ile Ile Ile Asn Glu Arg Ile Arg Glu Val Leu Arg Glu Asn Glu Gly Ile Ala Lys Ala Ile His Leu Gly Tyr Ile Asn Ala Ser Arg Ala Ile Phe Asp Ser Asn Ile Thr Ser Leu Ile Ala Ser Val Leu Leu Tyr Ala Tyr Gly Thr Gly Ala Ile Lys Gly Phe Ala Leu Thr Thr Gly Ile Gly Ile Leu Ala Ser Ile Ile Thr Ala Ile Val Gly Thr Gln Gly Ile Tyr Gln Ala Leu Leu Pro Lys Leu Thr Gln Thr Lys Ser Leu Tyr Phe Trp Phe Gly Val Asn Lys Arg Ala . 515 520 525 (2) INFORMATION FOR SEQ ID N0:97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 706 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 64...654 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:97:

Met Lys Arg Ser Ser Phe Thr Ser Asn Ser Val Leu Asn Phe Phe Val Val Leu Ser Phe Ile Thr Ile Gly Leu Val Phe Phe Phe Leu Arg Ser Gln Pro Thr Ser Val Val Ser Lys Glu Asn Ile Pro Lys Ile Glu Leu Glu Asn Phe Lys Ala Phe Gln Ile Asn Asp Lys Ile Leu Asp Leu Ser Ile Glu Gly Lys Lys Ala Leu Gln Tyr Asp Asp His Glu Ile Phe Phe Asp Ser Lys Ile Lys Arg Tyr Asp Glu Asp Thr Ile Glu Ser Val CCT TTG TAT TTC

GluSer Lys Ala Lys Arg Gln Gln Asp Phe Pro Asn Pro Leu Tyr Phe ACT AGT TTT AGT

GlyVal Tyr Lys Arg Ser Asp Asp Ser Trp Glu Thr Thr Ser Phe Ser _ TAT AAA GGC GGC

GlyIle Asn His Lys Glu Gln Asn Phe Lys Arg Phe Tyr Lys Gly Gly 130 135 ~ 140 ACT GGG CTT ATT

IleLeu Ser Lys Asp Ser Lys Ile Glu Asp Ser Tyr Thr Gly Leu Ile GCA AGC ATT GCG

SerHis Leu Ala Ile ile Glu Ala Gln Gln His Leu Ala Ser Ile Ala GAT GAA AAG AAA

PheLeu Glu Ile Lys Gln Ser Gln Lys Lys Phe Pro Asp Glu Lys Lys AAA TTGGTGTGTT

ThrPhe Gly Gly Phe Lys TGAT

(2) INFORMATION FOR SEQ ID N0:98:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 197 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE:
protein (v) FRAGMENT
TYPE:
internal (xi) EQUENCE DESCRIPTION: SEQ ID
S N0:98:

Met Lys Arg Ser Ser Phe Thr Ser Asn Ser Val Leu Asn Phe Phe Val Val Leu Ser Phe Ile Thr Ile Gly Leu Val Phe Phe Phe Leu Arg Ser 20 25 30 _ Gln Pro Thr Ser Val Val Ser Lys Glu Asn Ile Pro Lys Ile Glu Leu Glu Asn Phe Lys Ala Phe Gln Ile Asn Asp Lys Ile Leu Asp Leu Ser Ile Glu Gly Lys Lys Ala Leu Gln Tyr Asp Asp His Glu Ile Phe Phe Asp Ser Lys Ile Lys Arg Tyr Asp Glu Asp Thr Ile Glu Ser Val Glu Ser Pro Lys Ala Lys Arg Gln Gln Asp Leu Tyr Phe Phe Pro Asn Gly Val Thr Tyr Lys Arg Ser Asp Asp Ser Ser Phe Trp Ser Glu Thr Gly Ile Tyr Asn His Lys Glu Gln Asn Phe Lys Gly Lys Gly Arg Phe Ile Leu Thr Ser Lys Asp Ser Lys Ile Glu Gly Leu Asp Ile Ser Tyr Ser His Ala Leu Ala Ile Ile Glu Ala Gln Ser Ile Gln Ala His Leu Phe Leu Asp Glu Ile Lys Gln Ser Gln Lys Glu Lys Lys Lys Phe Pro Thr Phe Lys Gly Gly Phe (2) INFORMATION FOR SEQ ID N0:99:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1010 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/I~Y: Coding Sequence (B) LOCATION: 130...957 {D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:99:

Met Lys Thr Ser Lys Thr Lys Thr Pro Lys Ser Val Leu Ile ' Ala Gly Pro Cys Val Ile Glu Ser Leu Glu Asn Leu Arg Ser Ile Ala Thr Lys Leu Gln Pro Leu Ala Asn Asn Glu Arg Leu Asp Phe Tyr Phe Lys Ala Ser Phe Asp Lys Ala Asn Arg Thr Ser Leu Glu Ser Tyr Arg AAA ATG ACG AAA

Gly ProGly Leu Glu Gly Leu Glu Leu Gln Ile Glu Lys Met Thr Lys ATC GTG AGT CAA

Glu PheGly Tyr Lys Leu Thr Asp His Glu Tyr Ala Ile Val Ser Gln GTG TTA CCG TTT

Ser ValAla Ala Lys Ala Asp Ile Gln Ile Ala Leu Val Leu Pro Phe CTG GTG ACT GCT

Cys ArgGln Thr Asp Ile Val Glu Ser Gln Asn Ile Leu Val Thr Ala GGG AAC GAC CAA

Val AsnIle Lys Lys Gln Phe Met Pro Lys Met Tyr Gly Asn Asp Gln CTT GAT ATT AGC

Ser ValLeu Lys Ala Lys Thr Arg Lys Ser Gln Pro Leu Asp Ile Ser TTA GTG TGT AGG

Thr TyrGlu Thr Ala Lys Asn Gly Trp Leu Glu Gly Leu Val Cys Arg GGG GTG CGC TTA

Ser SerPhe Gly Tyr Asn Leu Val Asp Met Ser Lys Gly Val Arg Leu GCC TTT ACC AGC

Ile MetArg Glu Phe Pro Val Ile Asp Ala His Val Ala Phe Thr Ser_ 195 200 20!:i GCG AGT GAC TCT

Gln MetPro Gly Gly Asn Gly Lys Ser Gly Ser Phe Ala Ser Asp Ser AGA GCG ATT GGG

Ala ProIle Leu Ala Ala Ala Ala Val Gly Asp Leu Arg Ala Ile Gly GTT AAC AGC GGA

Phe AlaGlu Thr His Asp Pro Lys Ala Leu Asp Ala Val Asn Ser Gly GAC CAA ACC ATG

_ Asn MetLeu Lys Pro Glu Leu Glu Leu Val Asp Leu Asp Gln Thr Met TTT TCATGCAAAT
CATAGAAGGG
AAATTGCA

Lys ileGln Asn Leu Phe (2) INFORMATION FOR SEQ ID NO:100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 276 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:100:
Met Lys Thr Ser Lys Thr Lys Thr Pro Lys Ser Val Leu Ile Ala Gly Pro Cys Val Ile Glu Ser Leu Glu Asn Leu Arg Ser Ile Ala Thr Lys Leu Gln Pro Leu Ala Asn Asn Glu Arg Leu Asp Phe Tyr Phe Lys Ala Ser Phe Asp Lys Ala Asn Arg Thr Ser Leu Glu Ser Tyr Arg Gly Pro Gly Leu Glu Lys Gly Leu Glu Met Leu Gln Thr Ile Lys Glu Glu Phe Gly Tyr Lys Ile Leu Thr Asp Val His Glu Ser Tyr Gln Ala Ser Val Ala Ala Lys Val Ala Asp Ile Leu Gln Ile Pro Ala Phe Leu ~ys Arg Gln Thr Asp Leu Ile Val Glu Val Ser Gln Thr Asn Ala Ile Val Asn Ile Lys Lys Gly Gln Phe Met Asn Pro Lys Asp Met Gln Tyr Ser Val Leu Lys Ala Leu Lys Thr Arg Asp Lys Ser Ile Gln Ser Pro Thr Tyr Glu Thr Ala Leu Lys Asn Gly Val Trp Leu Cys Glu Arg Gly Ser Ser Phe Gly Tyr Gly Asn Leu Val Val Asp Met Arg Ser Leu Lys Ile Met Arg Glu Phe Ala Pro Val Ile Phe Asp Ala Thr His Ser Val Gln Met Pro Gly Gly Ala Asn Gly Lys Ser Ser Gly Asp Ser Ser Phe Ala Pro Ile Leu Ala Arg Ala Ala Ala Ala Val Gly Ile Asp Gly Leu Phe Ala Glu Thr His Val Asp Pro Lys Asn Ala Leu Ser Asp Gly Ala Asn Met Leu Lys Pro Asp Glu Leu Glu Gln Leu Val Thr Asp Met Leu Lys Ile Gln Asn Leu Phe (2) INFORMATION FOR SEQ ID NO:101:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 240 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence ' (B) LOCATION: 59...196 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:101:

ATG AAA GGT CGC GTA GCT CAG TTG GTA GAG CAC TAC CTT GAC. ATG GTA 106 Met Lys Gly Arg Val Ala Gln Leu Val Glu His Tyr Leu Asp Met Val Val Ala Ala Gly Ser Ser Pro Val Val Ala Thr Ile Ile Thr Pro Ile Leu Ile Leu Ile Phe Leu Arg Val Phe Asp Leu Tyr Lys Phe (2) INFORMATION FOR SEQ ID N0:102:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single {D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:102:
Met Lys Gly Arg Val Ala Gln Leu Val Glu His Tyr Leu Asp Met Val Val Ala Ala Gly Ser Ser Pro Val Val Ala Thr Ile Ile Thr Pro Ile Leu Ile Leu Ile Phe Leu Arg Val Phe Asp Leu Tyr Lys Phe {2) INFORMATION FOR SEQ ID N0:103:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1382 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 91...1329 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:103:

Met Ala Gln Glu Lys Ala Val Pro Arg Asp Pro Lys Lys Leu Asn Ala Phe Asp Leu Arg Trp Met Val Ser Leu Phe Gly Thr Ala Val Gly Ala Gly Ile Leu Phe Leu Pro Ile Arg Ala Gly Gly His Gly Val Trp Ala Ile Val Val Met Ser Ala Ile Ile Phe Pro Leu Thr Tyr Leu Gly His Arg Ala Leu Ala Tyr Phe Ile Gly Ser Lys Asp Lys Glu Asp Ile Thr Met Val Val Arg Ser His Phe Gly Ala Gln Trp Gly Phe Leu Ile Thr Leu Leu Tyr Phe Leu Ala Ile Tyr ' 90 95 100 _ CCT ATT TGC TTG GTT TAT GGG GTG GGT ATC ACT AAC GTG TTT GAT CAT 450 Pro Ile Cys Leu Val Tyr Gly Val Gly Ile Thr Asn Val Phe Asp His Phe Phe Thr Asn Gln Leu His Leu Ala Pro Phe His Arg Gly Leu Leu WO

AlaVal AlaLeuVal Leu MetMet Leu MetValPhe Ala Ser Val Asn ATT GTG TGC

ThrIle ValThrArg Cys AsnAla Leu TyrProLeu Leu Ile Val Cys TCT CCT GGC

_ IleLeu LeuLeuPhe Leu TyrLeu Ile TyrTrpGln Ala Ser Pro Gly CCG TTT ATT

AsnLeu PheValVal Ser PheLys Glu ValLeuAla Trp Pro Phe Ile CTT GAC ATC

LeuThr LeuProVal Val PheAla Phe HisSerPro Ile Leu Asp Ile AAT TAC AAA

SerThr PheThrGln Val GlyLys Glu GlyValPhe Glu Asn Tyr Lys ATT TCG TTA

TyrLys LeuAsnGln Glu LeuGly Thr LeuMetLeu Gly Ile Ser Leu GTG ATG GCT

PheVal MetPhePhe Phe SerCys Val CysLeuAsn Asp Val Met Ala AGG CCC TAT

AspPhe ValLysAla Glu GlnAsn Ile IleLeuSer Leu Arg Pro Tyr AAC TAT GTG

AlaAsn ThrLeuAsn Pro LeuIle Asn AlaGlyPro Val Asn Tyr Val TTT GGG GGG

AlaPhe LeuAlaIle Ser SerPhe Phe HisTyrTyr Ala Phe GIy Gly GGC AGC AAA

LysGlu GlyLeuGlu Ile IleIle Gln LeuLysLeu Lys Gly Ser Lys AGC ATT CTG

AlaSer LysProLeu Val SerVal Thr PheLeuTrp Thr Ser Ile Leu TAT ATC ATT

IleThr LeuValAla Ile AsnPro Asn LeuAspPhe Glu Tyr Ile Ile Asn Leu Gly Gly Pro Ile Ile Ala Leu Ile Leu Phe Val Met Pro Met Ile Ala Phe Tyr Ser Val Ser Ser Leu Lys Arg Phe Arg Asn Phe Lys Val Asp Ile Phe Val Phe Val Phe Gly Ser Leu Thr Ala Leu Ser Val Phe Leu Gly Leu Phe (2) INFORMATION FOR SEQ ID N0:104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 413 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:104:
Met Ala Gln Glu Lys Ala Val Pro Arg Asp Pro Lys Lys Leu Asn Ala Phe Asp Leu Arg Trp Met Val Ser Leu Phe Gly Thr Ala Val Gly Ala Gly Ile Leu Phe Leu Pro Ile Arg Ala Gly Gly His Gly Val Trp Ala Ile Val Val Met Ser Ala Ile Ile Phe Pro Leu Thr Tyr Leu Gly His Arg Ala Leu Ala Tyr Phe Ile Gly Ser Lys Asp Lys Glu Asp Ile Thr Met Val Val Arg Ser His Phe Gly Ala Gln Trp Gly Phe Leu Ile Thr Leu Leu Tyr Phe Leu Ala Ile Tyr Pro Ile Cys Leu Val Tyr Gly Val Gly Ile Thr Asn Val Phe Asp His Phe Phe Thr Asn Gln Leu His Leu Ala Pro Phe His Arg Gly Leu Leu Ala Val Ala Leu Val Ser Leu Met Met Leu Val Met Val Phe Asn Ala Thr Ile Val Thr Arg Ile Cys Asn Ala Leu Val Tyr Pro Leu Cys Leu Ile Leu Leu Leu Phe Ser Leu Tyr Leu Ile Pro Tyr Trp Gln Gly Ala Asn Leu Phe Val Val Pro Ser Phe WO 98!43478 PCT/US98/06371 Lys Glu Phe Val Leu Ala Ile Trp Leu Thr Leu Pro Val Leu Val Phe Ala Phe Asp His Ser Pro Ile Ile Ser Thr Phe Thr Gln Asn Val Gly Lys Glu Tyr Gly Val Phe Lys Glu Tyr Lys Leu Asn Gln Ile Glu Leu Gly Thr Ser Leu Met Leu Leu Gly Phe Val Met Phe Phe Val Phe Ser 245 250 255 _ Cys Val Met Cys Leu Asn Ala Asp Asp Phe Val Lys Ala Arg Glu Gln Asn Ile Pro Ile Leu Ser Tyr Leu Ala Asn Thr Leu Asn Asn Pro Leu Ile Asn Tyr Ala Gly Pro Val Val Ala Phe Leu Ala Ile Phe Ser Ser Phe Phe Gly His Tyr Tyr Gly Ala Lys Glu Gly Leu Glu Gly Ile Ile Ile Gln Ser Leu Lys Leu Lys Lys Ala Ser Lys Pro Leu Ser Val Ser Val Thr Ile Phe Leu Trp Leu Thr Ile Thr Leu Val Ala Tyr Ile Asn Pro Asn Ile Leu Asp Phe Ile Glu Asn Leu Gly Gly Pro Ile Ile Ala Leu Ile Leu Phe Val Met Pro Met Ile Ala Phe Tyr Ser Val Ser Ser Leu Lys Arg Phe Arg Asn Phe Lys Val Asp Ile Phe Val Phe Val Phe Gly Ser Leu Thr Ala Leu Ser Val Phe Leu Gly Leu Phe (2) INFORMATION FOR SEQ ID N0:105:
(i} SEQUENCE CHARACTERISTICS:
(A) LENGTH: 875 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B} LOCATION: 63...827 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:105:

Met Leu Asp Phe Ile Gln Glu Leu Ser Thr Pro His Val Arg Asp Phe Phe Leu Leu Phe Leu Arg Val Ser Gly Val Leu Ser Phe Phe Pro Phe Phe Glu Asn His Leu Val Pro Leu Ser Val Arg Gly Ala Leu Ser Leu Tyr Val Ser Ala Ile Phe Tyr Pro Thr Leu Glu Phe Ser Asn Ala Ala Tyr Thr Pro Glu Gly Phe Ile Ile Ala Cys Leu Cys Glu Leu Phe Leu Gly Val Cys Ala Ser Val Phe Leu Gln Ile Val Phe Ala Ser Leu Val Phe Ala Thr Asp Ser Ile Ser Phe Ser Met Gly Leu Thr Met Ala Ser Ala Tyr Asp Pro Ile Ser Gly Ser Gln Lys Pro Ile Val Gly Gln Ala Leu Leu Leu Leu Ala Ile Leu Ile Leu Leu Asp Leu Ser Phe His His Gln Ile Ile Leu Phe Val Asp His Ser Leu Lys Ala Val Pro Leu Gly Gln Phe Val Phe Glu Pro Ala Leu Ala Lys Asn Ile Val Lys Ala Phe Ser His Leu Phe Val Ile Gly Phe Ser Met Ala Phe Pro Ile Leu Cys Leu Val Leu Leu Ser Asp Ile Ile Phe Gly Met Ile Met Lys Thr His Pro Gln Phe Asn Leu Leu Ala Ile Gly Phe Pro Val Lys Ile Ala Ile Gly Phe Val Gly Ile Ile Leu Ile Ala Ser Ala Ile Met Gly Arg WO 98/43478 PCT'/US98/06371 Phe Lys Glu Glu Ile Ser Leu Ala Phe Ser Ala Ile Ser Lys Ile Phe (2) INFORMATION FOR SEQ ID N0:106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:106:
Met Leu Asp Phe Ile Gln Glu Leu Ser Thr Pro His Val Arg Asp Phe Phe Leu Leu Phe Leu Arg Val Ser Gly Val Leu Ser Phe Phe Pro Phe Phe Glu Asn His Leu Val Pro Leu Ser Val Arg Gly Ala Leu Ser Leu Tyr Val Ser Ala Ile Phe Tyr Pro Thr Leu Glu Phe Ser Asn Ala Ala Tyr Thr Pro Glu Gly Phe Ile Ile Ala Cys Leu Cys Glu Leu Phe Leu Gly Val Cys Ala Ser Val Phe Leu Gln Ile Val Phe Ala Ser Leu Val Phe Ala Thr Asp Ser Ile Ser Phe Ser Met Gly Leu Thr Met Ala Ser Ala Tyr Asp Pro Ile Ser Gly Ser Gln Lys Pro Ile Val Gly Gln Ala Leu Leu Leu Leu Ala Ile Leu Ile Leu Leu Asp Leu Ser Phe His His Gln Ile Ile Leu Phe Val Asp His Ser Leu Lys Ala Val Pro Leu Gly Gln Phe Val Phe Glu Pro Ala Leu Ala Lys Asn Ile Val Lys Ala Phe Ser His Leu Phe Val Ile Gly Phe Ser Met Ala Phe Pro Ile Leu Cys Leu Val Leu Leu Ser Asp Ile Ile Phe Gly Met Ile Met Lys Thr His Pro Gln Phe Asn Leu Leu Ala Ile Gly Phe Pro Val Lys Ile Ala Ile Gly Phe Val Gly Ile Ile Leu Ile Ala Ser Ala Ile Met Gly Arg Phe _ Lys Glu Glu Ile Ser Leu Ala Phe Ser Ala Ile Ser Lys Ile Phe (2) INFORMATION FOR SEQ ID N0:107:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1160 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 373...1110 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:107:

CCC ACC AAT
AAT GCT
AGA ATG
CCC AAA
ATA

Met Pro Val Lys Thr Asn Asn Ala Arg Met Pro Lys Ile Gly Ile His Pro Ile Lys Thr Gly Arg Ile Arg Tyr Arg Tyr Ile Thr Leu Ile Gly Pro Arg His Ser Phe Tyr Tyr Cys Asn Leu Leu Leu Leu Ser Arg Ile His Val Thr Leu Thr Ala His Asn Glu Phe Cys Pro Thr His Arg Ala Ile Ala Pro Asn Phe Arg Val Val Ser Ile Ile Ala Asn Asn Gln Arg Asn Phe Gln Ala Leu Arg Pro Ile Asn His Ile Ser Phe Ile Pro Arg Ile Pro Thr Phe Asn Arg Ala Pro Arg Gln Asp Phe Ala Val Phe Leu His Asp Leu Thr Leu Ile Ile Asp Lys Asn Gln Ser Val CTT TGT AGA

IleGlyIle PheGly LeuLeuVal PhePhePro Gln Glu Leu Cys Arg AAC AAA CGC

HisSerPro LeuVal PheLeuThr SerPheSer Asp Gly Asn Lys Arg AGG TTC TCT

PhePheSer AsnAla CysGlyCys IleLysHis Leu Val Arg Phe Ser CCC AAT ATC

IleHisAsn MetArg AlaValPhe ArgGluAsn Gln Gln Pro Asn Ile ACC AGC ATT

ProArgGln LeuPhe AspProThr AsnHisLeu Asp Ala Thr Ser Ile CAA AGG CTT

ThrIlePhe HisLeu IleLeuSer ValGluSer His Ile Gln Arg Leu CGC ACT ATC

IleAsnTyr TyrThr HisSerIle TrpAlaAla Asn Ser Arg Thr Ile ATT TCCTACAAAC
CCTCTAC

MetSerHis MetPhe LeuValPhe Ile TTGAATTTAT
AAAATAATTG
TGT

(2) INFORMATION FOR SEQ ID N0:108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 246 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:108:
Met Pro Val Lys Thr Asn Asn Ala Arg Met Pro Lys Ile Gly Ile His Pro Ile Lys Thr Gly Arg Ile Arg Tyr Arg Tyr Ile Thr Leu Ile Gly Pro Arg His Ser Phe Tyr Tyr Cys Asn Leu Leu Leu Leu Ser Arg Ile His Val Thr Leu Thr Ala His Asn Glu Phe Cys Pro Thr His Arg Ala Ile Ala Pro Asn Phe Arg Val Val Ser Ile Ile Ala Asn Asn Gln Arg Asn Phe Gln Ala Leu Arg Pro Ile Asn His Ile Ser Phe Ile Pro Arg Ile Pro Thr Phe Asn Arg Ala Pro Arg Gln Asp Phe Ala Val Phe Leu His Asp Leu Thr Leu Ile Ile Asp Lys Asn Gln Ser Val Ile Gly Ile Leu Phe Gly Leu Leu Val Phe Phe Pro Cys Gln Arg Glu His Ser Pro Asn Leu Val Phe Leu Thr Ser Phe Ser Lys Asp Arg Gly Phe Phe Ser Arg Asn Ala Cys Gly Cys Ile Lys His Phe Leu Ser Val Ile His Asn Pro Met Arg Ala Val Phe Arg Glu Asn Asn Gln Ile Gln Pro Arg Gln Thr Leu Phe Asp Pro Thr Asn His Leu Ser Asp Ile Ala Thr Ile Phe Gln His Leu Ile Leu Ser Val Glu Ser Arg His Leu Ile Ile Asn Tyr Arg Tyr Thr His Ser Ile Trp Ala Ala Thr Asn Ile Ser Met Ser His Ile Met Phe Leu Val Phe (2) INFORMATION FOR SEQ ID N0:109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1661 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 79...1611 (D) OTHER INFORMATION:
~ (xi) SEQUENCE DESCRIPTION: SEQ ID N0:109:

Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile Gly Leu Leu Thr Ile Tyr Leu Ile Leu Phe Thr Glu Trp Gly Asn Lys Ile Ile Ala Ser Tyr Ile Glu Lys Lys Ile Asn Pro Asn AAA AAA AAC
ACC

GluHisTyrLeu SerVal LysThrPhe LeuArg PheAsnSer Leu Lys TCC

AspPheLysAla GlnAla AsnAspAsp ThrLeu IleLeuLys Gly Ser AAT

AspPheSerLeu LeuLys GlnSerVal LeuAsn TyrHisIle Asp Asn TGG

IleLysAspLeu ArgSer PheLysGlu IlePro TyrProLeu Arg Trp AAA

GlyAlaValIle ThrSer GlyAsnIle GlyHis ArgLysAla Leu Lys Met Ile Gln Gly Val Ser Asn Val Ala Gln Ser His Thr Ala Tyr Asn Ala Leu Leu Asp Asp Phe Lys Leu Ser Arg Leu Asn Leu Asn Ala Gln Asp Ala Asn Leu Glu Asp Leu Leu Tyr Leu Ile Asn Arg Pro Ala Tyr Ala Asn Ala Lys Val Ser Leu Gln Ala Asp Phe Asn Ser Leu Lys Pro Leu Glu Gly His Leu Ile Leu Thr Ala Asn Asn Ala Leu Ile Asn Asn Ala Leu Ile Asn Gln Ile Phe His Leu Asn Leu Lys Asp Thr Leu Val Phe Ser Leu Ser His Ser Ser Asp Phe Lys Gly Asn Lys Ala Ile Ser Asp Thr Thr Leu Thr Ser Pro Leu Ala Asn Phe Lys Ala Leu Lys Ser Glu Tyr Leu Phe Ser Ile Leu Lys Leu Asn Ala Pro Tyr Thr Leu Glu Ile Pro Asn Leu Ala Lys Leu Tyr Asn Ile Thr Asn His Pro Leu Lys Gly Ser Leu Thr Leu Lys Gly Ala Ile Glu Gln Ser Pro Lys Leu Leu Lys Val Ser Gly His Ser Asn Leu Leu Asp Gly Ala Leu Asp Phe Thr Leu Leu Asn Lys Asp Leu Lys Gly Arg Phe Ser Asn Ile Ser Thr Leu Lys Ala Leu Asp Leu Phe His Tyr Pro Lys Phe Phe Gln Ser Val Ala Asp Ala Asn Leu Asp Tyr Asp Leu Ile Ala Lys Gln Gly Val Leu Lys Ala Arg Leu Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser Ile Ser Lys Phe Asp Ile Thr Lys Glu Ile Tyr Asn Asp 380 385 ~ 390 395 Ala Asn Leu Val Ser Gln Ile Asn Gln Gln Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro Lys Thr Gln Leu Lys Ile His Asn Gly Leu Leu " GAT TTA AAC ACC AAA CAA ATG AAC ATG CTC ATG GAT GCG GAA ATT TTA 1407 Asp Leu Asn Thr Lys Gln Met Asn Met Leu Met Asp Ala Glu Ile Leu Lys Phe Ile Phe Lys Met Lys Leu Gln Gly Asn Met His Gln Pro Lys Phe Ser Leu Ile Leu Asn Glu Lys Ala Ile Gln Gln Asn Leu Gln Gln Gly Leu Lys Glu Ile Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp HisLeuLeuLys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu LysGlyLeuPhe (2) INFORMATION FOR SEQ ID NO:110:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 511 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:110:
Met Lys Lys Leu Leu Tyr Thr Ile Leu Ala Leu Leu Leu Ile Gly Leu Leu Thr Ile Tyr Leu Ile Leu Phe Thr Glu Trp Gly Asn Lys Ile Ile Ala Ser Tyr Ile Glu Lys Lys Ile Asn Pro Asn Glu His Tyr Leu Ser Val Lys Thr Phe Lys Leu Arg Phe Asn Ser Leu Asp Phe Lys Ala Gln Ala Asn Asp Asp Ser Thr Leu Ile Leu Lys Gly Asp Phe Ser Leu Leu Lys Gln Ser Val Asn Leu Asn Tyr His Ile Asp Ile Lys Asp Leu Arg Ser Phe Lys Glu Trp Ile Pro Tyr Pro Leu Arg Gly Ala Val Ile Thr Ser Gly Asn Ile Lys Gly His Arg Lys Ala Leu Met Ile Gln Gly Val Ser Asn Val Ala Gln Ser His Thr Ala Tyr Asn Ala Leu Leu Asp Asp 130 135 140 ' Phe Lys Leu Ser Arg Leu Asn Leu Asn Ala Gln Asp Ala Asn Leu Glu Asp Leu Leu Tyr Leu Ile Asn Arg Pro Ala Tyr Ala Asn Ala Lys Val Ser Leu Gln Ala Asp Phe Asn Ser Leu Lys Pro Leu Glu Gly His Leu Ile Leu Thr Ala Asn Asn Ala Leu Ile Asr~ Asn Ala Leu Ile Asn Gln Ile Phe His Leu Asn Leu Lys Asp Thr Leu Val Phe Ser Leu Ser His Ser Ser Asp Phe Lys Gly Asn Lys Ala Ile Ser Asp Thr Thr Leu Thr Ser Pro Leu Ala Asn Phe Lys Ala Leu Lys Ser Glu Tyr Leu Phe Ser Ile Leu Lys Leu Asn Ala Pro Tyr Thr Leu Glu Ile Pro Asn Leu Ala Lys Leu Tyr Asn Ile Thr Asn His Pro Leu Lys Gly Ser Leu Thr Leu Lys Gly Ala Ile Glu Gln Ser Pro Lys Leu Leu Lys Val Ser Gly His Ser Asn Leu Leu Asp Gly Ala Leu Asp Phe Thr Leu Leu Asn Lys Asp Leu Lys Gly Arg Phe Ser Asn Ile Ser Thr Leu Lys Ala Leu Asp Leu Phe His Tyr Pro Lys Phe Phe Gln Ser Val Ala Asp Ala Asn Leu Asp Tyr Asp Leu Ile Ala Lys Gln Gly Val Leu Lys Ala Arg Leu Lys Asn Ala Arg Phe Leu Lys Asn Ala Phe Ser Asp Phe Leu Tyr Ser Ile Ser Lys Phe Asp Ile Thr Lys Glu Ile Tyr Asn Asp Ala Asn Leu Val Ser Gln Ile Asn Gln Gln Arg Leu Leu Ser Asp Leu Ser Leu Lys Ser Pro Lys Thr Gln Leu Lys Ile His Asn Gly Leu Leu Asp Leu Asn Thr Lys Gln Met Asn Met Leu Met Asp Ala Glu Ile Leu Lys Phe Ile Phe Lys Met Lys Leu Gln Gly Asn Met His Gln Pro Lys Phe Ser Leu Ile Leu Asn Glu Lys Ala Ile Gln Gln Asn Leu Gln Gln Gly Leu Lys Glu Ile Leu Lys Asn Asp Thr Leu Lys Lys Gly Leu Asp His Leu Leu Lys Asp Asp Lys Leu Lys Glu Lys Leu Glu Lys Gly Leu Lys Gly Leu Ph.e (2) INFORMATION FOR SEQ ID NO:111:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 397 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
{ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 53...352 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:111:

CTATTACACC
AACAATCAAT
CTCAAAACAA
AGGACATGAAAG
ATG AAA

Met Lys AAA ATG ATT

Thr Lys His GlyIle Arg Phe Lys Gln Arg ArgMet Met Lys Met Ile ATA AGT GCG

Ser Leu Ala LeuMet Pro Phe Leu Leu Ala ProAsp Tyr Ile Ser Ala TTC TTG AGC

Lys Gln Lys ThrGln Ile Asp Phe Ile Asn AspPhe Ile Phe Leu Ser GGT ATT TGC

Lys Ala Ile GlyLeu Ile Val Gly Thr Ile TyrAla Tyr Gly Ile Cys GAC GAA AAA

Lys Asn Trp ArgLeu Gly Ile Gly Trp Cys ValGly Ile Asp Glu Lys ACC TCT ACT

Ile Ile Ile AlaAla Ile Asn Ala Lys Leu SerGln Trp Thr Ser Thr TGCATATTGT
TTGTGTTGAA
AGTATCAACA

Leu Phe (2) INFORMATION FOR SEQ ID N0:112:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 100 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal ' (xi) SEQUENCE DESCRIPTION: SEQ ID NO:I12:
Met Lys Thr Lys His Lys Gly Ile Arg Met Phe Lys Gln Ile Arg Arg Met Met Ser Leu Ala Ile Leu Met Pro Ser Phe Leu Leu Ala Ala Pro Asp Tyr Lys Gln Lys Phe Thr Gln Ile Leu Asp Phe Ile Ser Asn Asp Phe Ile Lys Ala Ile Gly Gly Leu Ile Ile Val Gly Thr Cys Ile Tyr Ala Tyr Lys Asn Trp Asp Arg Leu Gly Glu Ile Gly Trp Lys Cys Val Gly Ile Ile Ile Ile Thr Ala Ala Ile Ser Asn Ala Lys Thr Leu Ser Gln Trp Leu Phe (2) INFORMATION FOR SEQ ID N0:113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 367 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 52...318 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:113:

Met Ile Gln Ser Asp Ala Val Phe Lys Ile Asn Phe Cys Leu Ala Leu Leu Val Phe Val Lys Arg Gly Leu Ser Asp Ile Asn Met Pro Leu Phe Asn Gln Arg Ala Gln Ile Thr Ile Glu Lys Ser His Gln Gln Gly Leu Asp Met ' GCT CCC ATC CAC ATC AGC ATC GGT CAT GAT AAT GAT TTT ATG ATA GCG 249 Ala Pro Ile His Ile Ser Ile Gly His Asp Asn Asp Phe Met Ile Ala Gln Ser Phe Tyr Ile Lys Thr Leu Leu Asn Ala Ala Pro Lys Ser Arg Asp His Val Phe Asn Phe Phe (2) INFORMATION FOR SEQ ID N0:114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 89 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:114:
Met Ile Gln Ser Asp Ala Val Phe Lys Ile Asn Phe Cys Leu Ala Leu Leu Val Phe Val Lys Arg Gly Leu Ser Asp Ile Asn Met Pro Leu Phe Asn Gln Arg Ala Gln Ile Thr Ile Glu Lys Ser His Gln Gln Gly Leu Asp Met Ala Pro Ile His Ile Ser Ile Gly His Asp Asn Asp Phe Met Ile Ala Gln Ser Phe Tyr Ile Lys Thr Leu Leu Asn Ala Ala Pro Lys Ser Arg Asp His Val Phe Asn Phe Phe {2) INFORMATION FOR SEQ ID N0:115:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 386 base pairs (B) TYPE: nucleic acid {C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
{ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 54...344 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:115:

Met Ile Ile Phe Gly Lys Asp Tyr Leu Ser Thr Asp Leu Gln Asn Ser Ala Lys Asp Ile Leu Leu Ile Ala Ser Gln Ile Leu Lya Glu Arg Leu Phe Ala His Lys Asn Glu Ile Phe Phe Cys Pro Arg Asn Ser Tyr Ile Gln Ala Phe Arg Ile Tyr Gln Glu Arg Lys Ile Thr Ile Ser Phe His Gly Gly Ile Asn Asn Asn Ile Cys Leu Leu Ala Leu Lys Gly Ile His Ser Val Tyr Phe Glu Leu Ile Lys Ile Leu Glu Ala Val Phe Phe His Phe (2) INFORMATION FOR SEQ ID N0:116:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 97 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:116:
Met Ile Ile Phe Gly Lys Asp Tyr Leu Ser Thr Asp Leu Gln Asn Ser Ala Lys Asp Ile Leu Leu Ile Ala Ser Gln Ile Leu Lys Glu Arg Leu Phe Ala His Lys Asn Glu Ile Phe Phe Cys Pro Arg Asn Ser Tyr Ile Gln Ala Phe Arg Ile Tyr Gln Glu Arg Lys Ile Thr Ile Ser Phe His Gly Gly Ile Asn Asn Asn Ile Cys Leu Leu Ala Leu Lys Gly Ile His ' 65 70 75 80 Ser Val Tyr Phe Glu Leu Ile Lys Ile Leu Glu Ala Val Phe Phe His Phe (2) INFORMATION FOR SEQ ID N0:117:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 569 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 55...516 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: N0:117:
SEQ
ID

AAATTCGTTG
CTCACCCCTG

Ala SerGluValAla ProSerGlu ValLeuLeu AspSerSer CysLeuSer PheSerLeuThr IleSerLeu ValValThr CysLeuGly AlaLeuPhe SerLeuAlaSer SerLeuAla SerSerPhe LeuGlySer SerLeuGly SerSerPheLeu ThrSerSer ThrLeuGly SerGlyLeu GlySerGly PheGlySerGly LeuGlySer GlyLeuGly PheGlyPhe GlyPheGly LeuGlyLeuGly LeuGlyLeu GlyPheVal ThrSerPhe LeuGlySer SerPhePheGly SerSerPhe LeuGlyPhe SerLeuGly SerSerLeu GlyLeuAlaAsp SerAlaLeu ValPheVal LeuGluLeu ValLeuMet _ LeuAlaLysLeu MetValThr LeuValVal ProAlaCys AlaLysGly Ser Gly Ala Ser Ser Arg Ser Lys Lys (2) INFORMATION FOR SEQ ID N0:118:
' (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 154 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:118:
Ala Ser Glu Val Ala Pro Ser Glu Val Leu Leu Asp Ser Ser Cys Leu Ser Phe Ser Leu Thr Ile Ser Leu Val Val Thr Cys Leu Gly Ala Leu Phe Ser Leu Ala Ser Ser Leu Ala Ser Ser Phe Leu Gly Ser Ser Leu Gly Ser Ser Phe Leu Thr Ser Ser Thr Leu Gly Ser Gly Leu Gly Ser Gly Phe Gly Ser Gly Leu Gly Ser Gly Leu Gly Phe Gly Phe Gly Phe Gly Leu Gly Leu Gly Leu Gly Leu Gly Phe Val Thr Ser Phe Leu Gly Ser Ser Phe Phe Gly Ser Ser Phe Leu Gly Phe Ser Leu Gly Ser Ser Leu Gly Leu Ala Asp Ser Ala Leu Val Phe Val Leu Glu Leu Val Leu Met Leu Ala Lys Leu Met Val Thr Leu Val Val Pro Ala Cys Ala Lys Gly Ser Gly Ala Ser Ser Arg Ser Lys Lys (2) INFORMATION FOR SEQ ID N0:119:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 359 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear w (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/I~Y: Coding Sequence (B) LOCATION: 77...310 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:119:

Asn Glu His His Thr Pro Ala Gly Ser Leu Val Leu ATC AAA GCT
GGC ATA
GGG
GGC

GlySer Phe Ile GlySerPhe Lys ValGly Ile Gly Gly Ile Gly Ala GTG TTA TTT

ValGly Ala Val PheGlyIle Ser PheSer Gly Gly Phe Val Leu Phe TCT TTT TCC

CysHis Asn Val LysAlaAla Ala LeuGly Ile Leu Ala Ser Phe Ser CCG AGC AAA

LysIle Leu Ser SerTrpGly Phe AlaIle Ile Ser Asn Pro Ser Lys AATTTTCTAA
AATCAGAGCC

AlaPhe (2} INFORMATION FOR SEQ ID N0:120:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 78 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:120:
Asn Glu His His Thr Pro Ala Gly Ser Leu Val Leu Gly Ser Phe Ile Ile Gly Ser Phe Lys Gly Val Gly Ala Ile Gly Gly Val Gly Ala Val Val Phe Gly Ile Ser Leu Phe Ser Phe Gly Gly Phe Cys His Asn Ser Val Lys Ala Ala Ala Phe Leu Gly Ser Ile Leu Ala Lys Ile Leu Pro Ser Ser Trp Gly Phe Ser Ala Ile Lys Ile Ser Asn Ala Phe (2) INFORMATION FOR SEQ ID N0:121:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1051 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...998 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:121:

Met Thr Leu Asn Thr Phe Leu Asp Thr Cys Phe Leu Leu Phe Ile Ser Ile Leu Phe Tyr Leu Ser Ile Pro Ile Tyr Pro Asn Lys Val Val Val Val Pro Gln Gly Ser Leu Lys Lys Val Phe Phe Ser Leu Lys Glu Gln Gly Val GAT ATG AAC GCT TTG GAT TTG CTT TTT TTA CGC CTG ATG GGC ATCr CCT 248 Asp Met Asn Ala Leu Asp Leu Leu Phe Leu Arg Leu Met Gly Met Pro Lys Lys Gly Tyr Ile Asp Met Gly Asp Gly Ala Leu Arg Lys Gly Asp Phe Leu Val Arg Leu Ile Lys Ala Lys Ala Ala Gln Lys Ser Ala Thr Leu Ile Pro Gly Glu Ser Arg Tyr Phe Phe Thr Gln Ile Leu Ser Glu Thr Tyr Gln Leu Glu Thr Ser Asp Leu Asn Gln Ala Tyr Glu Ser Ile WO 98/43478 PCT/I1S98/063~1 Ala Pro Arg Leu Asn Gly Glu Val Ile Glu Asp Gly Val Ile Trp Pro Asp Thr Tyr His Leu Pro Leu Gly Glu Asp Ala Phe Lys Ile Met Gln Thr Leu Ile Gly Gln Ser Met Lys Lys His Glu Ala Leu Ser Lys Gln TAC TGG GAA
AAA
ATC

Trp LeuGlyTyr His Lys Glu Glu Phe Lys IleLeu Tyr Trp Glu Ile CAA AAT GAA ATG

Ala SerIleVal Lys Glu Ala Ala Val Glu ProLeu Gln Asn Glu Met ATT AAA GGC CCT

Ile AlaSerVal Phe Asn Arg Leu Lys Met LeuGln Ile Lys Gly Pro TTG TTT CAC AAA

Met AspGlyAla Asn Tyr Gln Glu Ser Ala ValThr Leu Phe His Lys AAA CCC AAT TAT

Lys GluArgIle Thr Asp Asn Thr Tyr Thr LysPhe Lys Pro Asn Tyr AAA AGC AGC GAA

Lys GlyLeuPro Asn Pro Val Gly Val Leu AlaIle Lys Ser Ser Glu TTC GAT TTG TTT

Arg AlaValIle Pro Lys Lys Thr Phe Tyr ValLys Phe Asp Leu Phe AAA GCG TAT GAG

Met ProAspLys His Ala Phe Ser Thr Lys HisLeu Lys Ala Tyr Glu CTT TTT GTAAATGGGG
CGT

_ Lys AsnIleAsn Ser Asn Asn His Leu Phe GAATTGAGTA
AAAAGTGTTT

(2)INFORMATION
FOR
SEQ
ID
N0:122:

(i) SEQUENCE
CHARACTERISTICS:

(A)LENGTH:316 amino acids (B)TYPE: ino acid am (C)STRANDEDNESS:
single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:122:
Met Thr Leu Asn Thr Phe Leu Asp Thr Cys Phe Leu Leu Phe Ile Ser ' 1 5 10 15 Ile Leu Phe Tyr Leu Ser Ile Pro Ile Tyr Pro Asn Lys Val Val Val Val Pro Gln Gly Ser Leu Lys Lys Val Phe Phe Ser Leu Lys Glu Gln _ 35 40 45 Gly Val Asp Met Asn Ala Leu Asp Leu Leu Phe Leu Arg Leu Met Gly Met Pro Lys Lys Gly Tyr Ile Asp Met Gly Asp Gly Ala Leu Arg Lys Gly Asp Phe Leu Val Arg Leu Ile Lys Ala Lys Ala Ala Gln Lys Ser Ala Thr Leu Ile Pro Gly Glu Ser Arg Tyr Phe Phe Thr Gln Ile Leu Ser Glu Thr Tyr Gln Leu Glu Thr Ser Asp Leu Asn Gln Ala Tyr Glu Ser Ile Ala Pro Arg Leu Asn Gly Glu Val Ile Glu Asp Gly Val Ile Trp Pro Asp Thr Tyr His Leu Pro Leu Gly Glu Asp Ala Phe Lys Ile Met Gln Thr Leu Ile Gly Gln Ser Met Lys Lys His Glu Ala Leu Ser Lys Gln Trp Leu Gly Tyr Tyr His Lys Glu Glu Trp Phe Glu Lys Ile Ile Leu Ala Ser Ile Val Gln Lys Glu Ala Ala Asn Val Glu Glu Met Pro Leu Ile Ala Ser Val Ile Phe Asn Arg Leu Lys Lys Gly Met Pro 210 215 ' 220 Leu Gln Met Asp Gly Ala Leu Asn Tyr Gln Glu Phe Ser His Ala Lys Val Thr Lys Glu Arg Ile Lys Thr Asp Asn Thr Pro Tyr Asn Thr Tyr Lys Phe Lys Gly Leu Pro Lys Asn Pro Val Gly Ser Val Ser Leu Glu Ala Ile Arg Ala Val Ile Phe Pro Lys Lys Thr Asp Phe Leu Tyr Phe Val Lys Met Pro Asp Lys Lys His Ala Phe Ser Ala Thr Tyr Lys Glu ' 290 295 300 His Leu Lys Asn Ile Asn Leu Ser Asn Asn His Phe (2) INFORMATION FOR SEQ ID N0:123:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 637 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...584 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: N0:123:
SEQ
ID

TATGATTAAT
GGCTATGGTT
ATACCAAAGA
ATG
AGT

Met Ser GGC CTT GAT

Gln LysIleLeu IleLeuGly Ile Asn Ile Phe Gly Glu Gly Leu Asp TAC AAA TCT

Gly IleGlyVal HisLeuAla His Leu Lys Asn Phe Phe Tyr Lys Ser GGG ATG CAG

Phe ProSerVal AspIleIle Asp Gly Thr Ala Gln Leu Gly Met Gln AAG ATT TGC

Ile ProLeuIle ThrSerTyr Glu Val Leu Leu Asp Val Lys Ile Cys TCA GCT TTT

Ser AlaGluGly ValGluIle Gly Val Tyr Phe Asp Lys Ser Ala Phe GCT GCT GTG

Asp AlaProLys GluIleThr Trp Gly Ser His Glu Glu Ala Ala Val GAG GGG CCT

Met LeuHisThr LeuArgLeu Thr Phe Leu Asp Leu Lys Glu Gly Pro TTT GGG ACC

Thr PheIleVal GlyLeuVal Pro Val Ile Ser Glu Thr Phe Gly Thr AAC GAA TTA

_ Phe LysLeuSer SerLysIle Leu Ala Leu Thr Ala Lys Asn Glu Leu TGG AAA CGC

Ala IleGluThr GlnLeuAsn Ala Gly Val Met Gln Thr Trp Lys Arg Asp His Ile Ala Leu Glu Cys Ile Ala Glu Leu Ser Tyr Lys Gly Phe (2) INFORMATION FOR SEQ ID N0:124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 178 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:124:
Met Ser Gln Lys Ile Leu Ile Leu Gly Ile Gly Asn Ile Leu Phe Gly Asp Glu Gly Ile Gly Val His Leu Ala His Tyr Leu Lys Lys Asn Phe Ser Phe Phe Pro Ser Val Asp Ile Ile Asp Gly Gly Thr Met Ala Gln Gln Leu Ile Pro Leu Ile Thr Ser Tyr Glu Lys Val Leu Ile Leu Asp Cys Val Ser Ala Glu Gly Val Glu Ile Gly Ser Val Tyr Ala Phe Asp Phe Lys Asp Ala Pro Lys Glu Ile Thr Trp Ala Gly Ser Ala His Glu Val Glu Met Leu His Thr Leu Arg Leu Thr Glu Phe Leu Gly Asp Leu Pro Lys Thr Phe Ile Val Gly Leu Val Pro Phe Val Ile Gly Ser Glu Thr Thr Phe Lys Leu Ser Ser Lys Ile Leu Asn Ala Leu Glu .Thr Ala Leu Lys Ala Ile Glu Thr Gln Leu Asn Ala Trp Gly Val Lys Met Gln Arg Thr Asp His Ile Ala Leu Glu Cys Ile Ala Glu Leu Ser Tyr Lys Gly Phe ' (2) INFORMATION FOR SEQ ID N0:125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 214 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:

WO 95143478 . PCT/fJS98/06371 (A) NAME/KEY: Coding Sequence (B) LOCATION: 51...161 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:125:
GCATTGCTAA TTTGGGAATA CTTGTTTATG CCAGTGAAAT AGGAGCGGCT ATG ATG 56 _ Met Met Trp Arg Ser Leu Xaa Val Ala Phe Thr Ile Thr Asp Ile Ser Lys Thr Phe Gln Ser Gln Pro Lys His His Gln Ile Gly Thr Leu Glu Leu Asn Phe Ala Phe (2) INFORMATION FOR SEQ ID N0:126:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:126:
Met Met Trp Arg Ser Leu Xaa Val Ala Phe Thr Ile Thr Asp Ile Ser Lys Thr Phe Gln Ser Gln Pro Lys His His Gln Ile Gly Thr Leu Glu Leu Asn Phe Ala Phe (2) INFORMATION FOR SEQ ID N0:127:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1576 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear WO 98/d3478 PCT/US98/06371 (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1523 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:127:

Met Asn Thr Thr Ile Leu Glu Ala Tyr Ala Ala Glu Pro Ser Arg Gln Thr Leu Ser Lys Val Ser Asn Arg Phe Lys Glu His Gly Ala Lys Phe Asp Leu Arg Val Met Ala Thr His Gly Gly Thr Ile Ser Trp Lys Ala Lys Glu Leu Ala Arg Thr Ile Val Ser Gly Pro Ile Gly Gly Val Ile Gly Ser Lys Leu Leu Gly Glu Thr Leu Gly Tyr Asp Asn Ile Ala Cys Ser Asp Ile Xaa Gly Thr Ser Phe Asp Met Ala Leu Ile Val Lys Ser Asn Phe Asn Ile Ala Ser Asp Pro Asp Met Ala Arg Leu Val Leu Ser Leu Pro Leu Val Ala Met Asp Ser Val Gly Ala Gly Ala Gly Ser Phe Val Arg ' Ile Asp Pro His Ser Arg Ser Val Lys Leu Gly Pro Asp Ser Ala Gly Tyr Arg Val Gly Thr Cys Trp Lys Asp Ser Gly Leu Asp Thr Val Ser ACC

Val AspCys HisIleVal LeuGlyTyr LeuAsnPro AspAsnPhe Thr GGC

Leu GlyLeu IleLysLeu AspValAsp ArgAlaLys LysHisIle Gly GAA

Lys GlnIle AlaAspPro LeuGlyIle SerValGlu AspAlaAla Glu GGT

Ala ValIle GluLeuLeu AspLeuGlu LeuLysGlu TyrLeuArg Gly AAC

Ser IleSer AlaLysGly TyrSerPro SerAspPhe ValCysPhe Asn TAT

Ser GlyGly AlaGlyPro ValHisThr TyrGlyTyr ThrGluGly Tyr GGG

Leu PheLys AspValVal ValProAla TrpAlaAla GlyPheSer Gly TTT

Ala GlyCys AlaCysAla AspPheGlu TyrArgTyr AspLysSer Phe ATT CAG TCT GAC AAA
ATA
GAC

ValAsp IleAlaIle Pro Tyr Ser LysSer LysIleAsp Gln Ser Asp GAC TGG GAA

AlaCys LysIleIle Gln Ala Asp LeuThr LeuLysVal Asp Trp Glu AAT TTT CAA

IleGlu GluPheLys Ile Gly Ser LysAsp ValIleLeu Asn Phe Gln CAG ATG CAA

ArgPro GlyTyrArg Met Tyr Gly LeuAsn AspLeuGlu Gln Met Gln AAA GCA GTG

IleThr SerProVal Ser Ala Ser AlaAsp TrpGluGlu Lys Ala Val AAA TAC CGC

IleVal LysGluTyr Glu Thr Ala ValTyr SerGluSer Lys Tyr Arg Ala Cys Ser Pro Glu Leu Gly Phe Ser Val Thr Gly Val Ile Met Arg Gly Val Val Ala Thr Gln Lys Pro Val Ile Pro Val Glu Lys Glu His ' GGT GCT ACG CCC CCA AAA GAA GCC AAA ATA GGC GTT AGA AAA TTC TAT 1352 Gly Ala Thr Pro Pro Lys Glu Ala Lys Ile Gly Val Arg Lys Phe Tyr Arg His Lys Lys Trp Val Asp Ala Asp Val Trp Gln Met Glu Lys Leu Leu Pro Gly Asn Glu Val Ile Gly Pro Ala Ile Val Glu Ser Asp Ala Thr Thr Phe Val Ile Pro Lys Gly Phe Ala Thr Arg Leu Asp Lys His Arg Leu Phe His Leu Lys Glu Ile Lys (2) INFORMATION FOR SEQ ID NO:I28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 491 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:128:
Met Asn Thr Thr Ile Leu Glu Ala Tyr Ala Ala Glu Pro Ser Arg Gln " 1 5 10 15 Thr Leu Ser Lys Val Ser Asn Arg Phe Lys Glu His Gly Ala Lys Phe Asp Leu Arg Val Met Ala Thr His Gly Gly Thr Ile Ser Trp Lys Ala Lys Glu Leu Ala Arg Thr Ile Val Ser Gly Pro Ile Gly Gly Val Ile Gly Ser Lys Leu Leu Gly Glu Thr Leu Gly Tyr Asp Asn Ile Ala Cys Ser Asp Ile Xaa Gly Thr Ser Phe Asp Met Ala Leu Ile Val Lys Ser Asn Phe Asn Ile Ala Ser Asp Pro Asp Met Ala Arg Leu Val Leu Ser Leu Pro Leu Val Ala Met Asp Ser Val Gly Ala Gly Ala Gly Ser Phe Val Arg Ile Asp Pro His Ser Arg Ser Val Lys Leu Gly Pro Asp Ser Ala Gly Tyr Arg Val Gly Thr Cys Trp Lys Asp Ser Gly Leu Asp Thr 145 150 155 160 _ Val Ser Val Thr Asp Cys His Ile Val Leu Gly Tyr Leu Asn Pro Asp Asn Phe Leu Gly Gly Leu Ile Lys Leu Asp Val Asp Arg Ala Lys Lys His Ile Lys Glu Gln Ile Ala Asp Pro Leu Gly Ile Ser Val Glu Asp Ala Ala Ala Gly Val Ile Glu Leu Leu Asp Leu Glu Leu Lys Glu Tyr Leu Arg Ser Asn Ile Ser Ala Lys Gly Tyr Ser Pro Ser Asp Phe Val Cys Phe Ser Tyr Gly Gly Ala Gly Pro Val His Thr Tyr Gly Tyr Thr Glu Gly Leu Gly Phe Lys Asp Val Val Val Pro Ala Trp Ala Ala Gly Phe Ser Ala Phe Gly Cys Ala Cys Ala Asp Phe Glu Tyr Arg Tyr Asp Lys Ser Val Asp Ile Ala Ile Pro G1n Tyr Ser Ser Asp Lys Ser Lys Ile Asp Ala Cys Lys Ile Ile Gln Asp Ala Trp Asp Glu Leu Thr Leu Lys Val Ile Glu Glu Phe Lys Ile Asn Gly Phe Ser Gln Lys Asp Val Ile Leu Arg Pro Gly Tyr Arg Met Gln Tyr Met Gly Gln Leu Asn Asp Leu Glu Ile Thr Ser Pro Val Ser Lys Ala Ala Ser Val Ala Asp Trp Glu Glu Ile Val Lys Glu Tyr Glu Lys Thr Tyr Ala Arg Val Tyr Ser Glu Ser Ala Cys Ser Pro Glu Leu Gly Phe Ser Val Thr Gly Val Ile Met Arg Gly Val Val Ala Thr Gln Lys Pro Val Ile Pro VaI Glu Lys Glu His Gly Ala Thr Pro Pro Lys Glu Ala Lys Ile Gly Val Arg Lys Phe Tyr Arg His Lys Lys Trp Val Asp Ala Asp Val Trp Gln Met Glu Lys Leu Leu Pro Gly Asn Glu Val Ile Gly Pro Ala Ile Val Glu Ser Asp Ala Thr Thr Phe Val Ile Pro Lys Gly Phe Ala Thr Arg Leu Asp Lys His Arg Leu Phe His Leu Lys Glu Ile Lys (2) INFORMATION FOR SEQ ID N0:129:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 303 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence " (B) LOCATION: 52...261 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:129:

Met Ile Gln Leu Lys Ser Asn Leu Asp Trp Tyr Ala Asp Tyr Leu Asn Phe Leu Asp Arg Phe Gly Glu Lys Met Glu Glu Ser Lys Glu Arg Lys Gln Leu Leu Ile Ala Ser Leu Ala Pro Leu Ala Gly Phe Ala Ala Arg Ile Ser Pro Gly Leu Leu Ser Leu Leu Gly Leu Met Leu Ala Met Gly Cys Ala Asn Phe Trp Ile (2) INFORMATION FOR SEQ ID N0:130:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 70 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear " (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:130:
Met Ile Gln Leu Lys Ser Asn Leu Asp Trp Tyr Ala Asp Tyr Leu Asn Phe Leu Asp Arg Phe Gly Glu Lys Met Glu Glu Ser Lys Glu Arg Lys Gln Leu Leu Ile Ala Ser Leu Ala Pro Leu Ala Gly Phe Ala Ala Arg Ile Ser Pro Gly Leu Leu Ser Leu Leu Gly Leu Met Leu AIa Met Gly Cys Ala Asn Phe Trp Ile (2) INFORMATION FOR SEQ ID N0:131:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 826 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...773 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:131:

GTTAAGCCGG
CTAGAAAAAG
AGCGTTATTT

MetLeu GAG

GluAspValGly Glu GlyGlnLeuLys LeuLeuLys SerSerVal Glu GGG

LeuValIleGly Ala GlyLeuGlySer AlaValLeu MetTyrLeu Gly GGA

CysAlaAlaGly Ile LysIleGlyIle ValAspPhe AspValVal Gly Asp Met Ser Asn Leu Gln Arg Gln Ile Ile His Ser Gln Asp Phe Leu Asn Gln Ser Lys Ala Ser Ser Ala Lys Ala Arg Leu Lys Gln Leu Asn Ala Gly Ile Glu Ile Glu Ala Phe Glu Glu Arg Phe Lys Ala His Asn TTT

Ala LeuSer LeuIleGlu ProTyrAsp I1eIle AspAlaThr Asp Phe GAC

Asn PheAsn AlaLysPhe LeuIleAsn AlaCys ValLeuAla Gln Asp ' 115 120 125 130 GAA

Lys ProTyr 5erHisAla GlyValLeu TyrArg GlyGlnSer Met Glu . 135 140 145 GCG

Ser ValLeu ProHisSer AlaCysLeu CysVal PheAspLys Pro Ala GGG

Pro LysLys GlyLeuAsn ProIleSer LeuPhe GlyValLeu Pro Gly i65 170 175 GAA

Gly ValLeu GlyCysIle GlnAlaSer CysLeu LysTyrPhe Leu Glu TTA

Gly PheGlu ThrLeuLeu IleAsnThr LeuIle AlaAspIle Lys Leu CCC

Thr MetAsp PheLysLys IleGlnAla LysAsn ProGluCys Arg Pro TTA

Val CysGly ThrHisLys IleThrHis GlnAsp TyrGluIle Leu (2) INFORMATION FOR SEQ ID N0:132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 241 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:132:
Met Leu Glu Asp Val Gly Glu Glu Gly Gln Leu Lys Leu Leu Lys Ser Ser Val Leu Val Ile Gly Ala Gly Gly Leu Gly Ser Ala Val Leu Met Tyr Leu Cys Ala Ala Gly Ile Gly Lys Ile Gly Ile Val Asp Phe Asp Val Val Asp Met Ser Asn Leu Gln Arg Gln Ile Ile His Ser Gln Asp Phe Leu Asn Gln Ser Lys Ala Ser Ser Ala Lys Ala Arg Leu Lys Gln Leu Asn Ala Gly IIe Glu Ile Glu Ala Phe Glu Glu Arg Phe Lys Ala His Asn Ala Leu Ser Leu Ile Glu Pro Tyr Asp Phe Ile Ile Asp Ala Thr Asp Asn Phe Asn Ala Lys Phe Leu Ile Asn Asp Ala Cys Val Leu Ala Gln Lys Pro Tyr Ser His Ala Gly Val Leu Glu Tyr Arg Gly Gln Ser Met Ser Val Leu Pro His Ser Ala Cys Leu Ala Cys Val Phe Asp Lys Pro Pro Lys Lys Gly Leu Asn Pro Ile Ser Gly Leu Phe Gly Val Leu Pro Gly Val Leu Gly Cys Ile Gln Ala Ser Glu Cys Leu Lys Tyr Phe Leu Gly Phe Glu Thr Leu Leu Ile Asn Thr Leu Leu Ile Ala Asp Ile Lys Thr Met Asp Phe Lys Lys Ile Gln Ala Pro Lys Asn Pro Glu Cys Arg Val Cys Gly Thr His Lys Ile Thr His Leu Gln Asp Tyr Glu Ile (2) INFORMATION FOR SEQ ID N0:133:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 547 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear {ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...494 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:133:

Met Asn Arg Met Asn Lys Asn Tyr Leu Leu Ile Phe Leu Leu Leu Ala Ser Leu Val Ala Arg Glu Lys Asp Ala Ser Ser Asn Leu Phe Asp Leu Ile Asp ' AAG GGG ATC AAC AGA GAA CAA GAA TTA AAA GAG CAG GAG CAA AAA ACG 200 Lys Gly Ile Asn Arg Glu Gln Glu Leu Lys Glu Gln Glu Gln Lys Thr Arg Leu Lys Leu Ala Gln Ser Pro Leu Val Ala Leu Glu Ile Val Pro Gln Glu Thr Pro Tyr Leu Glu Trp Gln Gly Ala Arg Glu Ser Tyr Tyr Leu Lys Val Ser Ala Val Val Glu Ser Val Val Ile Leu Lys Ile Asp Ile Asn Gln Gly Arg Ser Cys Ser Leu Tyr Pro Thr Pro Lys Ser Val Ser Leu Val Arg Asn Gln Ser Val Ala Tyr Glu Ile Leu Cys Glu Asn Gln Pro Leu Trp Ile Glu Val Ser Thr Asn Leu Gly Lys Arg Thr Phe Gln Phe (2) INFORMATION FOR SEQ ID N0:134:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 148 amino acids (B) TYPE: amino acid y (C) STRANDEDNESS: single (D) TOPOLOGY: linear " (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:134:
Met Asn Arg Met Asn Lys Asn Tyr Leu Leu Ile Phe Leu Leu Leu Ala Ser Leu Val Ala Arg Glu Lys Asp Ala Ser Ser Asn Leu Phe Asp Leu Ile Asp Lys Gly Ile Asn Arg Glu Gln Glu Leu Lys Glu Gln Glu Gln Lys Thr Arg Leu Lys Leu Ala Gln Ser Pro Leu Val Ala Leu Glu Ile Val Pro Gln Glu Thr Pro Tyr Leu Glu Trp Gln Gly Ala Arg Glu Ser Tyr Tyr Leu Lys Val Ser Ala Val Val Glu Ser Val Val Ile Leu Lys 85 90 g5 Ile Asp Ile Asn Gln Gly Arg Ser Cys Ser Leu Tyr Pro Thr Pro Lys Ser Val Ser Leu Val Arg Asn Gln Ser Val Ala Tyr Glu Ile Leu Cys Glu Asn Gln Pro Leu Trp Ile Glu Val Ser Thr Asn Leu Gly Lys Arg Thr Phe Gln Phe (2) INFORMATION FOR SEQ ID N0:135:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1684 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1631 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:135:

TTCTCATTGC ATG
ATCTTGTGGG
ATCATTTTTT

Met Leu AGG AAC

AlaSerIleIle SerIleLeu ValPhe ValLeuLeu Phe Thr Arg Asn TTG TAT

ProLeuPheIle PheAlaPhe ProVal GlyPheLeu Gly Phe Leu Tyr AAT TGG

IleLeuGlnAla TyrAlaLys ProLeu PheProLys Leu Leu Asn Trp AGT

Val Leu AlaSerLeu PhePheTyr AlaPheTrp AsnValLys TyrLeu Pro Leu LeuValGly SerIleVal PheAsnTyr PheValAla LeuLys ' ATC CAT CAAACCCAG CCAAATGCA TATAAAAGA TTATGGCTT ATTTTG 344 Ile His GlnThrGln ProAsnAla TyrLysArg LeuTrpLeu IleLeu Gly Leu IleAlaAsn ValSerLeu LeuGlyPhe PheLysTyr ThrAsp Phe Phe LeuThrAsn PheAsnLeu IleTrpLys SerHisPhe GluThr Leu His LeuIleLeu ProLeuAla IleSerPhe PheThrLeu GlnGln Ile Ala TyrLeuMet AspThrTyr LysGlnAsn GlnIleMet GlnPro Lys Met ArgGluArg ValSerGlu AsnAlaPro IleLeuLeu AsnPro Pro Thr SerPhePhe SerLeuSer HisPheLeu AspTyrAla LeuPhe Val Ser PhePhePro GlnLeuIle AlaGlyPro IleValHis HisSer Glu Met MetProGln PheLysAsp LysAsnAsn GlnTyrLeu AsnTyr Arg Asn IleAlaLeu GlyLeuPhe IlePheSer IleGlyLeu PheLys Lys Val ValIleAla AspAsnThr AlaHisPhe AlaAspPhe GlyPhe Asp Lys AlaThrSer LeuSerPhe IleGlnAla TrpMetThr SerLeu TCG GGT

SerTyr Phe GlnLeuTyr PheAspPhe Ser Tyr CysAspMet Ser Gly GGC CTC

AlaIle Ile GlyLeuPhe PheAsnIle Lys Pro IleAsnPhe Gly Leu CCC TTT

AsnSer Tyr LysAlaLeu AsnIleGln Asp Trp ArgArgTrp Pro Phe ACT TTG

HisIle Leu SerArgPhe LeuLysGlu Tyr Tyr IleProLeu Thr Leu AAT AGG

GlyGly Arg ValLysGlu LeuIleVal Tyr Asn LeuIleLeu Asn Arg TTG GGT

ValPhe Ile GlyGlyPhe TrpHisGly Ala Trp ThrPheIle Leu Gly GGG GTT

IleTrp Leu LeuHisGly IleAlaLeu Ser His ArgAlaTyr Gly Val GCC CCA

SerHis Thr ArgLysPhe HisPheThr Met Lys IleLeuAla Ala Pro ATC TGG

TrpLeu Thr PheAsnPhe IleAsnLeu Ala Val PhePheArg Ile Trp 405 410 ' 415 AAT AAG

AlaLys Leu GluSerAla LeuLysVal Leu Gly MetValGly Asn Lys GGT GAG

LeuAsn Val SerLeuCys HisLeuSer Lys Ala SerGluPhe Gly Glu CGT ACC

LeuAsn Val AsnAspAsn MetIleMet His Ile MetTyrAla Arg Thr ACA ATC

_ SerPro Phe LysMetCys ValLeuMet Ile Ile SerPheCys Thr Ile AAT CAA

LeuLys Ser SerHisLeu TyrGlnSer Asn Met AspTrpIle Asn Gln Lys Thr Thr Ser Ala Cys Leu Leu Leu Ser Ile Gly Phe Leu Phe Ile Phe Ala Ser Ser Gln Ser Val Phe Leu Tyr Phe Asn Phe ~ (2) INFORMATION FOR SEQ ID N0:136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 527 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:136:
Met Leu Aia Ser Ile Ile Ser Ile Leu Arg Val Phe Val Leu Leu Phe Asn Thr Pro Leu Phe Ile Phe Ala Phe Leu Pro Val Gly Phe Leu Gly Tyr Phe Ile Leu Gln Ala Tyr Ala Lys Asn Pro Leu Phe Pro Lys Leu Trp Leu Val Leu Ala Ser Leu Phe Phe Tyr Ala Phe Trp Asn Val Lys Tyr Leu Pro Leu Leu Val Gly Ser Ile Val Phe Asn Tyr Phe Val Ala Leu Lys Ile His Gln Thr Gln Pro Asn Ala Tyr Lys Arg Leu Trp Leu Ile Leu Gly Leu Ile Ala Asn Val Ser Leu Leu Gly Phe Phe Lys Tyr Thr Asp Phe Phe Leu Thr Asn Phe Asn Leu Ile Trp Lys Ser His Phe Glu Thr Leu His Leu Ile Leu Pro Leu Ala Ile Ser Phe Phe Thr Leu Gln Gln Ile Ala Tyr Leu Met Asp Thr Tyr Lys Gln Asn Gln Ile Met Gln Pro Lys Met Arg Glu Arg Val Ser Glu Asn Ala Pro Ile Leu Leu ~ 165 . 170 175 Asn Pro Pro Thr Ser Phe Phe Ser Leu Ser His Phe Leu Asp Tyr Ala Leu Phe Val Ser Phe Phe Pro Gln Leu Ile Ala Gly Pro Ile Val His His Ser Glu Met Met Pro Gln Phe Lys Asp Lys Asn Asn Gln Tyr Leu Asn Tyr Arg Asn Ile Ala Leu Gly Leu Phe Ile Phe Ser Ile Gly Leu Phe Lys Lys Val Val Ile Ala Asp Asn Thr Ala His Phe Ala Asp Phe Gly Phe Asp Lys Ala Thr Ser Leu Ser Phe Ile Gln Ala Trp Met Thr Ser Leu Ser Tyr Ser Phe Gln Leu Tyr Phe Asp Phe Ser Gly Tyr Cys Asp Met Ala Ile Gly Ile Gly Leu Phe Phe Asn Ile Lys Leu Pro Ile Asn Phe Asn Ser Pro Tyr Lys Ala Leu Asn Ile Gln Asp Phe Trp Arg Arg Trp His Ile Thr Leu Ser Arg Phe Leu Lys Glu Tyr Leu Tyr Ile Pro Leu Gly Gly Asn Arg Val Lys Glu Leu Ile Val Tyr Arg Asn Leu Ile Leu Val Phe Leu Ile Gly Gly Phe Trp His Gly Ala Gly Trp Thr Phe Ile Ile Trp Gly Leu Leu His Gly Ile Ala Leu Ser Val His Arg Ala Tyr Ser His Ala Thr Arg Lys Phe His Phe Thr Met Pro Lys Ile Leu Ala Trp Leu Ile Thr Phe Asn Phe Ile Asn Leu Ala Trp Val Phe Phe Arg Ala Lys Asn Leu Glu Ser Ala Leu Lys Val Leu Lys Gly Met Val Gly Leu Asn Gly Val Ser Leu Cys His Leu Ser Lys Glu Ala Ser Glu Phe Leu Asn Arg Val Asn Asp Asn Met Ile Met His Thr Ile Met Tyr Ala Ser Pro Thr Phe Lys Met Cys Val Leu Met Ile Ile Ile Ser Phe Cys Leu Lys Asn Ser Ser His Leu Tyr Gln Ser Asn Gln Met Asp Trp Ile Lys Thr Thr Ser Ala Cys Leu Leu Leu Ser Ile Gly Phe Leu Phe Ile Phe Ala Ser Ser Gln Ser Val Phe Leu Tyr Phe Asn Phe (2) INFORMATION FOR SEQ ID N0:137:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3973 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic RNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...3920 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:137:

Met Glu Ile Gln Gln Thr His Arg Lys Ile Asn Arg Pro Leu Val Ser Leu Ala ' TTA GTA GGA GCG TTA GTC AGC ATC ACA CCG CAA CAA AGT CAT GCC GCC 152 Leu Val Gly Ala Leu Val Ser Ile Thr Pro Gln Gln Ser His Ala Ala Phe Phe Thr Thr Val Ile Ile Pro Ala Ile Val Gly Gly Ile Ala Thr Gly Ala Ala Val Gly Thr Val Ser Gly Leu Leu Gly Trp Gly Leu Lys Gln Ala Glu Glu Ala Asn Lys Thr Pro Asp Lys Pro Asp Lys Val Trp Arg Ile Gln Ala Gly Lys Gly Phe Asn Glu Phe Pro Asn Lys Glu Tyr TAC ATT GGC
AGA GAT
TCC
CTA
CTA
TCT
AGT

AspLeuTyr Ser LeuSer SerLys AspGly TrpAsp Arg Leu Ile Gly GCC ACG AAA CAA

TrpGlyAsn Ala HisTyr TrpVal GlyGly TrpAsn Ala Thr Lys Gln GTG ATG GGG AAT

LysLeuGlu Asp LysAsp AlaVal ThrTyr LeuSer Val Met Gly Asn AAC ACT GAT ATG

GlyLeuArg Phe GlyGly AspLeu ValAsn GlnLys Asn Thr Asp Met Ala Thr Leu Arg Leu Gly Gln Phe Asn Gly Asn Ser Phe Thr Ser Tyr Lys Asp Ser Ala Asp Arg Thr Thr Arg Val Asp Phe Asn Ala Lys Asn Ile Leu Ile Asp Asn Phe Leu Glu Ile Asn Asn Arg Val Giy Ser Gly AGG GCC TTA TCA
AGC

AlaGly LysAla SerThrVal ThrLeu GlnAla Glu Arg Ser Leu Ser ACT AAA ATT GGC

GlyIle SerSer AsnAlaGlu SerLeu TyrAsp Ala Thr Lys Ile Gly AAT TCA AAA GTG

ThrLeu LeuAla AsnSerVal LeuMet GlyAsn Trp Asn Ser Lys Val _ ATGGGC TTGCAA GTGGGAGCG TTGGCC CCTTCA AGC 872 ' CGT TAT TAT TAC

MetGly LeuGln ValGlyAla LeuAla ProSer Ser Arg Tyr Tyr Tyr AAC AAA GAA CAT

ThrIle ThrSer ValThrGly ValAsn PheAsn Leu Asn Lys Glu His GGC AAC GCA AGT

ThrVal AspHis AlaAlaGln GlyIle IleAla Asn Gly Asn Ala Ser CAT ACA TGG CTA

LysThr IleGly LeuAspLeu GlnSer AlaGly Asn His Thr Trp Leu GCC GAA AAG GAT

IleIle ProPro GlyGlyTyr AspLys ProLys Lys Ala Glu Lys Asp AAC CAA AAC AAC

ProSer ThrThr AsnAsnAla AsnAsn GlnGln Ser Asn Gln Asn Asn AAC AAC ATT AGC

AlaGln AsnSer ThrGlnVal AsnPro ProAsn Ala Asn Asn Ile Ser ACA CAA GTC TTT

GlnLys GluIle ProThrGln IleAsp GlyPro Ala Thr Gln Val Phe AAA GTT GAT AAC

GlyGly AspThr ValAsnIle ArgIle AsnThr Ala Lys Val Asp Asn ACG GTG AAA ACC

AspGly IleLys GlyGlyTyr AlaSer LeuThr Asn Thr Val Lys Thr CAT ATC GGT AAT

AlaAla LeuHis GlyLysGly IleAsn LeuSer Gln His Ile Gly Asn Ala Ser Gly Arg Thr Leu Leu Val Glu Asn Leu Thr Gly Asn Ile Thr Val Asp Gly Pro Leu Arg Val Asn Asn Gln Val Gly Gly Tyr Ala Leu ' GCA GGA TCA AGC GCG AAT TTT GAG TTT AAG GCT GGT ACG GAT ACC AAA 1496 Ala Gly Ser Ser Ala Asn Phe Glu Phe Lys Ala Gly Thr Asp Thr Lys ~ 470 475 480 Asn Gly Thr Ala Thr Phe Asn Asn Asp Ile Ser Leu Gly Arg Phe Val Asn LeuLys ValAspAla HisThr AlaAsnPhe LysGlyIle AspThr Gly AsnGly GlyPheAsn ThrLeu AspPheSer GlyValThr GlyLys Val AsnIle AsnLysLeu IleThr AlaSerThr AsnValAla ValLys Asn PheAsn IleAsnGlu LeuVal ValLysThr AsnGlyVal SerVal Gly GluTyr ThrHisPhe SerGlu AspIleGly SerGlnSer ArgIle Asn ThrVal ArgLeuGlu ThrGly ThrArgSex IlePheSer GlyGly Val LysPhe LysSerGly GluLys LeuValIle AspGluPhe TyrTyr Ser ProTrp AsnTyrPhe AspAla ArgAsnIle LysAsnVal GluIle Thr ArgLys PheAlaSer SerThr ProGluAsn ProTrpGly ThrSer Lys LeuMet PheAsnAsn LeuThr LeuGlyGln AsnAlaVal MetAsp Tyr Ser Gln Phe Ser Asn Leu Thr Ile Gln Gly Asp Phe Ile Asn Asn Gln Gly Thr Ile Asn Tyr Leu Val Arg Gly Gly Gln Val Ala Thr Leu GCT AGT GTG
ATG AAT

AsnValGly AsnAlaAla AlaMet PhePheSer Asn Asp Ser Asn Val AAC GCT

AlaThrGly PheTyrGln ProLeu MetLysIle Ser Gln Asp Asn Ala GCG ATC

LeuIleLys AsnLysGlu HisVal LeuLeuLys Lys Ile Gly Ala Ile AGT GTT

TyrGlyAsn ValSerLeu GlyThr AsnSerIle Asn Asn Leu Ser Val AAC AAT

IleGluGln PheLysGlu ArgLeu AlaLeuTyr Asn Asn Arg Asn Asn ATT GCA

MetAspIle CysValVal ArgAsn ThrAspAsp Lys Cys Gly Ile Ala CCC AAT

ThrAlaIle GlyAsnGln SerMet ValAsnAsn Asp Tyr Lys Pro Asn ATC AAA

TyrLeuIle GlyLysAla TrpLys AsnIleGIy Ser Thr Ala Ile Lys AAT ACG

AsnGlySer LysIleSer ValTyr TyrLeuGly Ser Pro Thr Asn Thr AAC ACT

GluLysGly GlyAsnThr ThrAsn LeuProThr Thr Ser Asn Asn Thr GCT TTC

ValArgSer AlaAsnAsn AlaLeu AlaGlnAsn Pro Ala Gln Ala Phe CAG GAT

ProSerAla ThrProAsn LeuVal AlaIleAsn His Phe Gly Gln Asp Thr Ile Glu Ser Val Phe Glu Leu Ala Asn Arg Ser Lys Asp Ile Aap Thr Leu Tyr Ala Asn Ser Gly Ala Gln Gly Arg Asp Leu Leu Gln Thr Leu Leu Ile Asp Ser His Asp Ala Gly Tyr Ala Arg Gln Met Ile Asp Asn Thr Ser Thr Gly Glu Ile Thr Lys Gln Leu Asn Ala Ala Thr Thr Thr Leu Asn Asn Ile Ala Ser Leu Glu His Lys Thr Ser Ser Leu Gln Thr Leu Ser Leu Ser Asn Ala Met Ile Leu Asn Ser Arg Leu Val Asn Leu Ser Arg Arg His Thr Asn Asn Ile Asp Ser Phe Ala Gln Arg Leu Gln Ala Leu Lys Asp Gln Lys Phe Ala Ser Leu Glu Ser Ala Ala Glu GTG TTG TAT CAA TTT GCC CCT AAA TAT GAA AAA CCT ACC AAT G'~T TGG 3128 Val Leu Tyr Gln Phe Ala Pro Lys Tyr Glu Lys Pro Thr Asn Val Trp Ala Asn Ala Ile Gly Gly Thr Ser Leu Asn Asn Gly Gly Asn Ala Ser Leu Tyr Gly Thr Ser Ala Gly Val Asp Ala Tyr Leu Asn Gly Glu Val Glu Ala Ile Val Gly Gly Phe Gly Ser Tyr Gly Tyr Ser Ser Phe Asn Asn Gln Ala Asn Ser Leu Asn Ser Gly Ala Asn Asn Thr Asn Phe Gly ValTyr ArgIle Ala Gln Glu AspPhe Glu Ala Ser Phe Asn His Phe GCG AGT CAA AGC AAT

GlnGly LeuGly Asp Ser Leu PheLys Ser Ala Ala Ser Gln Ser Asn CGA AAT AGC AAT TTA

_ LeuLeu AspLeu Gln Tyr Tyr AlaTyr Ser Ala Arg Asn Ser Asn Leu ACA AGC TTT TTT AGG
AGA AAC

Ala ThrArg Ala Tyr Gly Tyr Asp Ala Phe Asn Ala Ser Phe Phe Arg CCA AGC AAC TTA

Leu ValLeu Lys Ser Val Gly Val Tyr His Gly Ser Pro Ser Asn Leu AGC GTG TTG AAT

Thr AsnPhe Lys Asn Ser Asn Gln Ala Lys Gly Ser Ser Val Leu Asn TTA GCT GTG GCG

Ser SerGln His Phe Asn Ala Ser Asn Glu Arg Tyr Leu Ala Val Ala ACT ATG GCT GTT

Tyr TyrGly Asp Ser Tyr Phe Tyr Asn Gly Leu Gln Thr Met Ala Val TTT GCG TCT AAC

Glu PheAla Asn Gly Ser Ser Asn Val Leu Thr Phe Phe Ala Ser Asn GCA AGT CAT AGA

Lys ValAsn Ala His Asn Pro Leu Thr Ala Val Met Ala Ser His Arg TTA GAA TTT AAT

Met GlyGly Glu Lys Leu Ala Lys Val Leu Leu Gly Leu Glu Phe Asn CAC AAT GGC TTC

Phe ValTyr Leu Asn Leu Ile Ser Ile His Ala Ser His Asn Gly Phe Asn Leu Gly Met Arg Tyr Ser Phe (2) INFORMATION FOR SEQ ID N0:138:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1290 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:138:
Met Glu Ile Gln Gln Thr His Arg Lys Ile Asn Arg Pro Leu Val Ser Leu Ala Leu Val Gly Ala Leu Val Ser Ile Thr Pro Gln Gln Ser His Ala Ala Phe Phe Thr Thr Val Ile Ile Pro Ala Ile Val Gly Gly Ile Ala Thr Gly Ala Ala Val Gly Thr Val Ser Gly Leu Leu Gly Trp Gly Leu Lys Gln Ala Glu Glu Ala Asn Lys Thr Pro Asp Lys Pro Asp Lys Val Trp Arg Ile Gln Ala Gly Lys Gly Phe Asn Glu Phe Pro Asn Lys Glu Tyr Asp Leu Tyr Arg Ser Leu Leu Ser Ser Lys Ile Asp Gly Gly Trp Asp Trp Gly Asn Ala Ala Thr His Tyr Trp Val Lys Gly Gly Gln Trp Asn Lys Leu Glu Val Asp Met Lys Asp Ala Val Gly Thr Tyr Asn Leu Ser Gly Leu Arg Asn Phe Thr Gly Gly Asp Leu Asp Val Asn Met Gln Lys Ala Thr Leu Arg Leu Gly Gln Phe Asn Gly Asn Ser Phe Thr Ser Tyr Lys Asp Ser Ala Asp Arg Thr Thr Arg Val Asp Phe Asn Ala 180 185 ~ 190 Lys Asn Ile Leu Ile Asp Asn Phe Leu Glu Ile Asn Asn Arg Val Gly Ser Gly Ala Gly Arg Lys Ala Ser Ser Thr Val Leu Thr Leu Gln Ala Ser Glu Gly Ile Thr Ser Ser Lys Asn Ala Glu Ile Ser Leu Tyr Asp Gly Ala Thr Leu Asn Leu Ala Ser Asn Ser Val Lys Leu Met Gly Asn Val Trp Met Gly Arg Leu Gln Tyr Val Gly Ala Tyr Leu Ala Pro Ser Tyr Ser Thr Ile Asn Thr Ser Lys Val Thr Gly Glu Val Asn Phe Asn . 275 280 285 His Leu Thr Val Gly Asp His Asn Ala Ala Gln Ala Gly Ile Ile Ala Ser Asn Lys Thr His Ile Gly Thr Leu Asp Leu Trp Gln Ser Ala Gly Leu Asn Ile Ile Ala Pro Pro Glu Gly Gly Tyr Lys Asp Lys Pro Lys Asp Lys Pro Ser Asn Thr Thr Gln Asn Asn Ala Asn Asn Asn Gln Gln Asn Ser Ala Gln Asn Asn Ser Asn Thr Gln Val Ile Asn Pro Pro Asn Ser Ala Gln Lys Thr Glu Ile Gln Pro Thr Gln val Ile Asp Gly Pro Phe Ala Gly Gly Lys Asp Thr Val Val Asn Ile Asp Arg Ile Asn Thr Asn Ala Asp Gly Thr Ile Lys Val Gly Gly Tyr Lys Ala Ser Leu Thr Thr Asn Ala Ala His Leu His Ile Gly Lys Gly Gly Ile Asn Leu Ser Asn Gln Ala Ser Gly Arg Thr Leu Leu Val Glu Asn Leu Thr Gly Asn -Ile Thr Val Asp Gly Pro Leu Arg Val Asn Asn Gln Val Gly Gly Tyr Ala Leu Ala Gly Ser Ser Ala Asn Phe Glu Phe Lys Ala Gly Thr Asp Thr Lys Asn Gly Thr Ala Thr Phe Asn Asn Asp Ile Ser Leu Gly Arg Phe Val Asn Leu Lys Val Asp Ala His Thr Ala Asn Phe Lys Gly Ile Asp Thr Gly Asn Gly Gly Phe Asn Thr Leu Asp Phe Ser Gly Val Thr Gly Lys Val Asn Ile Asn Lys Leu Ile Thr Ala Ser Thr Asn Val Ala Val Lys Asn Phe Asn Ile Asn Glu Leu Val Val Lys Thr Asn Gly Val Ser Val Gly Glu Tyr Thr His Phe Ser Glu Asp Ile Gly Ser Gln Ser Arg Ile Asn Thr Val Arg Leu Glu Thr Gly Thr Arg Ser Ile Phe Ser Gly Gly Val Lys Phe Lys Ser Gly Glu Lys Leu Val Ile Asp Glu Phe Tyr Tyr Ser Pro Trp Asn Tyr Phe Asp Ala Arg Asn Ile Lys Asn Val Glu Ile Thr Arg Lys Phe Ala Ser Sex Thr Pro Glu Asn Pro Trp Gly Thr Ser Lys Leu Met Phe Asn Asn Leu Thr Leu Gly Gln Asn Ala Val Met Asp Tyr Ser Gln Phe Ser Asn Leu Thr Ile Gln Gly Asp Phe Ile Asn Asn Gln Gly Thr Ile Asn Tyr Leu Val Arg Gly Gly Gln Val Ala Thr Leu Asn Val Gly Asn Ala Ala Ala Met Phe Phe Ser Asn Asn Val Asp Ser Ala Thr Gly Phe Tyr Gln Pro Leu Met Lys Ile Asn Ser Ala Gln Asp Leu Ile Lys Asn Lys Glu His Val Leu Leu Lys Ala Lys Ile Ile Gly Tyr Gly Asn Val Ser Leu Gly Thr Asn Ser Ile Ser Asn Val Asn Leu Ile Glu Gln Phe Lys Glu Arg Leu Ala Leu Tyr Asn Asn Asn 755 760 . 765 Asn Arg Met Asp Ile Cys Val Val Arg Asn Thr Asp Asp Ile Lys Ala Cys Gly Thr Ala Ile Gly Asn Gln Ser Met Val Asn Asn Pro Asp Asn Tyr Lys Tyr Leu Ile Gly Lys Ala Trp Lys Asn Ile Gly Ile Ser Lys Thr Ala Asn Gly Ser Lys Ile Ser Val Tyr Tyr Leu Gly Asn Ser Thr Pro Thr Glu Lys Gly Gly Asn Thr Thr Asn Leu Pro Thr Asn Thr Thr Ser Asn Val Arg Ser Ala Asn Asn Ala Leu Ala Gln Asn Ala Pro Phe Ala Gln Pro Ser Ala Thr Pro Asn Leu Val Ala Ile Asn Gln His Asp ,; 865 870 875 880 Phe Gly Thr Ile Glu Ser Val Phe Glu Leu Ala Asn Arg Ser Lys Asp Ile Asp Thr Leu Tyr Ala Asn Ser Gly Ala Gln Gly Arg Asp Leu Leu Gln Thr Leu Leu Ile Asp Ser His Asp Ala Gly Tyr Ala Arg Gln Met Ile Asp Asn Thr Ser Thr Gly Glu Ile Thr Lys Gln Leu Asn Ala Ala Thr Thr Thr Leu Asn Asn Ile Ala Ser Leu Glu His Lys Thr Ser Ser Leu Gln Thr Leu Ser Leu Ser Asn Ala Met Ile Leu Asn Ser Arg Leu Val Asn Leu Ser Arg Arg His Thr Asn Asn Ile Asp Ser Phe Ala Gln Arg Leu Gln Ala Leu Lys Asp Gln Lys Phe Ala 5er Leu Glu Ser Ala Ala Glu Val Leu Tyr Gln Phe Ala Pro Lys Tyr Glu Lys Pro Thr Asn Val Trp Ala Asn Ala Ile Gly Gly Thr Ser Leu Asn Asn Gly Gly Asn Ala Ser Leu Tyr Gly Thr Ser Ala Gly Val Asp Ala Tyr Leu Asn Gly Glu Val Glu Ala Ile Val Gly Gly Phe Gly Ser Tyr Gly Tyr Ser Ser Phe Asn Asn Gln Ala Asn Ser Leu Asn Ser Gly Ala Asn Asn Thr Asn Phe Gly Val Tyr Ser Arg Ile Phe Ala Asn Gln His Glu Phe Asp Phe Glu Ala Gln Gly Ala Leu Gly Ser Asp Gln Ser Ser Leu Asn Phe Lys Ser Ala Leu Leu Arg Asp Leu Asn Gln Ser Tyr Asn Tyr Leu Ala Tyr Ser Ala Ala Thr Arg Ala Ser Tyr Gly Tyr Asp Phe Ala Phe Phe Arg Asn Ala Leu Val Leu Lys Pro Ser Val Gly Val Ser Tyr Asn His Leu Gly Ser Thr Asn Phe Lys Ser Asn Ser Asn Gln Val Ala Leu Lys Asn ' Gly Ser Ser Ser Gln His Leu Phe Asn Ala Ser Ala Asn Val Glu Ala Arg Tyr Tyr Tyr Gly Asp Thr Ser Tyr Phe Tyr Met Asn Ala Gly Val Leu Gln Glu Phe Ala Asn Phe Gly Ser Ser Asn Ala Val Ser Leu Asn Thr Phe Lys Val Asn Ala Ala His Asn Pro Leu Ser Thr His Ala Arg Val Met Met Gly Gly Glu Leu Lys Leu Ala Lys Glu Val Phe Leu Asn Leu Gly Phe Val Tyr Leu His Asn Leu Ile Ser Asn Ile Gly His Phe Ala Ser Asn Leu Gly Met Arg Tyr Ser Phe (2) INFORMATION FOR SEQ ID N0:139:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1335 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/ICEY: Coding Sequence (B) LOCATION: 55...1284 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:139:

Met Lys Lys Glu Val Val Val Ile Gly Gly Gly Ile Val Gly Leu Ser Cys Ala Tyr Ser Met His Lys Leu Gly His Lys Val Cys Val Ile Glu Lys Asn Asp Gly Ala Asn Gly Thr Ser Phe Gly Asn Aia Gly Leu Ile Ser Ala Phe Lys Lys Ala Pro Leu Ser Cys Pro Gly Val Val Leu Asp Thr -Leu Lys Leu Met Leu Lys Asn Gln Ala Pro Leu Lys Phe His Phe Gly Leu Asn Leu Lys Leu Tyr Gln Trp Ile Leu Lys Phe Val Lys Ser Ala Asn Ala Lys Ser Thr His Arg Thr Met Ala Leu Phe Glu Arg Tyr Gly Trp Leu Ser Ile Asp Met Tyr His Gln Met Leu Lys Asp Gly Met Asp Phe Trp Tyr Lys Glu Asp Gly Leu Leu Met Ile Tyr Thr Leu Glu Glu . 130 135 140 145 AAA ACT AGC
AAG

Ser PheGlu LysLysLeu LysThrCys AspAsn GlyAlaTyr Lys Ser CCC

Ile LeuSer AlaLysGlu ThrLysGlu TyrMet ValValAsn Asp Pro GCG

Asn IleCys GlySerVal LeuLeuThr GluAsn HisValAsp Pro Ala CAA

Gly GluVal MetHisSer LeuGlnGlu TyrLeu AsnValGly Val Gln GAG

Glu PheLeu TyrAsnGlu GluValIle AspPhe PheLysAsn Asn Glu ATC

Leu IleGlu GlyValIle ThrHisLys GluLys GlnAlaGlu Thr Ile ATT

Ile IleLeu AlaThrGly AlaAsnPro ThrLeu LysLysThr Lys Ile AGC

Asn AspPhe LeuMetMet GlyAlaLys GlyTyr IleThrPhe Lys Ser TTA

Met ProGlu GluLeuLys ProLysThr SerSer PheAlaAsp Ile Leu AGG

Phe MetAla MetThrPro ArgArgAsp ThrVal IleThrSer Lys Arg AAA

Leu Glu Leu Asn Thr Asn Asn Ala Leu Ile Asp Lys Glu Gln Ile Ala Asn Met Lys Lys Asn Leu Ala Ala Phe Thr Gln Pro Phe Glu Met Lys GAC GCC ATA GAG TGG TGC GGT TTC AGA CCC TTA ACC CCT AAT GAT ATT 1113 _ Asp Ala Ile Glu Trp Cys Gly Phe Arg Pro Leu Thr Pro Asn Asp Ile AAA
CGC
TAT
AAA
AAC
TTA
ATC

ProTyrLeu GlyTyrAsp Arg LysAsn Leu His Ala Thr Lya Tyr Ile ATC TTT ATT

GlyLeuGly TrpLeuGly Thr GlyPro Ala Gly Lys Ile Ile Phe Ile ATC~GCCAAT TTGAGCCAA GGA AATGAA AAA GCC GAT ATT 1257 GAC GCG AAT

IleAlaAsn LeuSerGln Gly AsnGlu Lys Ala Asp Ile Asp Ala Asn TTT GAT CTTTTTTAAA
CCCTAGT

MetLeuPhe SerAlaPhe Arg Phe Asp (2) INFORMATION FOR SEQ ID N0:140:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 410 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:140:
Met Lys Lys Glu Val Val Val Ile Gly Gly Gly Ile Val Gly Leu Ser Cys Ala Tyr 5er Met His Lys Leu Gly His Lys Val Cys Val Ile Glu Lys Asn Asp Gly Ala Asn Gly Thr Ser Phe Gly Asn Ala Gly Leu Ile Ser Ala Phe Lys Lys Ala Pro Leu Ser Cys Pro Gly Val Val Leu Asp Thr Leu Lys Leu Met Leu Lys Asn Gln Ala Pro Leu Lys Phe His Phe Gly Leu Asn Leu Lys Leu Tyr Gln Trp Ile Leu Lys Phe Val Lys Ser Ala Asn Ala Lys Ser Thr His Arg Thr Met Ala Leu Phe Glu Arg Tyr Gly Trp Leu Ser Ile Asp Met Tyr His Gln Met Leu Lys Asp Gly Met Asp Phe Trp Tyr Lys Glu Asp Gly Leu Leu Met Ile Tyr Thr Leu Glu Glu Ser Phe Glu Lys Lys Leu Lys Thr Cys Asp Asn Ser Gly Ala Tyr Lys Ile Leu Ser Ala Lys Glu Thr Lys Glu Tyr Met Pro Val Val Asn ' 165 170 175 Asp Aen Ile Cys Gly Ser Val Leu Leu Thr Glu Asn Ala His Val Asp Pro Gly Glu Val Met His Ser Leu Gln Glu Tyr Leu Gln Asn Val Gly Val Glu Phe Leu Tyr Asn Glu Glu Val Ile Asp Phe Glu Phe Lys Asn Asn Leu Ile Glu Gly Val Ile Thr His Lys Glu Lys Ile Gln Ala Glu Thr Ile Ile Leu Ala Thr Gly Ala Asn Pro Thr Leu Ile Lys Lys Thr Lys Asn Asp Phe Leu Met Met Gly Ala Lys Gly Tyr Ser Ile Thr Phe Lys Met Pro Glu Glu Leu Lys Pro Lys Thr Ser Ser Leu Phe Ala Asp Ile Phe Met Ala Met Thr Pro Arg Arg Asp Thr Val Arg Ile Thr Ser Lys Leu Glu Leu Asn Thr Asn Asn Ala Leu Ile Asp Lys Glu Gln Ile Ala Asn Met Lys Lys Asn Leu Ala Ala Phe Thr Gln Pro Phe Glu Met Lys Asp Ala Ile Glu Trp Cys Gly Phe Arg Pro Leu Thr Pro Asn Asp Ile Pro Tyr Leu Gly Tyr Asp Lys Arg Tyr Lys Asn Leu Ile His Ala Thr Gly Leu Gly Trp Leu Gly Ile Thr Phe Gly Pro Ala Ile Gly Lys Ile Ile Ala Asn Leu Ser Gln Asp Gly Ala Asn Glu Lys Asn Ala Asp Ile Met Leu Phe Ser Ala Phe Phe Arg Asp (2) INFORMATION FOR SEQ ID N0:141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1579 base pairs ' (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
' (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...1526 (D) OTHER INFORMATION:

(xi) N0:141:
SEQUENCE
DESCRIPTION:
SEQ
ID

TAAGATAGTC AGGGGAAATC
AAATACATT ATG

Met Glu ATC ATA GTT CTA ATT

HisLysGlu Val Gly Asp Gly Ser Arg Lys Cys Ile Ile Val Leu Ile GCT TTT GAA ATT GGC

AlaIleVal Glu Lys Gly Leu Arg Ile Ile Thr Ala Phe Glu Ile Gly GAC AAA ATC TCA AGA

AlaHisGln Ser Glu Asn Lys Ala Ile Lys Gly Asp Lys Ile Ser Arg AGC GCT GCT AAC GTG

ArgIleAsn Leu His Ser Ala Ile Lys Glu Ile Ser Ala Ala Asn Val AAA ATG GGT AAC AGA

AsnSerAla Lys Ala Leu Ala Asp Glu Asp Asn Lys Met Gly Asn Arg CCC TTT GAA CAC GCG

AsnProMet His Gly Tyr Pro Lys Thr Lys Ile Pro Phe Glu His Ala TCT GCT ACT AGC ACC

ValSerPhe Gly Tyr Glu Ile Arg Asp Val Gly Ser Ala Thr Ser Thr GTAGCGAGC AAA AAT GTA ATT GAT GAA ATC CGC 440 .
ACC GAT GTG ACC AAT

ValAlaSer Lys Asn Val Ile Asp Glu Ile Arg Thr Asp Val Thr Asn AGT TGC AAA GGC AAA

AlaIleAsn Ala Ala Ala Leu Asp Asn Asp His Ser Cys Lys Gly Lys GCT CCC CGC ACT GAA

IleLeuHis Leu Tyr Phe Leu Asp Lys Gln Val Ala Pro Arg Thr Glu TTA ATG GGG CGC ATC

AsnAspPro Gly Ser Thr Leu Glu Val Phe His Leu Met Gly Arg Ile ACA AAA AAC GAA ATC

IleValTyr Glu Asn Ile Asn Leu Glu Lys Met Thr Lys Asn Glu Ile WO 98/43478 PGT/tTS98/06371 Ile Gln Ser Gly Val Glu Ile Glu Asn Ile Val Ile Asn Ser Tyr Ala Ala Ser Ile Ala Thr Leu Ser Asn Asp Glu Arg Glu Leu Gly Val Ala ' TGC GTG GAT ATG GGC GGA GAG ACA TGC AAC CTT ACG ATT TAT AGC GGC 776 Cys Val Asp Met Gly Gly Glu Thr Cys Asn Leu Thr Ile Tyr Ser Gly Asn Ser Ile Arg Tyr Asn Lys Tyr Leu Pro Val Gly Ser His His Leu Thr Thr Asp Leu Ser His Met Leu Asn Thr Pro Phe Pro Tyr Ala Glu Glu Val Lys Ile Lys Tyr Gly Asp Leu Ser Phe Glu Gly Gly Glu Glu Thr Pro Ser Gln Asn Val Gln Ile Pro Thr Thr Gly Ser Asp Gly His Glu Ser His Ile Val Pro Leu Ser Glu Ile Gln Thr Ile Met Arg Glu Arg Ala Leu Glu Thr Phe Lys Ile Ile His Arg Ser Ile Gln Asp Ser Gly Leu Glu Glu His Leu Gly Gly Gly Val Val Leu Thr Gly Gly Met Ala Leu Met Lys Gly Ile Lys Glu Leu Ala Arg Thr His Phe Thr Asn Tyr Pro Val Arg Leu Ala Ala Pro Val Glu Lys Tyr Asn Ile Met Gly Met Phe Glu Asp Leu Lys Asp Pro Arg Phe Ser Val Val Val Gly Leu Ile Leu Tyr Lys Ala Gly Gly His Thr Asn Tyr Glu Arg Asp Ser Lys ATC AGC GAT CAT

GlyVal ArgTyr HisGlu AspAspTyr Thr Arg Thr Ala Ile Ser His AGC ATC AAT

GlnSer ProThr ProHis HisSerSer Pro Thr Glu Arg Ser Ile Asn GAT AGT AAC

LeuSer LeuLys AlaPro AlaProLeu Asn Thr Ala Lys Asp Ser Asn TTT CCC AAA

AspAsp LeuPro IleLys ThrGluGln Lys Gly Phe Phe Phe Pro Lys CTT AAA ATG

SerPhe AspLys IleSer PhePhe Leu Lys (2) INFORMATION FOR SEQ ID N0:142:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 492 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:142:
Met Glu His Lys Glu Ile Val Ile Gly Val Asp Leu Gly Ser Arg Lys Ile Cys Ala Ile Val Ala Glu Phe Lys Glu Gly Ile Leu Arg Ile Ile Gly Thr Ala His Gln Asp Ser Lys Glu Ile Asn Ser Lys Ala Ile Lys Arg Gly Arg Ile Asn Ser Leu Ala His Ala Ser Asn Ala Ile Lys Glu Val Ile Asn Ser Ala Lys Lys Met Ala Gly Leu Asn Ala Asp Glu Asp Arg Asn Asn Pro Met Pro His Phe Gly Glu Tyr His Pro Lys Thr Lys 85 9p 95 Ala Ile Val Ser Phe Ser Gly Ala Tyr Thr Glu Ser Ile Arg Asp Val Thr Gly Val Ala Ser Thr Lys Asp Asn Val Val Thr Ile Asp Glu Ile Asn Arg Ala Ile Asn Ser Ala Cys Ala Lys Ala Gly Leu Asp Asn Asp Lys His Ile Leu His Ala Leu Pro Tyr Arg Phe Thr Leu Asp Lys Gln Glu Val Asn Asp Pro Leu Gly Met Ser Gly Thr Arg Leu Glu Val Phe Ile His Ile Val Tyr Thr Glu Lys Asn Asn Ile Glu Asn Leu Glu Lys Ile Met Ile Gln Ser Gly Val Glu Ile Glu Asn Ile Val Ile Asn Ser Tyr Ala Ala Ser Ile Ala Thr Leu Ser Asn Asp Glu Arg Glu Leu Gly ~ Val Ala Cys Val Asp Met Gly Gly Glu Thr Cys Asn Leu Thr Ile Tyr Ser Gly Asn Ser Ile Arg Tyr Asn Lys Tyr Leu Pro Val Gly Ser His ~ His Leu Thr Thr Asp Leu Ser His Met Leu Asn Thr Pro Phe Pro Tyr Ala Glu Glu Val Lys Ile Lys Tyr Gly Asp Leu Ser Phe Glu Gly Gly Glu Glu Thr Pro Ser Gln Asn Val Gln Ile Pro Thr Thr Gly Ser Asp Gly His Glu Ser His Ile Val Pro Leu Ser Glu Ile Gln Thr Ile Met 305 310 ~ 315 320 Arg Glu Arg Ala Leu Glu Thr Phe Lys Ile Ile His Arg Ser Ile Gln Asp Ser Gly Leu Glu Glu His Leu Gly Gly Gly Val Val Leu Thr Gly Gly Met Ala Leu Met Lys Gly Ile Lys Glu Leu Ala Arg Thr His Phe Thr Asn Tyr Pro Val Arg Leu Ala Ala Pro Val Glu Lys Tyr Asn Ile Met Gly Met Phe Glu Asp Leu Lys Asp Pro Arg Phe 5er Val Val Val Gly Leu Ile Leu Tyr Lys Ala Gly Gly His Thr Asn Tyr Glu Arg Asp Ser Lys Gly Val Ile Arg Tyr His Glu Ser Asp Asp Tyr Thr Arg Thr Ala His Gln Ser Ser Pro Thr Pro His Ile His Ser Ser Pro Thr: Glu Arg Asn Leu Ser Asp Leu Lys Ala Pro Ser Ala Pro Leu Asn Thr Ala Lys Asn Asp Asp Phe Leu Pro Ile Lys Pro Thr Glu Gln Lys Gly Phe Phe Lys Ser Phe Leu Asp Lys Ile Ser Lys Phe Phe (2) INFORMATION FOR SEQ ID N0:143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1987 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence WO 98/43478. PCT/US98/06371 (B) LOCATION: 51...1934 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION:
SEQ
ID
N0:143:

ATTTTTAGCC ATG
ATTTATGACA ATA
CGAATTTAGA

Met Ile AGA GCA GGGCTTAAA CAACTC GAGCATAAA ATCGCC AAA 104 ' GTG TAT TCT

Arg Ala GlyLeuLys GlnLeu GluHisLys IleAla Lys Val Tyr Ser ATT GAA AAG

Gly Asp GlyAlaSer ProGlu GlnLeuGlu LysIle His Ile Glu Lys TTA AGG AAA

Tyr Ala HisGluIle GluGlu GluLeuGlu PheGln Ile Leu Arg Lys GCC CTT AAT

Gln Leu LeuPheLys LysGly CysIleThr ProTyr Glu Ala Leu Asn AAT GCT GAG

Leu Leu GluGlnLys AlaLys LysThrTyr PheLys Gln Asn Ala Glu TAC AAA ACT

Leu Ala LeuVaILeu ProPhe LeuAspSer SerHis Phe Tyr Lys Thr B5 90 g5 CCT GCG AAA

Pro Leu AlaAsnLeu ThrPhe LeuPheAla ArgIle Asp Pro Ala Lys GAA GCG TCT

Lys Thr GlnIleIle SerTyr LeuIleLys LeuPro Phe Glu Ala Ser TTC AAA GCT

Ile Arg PheValGlu LeuGlu GlyLeuPhe ValLeu Glu Phe Lys Ala ATC GAA GAG

Glu Val GluAlaHis LeuGlu LeuPheLeu GluHis Ile Ile Glu Glu GAT ACT GCT

Leu Cys MetAlaPhe ArgVal CysAspAla AspIle Ile Asp Thr Ala Thr Glu Asp Glu Ala His Asp Tyr Ala Asp Leu Met Ser Lys Ser Leu Arg Lys Arg Asn Gln Gly Glu Ile Val Arg Leu Gln Thr Gln Lys Gly Ser Gln Glu Leu Leu Lys Thr Leu Leu Ala Ser Leu Arg Ser Phe Gln Thr His Ser Tyr Lys Lys His Lys Leu Thr Gly Met His Ile Tyr Lys Ser Ala Ile Met Leu Asn Leu Gly Asp Leu Trp Glu Leu Val Asn His Ser Asp Phe Lys Ala Leu Lys Ser Pro Asn Phe Thr Pro Lys Ile His Pro His Phe Asn Glu Asn Asp Leu Phe Lys Ser Ile Glu Lys Gln Asp Leu Leu Leu Phe His Pro Tyr Glu Ser Phe Glu Pro Val Ile Asp Leu Ile Glu Gln Ala Ala Ser Asp Pro Ala Thr Leu Ser Ile Lys Met Thr Leu Tyr Arg Val Gly Lys His Ser Pro Ile Val Lys Ala Leu Ile Glu Ala Ala Ser Lys Ile Gln Val Ser Val Leu Val Glu Leu Lys Ala Arg Phe Asp Glu Glu Ser Asn Leu His Trp Ala Lys Ala Leu Glu Arg Ala Gly Ala Leu Val Val Tyr Gly Val Phe Lys Leu Lys Val His Ala Lys Met Leu Leu Ile Thr Lys Lys Thr Asp Asn Gln Leu Arg His Phe Thr His Leu Ser Thr Gly Asn Tyr Asn Pro Leu Ser Ala Lys Val Tyr Thr Asp Val Ser Phe Phe Ser Ala Lys Asn Glu Ile Ala Asn Asp Ile Ile ACT ACT AGC
AGC
AGC

LysLeu PheHisSer LeuLeu Ser Ala ThrAsn SerAlaLeu Thr Ser AAA ATC

GluThr LeuPheMet AlaPro Gln Lys ProLys IleIleGlu Lys Ile CAC CAA

LeuIle GlnAsnGlu MetAsn Gln Glu GlyTyr IleIleLeu His Gln AGC ATC

LysAla AsnAlaLeu ValAsp Glu Ile GluTrp LeuTyrGln Ser Ile ATT CTC

AlaSer GlnLysGly ValLys Asp Ile IleArg GlyIleCys Ile Leu GGC AGC

CysLeu LysProGln ValLys Leu Glu AsnIle ArgValTyr Gly Ser GAA GCA

SerIle ValGlyLys TyrLeu His Arg IleTyr TyrPheLys Glu Ala AGC GAT

HisGlu AsnIleTyr PheSer Ala Leu MetPro ArgAsnLeu Ser Asp ATT GCC

GluArg ArgValGlu LeuLeu Pro Thr AsnPro LysIleAla Ile Ala GAA CAA

HisLys LeuLeuHis IleLeu Ile Leu LysAsp ThrLeuLys Glu Gln GGC TAC

ArgTyr GluLeuAsn SerLys Arg Ile LysVal SerAsnPro Gly Tyr GAT TTT

AsnAsp ProLeuAsn SerGln Tyr Glu LysGln AlaLeuLys Asp Phe Thr Phe (2) INFORMATION FOR SEQ ID N0:144:
' (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 628 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:144:
Met Ile Arg Val Ala Gly Leu Lys Gln Leu Tyr Glu His Lys Ile Ala Ser Lys Gly Ile Asp Gly Ala Ser Pro Glu Glu Gln Leu Glu Lys Ile Lys His Tyr Leu Ala His Glu Ile Glu Glu Arg Glu Leu Glu Phe Gln Lys Ile Gln Ala Leu Leu Phe Lys Lys Gly Leu Cys Ile Thr Pro Tyr Asn Glu Leu Asn Leu Glu Gln Lys Ala Lys Ala Lys Thr Tyr Phe Lys 65 70 75 g0 Glu Gln Leu Tyr Ala Leu Val Leu Pro Phe Lys Leu Asp Ser Ser His Thr Phe Pro Pro Leu Ala Asn Leu Thr Phe Ala Leu Phe Ala Arg Ile Lys Asp Lys Glu Thr Gln Ile Ile Ser Tyr Ala Leu Ile Lys Leu Pro Ser Phe Ile Phe Arg Phe Val Glu Leu Glu Lys Gly Leu Phe Val Leu Ala Glu Glu Ile Val Glu Ala His Leu Glu Glu Leu Phe Leu Glu His Glu Ile Leu Asp Cys Met Ala Phe Arg Val Thr Cys Asp Ala Asp Ile Ala Ile Thr Glu Asp Glu Ala His Asp Tyr Ala Asp Leu Met Ser Lys Ser Leu Arg Lys Arg Asn Gln Gly Glu Ile Val Arg Leu Gln Thr Gln ' 195 200 205 Lys Gly Ser Gln Glu Leu Leu Lys Thr Leu Leu Ala Ser Leu Arg Ser . .
Phe Gln Thr His Ser Tyr Lys Lys His Lys Leu Thr Gly Met His Ile ' Tyr Lys Ser Ala Ile Met Leu Asn Leu Gly Asp Leu Trp Glu Leu Val Asn His Ser Asp Phe Lys Ala Leu Lys Ser Pro Asn Phe Thr Pro Lys Ile His Pro His Phe Asn Glu Asn Asp Leu Phe Lys Ser Ile Glu Lys 275 280 2g5 Gln Asp Leu Leu Leu Phe His Pro Tyr Glu Ser Phe Glu Pro Val Ile Asp Leu Ile Glu Gln Ala Ala Ser Asp Pro Ala Thr Leu Ser Ile Lys Met Thr Leu Tyr Arg Val Gly Lys His Ser Pro Ile Val Lys Ala Leu Ile Glu Ala Ala Ser Lys Ile Gln Val Ser Val Leu Val Glu Leu Lys Ala Arg Phe Asp Glu Glu Ser Asn Leu His Trp Ala Lys Ala Leu Glu Arg Ala Gly Ala Leu Val Val Tyr Gly Val Phe Lys Leu Lys Val His Ala Lys Met Leu Leu Ile Thr Lys Lys Thr Asp Asn Gln Leu Arg His Phe Thr His Leu Ser Thr Gly Asn Tyr Asn Pro Leu Ser Ala Lys Val Tyr Thr Asp Val Ser Phe Phe Ser Ala Lys Asn Glu Ile Ala Asn Asp Ile Ile Lys Leu Phe His Ser Leu Leu Thr Ser Ser Ala Thr Asn Ser Ala Leu Glu Thr Leu Phe Met Ala Pro Lys Gln Ile Lys Pro Lys Ile Ile Glu Leu Ile Gln Asn Glu Met Asn His Gln Gln Glu Gly Tyr Ile Ile Leu Lys Ala Asn Ala Leu Val Asp Ser Glu Ile Ile Glu Trp Leu Tyr Gln Ala Ser Gln Lys Gly Val Lys Ile Asp Leu Ile Ile Arg Gly Ile Cys Cys Leu Lys Pro Gln Val Lys Gly Leu Ser Glu Asn Ile Arg Val Tyr Ser Ile Val Gly Lys Tyr Leu Glu His Ala Arg Ile Tyr Tyr Phe Lys His Glu Asn Ile Tyr Phe Ser Ser Ala Asp Leu Met Pro Arg Asn Leu Glu Arg Arg Val Glu Leu Leu ~Ile Pro Ala Thr Asn Pro Lys Ile Ala His Lys Leu Leu His Ile Leu Glu Ile Gln Leu Lys Asp Thr Leu Lys Arg Tyr Glu Leu Asn Ser Lys Gly Arg Tyr Ile Lys Val Ser Asn Pro Asn Asp Pro Leu Asn Ser Gln Asp Tyr Phe Glu Lys Gln Ala Leu Lys Thr Phe (2) INFORMATION FOR SEQ ID N0:145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:

(A) NAME/ICEY: Coding Sequence (B) LOCATION: 51...563 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:145:

Met Gly Leu Lys Asn Leu Ser Thr Leu Leu Val Phe Leu Phe Phe Cys Leu Gly Cys Val Ser Asn Phe Asn Glu Asp Thr Tyr Thr Leu Asp Leu Val Leu AAA AGC AAA ACC GAT
AAG GGT
ATC

GluLys LysIleGln AlaSerArg LysGlyGlu IleThr Gln Asn Asp GAT

ValPro IleIleThr AlaIleAla ThrHisLeu AsnAsp Val Ser Asp ACG

GlyThr TyrTyrAsp HisGluTyr PheLeuVal GluIle Phe Gln Thr TTT

AsnAsn AspTrpIle AspAspGly TyrIleSer TyrGlu Leu Gly Phe ACA

ThrLys ProIleGly SerGluPro LeuTrpVal ArgGlu Ile Lys Thr AGA

AspGlu PheAspGly IleLeuGlu ThrThrAsn ArgTrp Ser Ala Arg Phe Leu Leu Ala Phe Asn Lys Leu Asp Tyr Leu Ala Val Gln Glu Ala Lys Leu Glu Leu Asp Ala Tyr Ser Leu Gly Lys Ile Val Phe Asn Phe Ala Tyr Gln Val Pro Leu Pro Gln Phe (2) INFORMATION FOR SEQ ID N0:146:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 171 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:146:
Met Gly Leu Lys Asn Leu Ser Thr Leu Leu Val Phe Leu Phe Phe Cys Leu Gly Cys Val Ser Asn Phe Asn Glu Asp Thr Tyr Thr Leu Asp Leu Val Leu Glu Lys Lys Ile Gln Ala Ser Arg Lys Gly Glu Ile Thr Gln Asp Asn Val Pro Ile Ile Thr Ala Ile Ala Thr His Leu Asn Asp Val Asp Ser Gly Thr Tyr Tyr Asp His Glu Tyr Phe Leu Val Glu Ile Phe Thr Gln Asn Asn Asp Trp Ile Asp Asp Gly Tyr Ile Ser Tyr Glu Leu Phe Gly Thr Lys Pro Ile Gly Ser Glu Pro Leu Trp Val Arg Glu Ile Thr Lys Asp Glu Phe Asp Gly Ile Leu Glu Thr Thr Asn Arg Trp Ser Arg Ala Phe Leu Leu Ala Phe Asn Lys Leu Asp Tyr Leu Ala Val Gln Glu Ala Lys Leu Glu Leu Asp Ala Tyr Ser Leu Gly Lys Ile Val Phe Asn Phe Ala Tyr Gln Val Pro Leu Pro Gln Phe (2) INFORMATION FOR SEQ ID N0:147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2341 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA _ (ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 966...2291 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:147:

AGGAGTGAGC
GTGAATAGTA
ATGGCAATAA
AGACAAACAA
CAGCAGAATG

GAAAATGGGG

AGCGGTAAGA

AAGTTTGCTA

AACAAGCAAT

GGTAAAAAAA

GAGAATGAAA

AAGGAAGAAA

GATTTTAAAG

AAGGAATTTG

GAAAAAATTG

TTAACAGATT

GGTGATGCAG

GGTAGAGAAT

GAAGAATTTA

ACGATCAAGG

ATG GAA
GAC TAC
GCA AGC
AGA ACC
GCT GGA
GCA CTG
GAG CGA
CTT

Met Glu Asp Tyr Ala Ser Arg Thr Ala Gly Ala Leu Glu Arg Leu Asp Lys Ile Val Glu Thr Glu Gln Lys Asn Gln Gln Thr Lys Leu Asp Thr Glu Asn Leu Lys Ile Ile Ile Glu Thr Leu Arg Ser Lys Ile Asn Gly Asn Gln Gln Lys Met Leu Asp Lys Ser Lys Glu Met Ser Arg Asn Phe Lys Leu Asp Ser Thr Lys Asn Glu Ile Asp Ala Ile Lys Asp Leu Ile Lys Lys Ala Asn Glu Gln Ile Ala Asn Tyr Asn Glu Met Ile Lys ' GAT ATT GAA AAA CAG AAA AAG AGT TGT AAG GAA CAA ACT TGG AAA TTT 1298 Asp Ile Glu Lys Gln Lys Lys Ser Cys Lys Glu Gln Thr Trp Lys Phe Leu Val Asn Glu Phe Lys Ser Asp Ile Gln Glu Tyr Asn Lys Lys Tyr Cys Gly Leu Glu Lys Gly Ile Asn Asn Leu Glu Lys Ala Ile Ser Glu CAA AAG GAA ATT AAG

Asn GluGlu Val Lys Leu Glu Asn IleLys GluLeu Glu Gln Lys Glu ACT ATA AAT ATC

Lys MetVal Ser Lys Pro Ile Val GluIle AsnThr Leu Thr Ile Asn AAA TTC TTG

Leu GlyTyr Gly Ala Asn Phe Ser AlaCys ThrGlu Asp Lys Phe Leu AAA ATT GGT

Glu PheTyr Arg Gln Arg Glu Asp GlnLeu ValGly Glu Lys Ile Gly CTG GAA ACT

Thr SerGlu Gly Val Thr Phe Ile PheLeu TyrTyr Tyr Leu Glu Thr TTA TCT GAT

His AlaLys Gly Leu Glu Glu Asn IleSer LysAsn Lys Leu Ser Asp TTA GAC TTG

Val ValIle Asp Pro Ile Ser Ser AspSer AsnIle Leu Leu Asp Leu ATA TTA ATG

Phe ValSer Val Val Lys Asp Leu LysGlu AlaMet Glu Ile Leu Met AAA AAG CTA

Glu ThrAsn Ile Gln Val Ile Ile ThrHis AsnThr Tyr Lys Lys Leu TAC ACA TTA

Phe LysGlu Ile Leu Glu Cys Asp LysArg TyrGln Gly Tyr Thr Leu TAT ATA AAT

Lys SerPhe Trp Ile Lys Lys Asp AsnVal SerLys Ile Tyr Ile Asn GAT AAT TCC

Lys TyrLys Glu Pro Ile Lys Asn TyrGlu LeuLeu Trp Asp Asn Ser GAA GCA GCT

Gln ValLys Gln Lys Glu Asn Asn SerTrp ValSer Leu Glu Ala Ala AAT AGA TAC

Gln ValMet Arg Ile Ile Glu Tyr PheArg IleLeu Gly Asn Arg Tyr Gly Phe Lys His Asn Asp Ser Leu Ser Glu Cys Phe Glu Asn Ile Glu Glu Lys Arg Val Cys Asn Ser Phe Ile Ser Trp Phe Asn Asp Gly Ser ' CAT GGG ATT TCA GAT GAT TTG TTT ATG CAA AGT CAA GAT ACA AGT ATT 2210 His Gly Ile Ser Asp Asp Leu Phe Met Gln Ser Gln Asp Thr Ser Ile Glu Thr Tyr Leu Lys Val Phe Glu Lys Ile Phe Lys Glu Thr Gly His Glu Ala His Tyr Lys Met Met Met Arg Met Lye (2) INFORMATION FOR SEQ ID N0:148:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 442 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:148:
Met Glu Asp Tyr Ala Ser Arg Thr Ala Gly Ala Leu Glu Arg Leu Asp Lys Ile Val Glu Thr Glu Gln Lys Asn Gln Gln Thr Lys Leu Asp Thr Glu Asn Leu Lys Ile Ile Ile Glu Thr Leu Arg Ser Lys Ile Asn Gly Asn Gln Gln Lys Met Leu Asp Lys Ser Lys Glu Met Ser Arg Asn Phe Lys Leu Asp Ser Thr Lys Asn Glu Ile Asp Ala Ile Lys Asp Leu Ile Lys Lys Ala Asn Glu Gln Ile Ala Asn Tyr Asn Glu Met Ile Lys Asp Ile Glu Lys Gln Lys Lys Ser Cys Lys Glu Gln Thr Trp Lys Phe Leu ' Val Asn Glu Phe Lys Ser Asp Ile Gln Glu Tyr Asn Lys Lys Tyr Cys Gly Leu Glu Lys Gly Ile Asn Asn Leu Glu Lys Ala Ile Ser Glu Asn Gln Glu Glu Val Lys Lys Leu Glu Asn Glu Ile Lys Glu Leu Glu Lys _ CA 02286306 1999-10-O1 Thr Met Val Ser Ile Lys Pro Ile Val Asn Glu Ile Asn Thr Leu Leu Lys Gly Tyr Gly Phe Ala Asn Phe Ser Leu Ala Cys Thr Glu Asp Glu Lys Phe Tyr Arg Ile Gln Arg Glu Asp Gly Gln Leu Val Gly Glu Thr Leu Ser Glu Gly Glu Val Thr Phe Ile Thr Phe Leu Tyr Tyr Tyr His Leu Ala Lys Gly Ser Leu Glu Glu Asn Asp Ile Ser Lys Asn Lys Val Leu Val Ile Asp Asp Pro Ile Ser Ser Leu Asp Ser Asn Ile Leu Phe Ile Val Ser Val Leu Val Lys Asp Leu Met Lys Glu Ala Met Glu Glu Lys Thr Asn Ile Lys Gln Val Ile Ile Leu Thr His Asn Thr Tyr Phe Tyr Lys Glu Ile Thr Leu Glu Cys Asp Leu Lys Arg Tyr Gln Gly Lys Tyr Ser Phe Trp Ile Ile Lys Lys Asp Asn Asn Val Ser Lys Ile Lys Asp Tyr Lys Glu Asn Pro Ile Lys Asn Ser Tyr G1u Leu Leu Trp Gln Glu Val Lys Gln Ala Lys Glu Asn Asn Ala Ser Trp Val Ser Leu Gln Asn Val Met Arg Arg Ile Ile Glu Tyr Tyr Phe Arg Ile Leu Gly Gly Phe Lys His Asn Asp Ser Leu Ser Glu Cys Phe Glu Asn Ile Glu Glu Lys Arg Val Cys Asn Ser Phe Ile Ser Trp Phe Asn Asp Gly Ser His Gly Ile Ser Asp Asp Leu Phe Met Gln Ser Gln Asp Thr Ser Ile Glu Thr Tyr Leu Lys Val Phe Glu Lys Ile Phe Lys Glu Thr Gly His Glu Ala His Tyr Lys Met Met Met Arg Met Lys (2) INFORMATION FOR SEQ ID N0:149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3793 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 51...3740 {D) OTHER INFORMATION:

CA 02286306 1999-10-01..
DEMANDES OU BREVETS VOLUMINEUX

COMPREND PLUS D'UN TOME.
CEC1 EST LE TOME . _ / DE
NOTE: Pour les tomes additionels, veuiltez co~tacter le Bureau canadien des brevets JUMBO APPLlCATIONS/PATENTS
THiS SECT10N OF THE APPLICATION/PATENT CONTAINS MORE
. THIS 1S VOLUME OF
NOTE: t=or additional voiumes.pl~as~ contact. the Canadian Patent OfificE .

Claims (34)

1. An isolated polynucleotide that encodes:
(i) a polypeptide comprising an amino acid sequence that is homologous to the amino acid sequence of a Helicobacter polypeptide selected from the group consisting of GHPO 35 (SEQ ID NO:2), GHPO 55 (SEQ ID NO:4), GHPO 78 (SEQ ID NO:6), GHPO 89 (SEQ ID NO:8), GHPO 129 (SEQ ID
NO:10), GHPO 541 (SEQ ID NO:12), GHPO 607 (SEQ ID NO:14), GHPO
635 (SEQ ID NO:16), GHPO 701 (SEQ ID NO:18), GHPO 712 (SEQ ID
NO:20), GHPO 761 (SEQ ID NO:22), GHPO 838 (SEQ ID NO:24), GHPO
1034 (SEQ ID NO:26), GHPO 1085 (SEQ ID NO:28), GHPO 1213 (SEQ ID
NO:30), GHPO 1255 (SEQ ID NO:32), GHPO 1308 (SEQ ID NO:34), GHPO
1389 (SEQ ID NO:36), GHPO 1706 (SEQ ID NO:38), GHPO 234 (SEQ ID
NO:40), GHPO 314 (SEQ ID NO:42), GHPO 510 (SEQ ID NO:44), GHPO
603 (SEQ ID NO:46), GHPO 937 (SEQ ID NO:48), GHPO 1027 (SEQ ID
NO:50), GHPO 1099 (SEQ ID NO:52), GHPO 1151 (SEQ ID NO:54), GHPO
1275 (SEQ ID NO:56), GHPO 1365 (SEQ ID NO:58), GHPO 1578 (SEQ ID
NO:60), GHPO 22 (SEQ ID NO:62), GHPO 58 (SEQ ID NO:64), GHPO 200 (SEQ ID NO:66), GHPO 558 (SEQ ID NO:68), GHPO 563 (SEQ ID NO:70), GHPO 695 (SEQ ID NO:72), GHPO 699 (SEQ ID NO:74), GHPO 702 (SEQ
ID NO:76), GHPO 709 (SEQ ID NO:78), GHPO 741 (SEQ ID NO:80), GHPO
762 (SEQ ID NO:82), GHPO 827 (SEQ ID NO:84), GHPO 852 (SEQ ID
NO:86), GHPO 1013 (SEQ ID NO:88), GHPO 1020 (SEQ ID NO:90), GHPO
1031 (SEQ ID NO:92), GHPO 1052 (SEQ ID NO:94), GHPO 1127 (SEQ ID
NO:96), GHPO 1149 (SEQ ID NO:98), GHPO 1176 (SEQ ID NO:100), GHPO
1250 (SEQ ID NO:102), GHPO 1312 (SEQ ID NO:104), GHPO 1358 (SEQ ID
NO:106), GHPO 1490 (SEQ ID NO:108), GHPO 1559 (SEQ ID NO:110), GHPO 1651 (SEQ ID NO:112), GHPO 1726 (SEQ ID NO:114), GHPO 1780 (SEQ ID NO:116), GHPO 895 (SEQ ID NO:118), GHPO 1447 (SEQ ID

NO:120), GHPO 28 (SEQ ID NO:122), GHPO 86 (SEQ ID NO:124), GHPO
155 (SEQ ID NO:126), GHPO 157 (SEQ ID NO:128), GHPO 237 (SEQ ID
NO:130), GHPO 290 (SEQ ID NO:132), GHPO 293 (SEQ ID NO:134), GHPO
335 (SEQ ID NO:136), GHPO 374 (SEQ ID NO:138), GHPO 442 (SEQ ID
NO:140), GHPO 480 (SEQ ID NO:142), GHPO 523 (SEQ ID NO: 144), GHPO
610 (SEQ ID NO:146), GHPO 675 (SEQ ID NO:148), GHPO 690 (SEQ ID
NO:150), GHPO 829 (SEQ ID NO:152), GHPO 850 (SEQ ID NO:154), GHPO
876 (SEQ ID NO:156), GHPO 984 (SEQ ID NO:158), GHPO 989 (SEQ ID
NO:160), GHPO 1111 (SEQ ID NO:162), GHPO 1145 (SEQ ID NO:164), GHPO 1256 (SEQ ID NO:166), GHPO 1264 (SEQ ID NO:168), GHPO 1316 (SEQ ID NO:170), GHPO 1368 (SEQ ID NO:172), GHPO 1442 (SEQ ID
NO:174), GHPO 1506 (SEQ ID NO:176), GHPO 1543 (SEQ ID NO:178), GHPO 1574 (SEQ ID NO:180), GHPO 1627 (SEQ ID NO:182), GHPO 1657 (SEQ ID NO:184), GHPO 1664 (SEQ ID NO:186), GHPO 1694 (SEQ ID
NO:188), GHPO 1704 (SEQ ID NO:190), GHPO 1763 (SEQ ID NO:192), GHPO 616 (SEQ ID NO:194), GHPO 76 (SEQ ID NO:196), GHPO 109 (SEQ
ID NO:198), GHPO 163 (SEQ ID NO:200), GHPO 169 (SEQ ID NO:202), GHPO 208 (SEQ ID NO:204), GHPO 219 (SEQ ID NO:206), GHPO 445 (SEQ ID NO:208), GHPO 479 (SEQ ID NO:210), GHPO 525 (SEQ ID
NO:212), GHPO 535 (SEQ ID NO:214), GHPO 731 (SEQ ID NO:216), GHPO
836 (SEQ ID NO:218), GHPO 879 (SEQ ID NO:220), GHPO 881 (SEQ ID
NO:222), GHPO 886 (SEQ ID NO:224), GHPO 893 (SEQ ID NO:226), GHPO
894 (SEQ ID NO:228), GHPO 976 (SEQ ID NO:230), GHPO 1011 (SEQ ID
NO:232), GHPO 1024 (SEQ ID NO:234), GHPO 1084 (SEQ ID NO:236), GHPO 1329 (SEQ ID NO:238), GHPO 1330 (SEQ ID NO:240), GHPO 1346 (SEQ ID NO:242), GHPO 1360 (SEQ ID NO:244), GHPO 1388 (SEQ ID
NO:246), GHPO 1411 (SEQ ID NO:248), GHPO 1419 (SEQ ID NO:250), GHPO 1446 (SEQ ID NO:252), GHPO 1469 (SEQ ID NO:254), GHPO 1501 (SEQ ID NO:256), GHPO 1505 (SEQ ID NO:258), GHPO 1522 (SEQ ID
NO:260), GHPO 1525 (SEQ ID NO:262), GHPO 1615 (SEQ ID NO:264), GHPO 1689 (SEQ ID NO:266), GHPO 1733 (SEQ ID NO:268), GHPO 18 (SEQ ID NO:270), GHPO 139 (SEQ ID NO:272), GHPO 142 (SEQ ID
NO:274), GHPO 250 (SEQ ID NO:276), GHPO 257 (SEQ ID NO:278), GHPO
325 (SEQ ID NO:280), GHPO 355 (SEQ ID NO:282), GHPO 357 (SEQ ID
NO:284), GHPO 454 (SEQ ID NO:286), GHPO 475 (SEQ ID NO:288), GHPO
515 (SEQ ID NO:290), GHPO 527 (SEQ ID NO:292), GHPO 551 (SEQ ID
NO:294), GHPO 602 (SEQ ID NO:296), GHPO 626 (SEQ ID NO:298), GHPO
646 (SEQ ID NO:300), GHPO 653 (SEQ ID NO:302), GHPO 655 (SEQ ID
NO:304), GHPO 670 (SEQ ID NO:306), GHPO 739 (SEQ ID NO:308), GHPO
798 (SEQ ID NO:310), GHPO 1102 (SEQ ID NO:312), GHPO 1114 (SEQ ID
NO:314), GHPO 1152 (SEQ ID NO:316), GHPO 1272 (SEQ ID NO:318), GHPO 1345 (SEQ ID NO:320), GHPO 1377 (SEQ ID NO:322), GHPO 1424 (SEQ ID NO:324), GHPO 1430 (SEQ ID NO:326), GHPO 1502 (SEQ ID
NO:328), GHPO 1600 (SEQ ID NO:330), GHPO 1714 (SEQ ID NO:332), GHPO 359 (SEQ ID NO:334), GHPO 678 (SEQ ID NO:336), GHPO 708 (SEQ ID NO:338), GHPO 759 (SEQ ID NO:340), GHPO 847 (SEQ ID
NO:342), GHPO 1050 (SEQ ID NO:344), GHPO 1101 (SEQ ID NO:346), GHPO 1120 (SEQ ID NO:348), GHPO 1138 (SEQ ID NO:350), GHPO 1310 (SEQ ID NO:352), GHPO 1320 (SEQ ID NO:354), GHPO 1375 (SEQ ID
NO:356), GHPO 1432 (SEQ ID NO:358), GHPO 21 (SEQ ID NO:360), GHPO
282 (SEQ ID NO:362), GHPO 1089 (SEQ ID NO:364), GHPO 1141 (SEQ ID
NO:366), GHPO 1280 (SEQ ID NO:368), GHPO 1608 (SEQ ID NO:370), GHPO 15 (SEQ ID NO:372), GHPO 16 (SEQ ID NO:374), GHPO 36 (SEQ ID
NO:376), GHPO 38 (SEQ ID NO:378), GHPO 52 (SEQ ID NO:380), GHPO

57 (SEQ ID NO:382), GHPO 64 (SEQ ID NO:384), GHPO 79 (SEQ ID
NO:386), GHPO 84 (SEQ ID NO:388), GHPO 86 (SEQ ID NO:390), GHPO
99 (SEQ ID NO:392), GHPO 106 (SEQ ID NO:394), GHPO 118 (SEQ ID
NO:396), GHPO 122 (SEQ ID NO:398), GHPO 128 (SEQ ID NO:400), GHPO
138 (SEQ ID NO:402), GHPO 153 (SEQ ID NO:404), GHPO 160 (SEQ ID
NO:406), GHPO 168 (SEQ ID NO:408), GHPO 179 (SEQ ID NO:410), GHPO
189 (SEQ ID NO:412), GHPO 229 (SEQ ID NO:414), GHPO 243 (SEQ ID
NO:416), GHPO 244 (SEQ ID NO:418), GHPO 251 (SEQ ID NO:420), GHPO
267 (SEQ ID NO:422), GHPO 269 (SEQ ID NO:424), GHPO 279 (SEQ ID
NO:426), GHPO 284 (SEQ ID NO:428), GHPO 296 (SEQ ID NO:430), GHPO
300 (SEQ ID NO:432), GHPO 305 (SEQ ID NO:434), GHPO 319 (SEQ ID
NO:436), GHPO 330 (SEQ ID NO:438), GHPO 340 (SEQ ID NO:440), GHPO
342 (SEQ ID NO:442), GHPO 344 (SEQ ID NO:444), GHPO 358 (SEQ ID
NO:446), GHPO 373 (SEQ ID NO:448), GHPO 382 (SEQ ID NO:450), GHPO
384 (SEQ ID NO:452), GHPO 398 (SEQ ID NO:454), GHPO 409 (SEQ ID
NO:456), GHPO 422 (SEQ ID NO:458), GHPO 430 (SEQ ID NO:460), GHPO
446 (SEQ ID NO:462), GHPO 447 (SEQ ID NO:464), GHPO 450 (SEQ ID
NO:466), GHPO 451 (SEQ ID NO:468), GHPO 452 (SEQ ID NO:470), GHPO
456 (SEQ ID NO:472), GHPO 461 (SEQ ID NO:474), GHPO 476 (SEQ ID
NO:476), GHPO 478 (SEQ ID NO:478), GHPO 491 (SEQ ID NO:480), GHPO
511 (SEQ ID NO:482), GHPO 519 (SEQ ID NO:484), GHPO 526 (SEQ ID
NO:486), GHPO 534 (SEQ ID NO:488), GHPO 536 (SEQ ID NO:490), GHPO
542 (SEQ ID NO:492), GHPO 544 (SEQ ID NO:494), GHPO 576 (SEQ ID
NO:496), GHPO 578 (SEQ ID NO:498), GHPO 580 (SEQ ID NO:500), GHPO
585 (SEQ ID NO:502), GHPO 599 (SEQ ID NO:504), GHPO 639 (SEQ ID
NO:506), GHPO 642 (SEQ ID NO:508), GHPO 647 (SEQ ID NO:510), GHPO
654 (SEQ ID NO:512), GHPO 669 (SEQ ID NO:514), GHPO 710 (SEQ ID

NO:516), GHPO 713 (SEQ ID NO:518), GHPO 716 (SEQ ID NO:520), GHPO
718 (SEQ ID NO:522), GHPO 726 (SEQ ID NO:524), GHPO 734 (SEQ ID
NO:526), GHPO 740 (SEQ ID NO:528), GHPO 770 (SEQ ID NO:530), GHPO
782 (SEQ ID NO:532), GHPO 786 (SEQ ID NO:534), GHPO 792 (SEQ ID
NO:536), GHPO 797 (SEQ ID NO:538), GHPO 816 (SEQ ID NO:540), GHPO
828 (SEQ ID NO:542), GHPO 839 (SEQ ID NO:544), GHPO 840 (SEQ ID
NO:546), GHPO 842 (SEQ ID NO:548), GHPO 885 (SEQ ID NO:550), GHPO
889 (SEQ ID NO:552), GHPO 903 (SEQ ID NO:554), GHPO 912 (SEQ ID
NO:556), GHPO 946 (SEQ ID NO:558), GHPO 958 (SEQ ID NO:560), GHPO
968 (SEQ ID NO:562), GHPO 987 (SEQ ID NO:564), GHPO 992 (SEQ ID
NO:566), GHPO 996 (SEQ ID NO:568), GHPO 997 (SEQ ID NO:570), GHPO
1002 (SEQ ID NO:572), GHPO 1026 (SEQ ID NO:574), GHPO 1028 (SEQ ID
NO:576), GHPO 1034 (SEQ ID NO:578), GHPO 1038 (SEQ ID NO:580), GHPO 1059 (SEQ ID NO:582), GHPO 1065 (SEQ ID NO:584), GHPO 1072 (SEQ ID NO:586), GHPO 1073 (SEQ ID NO:588), GHPO 1088 (SEQ ID
NO:590), GHPO 1091 (SEQ ID NO:592), GHPO 1105 (SEQ ID NO:594), GHPO 1115 (SEQ ID NO:596), GHPO 1159 (SEQ ID NO:598), GHPO 1177 (SEQ ID NO:600), GHPO 1187 (SEQ ID NO:602), GHPO 1192 (SEQ ID
NO:604), GHPO 1195 (SEQ ID NO:606), GHPO 1224 (SEQ ID NO:608), GHPO 1225 (SEQ ID NO:610), GHPO 1228 (SEQ ID NO:612), GHPO 1229 (SEQ ID NO:614), GHPO 1231 (SEQ ID NO:616), GHPO 1236 (SEQ ID
NO:618), GHPO 1242 (SEQ ID NO:620), GHPO 1248 (SEQ ID NO:622), GHPO 1270 (SEQ ID NO:624), GHPO 1271 (SEQ ID NO:626), GHPO 1298 (SEQ ID NO:628), GHPO 1301 (SEQ ID NO:630), GHPO 1304 (SEQ ID
NO:632), GHPO 1315 (SEQ ID NO:634), GHPO 1319 (SEQ ID NO:636), GHPO 1323 (SEQ ID NO:638), GHPO 1331 (SEQ ID NO:640), GHPO 1332 (SEQ ID NO:642), GHPO 1347 (SEQ ID NO:644), GHPO 1373 (SEQ ID

NO:646), GHPO 1376 (SEQ ID NO:648), GHPO 1380 (SEQ ID NO:650), GHPO 1394 (SEQ ID NO:652), GHPO 1407 (SEQ ID NO:654), GHPO 1415 (SEQ ID NO:656), GHPO 1425 (SEQ ID NO:658), GHPO 1427 (SEQ ID
NO:660), GHPO 1444 (SEQ ID NO:662), GHPO 1449 (SEQ ID NO:664), GHPO 1465 (SEQ ID NO:666), GHPO 1475 (SEQ ID NO:668), GHPO 1479 (SEQ ID NO:670), GHPO 1483 (SEQ ID NO:672), GHPO 1488 (SEQ ID
NO:674), GHPO 1496 (SEQ ID NO:676), GHPO 1524 (SEQ ID NO:678), GHPO 1536 (SEQ ID NO:680), GHPO 1539 (SEQ ID NO:682), GHPO 1540 (SEQ ID NO:684), GHPO 1542 (SEQ ID NO:686), GHPO 1555 (SEQ ID
NO:688), GHPO 1560 (SEQ ID NO:690), GHPO 1564 (SEQ ID NO:692), GHPO 1570 (SEQ ID NO:694), GHPO 1588 (SEQ ID NO:696), GHPO 1604 (SEQ ID NO:698), GHPO 1605 (SEQ ID NO:700), GHPO 1619 (SEQ ID
NO:702), GHPO 1629 (SEQ ID NO:704), GHPO 1642 (SEQ ID NO:706), GHPO 1654 (SEQ ID NO:708), GHPO 1661 (SEQ ID NO:710), GHPO 1673 (SEQ ID NO:712), GHPO 1687 (SEQ ID NO:714), GHPO 1692 (SEQ ID
NO:716), GHPO 1693 (SEQ ID NO:718), GHPO 1699 (SEQ ID NO:720), GHPO 1738 (SEQ ID NO:722), GHPO 1745 (SEQ ID NO:724), GHPO 1746 (SEQ ID NO:726), GHPO 1754 (SEQ ID NO:728), GHPO 1792 (SEQ ID
NO:730), GHPO 1795 (SEQ ID NO:732), GHPO 1796 (SEQ ID NO:734), GHPO 7 (SEQ ID NO:736), GHPO 8 (SEQ ID NO:738), GHPO 9 (SEQ ID
NO:740), GHPO 10 (SEQ ID NO:742), GHPO 12 (SEQ ID NO:744), GHPO
25 (SEQ ID NO:746), GHPO 27 (SEQ ID NO:748), GHPO 29 (SEQ ID
NO:750), GHPO 30 (SEQ ID NO:752), GHPO 37 (SEQ ID NO:754), GHPO
49 (SEQ ID NO:756), GHPO 51 (SEQ ID NO:758), GHPO 54 (SEQ ID
NO:760), GHPO 65 (SEQ ID NO:762), GHPO 66 (SEQ ID NO:764), GHPO
68 (SEQ ID NO:766), GHPO 70 (SEQ ID NO:768), GHPO 77 (SEQ ID
NO:770), GHPO 83 (SEQ ID NO:772), GHPO 85 (SEQ ID NO:774), GHPO

87 (SEQ ID NO:776), GHPO 91 (SEQ ID NO:778), GHPO 92 (SEQ ID
NO:780), GHPO 96 (SEQ ID NO:782), GHPO 97 (SEQ ID NO:784), GHPO
111 (SEQ ID NO:786), GHPO 115 (SEQ ID NO:788), GHPO 117 (SEQ ID
NO:790), GHPO 123 (SEQ ID NO:792), GHPO 124 (SEQ ID NO:794), GHPO
126 (SEQ ID NO:796), GHPO 127 (SEQ ID NO:798), GHPO 128 (SEQ ID
NO:800), GHPO 131 (SEQ ID NO:802), GHPO 133 (SEQ ID NO:804), GHPO
140 (SEQ ID NO:806), GHPO 141 (SEQ ID NO:808), GHPO 145 (SEQ ID
NO:810), GHPO 147 (SEQ ID NO:812), GHPO 166 (SEQ ID NO:814), GHPO
181 (SEQ ID NO:816), GHPO 187 (SEQ ID NO:818), GHPO 188 (SEQ ID
NO:820), GHPO 192 (SEQ ID NO:822), GHPO 202 (SEQ ID NO:824), GHPO
204 (SEQ ID NO:826), GHPO 205 (SEQ ID NO:828), GHPO 212 (SEQ ID
NO:830), GHPO 218 (SEQ ID NO:832), GHPO 226 (SEQ ID NO:834), GHPO
231 (SEQ ID NO:836), GHPO 236 (SEQ ID NO:838), GHPO 239 (SEQ ID
NO:840), GHPO 245 (SEQ ID NO:842), GHPO 246 (SEQ ID NO:844), GHPO
248 (SEQ ID NO:846), GHPO 253 (SEQ ID NO:848), GHPO 265 (SEQ ID
NO:850), GHPO 266 (SEQ ID NO:852), GHPO 271 (SEQ ID NO:854), GHPO
272 (SEQ ID NO:856), GHPO 286 (SEQ ID NO:858), GHPO 291 (SEQ ID
NO:860), GHPO 292 (SEQ ID NO:862), GHPO 297 (SEQ ID NO:864), GHPO
304 (SEQ ID NO:866), GHPO 307 (SEQ ID NO:868), GHPO 324 (SEQ ID
NO:870), GHPO 326 (SEQ ID NO:872), GHPO 331 (SEQ ID NO:874), GHPO
343 (SEQ ID NO:876), GHPO 345 (SEQ ID NO:878), GHPO 346 (SEQ ID
NO:880), GHPO 352 (SEQ ID NO:882), GHPO 355 (SEQ ID NO:884), GHPO
363 (SEQ ID NO:886), GHPO 369 (SEQ ID NO:888), GHPO 376 (SEQ ID
NO:890), GHPO 378 (SEQ ID NO:892), GHPO 388 (SEQ ID NO:894), GHPO
396 (SEQ ID NO:896), GHPO 403 (SEQ ID NO:898), GHPO 410 (SEQ ID
NO:900), GHPO 415 (SEQ ID NO:902), GHPO 421 (SEQ ID NO:904), GHPO
439 (SEQ ID NO:906), GHPO 441 (SEQ ID NO:908), GHPO 443 (SEQ ID

NO:910), GHPO 453 (SEQ ID NO:912), GHPO 455 (SEQ ID NO:914), GHPO
464 (SEQ ID NO:916), GHPO 467 (SEQ ID NO:918), GHPO 468 (SEQ ID
NO:920), GHPO 470 (SEQ ID NO:922), GHPO 486 (SEQ ID NO:924), GHPO
487 (SEQ ID NO:926), GHPO 488 (SEQ ID NO:928), GHPO 489 (SEQ ID
NO:930), GHPO 498 (SEQ ID NO:932), GHPO 501 (SEQ ID NO:934), GHPO
504 (SEQ ID NO:936), GHPO 512 (SEQ ID NO:938), GHPO 517 (SEQ ID
NO:940), GHPO 520 (SEQ ID NO:942), GHPO 528 (SEQ ID NO:944), GHPO
530 (SEQ ID NO:946), GHPO 532 (SEQ ID NO:948), GHPO 548 (SEQ ID
NO:950), GHPO 561 (SEQ ID NO:952), GHPO 564 (SEQ ID NO:954), GHPO
572 (SEQ ID NO:956), GHPO 573 (SEQ ID NO:958), GHPO 574 (SEQ ID
NO:960), GHPO 577 (SEQ ID NO:962), GHPO 579 (SEQ ID NO:964), GHPO
583 (SEQ ID NO:966), GHPO 588 (SEQ ID NO:968), GHPO 593 (SEQ ID
NO:970), GHPO 597 (SEQ ID NO:972), GHPO 598 (SEQ ID NO:974), GHPO
604 (SEQ ID NO:976), GHPO 606 (SEQ ID NO:978), GHPO 611 (SEQ ID
NO:980), GHPO 612 (SEQ ID NO:982), GHPO 615 (SEQ ID NO:984), GHPO
632 (SEQ ID NO:986), GHPO 633 (SEQ ID NO:988), GHPO 637 (SEQ ID
NO:990), GHPO 651 (SEQ ID NO:992), GHPO 663 (SEQ ID NO:994), GHPO
686 (SEQ ID NO:996), GHPO 693 (SEQ ID NO:998), GHPO 698 (SEQ ID
NO:1000), GHPO 703 (SEQ ID NO:1002), GHPO 704 (SEQ ID NO:1004), GHPO 705 (SEQ ID NO:1006), GHPO 707 (SEQ ID NO:1008), GHPO 721 (SEQ ID NO:1010), GHPO 727 (SEQ ID NO:1012), GHPO 728 (SEQ ID
NO:1014), GHPO 733 (SEQ ID NO:1016), GHPO 758 (SEQ ID NO:1018), GHPO 763 (SEQ ID NO:1020), GHPO 771 (SEQ ID NO:1022), GHPO 774 (SEQ ID NO:1024), GHPO 776 (SEQ ID NO:1026), GHPO 783 (SEQ ID
NO:1028), GHPO 800 (SEQ ID NO:1030), GHPO 806 (SEQ ID NO:1032), GHPO 807 (SEQ ID NO:1034), GHPO 808 (SEQ ID NO:1036), GHPO 809 (SEQ ID NO:1038), GHPO 811 (SEQ ID NO:1040), GHPO 815 (SEQ ID

NO:1042), GHPO 819 (SEQ ID NO:1044), GHPO 841 (SEQ ID NO:1046), GHPO 843 (SEQ ID NO:1048), GHPO 846 (SEQ ID NO:1050), GHPO 875 (SEQ ID NO:1052), GHPO 892 (SEQ ID NO:1054), GHPO 902 (SEQ ID
NO:1056), GHPO 904 (SEQ ID NO:1058), GHPO 906 (SEQ ID NO:1060), GHPO 908 (SEQ ID NO:1062), GHPO 921 (SEQ ID NO:1064), GHPO 923 (SEQ ID NO:1066), GHPO 926 (SEQ ID NO:1068), GHPO 933 (SEQ ID
NO:1070), GHPO 939 (SEQ ID NO:1072), GHPO 940 (SEQ ID NO:1074), GHPO 943 (SEQ ID NO:1076), GHPO 951 (SEQ ID NO:1078), GHPO 961 (SEQ ID NO:1080), GHPO 965 (SEQ ID NO:1082), GHPO 990 (SEQ ID
NO:1084), GHPO 991 (SEQ ID NO:1086), GHPO 998 (SEQ ID NO:1088), GHPO 1001 (SEQ ID NO:1090), GHPO 1005 (SEQ ID NO:1092), GHPO
1033 (SEQ ID NO:1094), GHPO 1039 (SEQ ID NO:1096), GHPO 1041 (SEQ
ID NO:1098), GHPO 1043 (SEQ ID NO:1100), GHPO 1044 (SEQ ID
NO:1102), GHPO 1051 (SEQ ID NO:1104), GHPO 1058 (SEQ ID NO:1106), GHPO 1060 (SEQ ID NO:1108), GHPO 1075 (SEQ ID NO:1110), GHPO
1077 (SEQ ID NO:1112), GHPO 1082 (SEQ ID NO:1114), GHPO 1083 (SEQ
ID NO:1116), GHPO 1086 (SEQ ID NO:1118), GHPO 1087 (SEQ ID
NO:1120), GHPO 1090 (SEQ ID NO:1122), GHPO 1097 (SEQ ID NO:1124), GHPO 1098 (SEQ ID NO:1126), GHPO 1103 (SEQ ID NO:1128), GHPO
1113 (SEQ ID NO:1130), GHPO 1116 (SEQ ID NO:1132), GHPO 1123 (SEQ
ID NO:1134), GHPO 1125 (SEQ ID NO:1136), GHPO 1129 (SEQ ID
NO:1138), GHPO 1130 (SEQ ID NO:1140), GHPO 1134 (SEQ ID NO:1142), GHPO 1161 (SEQ ID NO:1144), GHPO 1166 (SEQ ID NO:1146), GHPO
1170 (SEQ ID NO:1148), GHPO 1175 (SEQ ID NO:1150), GHPO 1181 (SEQ
ID NO:1152), GHPO 1186 (SEQ ID NO:1154), GHPO 1188 (SEQ ID
NO:1156), GHPO 1191 (SEQ ID NO:1158), GHPO 1193 (SEQ ID NO:1160), GHPO 1196 (SEQ ID NO:1162), GHPO 1204 (SEQ ID NO:1164), GHPO

1210 (SEQ ID NO:1166), GHPO 1211 (SEQ ID NO:1168), GHPO 1216 (SEQ
ID NO:1170), GHPO 1218 (SEQ ID NO:1172), GHPO 1220 (SEQ ID
NO:1174), GHPO 1223 (SEQ ID NO:1176), GHPO 1226 (SEQ ID NO:1178), GHPO 1240 (SEQ ID NO:1180), GHPO 1246 (SEQ ID NO:1182), GHPO
1251 (SEQ ID NO:1184), GHPO 1252 (SEQ ID NO:1186), GHPO 1261 (SEQ
ID NO:1188), GHPO 1265 (SEQ ID NO:1190), GHPO 1267 (SEQ ID
NO:1192), GHPO 1278 (SEQ ID NO:1194), GHPO 1282 (SEQ ID NO:1196), GHPO 1283 (SEQ ID NO:1198), GHPO 1287 (SEQ ID NO:1200), GHPO
1292 (SEQ ID NO:1202), GHPO 1293 (SEQ ID NO:1204), GHPO 1302 (SEQ
ID NO:1206), GHPO 1309 (SEQ ID NO:1208), GHPO 1317 (SEQ ID
NO:1210), GHPO 1318 (SEQ ID NO:1212), GHPO 1321 (SEQ ID NO:1214), GHPO 1325 (SEQ ID NO:1216), GHPO 1341 (SEQ ID NO:1218), GHPO
1351 (SEQ ID NO:1220), GHPO 1354 (SEQ ID NO:1222), GHPO 1363 (SEQ
ID NO:1224), GHPO 1371 (SEQ ID NO:1226), GHPO 1381 (SEQ ID
NO:1228), GHPO 1401 (SEQ ID NO:1230), GHPO 1402 (SEQ ID NO:1232), GHPO 1403 (SEQ ID NO:1234), GHPO 1408 (SEQ ID NO:1236), GHPO
1416 (SEQ ID NO:1238), GHPO 1420 (SEQ ID NO:1240), GHPO 1428 (SEQ
ID NO:1242), GHPO 1437 (SEQ ID NO:1244), GHPO 1439 (SEQ ID
NO:1246), GHPO 1460 (SEQ ID NO:1248), GHPO 1463 (SEQ ID NO:1250), GHPO 1472 (SEQ ID NO:1252), GHPO 1474 (SEQ ID NO:1254), GHPO
1484 (SEQ ID NO:1256), GHPO 1489 (SEQ ID NO:1258), GHPO 1494 (SEQ
ID NO:1260), GHPO 1495 (SEQ ID NO:1262), GHPO 1498 (SEQ ID
NO:1264), GHPO 1499 (SEQ ID NO:1266), GHPO 1500 (SEQ ID NO:1268), GHPO 1503 (SEQ ID NO:1270), GHPO 1504 (SEQ ID NO:1272), GHPO
1510 (SEQ ID NO:1274), GHPO 1518 (SEQ ID NO:1276), GHPO 1533 (SEQ
ID NO:1278), GHPO 1541 (SEQ ID NO:1280), GHPO 1544 (SEQ ID
NO:1282), GHPO 1548 (SEQ ID NO:1284), GHPO 1565 (SEQ ID NO:1286), GHPO 1575 (SEQ ID NO:1288), GHPO 1582 (SEQ ID NO:1290), GHPO
1595 (SEQ ID NO:1292), GHPO 1597 (SEQ ID NO:1294), GHPO 1599 (SEQ
ID NO:1296), GHPO 1601 (SEQ ID NO:1298), GHPO 1609 (SEQ ID
NO:1300), GHPO 1613 (SEQ ID NO:1302), GHPO 1614 (SEQ ID NO:1304), GHPO 1626 (SEQ ID NO:1306), GHPO 1628 (SEQ ID NO:1308), GHPO
1639 (SEQ ID NO:1310), GHPO 1640 (SEQ ID NO:1312), GHPO 1641 (SEQ
ID NO:1314), GHPO 1646 (SEQ ID NO:1316), GHPO 1662 (SEQ ID
NO:1318), GHPO 1667 (SEQ ID NO:1320), GHPO 1668 (SEQ ID NO:1322), GHPO 1670 (SEQ ID NO:1324), GHPO 1671 (SEQ ID NO:1326), GHPO
1672 (SEQ ID NO:1328), GHPO 1678 (SEQ ID NO:1330), GHPO 1684 (SEQ
ID NO:1332), GHPO 1695 (SEQ ID NO:1334), GHPO 1697 (SEQ ID
NO:1336), GHPO 1701 (SEQ ID NO:1338), GHPO 1719 (SEQ ID NO:I340), GHPO 1723 (SEQ ID NO:1342), GHPO 1732 (SEQ ID NO:1344), GHPO
1739 (SEQ ID NO:1346), GHPO 1741 (SEQ ID NO:1348), GHPO 1747 (SEQ
ID NO:1350), GHPO 1749 (SEQ ID NO:1352), GHPO 1750 (SEQ ID
NO:1354), GHPO 1751 (SEQ ID NO:1356), GHPO 1755 (SEQ ID NO:1358), GHPO 1771 (SEQ ID NO:1360), GHPO 1786 {SEQ ID NO:1362), and GHPO
1789 (SEQ ID NO:1364); or (ii) a derivative of said Helicobacter polypeptide.
2. The isolated polynucleotide of claim 1, which encodes a mature form of said Helicobacter polypeptide.
3. The isolated polynucleotide of claim 1 or 2, wherein the polynucleotide is a DNA molecule.
4. The isolated polynucleotide of claim 1, which is a DNA molecule that can be amplified by polymerase chain reaction from a Helicobacter genome.
5. The isolated DNA molecule of claim 4, which can be amplified by the polymerase chain reaction from a Helicobacter pylori genome.
6. The isolated polynucleotide of claim 1, which is a DNA molecule that encodes the mature form or a derivative of a polypeptide encoded by the DNA molecule of claim 4.
7. The isolated polynucleotide of claim 1, which is a DNA molecule that encodes the mature form or a derivative of a polypeptide encoded by the DNA molecule of claim 5.
8. A compound, in a substantially purified form, that is the mature form or a derivative of a polypeptide comprising an amino acid sequence that is homologous to a Helicobacter polypeptide selected from the group consisting of GHPO 35 (SEQ ID NO:2), GHPO 55 (SEQ ID NO:4), GHPO 78 (SEQ ID
NO:6), GHPO 89 (SEQ ID NO:8), GHPO 129 (SEQ ID NO:10), GHPO 541 (SEQ ID NO:12), GHPO 607 (SEQ ID NO:14), GHPO 635 (SEQ ID NO:16), GHPO 701 (SEQ ID NO:18), GHPO 712 (SEQ ID NO:20), GHPO 761 (SEQ
ID NO:22), GHPO 838 (SEQ ID NO:24), GHPO 1034 (SEQ ID NO:26), GHPO 1085 (SEQ ID NO:28), GHPO 1213 (SEQ ID NO:30), GHPO 1255 (SEQ ID NO:32), GHPO 1308 (SEQ ID NO:34), GHPO 1389 (SEQ ID
NO:36), GHPO 1706 (SEQ ID NO:38), GHPO 234 (SEQ ID NO:40), GHPO
314 (SEQ ID NO:42), GHPO 510 (SEQ ID NO:44), GHPO 603 (SEQ ID

NO:46), GHPO 937 (SEQ ID NO:48), GHPO 1027 (SEQ ID NO:50), GHPO
1099 (SEQ ID NO:52), GHPO 1151 (SEQ ID NO:54), GHPO 1275 (SEQ ID
NO:56), GHPO 1365 (SEQ ID NO:58), GHPO 1578 (SEQ ID NO:60), GHPO
22 (SEQ ID NO:62), GHPO 58 (SEQ ID NO:64), GHPO 200 (SEQ ID NO:66), GHPO 558 (SEQ ID NO:68), GHPO 563 (SEQ ID NO:70), GHPO 695 (SEQ
ID NO:72), GHPO 699 (SEQ ID NO:74), GHPO 702 (SEQ ID NO:76), GHPO
709 (SEQ ID NO:78), GHPO 741 (SEQ ID NO:80), GHPO 762 (SEQ ID
NO:82), GHPO 827 (SEQ ID NO:84), GHPO 852 (SEQ ID NO:86), GHPO
1013 (SEQ ID NO:88), GHPO 1020 (SEQ ID NO:90), GHPO 1031 (SEQ ID
NO:92), GHPO 1052 (SEQ ID NO:94), GHPO 1127 (SEQ ID NO:96), GHPO
1149 (SEQ ID NO:98), GHPO 1176 (SEQ ID NO:100), GHPO 1250 (SEQ ID
NO:102), GHPO 1312 (SEQ ID NO:104), GHPO 1358 (SEQ ID NO:106), GHPO 1490 (SEQ ID NO:108), GHPO 1559 (SEQ ID NO:110), GHPO 1651 (SEQ ID NO:112), GHPO 1726 (SEQ ID NO:114), GHPO 1780 (SEQ ID
NO:116), GHPO 895 (SEQ ID NO:118), GHPO 1447 (SEQ ID NO:120), GHPO 28 (SEQ ID NO:122), GHPO 86 (SEQ ID NO:124), GHPO 155 (SEQ
ID NO:126), GHPO 157 (SEQ ID NO:128), GHPO 237 (SEQ ID NO:130), GHPO 290 (SEQ ID NO:132), GHPO 293 (SEQ ID NO:134), GHPO 335 (SEQ ID NO:136), GHPO 374 (SEQ ID NO:138), GHPO 442 (SEQ ID
NO:140), GHPO 480 (SEQ ID NO:142), GHPO 523 (SEQ ID NO:144), GHPO
610 (SEQ ID NO:146), GHPO 675 (SEQ ID NO:148), GHPO 690 (SEQ ID
NO:150), GHPO 829 (SEQ ID NO:152), GHPO 850 (SEQ ID NO:154), GHPO
876 (SEQ ID NO:156), GHPO 984 (SEQ ID NO:158), GHPO 989 (SEQ ID
NO:160), GHPO 1111 (SEQ ID NO:162), GHPO 1145 (SEQ ID NO:164), GHPO 1256 (SEQ ID NO:166), GHPO 1264 (SEQ ID NO:168), GHPO 1316 (SEQ ID NO:170), GHPO 1368 (SEQ ID NO:172), GHPO 1442 (SEQ ID
NO:174), GHPO 1506 {SEQ ID NO:176), GHPO 1543 (SEQ ID NO:178), GHPO 1574 (SEQ ID NO:180), GHPO 1627 (SEQ ID NO:182), GHPO 1657 (SEQ ID NO:184), GHPO 1664 (SEQ ID NO:186), GHPO 1694 (SEQ ID
NO:188), GHPO 1704 (SEQ ID NO:190), GHPO 1763 (SEQ ID NO:192), GHPO 616 (SEQ ID NO:194), GHPO 76 (SEQ ID NO:196), GHPO 109 (SEQ
ID NO:198), GHPO 163 (SEQ ID NO:200), GHPO 169 (SEQ ID NO:202), GHPO 208 (SEQ ID NO:204), GHPO 219 (SEQ ID NO:206), GHPO 445 (SEQ ID NO:208), GHPO 479 (SEQ ID NO:210), GHPO 525 (SEQ ID
NO:212), GHPO 535 (SEQ ID NO:214), GHPO 731 (SEQ ID NO:216), GHPO
836 (SEQ ID NO:218), GHPO 879 (SEQ ID NO:220), GHPO 881 (SEQ ID
NO:222), GHPO 886 (SEQ ID NO:224), GHPO 893 (SEQ ID NO:226), GHPO
894 (SEQ ID NO:228), GHPO 976 (SEQ ID NO:230), GHPO 1011 (SEQ ID
NO:232), GHPO 1024 (SEQ ID NO:234), GHPO 1084 (SEQ ID NO:236), GHPO 1329 (SEQ ID NO:238), GHPO 1330 (SEQ ID NO:240), GHPO 1346 (SEQ ID NO:242), GHPO 1360 (SEQ ID NO:244), GHPO 1388 (SEQ ID
NO:246), GHPO 1411 (SEQ ID NO:248), GHPO 1419 (SEQ ID NO:250), GHPO 1446 (SEQ ID NO:252), GHPO 1469 (SEQ ID NO:254), GHPO 1501 {SEQ ID NO:256), GHPO 1505 (SEQ ID NO:258), GHPO 1522 (SEQ ID
NO:260), GHPO 1525 (SEQ ID NO:262), GHPO 1615 (SEQ ID NO:264), GHPO 1689 (SEQ ID NO:266), GHPO 1733 (SEQ ID NO:268), GHPO 18 (SEQ ID NO:270), GHPO 139 (SEQ ID NO:272), GHPO 142 (SEQ ID
NO:274), GHPO 250 (SEQ ID NO:276), GHPO 257 (SEQ ID NO:278), GHPO
325 (SEQ ID NO:280), GHPO 355 (SEQ ID NO:282), GHPO 357 (SEQ ID
NO:284), GHPO 454 (SEQ ID NO:286), GHPO 475 (SEQ ID NO:288), GHPO
515 (SEQ ID NO:290), GHPO 527 (SEQ ID NO:292), GHPO 551 (SEQ ID
NO:294), GHPO 602 (SEQ ID NO:296), GHPO 626 (SEQ ID NO:298), GHPO
646 (SEQ ID NO:300), GHPO 653 (SEQ ID NO:302), GHPO 655 (SEQ ID
NO:304), GHPO 670 (SEQ ID NO:306), GHPO 739 (SEQ ID NO:308), GHPO

798 (SEQ ID NO:310), GHPO 1102 (SEQ ID NO:312), GHPO 1114 (SEQ ID
NO:314), GHPO 1152 (SEQ ID NO:316), GHPO 1272 (SEQ ID NO:318), GHPO 1345 (SEQ ID NO:320), GHPO 1377 (SEQ ID NO:322), GHPO 1424 (SEQ ID NO:324), GHPO 1430 (SEQ ID NO:326), GHPO 1502 (SEQ ID
NO:328), GHPO 1600 (SEQ ID NO:330), GHPO 1714 (SEQ ID NO:332), GHPO 359 (SEQ ID NO:334), GHPO 678 (SEQ ID NO:336), GHPO 708 (SEQ ID NO:338), GHPO 759 (SEQ ID NO:340), GHPO 847 (SEQ ID
NO:342), GHPO 1050 (SEQ ID NO:344), GHPO 1101 (SEQ ID NO:346), GHPO 1120 (SEQ ID NO:348), GHPO 1138 (SEQ ID NO:350), GHPO 1310 (SEQ ID NO:352), GHPO 1320 (SEQ ID NO:354), GHPO 1375 (SEQ ID
NO:356), GHPO 1432 (SEQ ID NO:358), GHPO 21 (SEQ ID NO:360), GHPO
282 (SEQ ID NO:362), GHPO 1089 (SEQ ID NO:364), GHPO 1141 (SEQ ID
NO:366), GHPO 1280 (SEQ ID NO:368), GHPO 1608 (SEQ ID NO:370), GHPO 15 (SEQ ID NO:372), GHPO 16 (SEQ ID NO:374), GHPO 36 (SEQ ID
NO:376), GHPO 38 (SEQ ID NO:378), GHPO 52 (SEQ ID NO:380), GHPO
57 (SEQ ID NO:382), GHPO 64 (SEQ ID NO:384), GHPO 79 (SEQ ID
NO:386), GHPO 84 (SEQ ID NO:388), GHPO 86 (SEQ ID NO:390), GHPO
99 (SEQ ID NO:392), GHPO 106 (SEQ ID NO:394), GHPO 118 (SEQ ID
NO:396), GHPO 122 (SEQ ID NO:398), GHPO 128 (SEQ ID NO:400), GHPO
138 (SEQ ID NO:402), GHPO 153 (SEQ ID NO:404), GHPO 160 (SEQ ID
NO:406), GHPO 168 (SEQ ID NO:408), GHPO 179 (SEQ ID NO:410), GHPO
189 (SEQ ID NO:412), GHPO 229 (SEQ ID NO:414), GHPO 243 (SEQ ID
NO:416), GHPO 244 (SEQ ID NO:418), GHPO 251 (SEQ ID NO:420), GHPO
267 (SEQ ID NO:422), GHPO 269 (SEQ ID NO:424), GHPO 279 (SEQ ID
NO:426), GHPO 284 (SEQ ID NO:428), GHPO 296 (SEQ ID NO:430), GHPO
300 (SEQ ID NO:432), GHPO 305 (SEQ ID NO:434), GHPO 319 (SEQ ID
NO:436), GHPO 330 {SEQ ID NO:438), GHPO 340 (SEQ ID NO:440), GHPO

342 (SEQ ID NO:442), GHPO 344 (SEQ ID NO:444), GHPO 3S8 (SEQ ID
NO:446), GHPO 373 (SEQ ID NO:448), GHPO 382 (SEQ ID NO:450), GHPO
384 (SEQ ID NO:452), GHPO 398 (SEQ ID NO:454), GHPO 409 (SEQ ID
NO:456), GHPO 422 (SEQ ID NO:4S8), GHPO 430 (SEQ ID NO:460), GHPO
446 (SEQ ID NO:462), GHPO 447 (SEQ ID NO:464), GHPO 450 (SEQ ID
NO:466), GHPO 451 (SEQ ID NO:468), GHPO 452 (SEQ ID NO:470), GHPO
456 (SEQ ID NO:472), GHPO 461 (SEQ ID NO:474), GHPO 476 (SEQ ID
NO:476), GHPO 478 (SEQ ID NO:478), GHPO 491 (SEQ ID NO:480), GHPO
511 (SEQ ID NO:482), GHPO 519 (SEQ ID NO:484), GHPO 526 (SEQ ID
NO:486), GHPO 534 (SEQ ID NO:488), GHPO 536 (SEQ ID NO:490), GHPO
542 (SEQ ID NO:492), GHPO 544 (SEQ ID NO:494), GHPO 576 (SEQ ID
NO:496), GHPO 578 (SEQ ID NO:498), GHPO 580 (SEQ ID NO:500), GHPO
S8S (SEQ ID NO:502), GHPO 599 (SEQ ID NO:504), GHPO 639 (SEQ ID
NO:506), GHPO 642 (SEQ ID NO:508), GHPO 647 (SEQ ID NO:510), GHPO
654 (SEQ ID NO:512), GHPO 669 (SEQ ID NO:514), GHPO 710 (SEQ ID
NO:516), GHPO 713 (SEQ ID NO:518), GHPO 716 (SEQ ID NO:520), GHPO
718 (SEQ ID NO:522), GHPO 726 (SEQ ID NO:524), GHPO 734 (SEQ ID
NO:526), GHPO 740 (SEQ ID NO:528), GHPO 770 (SEQ ID NO:530), GHPO
782 (SEQ ID NO:532), GHPO 786 (SEQ ID NO:534), GHPO 792 (SEQ ID
NO:536), GHPO 797 (SEQ ID NO:538), GHPO 816 (SEQ ID NO:540), GHPO
828 (SEQ ID NO:542), GHPO 839 (SEQ ID NO:544), GHPO 840 (SEQ ID
NO:546), GHPO 842 (SEQ ID NO:548), GHPO 885 (SEQ ID NO:550), GHPO
889 (SEQ ID NO:552), GHPO 903 (SEQ ID NO:554), GHPO 912 (SEQ ID
NO:556), GHPO 946 (SEQ ID NO:558), GHPO 958 (SEQ ID NO:560), GHPO
968 (SEQ ID NO:562), GHPO 987 (SEQ ID NO:564), GHPO 992 (SEQ ID
NO:566), GHPO 996 (SEQ ID NO:568), GHPO 997 (SEQ ID NO:570), GHPO
1002 (SEQ ID NO:572), GHPO 1026 (SEQ ID NO:574), GHPO 1028 (SEQ ID

NO:576), GHPO 1034 (SEQ ID NO:578), GHPO 1038 (SEQ ID NO:580), GHPO 1059 (SEQ ID NO:582), GHPO 1065 (SEQ ID NO:584), GHPO 1072 (SEQ ID NO:586), GHPO 1073 (SEQ ID NO:588), GHPO 1088 (SEQ ID
NO:590), GHPO 1091 (SEQ ID NO:592), GHPO 1105 (SEQ ID NO:594), GHPO 1115 (SEQ ID NO:596), GHPO 1159 (SEQ ID NO:598), GHPO 1177 (SEQ ID NO:600), GHPO 1187 (SEQ ID NO:602), GHPO 1192 (SEQ ID
NO:604), GHPO 1195 (SEQ ID NO:606), GHPO 1224 (SEQ ID NO:608), GHPO 1225 (SEQ ID NO:610), GHPO 1228 (SEQ ID NO:612), GHPO 1229 (SEQ ID NO:614), GHPO 1231 (SEQ ID NO:616), GHPO 1236 (SEQ ID
NO:618), GHPO 1242 (SEQ ID NO:620), GHPO 1248 (SEQ ID NO:622), GHPO 1270 (SEQ ID NO:624), GHPO 1271 (SEQ ID NO:626), GHPO 1298 (SEQ ID NO:628), GHPO 1301 (SEQ ID NO:630), GHPO 1304 (SEQ ID
NO:632), GHPO 1315 (SEQ ID NO:634), GHPO 1319 (SEQ ID NO:636), GHPO 1323 (SEQ ID NO:638), GHPO 1331 (SEQ ID NO:640), GHPO 1332 (SEQ ID NO:642), GHPO 1347 (SEQ ID NO:644), GHPO 1373 (SEQ ID
NO:646), GHPO 1376 (SEQ ID NO:648), GHPO 1380 (SEQ ID NO:650), GHPO 1394 (SEQ ID NO:652), GHPO 1407 (SEQ ID NO:654), GHPO 1415 (SEQ ID NO:656), GHPO 1425 (SEQ ID NO:658), GHPO 1427 (SEQ ID
NO:660), GHPO 1444 (SEQ ID NO:662), GHPO 1449 (SEQ ID NO:664), GHPO 1465 (SEQ ID NO:666), GHPO 1475 (SEQ ID NO:668), GHPO 1479 (SEQ ID NO:670), GHPO 1483 (SEQ ID NO:672), GHPO 1488 (SEQ ID
NO:674), GHPO 1496 (SEQ ID NO:676), GHPO 1524 (SEQ ID NO:678), GHPO 1536 (SEQ ID NO:680), GHPO 1539 (SEQ ID NO:682), GHPO 1540 (SEQ ID NO:684), GHPO 1542 (SEQ ID NO:686), GHPO 1555 (SEQ ID
NO:688), GHPO 1560 (SEQ ID NO:690), GHPO 1564 (SEQ ID NO:692), GHPO 1570 (SEQ ID NO:694), GHPO 1588 (SEQ ID NO:696), GHPO 1604 (SEQ ID NO:698), GHPO 1605 (SEQ ID NO:700), GHPO 1619 (SEQ ID

NO:702), GHPO 1629 (SEQ ID NO:704), GHPO 1642 (SEQ ID NO:706), GHPO 1654 (SEQ ID NO:708), GHPO 1661 (SEQ ID NO:710), GHPO 1673 (SEQ ID NO:712), GHPO 1687 (SEQ ID NO:714), GHPO 1692 (SEQ ID
NO:716), GHPO 1693 (SEQ ID NO:718), GHPO 1699 (SEQ ID NO:720), GHPO 1738 (SEQ ID NO:722), GHPO 1745 (SEQ ID NO:724), GHPO 1746 (SEQ ID NO:726), GHPO 1754 (SEQ ID NO:728), GHPO 1792 (SEQ ID
NO:730), GHPO 1795 (SEQ ID NO:732), GHPO 1796 (SEQ ID NO:734), GHPO 7 (SEQ ID NO:736), GHPO 8 (SEQ ID NO:738), GHPO 9 (SEQ ID
NO:740), GHPO 10 (SEQ ID NO:742), GHPO 12 (SEQ ID NO:744), GHPO
25 (SEQ ID NO:746), GHPO 27 (SEQ ID NO:748), GHPO 29 (SEQ ID
NO:750), GHPO 30 (SEQ ID NO:752), GHPO 37 (SEQ ID NO:754), GHPO
49 (SEQ ID NO:756), GHPO 51 (SEQ ID NO:758), GHPO 54 (SEQ ID
NO:760), GHPO 65 (SEQ ID NO:762), GHPO 66 (SEQ ID NO:764), GHPO
68 (SEQ ID NO:766), GHPO 70 (SEQ ID NO:768), GHPO 77 (SEQ ID
NO:770), GHPO 83 (SEQ ID NO:772), GHPO 85 (SEQ ID NO:774), GHPO
87 (SEQ ID NO:776), GHPO 91 (SEQ ID NO:778), GHPO 92 (SEQ ID
NO:780), GHPO 96 (SEQ ID NO:782), GHPO 97 (SEQ ID NO:784), GHPO
111 (SEQ ID NO:786), GHPO 115 (SEQ ID NO:788), GHPO 117 (SEQ ID
NO:790), GHPO 123 (SEQ ID NO:792), GHPO 124 (SEQ ID NO:794), GHPO
126 (SEQ ID NO:796), GHPO 127 (SEQ ID NO:798), GHPO 128 (SEQ ID
NO:800), GHPO 131 (SEQ ID NO:802), GHPO 133 (SEQ ID NO:804), GHPO
140 (SEQ ID NO:806), GHPO 141 (SEQ ID NO:808), GHPO 145 (SEQ ID
NO:810), GHPO 147 (SEQ ID NO:812), GHPO 166 (SEQ ID NO:814), GHPO
181 (SEQ ID NO:816), GHPO 187 (SEQ ID NO:818), GHPO 188 (SEQ ID
NO:820), GHPO 192 (SEQ ID NO:822), GHPO 202 (SEQ ID NO:824), GHPO
204 (SEQ ID NO:826), GHPO 205 (SEQ ID NO:828), GHPO 212 (SEQ ID
NO:830), GHPO 218 (SEQ ID NO:832), GHPO 226 (SEQ ID NO:834), GHPO

231 (SEQ ID NO:836), GHPO 236 (SEQ ID NO:838), GHPO 239 (SEQ ID
NO:840), GHPO 245 (SEQ ID NO:842), GHPO 246 (SEQ ID NO:844), GHPO
248 {SEQ ID NO:846), GHPO 253 (SEQ ID NO:848), GHPO 265 (SEQ ID
NO:850), GHPO 266 (SEQ ID NO:852), GHPO 271 (SEQ ID NO:854), GHPO
272 (SEQ ID NO:856), GHPO 286 (SEQ ID NO:858), GHPO 291 (SEQ ID
NO:860), GHPO 292 (SEQ ID NO:862), GHPO 297 (SEQ ID NO:864), GHPO
304 (SEQ ID NO:866), GHPO 307 (SEQ ID NO:868), GHPO 324 (SEQ ID
NO:870), GHPO 326 (SEQ ID NO:872), GHPO 331 (SEQ ID NO:874), GHPO
343 (SEQ ID NO:876), GHPO 345 (SEQ ID NO:878), GHPO 346 (SEQ ID
NO:880), GHPO 352 (SEQ ID NO:882), GHPO 355 (SEQ ID NO:884), GHPO
363 (SEQ ID NO:886), GHPO 369 (SEQ ID NO:888), GHPO 376 (SEQ ID
NO:890), GHPO 378 (SEQ ID NO:892), GHPO 388 (SEQ ID NO:894), GHPO
396 (SEQ ID NO:896), GHPO 403 (SEQ ID NO:898), GHPO 410 (SEQ ID
NO:900), GHPO 415 (SEQ ID NO:902), GHPO 421 (SEQ ID NO:904), GHPO
439 (SEQ ID NO:906), GHPO 441 (SEQ ID NO:908), GHPO 443 (SEQ ID
NO:910), GHPO 453 (SEQ ID NO:912), GHPO 455 (SEQ ID NO:914), GHPO
464 (SEQ ID NO:916), GHPO 467 (SEQ ID NO:918), GHPO 468 (SEQ ID
NO:920), GHPO 470 (SEQ ID NO:922), GHPO 486 (SEQ ID NO:924), GHPO
487 (SEQ ID NO:926), GHPO 488 (SEQ ID NO:928), GHPO 489 (SEQ ID
NO:930), GHPO 498 (SEQ ID NO:932), GHPO 501 (SEQ ID NO:934), GHPO
504 (SEQ ID NO:936), GHPO 512 (SEQ ID NO:938), GHPO 517 (SEQ ID
NO:940), GHPO 520 (SEQ ID NO:942), GHPO 528 (SEQ ID NO:944), GHPO
530 (SEQ ID NO:946), GHPO 532 (SEQ ID NO:948), GHPO 548 (SEQ ID
NO:950), GHPO 561 (SEQ ID NO:952), GHPO 564 (SEQ ID NO:954), GHPO
572 (SEQ ID NO:956), GHPO 573 (SEQ ID NO:958), GHPO 574 (SEQ ID
NO:960), GHPO 577 (SEQ ID NO:962), GHPO 579 (SEQ ID NO:964), GHPO
583 (SEQ ID NO:966), GHPO 588 (SEQ ID NO:968), GHPO 593 (SEQ ID

NO:970), GHPO 597 (SEQ ID NO:972), GHPO 598 (SEQ ID NO:974), GHPO
604 (SEQ ID NO:976), GHPO 606 (SEQ ID NO:978), GHPO 611 (SEQ ID
NO:980), GHPO 612 (SEQ ID NO:982), GHPO 615 (SEQ ID NO:984), GHPO
632 (SEQ ID NO:986), GHPO 633 (SEQ ID NO:988), GHPO 637 (SEQ ID
NO:990), GHPO 651 (SEQ ID NO:992), GHPO 663 (SEQ ID NO:994), GHPO
686 (SEQ ID NO:996), GHPO 693 (SEQ ID NO:998), GHPO 698 (SEQ ID
NO:1000), GHPO 703 (SEQ ID NO:1002), GHPO 704 (SEQ ID NO:1004), GHPO 705 (SEQ ID NO:1006), GHPO 707 (SEQ ID NO:1008), GHPO 721 (SEQ ID NO:1010), GHPO 727 (SEQ ID NO:1012), GHPO 728 (SEQ ID
NO:1014), GHPO 733 (SEQ ID NO:1016), GHPO 758 (SEQ ID NO:1018), GHPO 763 (SEQ ID NO:1020), GHPO 771 (SEQ ID NO:1022), GHPO 774 (SEQ ID NO:1024), GHPO 776 (SEQ ID NO:1026), GHPO 783 (SEQ ID
NO:1028), GHPO 800 (SEQ ID NO:1030), GHPO 806 (SEQ ID NO:1032), GHPO 807 (SEQ ID NO:1034), GHPO 808 (SEQ ID NO:1036), GHPO 809 (SEQ ID NO:1038), GHPO 811 (SEQ ID NO:1040), GHPO 815 (SEQ ID
NO:1042), GHPO 819 (SEQ ID NO:1044), GHPO 841 (SEQ ID NO:1046), GHPO 843 (SEQ ID NO:1048), GHPO 846 (SEQ ID NO:1050), GHPO 875 (SEQ ID NO:1052), GHPO 892 (SEQ ID NO:1054), GHPO 902 (SEQ ID
NO:1056), GHPO 904 (SEQ ID NO:1058), GHPO 906 (SEQ ID NO:1060), GHPO 908 (SEQ ID NO:1062), GHPO 921 (SEQ ID NO:1064), GHPO 923 (SEQ ID NO:1066), GHPO 926 (SEQ ID NO:1068), GHPO 933 (SEQ ID
NO:1070), GHPO 939 (SEQ ID NO:1072), GHPO 940 (SEQ ID NO:1074), GHPO 943 (SEQ ID NO:1076), GHPO 951 (SEQ ID NO:1078), GHPO 961 (SEQ ID NO:1080), GHPO 965 (SEQ ID NO:1082), GHPO 990 (SEQ ID
NO:1084), GHPO 991 (SEQ ID NO:1086), GHPO 998 (SEQ ID NO:1088), GHPO 1001 (SEQ ID NO:1090), GHPO 1005 (SEQ ID NO:1092), GHPO
1033 (SEQ ID NO:1094), GHPO 1039 (SEQ ID NO:1096), GHPO 1041 (SEQ

ID NO:1098), GHPO 1043 (SEQ ID NO:1100), GHPO 1044 (SEQ ID
NO:1102), GHPO 1051 (SEQ ID NO:1104), GHPO 1058 (SEQ ID NO:1106), GHPO 1060 (SEQ ID NO:1108), GHPO 1075 (SEQ ID NO:1110), GHPO
1077 (SEQ ID NO:1112), GHPO 1082 (SEQ ID NO:1114), GHPO 1083 (SEQ
ID NO:1116), GHPO 1086 (SEQ ID NO:1118), GHPO 1087 (SEQ ID
NO:1120), GHPO 1090 (SEQ ID NO:1122), GHPO 1097 (SEQ ID NO:1124), GHPO 1098 (SEQ ID NO:1126), GHPO 1103 (SEQ ID NO:1128), GHPO
1113 (SEQ ID NO:1130), GHPO 1116 (SEQ ID NO:1132), GHPO 1123 (SEQ
ID NO: 1134), GHPO 1125 (SEQ ID NO:1136), GHPO 1129 (SEQ ID
NO:1138), GHPO 1130 (SEQ ID NO:1140), GHPO 1134 (SEQ ID NO:1142), GHPO 1161 (SEQ ID NO:1144), GHPO 1166 (SEQ ID NO:1146), GHPO
1170 (SEQ ID NO:1148), GHPO 1175 (SEQ ID NO:1150), GHPO 1181 (SEQ
ID NO:1152), GHPO 1186 (SEQ ID NO:1154), GHPO 1188 (SEQ ID
NO:1156), GHPO 1191 (SEQ ID NO:1158), GHPO 1193 (SEQ ID NO:1160), GHPO 1196 (SEQ ID NO:1162), GHPO 1204 (SEQ ID NO:1164), GHPO
1210 (SEQ ID NO:1166), GHPO 1211 (SEQ ID NO:1168), GHPO 1216 (SEQ
ID NO:1170), GHPO 1218 (SEQ ID NO:1172), GHPO 1220 (SEQ ID
NO:1174), GHPO 1223 (SEQ ID NO:1176), GHPO 1226 (SEQ ID NO:1178), GHPO 1240 (SEQ ID NO:1180), GHPO 1246 (SEQ ID NO:1182), GHPO
1251 (SEQ ID NO:1184), GHPO 1252 (SEQ ID NO:1186), GHPO 1261 (SEQ
ID NO:1188), GHPO 1265 (SEQ ID NO:l 190), GHPO 1267 (SEQ ID
NO:1192), GHPO 1278 (SEQ ID NO:1194), GHPO 1282 (SEQ ID NO:1196), GHPO 1283 (SEQ ID NO:1198), GHPO 1287 (SEQ ID NO:1200), GHPO
1292 (SEQ ID NO:1202), GHPO 1293 (SEQ ID NO:1204), GHPO 1302 (SEQ
ID NO:1206), GHPO 1309 (SEQ ID NO:1208), GHPO 1317 (SEQ ID
NO:1210), GHPO 1318 (SEQ ID NO:1212), GHPO 1321 (SEQ ID NO:1214), GHPO 1325 (SEQ ID NO:1216), GHPO 1341 (SEQ ID NO:1218), GHPO

1351 (SEQ ID NO:1220), GHPO 1354 (SEQ ID NO:1222), GHPO 1363 (SEQ
ID NO:1224), GHPO 1371 (SEQ ID NO:1226), GHPO 1381 (SEQ ID
NO:1228), GHPO 1401 (SEQ ID NO:1230), GHPO 1402 (SEQ ID NO:1232), GHPO 1403 (SEQ ID NO:1234), GHPO 1408 (SEQ ID NO:1236), GHPO
1416 (SEQ ID NO:1238), GHPO 1420 (SEQ ID NO:1240), GHPO 1428 (SEQ
ID NO:1242), GHPO 1437 (SEQ ID NO:1244), GHPO 1439 (SEQ ID
NO:1246), GHPO 1460 (SEQ ID NO:1248), GHPO 1463 (SEQ ID NO:1250), GHPO 1472 (SEQ ID NO:1252), GHPO 1474 (SEQ ID NO:1254), GHPO
1484 (SEQ ID NO:1256), GHPO 1489 (SEQ ID NO:1258), GHPO 1494 (SEQ
ID NO:1260), GHPO 1495 (SEQ ID NO:1262), GHPO 1498 (SEQ ID
NO:1264), GHPO 1499 (SEQ ID NO:1266), GHPO 1500 (SEQ ID NO:1268), GHPO 1503 (SEQ ID NO:1270), GHPO 1504 (SEQ ID NO:1272), GHPO
1510 (SEQ ID NO:1274), GHPO 1518 (SEQ ID NO:1276), GHPO 1533 (SEQ
ID NO:1278), GHPO 1541 (SEQ ID NO:1280), GHPO 1544 (SEQ ID
NO:1282), GHPO 1548 (SEQ ID NO:1284), GHPO 1565 (SEQ ID NO:1286), GHPO 1575 (SEQ ID NO:1288), GHPO 1582 (SEQ ID NO:1290), GHPO
1595 (SEQ ID NO:1292), GHPO 1597 (SEQ ID NO:1294), GHPO 1599 (SEQ
ID NO:1296), GHPO 1601 (SEQ ID NO:1298), GHPO 1609 (SEQ ID
NO:1300), GHPO 1613 (SEQ ID NO:1302), GHPO 1614 (SEQ ID NO:1304), GHPO 1626 (SEQ ID NO:1306), GHPO 1628 (SEQ ID NO:1308), GHPO
1639 (SEQ ID NO:1310), GHPO 1640 (SEQ ID NO:1312), GHPO 1641 (SEQ
ID NO:1314), GHPO 1646 (SEQ ID NO:1316), GHPO 1662 (SEQ ID
NO:1318), GHPO 1667 (SEQ ID NO:1320), GHPO 1668 (SEQ ID NO:1322), GHPO 1670 (SEQ ID NO:1324), GHPO 1671 (SEQ ID NO:1326), GHPO
1672 (SEQ ID NO:1328), GHPO 1678 (SEQ ID NO:1330), GHPO 1684 (SEQ
ID NO:1332), GHPO 1695 (SEQ ID NO:1334), GHPO 1697 (SEQ ID
NO:1336), GHPO 1701 (SEQ ID NO:1338), GHPO 1719 (SEQ ID NO:1340), GHPO 1723 (SEQ ID NO:1342), GHPO 1732 (SEQ ID NO:1344), GHPO
1739 (SEQ ID NO:1346), GHPO 1741 (SEQ ID NO:1348), GHPO 1747 (SEQ
ID NO:1350), GHPO 1749 (SEQ ID NO:1352), GHPO 1750 (SEQ ID
NO:1354), GHPO 1751 (SEQ ID NO:1356), GHPO 1755 (SEQ ID NO:1358), GHPO 1771 (SEQ ID NO:1360), GHPO 1786 (SEQ ID NO:1362), and GHPO
1789 (SEQ ID NO:1364); or (ii) a derivative of said Helicobacter polypeptide.
9. The compound of claim 8, which is the mature form or a derivative of a polypeptide encoded by a DNA molecule of claim 4.
10. The compound of claim 8, which is the mature form or a derivative of a polypeptide encoded by a DNA molecule of claim 5.
11. A pharmaceutical composition for preventing or treating Helicobacter infection in a mammal, said composition comprising a prophylactically or therapeutically effective amount of a compound of claim 8, 9, or 10 admixed with a physiologically acceptable diluent or carrier.
12. The composition of claim 11, further comprising an antibiotic, an antisecretory agent, a bismuth salt, or a combination thereof.
13. The composition of claim 12, wherein said antibiotic is selected from the group consisting of amoxicillin, clarithromycin, tetracycline, metronidizole, and erythromycin.
14. The composition of claim 12, wherein said bismuth salt is selected from the group consisting of bismuth subcitrate and bismuth subsalicylate.
15. The composition of claim 12, wherein said antisecretory agent is a proton pump inhibitor.
16. The composition of claim 15, wherein said proton pump inhibitor is selected from the group consisting of omeprazole, lansoprazole, and pantoprazole.
17. The composition of claim 12, wherein said antisecretory agent is an H2-receptor antagonist.
18. The composition of claim 17, wherein said H2-receptor antagonist is selected from the group consisting of ranitidine, cimetidine, famotidine, nizatidine, and roxatidine.
19. The composition of claim 12, wherein said antisecretory agent is a prostaglandin analog.
20. The composition of claim 19, wherein said prostaglandin analog is misoprostil or enprostil.
21. The composition of claim 11, further comprising a prophylactically or therapeutically effective amount of a second Helicobacter polypeptide or a derivative thereof.
22. The composition of claim 21, wherein the second Helicobacter polypeptide is a Helicobacter urease, or a subunit or a derivative thereof.
23. The composition of claim 11, further comprising an adjuvant.
24. A pharmaceutical composition for preventing or treating Helicobacter infection in a mammal, said composition comprising a prophylactically or therapeutically effective amount of a polynucleotide of claim 1 or 2 admixed with a physiologically acceptable diluent or carrier.
25. A pharmaceutical composition for preventing or treating Helicobacter infection in a mammal, said composition comprising a prophylactically or therapeutically effective amount of a polynucleotide of claim 4, 5, or 6 admixed with a physiologically acceptable diluent or carrier.
26. A pharmaceutical composition for preventing or treating Helicobacter infection in a mammal, said composition comprising a prophylactically or therapeutically effective amount of a polynucleotide of claim 7 admixed with a physiologically acceptable diluent or carrier.
27. A composition comprising a viral vector, in the genome of which is inserted a DNA molecule of claim 3, said DNA molecule being placed under conditions for expression in a mammalian cell and said viral vector being admixed with a physiologically acceptable diluent or carrier.
28. The composition of claim 27, wherein said viral vector is a poxvirus.
29. A composition that comprises a bacterial vector comprising a DNA
molecule of claim 3, said DNA molecule being placed under conditions for expression and said bacterial vector being admixed with a physiologically acceptable diluent or carrier.
30. The composition of claim 29, wherein said vector is selected from the group consisting of Shigella, Salmonella, Vibrio cholerae, Lactobacillus, Bacille bilié de Calmette-Guérin, and Streptococcus.
31. The composition of claim 24, wherein said polynucleotide is a DNA
molecule that is inserted in a plasmid that is unable to replicate and to substantially integrate in a mammalian genome and is placed under conditions for expression in a mammalian cell.
32. An expression cassette comprising a DNA molecule of claim 3, said DNA molecule being placed under conditions for expression in a procaryotic or eucaryotic cell.
33. A process for producing a compound of claim 8, which comprises culturing a procaryotic or eucaryotic cell transformed or transfected with an expression cassette of claim 32, and recovering said compound from the cell culture.
34. A pharmaceutical composition for preventing or treating Helicobacter infection in a mammal, said composition comprising a prophylactically or therapeutically effective amount of an antibody that binds to the compound of claim 8, 9, or 10 admixed with a physiologically acceptable diluent or carrier.
CA002286306A 1997-04-01 1998-04-01 Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome Abandoned CA2286306A1 (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US83345797A 1997-04-01 1997-04-01
US08/833,457 1997-04-01
US88122797A 1997-06-24 1997-06-24
US08/881,227 1997-06-24
US90261597A 1997-07-29 1997-07-29
US08/902,615 1997-07-29
PCT/US1998/006371 WO1998043478A1 (en) 1997-04-01 1998-04-01 Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome

Publications (1)

Publication Number Publication Date
CA2286306A1 true CA2286306A1 (en) 1998-10-08

Family

ID=27420239

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002286306A Abandoned CA2286306A1 (en) 1997-04-01 1998-04-01 Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome

Country Status (8)

Country Link
EP (1) EP0977482A4 (en)
JP (1) JP2001527393A (en)
KR (1) KR20010005893A (en)
CN (1) CN1263436A (en)
AU (1) AU756010B2 (en)
CA (1) CA2286306A1 (en)
NZ (1) NZ338039A (en)
WO (1) WO1998043478A1 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19847628C2 (en) * 1998-10-15 2000-10-19 Chiron Behring Gmbh & Co Helicobacter pylori vaccine
US6238894B1 (en) 1998-11-04 2001-05-29 Diane Taylor α1,2 fucosyltransferase
EP1259639A2 (en) * 1999-05-31 2002-11-27 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Essential gene and gene products for identifying, developing and optimising immunological and pharmacological active ingredients for the treatment of microbial infections
WO2001008486A1 (en) * 1999-08-03 2001-02-08 Smithkline Beecham Corporation CoaA POLYPEPTIDES AND POLYNUCLEOTIDES AND METHODS THEREOF
AUPQ347199A0 (en) 1999-10-15 1999-11-11 Csl Limited Novel polypeptide fragments
EP1278768B1 (en) * 2000-04-27 2006-11-15 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Method for identifying helicobacter antigens
SE0001988D0 (en) * 2000-05-29 2000-05-29 A & Science Invest Ab Novel polypeptides and use thereof
AU2001268886A1 (en) * 2000-06-28 2002-01-08 National Research Council Of Canada Helicobacter pylori heptosyl transferase polypeptides
EP1301204A1 (en) * 2000-07-05 2003-04-16 Merieux OraVax SNC Immunological combinations for prophylaxis and therapy of helicobacter pylori infection
WO2003018794A1 (en) * 2001-08-24 2003-03-06 Kyowa Hakko Kogyo Co., Ltd. α-1,2-FUCOSYL TRANSFERASE AND DNA ENCODING THE SAME
KR20030024129A (en) * 2001-09-17 2003-03-26 국윤호 Detection and identification of helicobacter pylori
WO2003080654A2 (en) * 2002-03-26 2003-10-02 National Research Council Of Canada Helicobacter flagellar, motility polypeptides
JP3823268B2 (en) * 2002-05-30 2006-09-20 独立行政法人科学技術振興機構 Helicobacter pylori cell death inducer
WO2005023851A1 (en) * 2003-09-05 2005-03-17 Karolinska Innovations Ab Plasminogen/plasmin binding polypeptides and nucleic acids therefore
CN1906210A (en) * 2003-11-21 2007-01-31 Ace生物科学公司 Surface-located campylobacter jejuni polypeptides
CN108495650A (en) * 2015-12-14 2018-09-04 慕尼黑工业大学 Helicobacter pylori vaccine
KR101985335B1 (en) * 2017-11-01 2019-06-03 연세대학교 원주산학협력단 A method for simultaneously detecting and identifying Helicobacter pylori, and an antibiotics-resistance gene thereof, and a use therefor based PCR-Reverse blot hybridization assay
CN113754727B (en) * 2020-09-30 2022-07-12 广州派真生物技术有限公司 Adeno-associated virus mutant and application thereof
WO2024117798A1 (en) * 2022-11-29 2024-06-06 제일약품 주식회사 Composition for removing helicobacter pylori comprising zastaprazan or pharmaceutically acceptable salt thereof
CN118063569A (en) * 2024-04-24 2024-05-24 上海金翌生物科技有限公司 Helicobacter pylori secretory protein and application thereof in helicobacter pylori detection

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB8928625D0 (en) * 1989-12-19 1990-02-21 3I Res Expl Ltd H.pylori dna probes

Also Published As

Publication number Publication date
AU756010B2 (en) 2003-01-02
AU7099598A (en) 1998-10-22
EP0977482A1 (en) 2000-02-09
KR20010005893A (en) 2001-01-15
JP2001527393A (en) 2001-12-25
NZ338039A (en) 2001-04-27
WO1998043478A1 (en) 1998-10-08
CN1263436A (en) 2000-08-16
EP0977482A4 (en) 2002-03-06

Similar Documents

Publication Publication Date Title
CA2286306A1 (en) Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome
US7700755B2 (en) Helicobacter pylori CAI antigen
EP0643770B1 (en) Helicobacter pylori cytotoxin associated immunodominant antigen useful for vaccines and diagnostics
CN100390283C (en) i(Chlanydia pneumoniae) genomic sequence and polypeptides, fragments thereof and uses thereof, in particular for the diagnosis, prevention and treatment of infection
JP2004346083A (en) Conjugate formed from heat shock protein and oligo- or polysaccharide
CA2373021A1 (en) Chlamydia antigens and corresponding dna fragments and uses thereof
CA2321106A1 (en) Group b streptococcus antigens
EA006232B1 (en) Streptococcus antigens
CA2390344A1 (en) 85kda neisserial antigen
WO2002018595A2 (en) Moraxella polypeptides and corresponding dna fragments and uses thereof
CA2353107A1 (en) Chlamydia antigens and corresponding dna fragments and uses thereof
JPH11506905A (en) Helicobacter pylori CagI region
CA2273199A1 (en) Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof
CA2286893A1 (en) 76 kda, 32 kda, and 50 kda helicobacter polypeptides and corresponding polynucleotide molecules
US20070243204A1 (en) Helicobacter pylori proteins useful for vaccines and diagnostics
US20030158396A1 (en) Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome
US20020115078A1 (en) Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome
JP2002528082A (en) Chlamydia antigens and corresponding DNA fragments and uses thereof
US20030124141A1 (en) Helicobacter polypeptides and corresponding polynucleotide molecules
CA2271774A1 (en) Helicobacter polypeptides and corresponding polynucleotide molecules
US20020160456A1 (en) Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome
CA2414125A1 (en) Bartonella henselae virb operon and proteins encoded thereby
US20030023066A1 (en) Helicobacter polypeptides and corresponding polynucleotide molecules
CA2356764A1 (en) Type iii secretion system antigens from bordetella pertussis
US20020026035A1 (en) Helicobacter ghpo 1360 and ghpo 750 polypeptides and corresponding polynucleotide molecules

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued