AU775377B2 - Novel pesticidal toxins and nucleotide sequences which encode these toxins - Google Patents

Novel pesticidal toxins and nucleotide sequences which encode these toxins Download PDF

Info

Publication number
AU775377B2
AU775377B2 AU10114/02A AU1011402A AU775377B2 AU 775377 B2 AU775377 B2 AU 775377B2 AU 10114/02 A AU10114/02 A AU 10114/02A AU 1011402 A AU1011402 A AU 1011402A AU 775377 B2 AU775377 B2 AU 775377B2
Authority
AU
Australia
Prior art keywords
lys
ser
leu
thr
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU10114/02A
Other versions
AU1011402A (en
Inventor
Charles Joseph Dullum
Jerald S. Feitelson
David Loewer
Judy Muller-Cohn
Kenneth E. Narva
James L. Schmeits
H. Ernest Schnepf
George Schwab
Lisa Stamp
Brian A. Stockhoff
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mycogen Corp
Original Assignee
Mycogen Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU50983/98A external-priority patent/AU5098398A/en
Application filed by Mycogen Corp filed Critical Mycogen Corp
Publication of AU1011402A publication Critical patent/AU1011402A/en
Application granted granted Critical
Publication of AU775377B2 publication Critical patent/AU775377B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Description

S&F Ref: 459116D1
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT
ORIGINAL
Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: The following statement is a full performing it known to me/us:- Mycogen Corporation 5501 Oberlin Drive San Diego California 92121 United States of America Jerald S. Feitelson, H. Ernest Schnepf, Kenneth E.
Narva, Brian A. Stockhoff, James L. Schmeits, David Loewer, George Schwab, Charles Joseph Dullum, Judy Muller-Cohn, Lisa Stamp Spruson Ferguson St Martins Tower,Level 31 Market Street Sydney NSW 2000 (CCN 3710000177) Novel Pesticidal Toxins and Nucleotide Sequences Which Encode These Toxins description of this invention, including the best method of 5845c
DESCRIPTION
NOVEL PESTICIDAL TOXINS AND-NUCLEOTIDE SEOUENCES WHICH ENCODE THESE TOXINS Background of the Invention The soil microbe Bacillus thuringiensis is a Gram-positive. spore-forming bacterium characterized by parasporal crystalline protein inclusions. These inciusions often appear microscopically as distinctively shaped crystals. The proteins can be highly toxic to pests and specific in their toxic activity. Certain B.t. toxin genes have been isolated and sequenced, and recombinant DNA-based B.t. products have been produced and approved for use. In addition, with the use of genetic engineering techniques, new approaches for delivering these B.t. endotoxins to agricultural environments are under development, including the use of plants genetically engineered with endotoxin genes for insect resistance and the use of stabilized intact microbial cells as B.t. endotoxin delivery vehicles (Gaertner L. Kim [1988] TIBTECH 6:S4-S7). Thus, isolated B.t. endotoxin genes are becoming commercially valuable.
Until the last fifteen years, commercial use of B.t. pesticides has been largely restricted to a narrow range of lepidopteran (caterpillar) pests. Preparations of the spores and crystals of B. thuringiensis subsp. kurstaki have been used for many years as commercial insecticides for lepidopteran pests. For example, B. thuringiensis var. kurstaki HD-1 produces a crystalline 8- S* endotoxin which is toxic to the larvae of a number of lepidopteran insects.
In recent years, however, investigators have discovered B.t. pesticides with specificities for a much broader range of pests. For example, other species of namely israelensis and morrisoni tenebrionis, a.ka. B.t. M-7, a.k.a. B.t. san diego), have been used commercially to control insects of the orders Diptera and Coleoptera, respectively (Gaertner, F.H. [1989] "Cellular Delivery Systems for Insecticidal Proteins: Living and Non-Living Microorganisms," in Controlled Delivery of Crop Protection Agents, R.M. Wilkins, ed., Taylor and Francis. New York and London, 1990, pp. 245-255.). See also Couch, T.L. (1980) "Mosquito Pathogenicity or Bacillus thuringiensis var. israelensis," Developments in Industrial Microbiology 22:61-76; and Beegle, C.C. (1978) "Use of Entomogenous Bacteria in Agroecosystems," Developments in Industrial Microbiology 20:97-104. Krieg, A.M. Huger, G.A. Langenbruch, W.
Schnetter (1983) Z. ang. Ent. 96:500-508 describe Bacillus thuringiensis var. tenebrionis, which is reportedly active against two beetles in the order Coleoptera. These are the Colorado potato beetle, Leptinotarsa decemlineata, and Agelastica alni.
More recently, new subspecies of B.t. have been identified, and genes responsible for active 6-endotoxin proteins have been isolated (H6fte, H.R. Whiteley [1989] Microbiological Reviews 52(2):242-255). H6fte and Whiteley classified B.t. crystal protein genes into four major classes. The classes were Cryl (Lepidoptera-specific), CryII (Lepidoptera- and Diptera-specific), CryI (Coleoptera-specific), and CryIV (Diptera-specific). The discovery of strains specifically toxic to other pests has been reported (Feitelson, J. Payne. L. Kim [1992] Bio/Technology 10:271-275). CryV has been proposed to designate a class of toxin genes that are nematodespecific. Lambert et al. (Lambert, L. Buysse, C. Decock, S. Jansens. C. Piens. B. Saey, J.
Seurinck, K. van Audenhove, J. Van Rie, A. Van Vliet, M. Peferoen [1996] Appl. Environ.
Microbiol 62(1):80-86) describe the characterization of a Cry9 toxin active against lepidopterans. Published PCT applications WO 94/05771 and WO 94/24264 also describe B.t.
isolates active against lepidopteran pests. Gleave et al. ([1991] JGM 138:55-62), Shevelev et al. ([1993] FEBS Lett. 336:79-82: and Smulevitch et al. ([1991] FEBS Lett. 293:25-26) also ~describe B.t. toxins. Many other classes ofB.t. genes have now been identified.
15 The cloning and expression of a B.t. crystal protein gene in Escherichia coli has been described in the published literature (Schnepf, H.R. Whiteley [1981] Proc. Natl. Acad. Sci.
USA 78:2893-2897.). U.S. Patent 4,448,885 and U.S. Patent 4,467,036 both disclose the expression of B.t. crystal protein in E. coli. U.S. Patents 4,990,332; 5,039,523; 5,126,133; 5,164,180; and 5,169,629 are among those which disclose B.t. toxins having activity against lepidopterans. PCT application WO96/05314 discloses PS86W1, PS86V1, and other B.t.
isolates active against lepidopteran pests. The PCT patent applications published as W094/24264 and W094/05771 describe B.t. isolates and toxins active against lepidopteran pests. B.t. proteins with activity against members of the family Noctuidae are described by Lambert et al., supra. U.S. Patents 4,797,276 and 4,853,331 disclose B. thuringiensis strain tenebrionis which can be used to control coleopteran pests in various environments. U.S. Patent No. 4,918,006 discloses B.t. toxins having activity against dipterans. U.S. Patent No. 5,151,363 and U.S. Patent No. 4,948,734 disclose certain isolates of B.t. which have activity against nematodes. Other U.S. patents which disclose activity against nematodes include 5,093,120: 5,236,843; 5,262,399; 5,270,448; 5,281,530; 5,322,932; 5,350,577; 5,426,049; 5,439,881, 5,667,993; and 5,670,365. As a result of extensive research and investment of resources, other patents have issued for new B.t. isolates and new uses of B.t. isolates. See Feitelson et al., supra, for a review. However, the discovery of new B.t. isolates and new uses of known B.t. isolates remains an empirical, unpredictable art.
Isolating responsible toxin genes has been a slow empirical process. Carozzi et al.
(Carozzi, V.C. Kramer, G.W. Warren. S. Evola, G. Koziel (1991) Appl. Env. Microbiol.
57(11):3057-3061) describe methods for identifying toxin genes. U.S. Patent No. 5.204.237 describes specific and universal probes for the isolation of B.t. toxin genes. That patent.
however, does not describe the probes and primers of the subject invention.
WO 94/21795, WO 96/10083, and Estruch, J.J. er al. (1996) PNAS 93:5389-5394 describe toxins obtained from Bacillus microbes. These toxins are reported to be produced during vegetative cell growth and were thus termed vegetative insecticidal proteins (VIP). These toxins were reported to be distinct from crystal-forming 6-endotoxins. Activity of these toxins against lepidopteran and coleopteran pests was reported. These applications make specific reference to toxins designated ViplA(a), ViplA(b), Vip2A(a), Vip2A(b), Vip3A(a), and Vip3A(b). The toxins and genes of the current invention are distinct from those disclosed in the '795 and '083 applications and the Estruch article.
15 Brief Summary of the Invention The subject invention concerns materials and methods useful in the control of nonmammalian pests and, particularly, plant pests. In one embodiment, the subject invention S* provides novel B.t. isolates having advantageous activity against non-mammalian pests. In a further embodiment, the subject invention provides new toxins useful for the control of nonmammalian pests. In a preferred embodiment, these pests are lepidopterans and/or coleopterans.
"The toxins of the subject invetion include 6-endotoxins as well as soluble toxins which can be obtained from the supernatant of Bacillus cultures.
The subject invention further provides nucleotide sequences which encode the toxins of the subject invention. The subject invention further provides nucleotide sequences and 25 methods useful in the identification and characterization of genes which encode pesticidal toxins.
In one embodiment, the subject invention concerns unique nucleotide sequences which are useful as hybridization probes and/or primers in PCR techniques. The primers produce characteristic gene fragments which can be used in the identification, characterization, and/or isolation of specific toxin genes. The nucleotide sequences of the subject invention encode toxins which are distinct from previously-described toxins.
In a specific embodiment, the subject invention provides new classes of toxins having advantageous pesticidal activities. These classes of toxins can be encoded by polynucleotide 4 sequences which are characterised by their ability to hybridise with certain exemplified sequences and/or by their ability to be amplified by PCR using certain exemplified primers.
Herein described is the identification and characterisation of entirely new families of Bacillus thuringiensis toxins having advantageous pesticidal properties. Specific new toxin families disclosed herein include MIS-1, MIS-2, MIS-3, MIS-4, MIS-5, MIS-6, WAR-i, and SUP-1. These families of toxins, and the genes which encode them, can be characterised in terms of, for example, the size of the toxin or gene, the DNA or amino acid sequence, pesticidal activity, and/or antibody reactivity. With regard to the genes encoding the novel toxin families of the subject invention, the current disclosure provides unique hybridisation probes and PCR primers which can be used to identify and characterise DNA within each of the exemplified families.
According to a first embodiment of the invention, there is provided an isolated pesticidal toxin belonging to the SUP-1 family, wherein said toxin is obtainable from a Bacillus thuringiensis isolate selected from the group consisting of PS49C (NRRL B- 21532) and PS158C2 (NRRL B-18872).
According to another embodiment of the invention, there is provided an isolated pesticidal toxin belonging to the SUP-1 family, wherein a polynucleotide sharing greater than 90% identity with a nucleotide sequence selected from the group consisting of SEQ ID NOS. 10, 12 and 15 encodes at least a portion of said toxin.
According to another embodiment of the invention, there is provided an isolated pesticidal toxin belonging to the SUP-1 family, wherein a polynucleotide selected from the group consisting of SEQ ID NOS. 10, 12 and 15 encodes at least a portion of said toxin.
According to one aspect, Bacillus isolates can be cultivated under conditions resulting in high multiplication of the microbe. After treating the microbe to provide single-stranded genomic nucleic acid, the DNA can be contacted with the primers of the invention and subjected to PCR amplification. Characteristic fragments of toxinencoding genes will be amplified by the procedure, thus identifying the presence of the toxin-encoding gene(s).
30 Thus, according to another embodiment of the invention, there is provided an isolated polynucleotide useful as a PCR primer or a hybridisation probe for the detection and/or isolation of a nucleotide sequence encoding a SUP-I family toxin, wherein said polynucleotide is selected from the group consisting of SEQ ID NOS. 9 to 15, 53 and 54, or complements thereof.
According to another embodiment of the invention, there is provided a PCR primer pair for the detection and/or isolation of a nucleotide sequence encoding a SUP-i family toxin, wherein said primer pair is selected from the group consisting of SEQ ID NOS. 53 and 54, and SEQ ID NOs 53 and 14.
According to another embodiment of the invention, there is provided a method for 40 detecting and/or isolating a nucleotide sequence encoding a SUP-i family toxin in or from a sample, said method comprising submitting said sample to a PCR primer pair or polynucleotide of the invention under hybridising conditions, and amplifying any *oo• hybridised sequences by PCR. SUP-I family toxins detected and/or isolated by the method are also provided.
A further aspect of the subject invention is the use of the disclosed polynucleotides as probes to detect genes encoding Bacillus toxins which are active against pests.
Further aspects of the subject invention include the genes and isolates identified using the methods and nucleotide sequences disclosed herein. The genes thus identified encode toxins active against pests. Similarly, the isolates will have activity against these pests. In a preferred embodiment, these pests are lepidopteran or coleopteran pests.
Thus, according to another embodiment of the invention, there is provided an isolated polynucleotide which encodes a pesticidal toxin according to the invention.
In a preferred embodiment, the subject invention concerns plants cells transformed with at least one polynucleotide sequence of the subject invention such that the transformed plant cells express pesticidal toxins in tissues consumed by target pests. As described herein, the toxins useful according to the subject invention may be chimeric Is toxins produced by combining portions of multiple toxins. In addition, mixtures and/or combinations of toxins can be used according to the subject invention.
Thus, according to other embodiments, the invention provides recombinant DNA molecules, vectors and plasmids comprising the SUP-1 toxin-encoding polynucleotides of the invention.
According to another embodiment of the invention, there is provided a method for transforming a host cell with a polynucleotide encoding a SUP-1 family toxin, said method comprising contacting said host cell with a SUP-I toxin-encoding polynucleotide according to the invention, or a recombinant DNA molecule, plasmid or a vector comprising said polynucleotide. Host cells transformed by the method are also provided.
According to another embodiment of the invention, there is provided a transformed host comprising a heterologous nucleotide sequence encoding a pesticidal toxin according to the invention. According to a preferred aspect, the host is a plant.
According to another embodiment of the invention, there is provided a transformed host comprising a heterologous nucleotide sequence encoding a pesticidal toxin belonging S 30 to the SUP-I family, wherein said toxin can be encoded by a polynucleotide sequence wherein a portion of said polynucleotide sequence can be amplified by the primer pair SEQ ID NOS. 53 and 54. According to a preferred aspect, the host is a plant.
Transformation of plants with the genetic constructs disclosed herein can be accomplished using techniques well known to those skilled in the art and would typically involve modification of the gene to optimise expression of the toxin in plants.
Thus, according to another embodiment of the invention, there is provided a method of producing a transgenic plant expressing a Bacillus thuringiensis SUP-1 family toxin, said method comprising re-generation of a whole plant from a transformed plant cell o according to the invention.
40 According to other embodiments of the invention, there are provided pesticidal compositions comprising the SUP-1 toxins of the invention, and methods for their S: preparation.
•ooo o *oo According to another embodiment of the invention, there is provided a method for controlling a non-mammalian pest, wherein said method comprises contacting said pest with a toxin of the invention.
Alternatively, the Bacillus isolates of the subject invention, or recombinant microbes expressing the toxins described herein, can be used to control pests. In this regard, the invention includes the treatment of substantially intact Bacillus cells, and/or recombinant cells containing the expressed toxins of the invention, treated to prolong the pesticidal activity when the substantially intact cells are applied to the environment of a target pest. The treated cell acts as a protective coating for the pesticidal toxin. The toxin becomes active upon ingestion by a target insect.
Thus, according to another embodiment of the invention, there is provided a method for controlling a non-mammalian pest comprising applying to the situs or environment of said pest a pesticidally effective amount of a transgenic cell, transformed host, transgenic plant or pesticidal composition according to the invention.
Brief Description of the Sequences SEQ ID NO. 1 is a forward primer, designated "the 339 forward primer", used according to the subject invention SEQ ID NO. 2 is a reverse primer, designated "the 339 reverse primer", used according to the subject invention.
SEQ ID NO. 3 is a nucleotide sequence encoding a toxin from B.t. strain PS36A.
SEQ ID NO. 4 is an amino acid sequence for the 36A toxin.
SEQ ID NO. 5 is a nucleotide sequence encoding a toxin from B. t strain PS81 F.
SEQ ID NO. 6 is an amino acid sequence for the 81F toxin.
SEQ ID NO. 7 is a nucleotide sequence encoding a toxin from B. t strain Javelin 1990.
SEQ ID NO. 8 is an amino acid sequence for the Javelin 1990 toxin.
SEQ ID NO. 9 is a forward primer, designated "158C2 PRIMER used according to the subject invention.
SEQ ID NO. 10 is a nucleotide sequence encoding a portion of a soluble toxin from B.t PS1 58C2.
30 SEQ ID NO. 11 is a forward primer, designated "49C PRIMER used according •to the subject invention.
SEQ ID NO. 12 is a nucleotide sequence of a portion of a toxin gene from B.t strain PS49C.
SEQ ID NO. 13 is a forward primer, designated "49C PRIMER used according S 35 to the subject invention.
SEQ ID NO. 14 is a reverse primer, designated "49C PRIMER used according to the subject invention.
~SEQ ID NO. 15 is an additional nucleotide sequence of a portion of a toxin gene from PS49C.
SEQ ID NO. 16 is a forward primer used according to the subject invention.
SEQ ID NO. 17 is a reverse primer used according to the subject invention.
SEQ ID NO. 18 is a nucleotide sequence of a toxin gene from B.t strain PSIOEI.
SEQ ID NO. 19 is an amino acid sequence from the 10El toxin.
SEQ ID NO. 20 is a nucleotide sequence of a toxin gene from B.t. strain PS31J2.
SEQ ID NO. 21 is an amino acid sequence from the 31J2 toxin.
SEQ ID NO. 22 is a nucleotide sequence of a toxin gene from B.z. strain PS33D2.
SEQ ID NO. 23 is an amino acid sequence from the 33D2 toxin.
SEQ ID NO. 24 is a nucleotide sequence of a toxin gene from B.z. strain PS66D3 SEQ ID NO. 25 is an ammo acid sequence from the 66D3 toxin.
SEQ ID NO. 26 is a nucleotide sequence of a toxin gene from B.t. strain PS68F.
SEQ ID NO. 27 is an amino acid sequence from the 68F toxin.
SEQ ID NO. 28 is a nucleotide sequence of a toxin gene from B.I. strain PS69AA2 SEQ ID NO. 29 is an amino acid sequence from the 69AA2 toxin.
SEQ ID NO. 30 is a nucieotide sequence of a toxin gene from B.r. strain PS168G1.
SEQ ID NO. 31 is a nucleotide sequence of a MIS toxin gene from B.z. strain PSI 77C8.
SEQ ID NO. 32 is an amino acid sequence from the 177C8-MIS toxin.
SEQ ID NO. 33 is a nucleotide sequence of a toxin gene from B.t. strain PS17718 SEQ ID NO. 34 is an amino acid sequence from the 17718 toxin.
SEQ ID NO. 35 is a nucleotide sequence of a toxin gene from B.t. strain PS185AA2.
SEQ ID NO. 36 is an amino acid sequence from the 185AA2 toxin.
SEQ ID NO. 37 is a nucleotide sequence of a toxin gene from B.t. strain PS 96F3.
S SEQ ID NO. 38 is an amino acid sequence from the 196F3 toxin.
SEQ ID NO. 39 is a nucleotide sequence of a toxin gene from B.t. strain PS196J4.
SEQ ID NO. 40 is an amino acid sequence from the 196J4 toxin.
SEQ ID NO. 41 is a nucleotide sequence of a toxin gene from B.t. strain PS 197T1.
SEQ ID NO. 42 is an amino acid sequence from the 197T1 toxin.
SEQ ID NO. 43 is a nucleotide sequence of a toxin gene from B.t. strain PS 197U2.
25 SEQ ID NO. 44 is an amino acid sequence from the 197U2 toxin.
SEQ ID NO. 45 is a nucleotide sequence of a toxin gene from B.t. strain PS202E1.
SEQ ID NO. 46 is an amino acid sequence from the 202E1 toxin.
SEQ ID NO. 47 is a nucleotide sequence of a toxin-gene from B.t. strain KB33.
SEQ ID NO. 48 is a nucleotide sequence of a toxin gene from B.t. strain KB38.
SEQ ID NO. 49 is a forward primer, designated "ICON-forward," used according to the subject invention.
SEQ ID NO. 50 is a reverse primer, designated "ICON-reverse," used according to the subject invention.
SEQ ID NO. 51 is a nucleotide sequence encoding a 177CS-WAR toxin gene from B.t.
strain PS 177C8.
SEQ ID NO. 52 is an amino acid sequence of a 177C8-WAR toxin from B.t. strain PS177C8.
SEQ ID NO. 53 is a forward primer, designated "SUP-1A." used according to the subject invention.
SEQ ID NO. 54 is a reverse primer. designated "SUP-1B," used according to the subject invention.
SEQ ID NOS. 55-110 are primers used according to the subject invention.
SEQ ID NO. 111 is the reverse complement of the pnmer of SEQ ID NO. 58.
SEQ ID NO. 112 is the reverse complement of the primer of SEQ ID NO. SEQ ID NO. 113 is the reverse complement of the pnmer of SEQ ID NO. 64.
SEQ ID NO. 114 is the reverse complement of the pnmer of SEQ ID NO. 66.
SEQ ID NO. 115 is the reverse complement of the primer of SEQ ID NO. 68.
SEQ ID NO. 116 is the reverse complement of the primer of SEQ ID NO. SEQ ID NO. 117 is the reverse complement of the primer of SEQ ID NO. 72.
SEQ ID NO. 118 is the reverse complement of the primer of SEQ ID NO. 76.
SEQ ID NO. 119 is the reverse complement of the primer of SEQ ID NO. 78.
*SEQ ID NO. 120 is the reverse complement of the primer ofSEQ ID NO. SEQ ID NO. 121 is the reverse complement of the primer of SEQ ID NO. 82.
SEQ ID NO. 122 is the reverse complement of the primer of SEQ ID NO. 84.
SEQ ID NO. 123 is the reverse complement of the primer of SEQ ID NO. 86.
SEQ ID NO. 124 is the reverse complement of the primer of SEQ ID NO. 88.
SEQ ID NO. 125 is the reverse complement of the primer of SEQ ID NO. 92.
SEQ ID NO. 12 is the reverse complement of the primer of SEQ ID NO. 9 25 SEQ ID NO. 126 is the reverse complement of the primer of SEQ ID NO. 94.
SEQ ID NO. 127 is the reverse complement of the primer of SEQ ID NO. 96.
SEQ ID NO. 128 is the reverse complement of the primer of SEQ ID NO. 98.
SEQ ID NO. 129 is the reverse complement of the primer of SEQ ID NO. 99.
SEQ ID NO. 130 is the reverse complement of the primer of SEQ ID NO. 100.
SEQ ID NO. 131 is the reverse complement of the primer of SEQ ID NO. 104.
SEQ ID NO. 132 is the reverse complement of the primer of SEQ ID NO. 106.
SEQ ID NO. 133 is the reverse complement of the primer of SEQ ID NO. 108.
SEQ ID NO. 134 is the reverse complement of the primer of SEQ ID NO. 110.
C
Detailed Disclosure of the Invention The subject invention concerns materials and methods for the control of nonmammalian pests. In specific embodiments, the subject invention pertains to new Bacillus thuringiensis isolates and toxins which have activity against lepidopterans and/or coleopterans. The subject invention further concerns novel genes which encode pesticidal toxins and novel methods for identifying and characterising Bacillus genes which encode toxins with useful properties. The subject invention concerns not only the polynucleotide sequences which encode these toxins, but also the use of these polynucleotide sequences to produce recombinant hosts which express the toxins. The proteins of the subject to invention are distinct from protein toxins which have previously been isolated from Bacillus thuringiensis.
B.t. isolates useful according to the subject invention have been deposited in the permanent collection of the Agricultural Research Service patent Culture Collection (NRRL), Northern Regional Research Centre, 1815 North University Street, Peoria, is Illinois 61604, USA. The culture repository numbers of the B.t. strains are as follows: Culture Repository No. Deposit Date Patent No.
B.t. PSI IB (MT274) NRRL B-21556 April 18, 1996 B.t. PS24J NRRL B-18881 August 30, 1001 B.t. PS31GI (MT278) NRRL B-21560 April 18, 1996 B.t. PS36A NRRL B-18929 December 27, 1991 B.t. PS33F2 NRRL B-18244 July, 28, 1987 4,861,595 B.t. PS40DI NRRL B-18300 February 3, 1988 5,098,705 B.I. PS43F NRRL B-18298 February 2, 1988 4,996,155 B.t. PS45BI NRRL B-18396 August 16, 1988 5,427,786 B.t. PS49C NRRL B-21532 March 14, 1996 B.I. PS52AI NRRL B-18245 July 28, 1987 4,861,595 B.t. PS62BI NRRL B-18398 August 16, 1988 4,849,217 B.t. PS81A2 NRRL B-18457 March 7, 1989 5,164,180 B.I. PS81F NRRL B-18424 October 7, 1988 5,045,469 B.t. PS81GG NRRL B-18425 October 11, 1988 5,169,629 B.t. PS811 NRRL B-18484 April 19, 1989 5,126,133 B.t. PS85AI NRRL B-18426 October I1, 1988 B.t. PS86AI NRRL B-18400 August 16, 1988 4,849,217 B.I. PS86BI NRRL B-18299 February 2, 1988 4,966,765 B.I. PS86BBI (MT275) NRRL B-21557 April 18, 1996 A03589 Culture Repository No. Deposit Date Patent No.
B.r. PS86Q3 NRRL B-18765 February 6, 1991 5,208,017 PS86V1 (MT276) NR.RI B-21558 April 18, 1996 B.r. PS86W1 NTRL B-21559 April 18, 1996 (MT277) B.r. PS89J3 (MT279) NRRI B-21561 April 18, 1996 B.i PS91C2 NRRLB-18931 February 6, 1991 PS92B NRRL B-18889 September 23, 1991 f 5427,786 B.r. PSIOIZ2 NRR.L B-18890 October 1, 1991 5.427,786 PS122D3 NRRL B-18376 June 9.1988 5.006.336 B.r. PS123D1 NRRL B-21011 October 13, 1992 5.508.032 PS157C1 NRRL B-18240 July 17, 1987 5.262,159 (MT104) B.r. PS158C2 NRRL B-1887- August 27, 1991 5,268,172 B.r. PSI69E NRRL B-18682 July 17, 1990 5,151,363 B.t. PS177FI NRRLB-18683 July 17, 1990 5.151,363 PS177G NRRLB-18684 July 17, 1990 5,151,363 PS185L2 NRRL B-21535 March 14, 1996 PS185U2 NRRL B-21562 April 18, 1996 B.i. PS192M4 NRRL B-18932 December 27, 1991 5,273,746 B.r. PS201LI NRRL B-18749 January 9, 1991 5,298,245 PS204C3 NRL B-21008 October 6 1992 B.r. PS204G4 NRRL B-18685 July 17, 1990 5,262,399 B.t. PS242H10 NRRL B-21539 March 14, 1996 B.t. PS242K17 NRRL B-21540 March 14, 1996 PS244A2 NRRL B-21541 March 14, 1996 PS244D1 NRRL B-21542 March 14, 1996 PSIOEI NRRL B-21862 October 24, 1997 B.t. PS31F2 NRRL B-21876 October 24, 1997 B.i. PS31J2 NRRL B-21009 October 13, 1992 PS33D2 NRRL B-21870 October 24, 1997 PS66D3 NRRL B-21858 October 24, 1997 PS68F NRRL B-21857 October 24, 1997 PS69AA2 NTRRLB-21859 October 24, 1997 PS146D NRR B-21866 October 24, 1997 PS168GI NRRL B-21873 October 24, 1997 PS17514 NRRL B-21865 October 24, 1997
C
Culture Repository No. Deposit Date Patent No.
B.I. PS177C8a NRRLB-21867 October 24, 1997 B.I. PS17718 NRRLB-21868 October 24, 1997 B.t. PSI 85AA2 NRRL B-21861 October 24, 1997 B.t. PS196J4 NRRLB-21860 October 24. 1997 PS196F3 {NRRLB-21872 October 24, 1997 B.t. PS197T1 NRRL B-21869 October 24. 1997 B.t. PS197U2 NRRL B-21871 October 24, 1997 B.t. PS202E1 NRRL B-21874 October 24, 1997 B.t. PS217U2 NRRL B-21864 October 24. 1997 KB33 NRRL B-21875 October 24, 1997 KB38 NRRL B-21863 October 24, 1997 KB53A49-4 NRRL B-21879 October 24, 1997 KB68B46-2 NRRL B-21877 October 24, 1997 KB68B51-2 NRRL B-21880 October 24, 1997 15 KB68B55-2 NRRL B-21878 October 24, 1997 PS80JJ1 NRRL B-18679 July 17., 1990 5,151.363 PS94RI NRRLB-21801 July 1, 1997 PS101DD NRRL B-21802 July 1, 1997 PS202S NRRL B-21803 July 1, 1997 20 PS213E5 NRRL B-21804 July 1, 1997 PS218G2 NRRL B-21805 July 1. 1997 Cultures which have been deposited for the purposes of this patent application were deposited under conditions that assure that access to the cultures is available during the pendency of this patent application to one determined by the Commissioner of Patents and Trademarks to be entitled thereto under 37 CFR 1.14 and 35 U.S.C. 122. The deposits will be available as required by foreign patent laws in countries wherein counterparts of the subject application, or its progeny, are flied. However, it should be understood that the availability of a deposit does not constitute a license to practice the subject invention in derogation of patent rights granted by governmental action.
Further, the subject culture deposits will be stored and made available to the public in.
accord with the provisions of the Budapest Treaty for the Deposit of Microorganisms, they will be stored with all the care necessary to keep them viable and uncontaminated for a period of at least five years after the most recent request for the furnishing of a sample of the deposit, and in any case, for a period of at least thirty (30) years after the date of deposit or for the enforceable life of any patent which may issue disclosing the culture(s). The depositor acknowledges the duty to replace the deposit(s) should the depository be unable to furnish a sample when requested, due to the condition of a deposit. All restrictions on the availability to the public of the subject culture deposits will be irrevocably removed upon the granting of a patent disclosing them.
Many of the strains useful according to the subject invention are readily available.bv virtue of the issuance of patents disclosing these strains or by their deposit in public collections or by their inclusion in commercial products. For example, the B.t. strain used in the commercial product, Javelin, and the HD isolates are all publicly available.
Mutants of the isolates referred to herein can be made by procedures well known in the art. For example, an asporogenous mutant can be obtained through ethylmethane sulfonate (EMS) mutagenesis of an isolate. The mutants can be made using ultraviolet light and nitrosoguanidine by procedures well known in the art.
In one embodiment, the subject invention concerns materials and methods including nucleotide primers and probes for isolating, characterizing, and identifying Bacillus genes 15 encoding protein toxins which are active against non-mammalian pests. The nucleotide sequences described herein can also be used to identify new pesticidal Bacillus isolates. The invention further concerns the genes, isolates, and toxins identified using the methods and materials disclosed herein.
The new toxins and polynucleotide sequences provided here are defined according to several parameters. One characteristic of the toxins described herein is pesticidal activity. In a specific embodiment, these toxins have activity against coleopteran and/or lepidopteran pests.
The toxins and genes of the subject invention can be further defined by their amino acid and nucleotide sequences. The sequences of the molecules can be defined in terms of homology to certain exemplified sequences as well as in terms of the ability to hybridize with, or be amplified 25 by, certain exemplified probes and primers. The toxins provided herein can also be identified based on their immunoreactivity with certain antibodies.
An important aspect of the subject invention is the identification and characterization of new families of Bacillus toxins, and genes which encode these toxins. These families have been designated MIS-1, MIS-2, MIS-3, MIS-4, MIS-5, MI--6, WAR-1, and SUP-1. Toxins within these families, as well as genes encoding toxins within these families, can readily be identified as described herein by, for example, size, amino acid or DNA sequence, and antibody reactivity. Amino acid and DNA sequence characteristics include homology with exemplified sequences, ability to hybridize with DNA probes, and ability to be amplified with specific primers.
12 The MIS-1 family of toxins includes toxins from isolate PS68F. Also provided are hybridization probes and PCR primers which specifically identify genes falling in the MIS-1 family.
A second family of toxins identified herein is the MIS-2 family. This family includes toxins which can be obtained from isolates PS66D3, PS197T1, and PS31J2. The subject invention further provides probes and primers for the identification of MIS-2 toxins and genes.
A third family of toxins identified herein is the MIS-3 family. This family includes toxins which can be obtained from B.t. isolates PS69AA2 and PS33D2. The subject invention further provides probes and primers for identification of the MIS-3 genes and toxins.
Polynucleotide sequences encoding MIS-4 toxins can be obtained from the B.t. isolate designated PS197U2. The subject invention further provides probes and primers for the identification of genes and toxins in this family.
A fifth family of toxins identified herein is the MIS-5 family. This family includes toxins which can be obtained from B.t. isolates KB33 and KB38. The subject invention further 1 5 provides probes and primers for identification of the MIS-5 genes and toxins.
A sixth family of toxins identified herein is the MIS-6 family. This family includes toxins which can be obtained from B.t. isolates PS196F3, PS168G1, PS196J4, PS202E1, PSIOEI, and PS185AA2. The subject invention further provides probes and primers for identification of the MIS-6 genes and toxins.
In a preferred embodiment, the genes of the MIS family encode toxins having a •*molecular weight of about 70 to about 100 kDa and, most preferably, the toxins have a size of about 80 kDa. Typically, these toxins are soluble and can be obtained from the supernatant of Bacillus cultures as described herein. These toxins have toxicity against non-mammalian pests.
In a preferred embodiment, these toxins have activity against coleopteran pests. The MIS 25 proteins are further useful due to their ability to form pores in cells. These proteins can be used with second entities including, for example, other proteins.. When used with a second entity, the MIS protein will facilitate entry of the second agent into a target cell. In a preferred embodiment, the MIS protein interacts with MIS receptors in a target cell and causes pore formation in the target cell. The second entity may be a toxin or another molecule whose entry into the cell is desired.
The subject invention further concerns a family of toxins designated WAR-I. The WAR-I toxins typically have a size of about 30-50 kDa and, most typically, have a size of about kDa. Typically, these toxins are soluble and can be obtained from the supernatant of Bacillus cultures as described herein. The WAR-I toxins can be identified with primers described herein 13 as well as with antibodies. In a specific embodiment, the antibodies can be raised to, for example, toxin from isolate PS177C8.
An additional family of toxins provided according to the subject invention are the toxins designated SUP-1. Typically, these toxins are soluble and can be obtained from the supernatant of Baciiius cultures as described herein. In a preferred embodiment, the SUP-1 toxins are active against lepidopteran pests. The SUP-I toxins typically have a size of about 70-100 kDa and, preferably, about 80 kDa. The SUP-1 family is exemplified herein by toxins from isolates PS49C and PS158C2. The subject invention provides probes and primers useful for the identification of toxins and genes in the SUP-I family The subject invention further provides specific Bacillus toxins and genes which did not fall into any of the new families disclosed herein. These specific toxins and genes include toxins and genes which can be obtained from PS177C8 and PS 17718.
Toxins in the MIS, WAR, and SUP families are all soluble and can be obtained as described herein from the supernatant of Bacillus cultures. These toxins can be used alone or in combination with other toxins to control pests. For example, toxins from the MIS families may be used in conjunction with WAR-type toxins to achieve control of pests, particularly coleopteran pests. These toxins may be used, for example, with 6-endotoxins which are obtained from Bacillus isolates.
•Table 1 provides a summary of the novel families of toxins and genes of the subject invention. Each of the six MIS families is specifically exemplified herein by toxins which can be obtained from particular B.t. isolates as shown in Table 1. Genes encoding toxins in each of these families can be identified by a variety of highly specific parameters, including the ability to hybridize with the particular probes set forth in Table 1. Sequence identity in excess of about with the probes set forth in Table I can also be used to identify the genes of the various *25 families. Also exemplified are particular primer pairs which can be used to amplify the genes of the subject invention. A portion of a gene within the indicated families would typically be amplifiable with at least one of the enumerated primer pairs. In a preferred embodiment, the amplified portion would be of approximately the indicated fragment size. Primers shown in Table 1 consist of po!ynilcieoicide scquences which encode peptides as shown in the sequence listing attached hereto. Additional primers and probes can readily be constructed by those skilled in the art such that alternate polynucleotide sequences encoding the same amino acid sequences can be used to identify and/or characterize additional genes encoding pesticidal toxins. In a preferred embodiment, these additional toxins, and their genes, could be obtained from Bacillus isolates.
Table 1.
Family Isolates Probes Primer Pairs Fragment size (SEQ ED NO.) (SEQ ID NOS.) (nt) MSIPS68F 26 56 and I11 69 56 and 112 506 58 and 112 458 MIS-2 PS6633, PSI197TI, PS31IJ2 24, 41, 20 62 and 113 160 62 and 114 239 62 and 115 400 62 and 116 509 62 and 117 703 64 and 114 102 64 and 115 263 64 and 116 372 64 and 117 566 66 and 115 191 66 and 116 300 66 and 117 494 68 and 116 131 68 and 117 325 70Oand 117 213 MIS-3 PS69AA2, PS33D32 28,22 74 and 118 141 5
S
6 0 S SO *55* I S 74 and 119 74 and 120 74 and 121 74 and 122 74 and 123 74 and 124 76 and 119 76 and 120 76 and 121 76 and 122 76 and 123 .76 and 124 and 12s) 78 and 121 78 and 122 78 and 123 78 and 124 and 121 and 122 376 389 483 715 743 902 253 266 360 592 620 779 31 125 357 385 544 116 348 Family Isolates Probes Primer Pairs Fragment size (SEQ ID NO.) (SEQ ID NOS.) (nt) and 123 376 and 124 535 82 and 122 252 82 and 123 280 82 and 124 439 84 and 123 46 84 and 124 205 86 and 124 177 MIS-4 PS197U2 43 90 and 125 517 90 and 126 751 and 127 821 92 and 126 258 92 and 127 328 94 and 127 92 15 MIS-5 KB33, KB38 47,48 97 and 128 109
OSSS
97 and 129 379 97 and 130 504 98 and 129 291 98 and 130 416 20 99 and 130 144 MIS-6 PS196F3, PS168G1, 18,30,35,37, 102 and 131 66 PS 196J4, PS202E 1, 39,45 PSIOE1, PS185AA2 102 and 132 259 102 and 133 245 102 and 134 754 *oSo 104 and 132 213 104 and 133 199 104 and 134 708 106 and 133 31 106 and 134 518 108 and 134 526 SUP-I PS49C. PS158C2 10.12.15 53 and 54 370 Furthermore, chimeric toxins may be used according to the subject invention. Methods have been developed for making useful chimeric toxins by combining portions of B.I. crystal proteins. The portions which are combined need not, themselves, be pesticidal so long as the combination of portions creates a chimeric protein which is pesticidal. This can be done using restriction enzymes, as described in, for example, European Patent 0 228 838; Ge, N.L.
Shivarova, D.H. Dean (1989) Proc. Nail. Acad Sci. USA 86:4037-4041; Ge, D. Rivers, R.
Milne, D.H. Dean (1991) J. Biol. Chem. 266:17954-17958; Schnepf, K. Tomczak, J.P.
Ortega, H.R. Whiteley (1990) J. Biol. Chem. 265:20923-20930; Honee, D. Convents, J. Van Rie, S. Jansens, M. Peferoen, B. Visser (1991) Mol. Microbiol. 5:2799-2806. Alternatively, recombination using cellular recombination mechanisms can be used to achieve similar results.
See, for example, Caramori, A.M. Albertini, A. Galizzi (1991) Gene 98:37-44; Widner, H.R. Whiteley (1990) J Bacteriol. 172:2826-2832; Bosch, B. Schipper, H. van der Kliej, R.A. de Maagd, W.J. Stickema (1994) Biotechnolog)' 12:915-918. A number of other methods are known in the an by which such chimeric DNAs can be made. The subject invention is meant to include chimeric proteins that utilize the novel sequences identified in the subject 10 application.
**With the teachings provided herein, one skilled in the art could readily produce and use the various toxins and polynucleotide sequences described herein.
Genes and toxins. The genes and toxins useful according to the subject invention ~include not only the full length sequences but also fragments of these sequences, variants, mutants, and fusion proteins which retain the characteristic pesticidal activity of the toxins specifically exemplified herein. Chimeric genes and toxins, produced by combining portions from more than one Bacillus toxin or gene, may also be utilized according to the teachings of the subject invention. As used herein, the terms "variants" or "variations" of genes refer to nucleotide sequences which encode the same toxins or which encode equivalent toxins having 20 pesticidal activity. As-used herein, the term "equivalent toxins" refers to toxins having the same or essentially the same biological activity against the target pests as the exemplified toxins.
It is apparent to a person skilled in this art that genes encoding active toxins can be identified and obtained through several means. The specific genes exemplified herein may be obtained from the isolates deposited at a culture depository as described above. These genes, or portions or variants thereof, may also be constructed synthetically, for example, by use of a gene synthesizer. Variations of genes may be readily constructed using standard techniques for making point mutations. Also, fragments of these genes can be made using commercially available exonucleas.c or endonucleases according to standard procedures. For example, enzymcs such 3a.z31 or sitZ-di-z.ted mutagenesis can be used to systematically cut off nucleotides from the ends of these genes. Also, genes which encode active fragments may be obtained using a variety of restriction enzymes. Proteases may be used to directly obtain active fragments of these toxins.
Equivalent toxins and/or genes encoding these equivalent toxins can be derived from Bacillus isolates and/or DNA libraries using the teachings provided herein. There are a number of methods for obtaining the pesticidal toxins of the instant invention. For example, antibodies to the pesticidal toxins disclosed and claimed herein can be used to identify and isolate toxins from a mixture of proteins. Specifically, antibodies may be raised to the portions of the toxins which are most constant and most distinct from other Bacillus toxins. These antibodies can then be used to specifically identify equivalent toxins with the characteristic activity by immunoprecipitation, enzyme linked immunosorbent assay (ELISA), or Western blotting.
Antibodies to the toxins disclosed herein, or to equivalent toxins, or fragments of these toxins, can readily be prepared using standard procedures m this art. The genes which encode these toxins can then be obtained from the microorganism.
10 Fragments and equivalents which retain the pesticidal activity of the exemplified toxins are within the scope of the subject invention. Also, because of the redundancy of the genetic code, a variety of different DNA sequences can encode the amino acid sequences disclosed herein. It is well within the skill of a person trained in the art to create these alternative.DNA sequences encoding the same, or essentially the same, toxins. These variant DNA sequences are within the scope of the subject invention. As used herein, reference to "essentially the same" sequence refers to sequences which have amino acid substitutions, deletions, additions, or insertions which do not materially affect pesticidal activity. Fragments retaining pesticidal activity are also included in this definition.
A further method for identifying the toxins and genes of the subject invention is through the use of oligonucleotide probes. These probes are detectable nucleotide sequences. Probes provide a rapid method for identifying toxin-encoding genes of the subject invention. The nucleotide segments which are used as probes according to the invention can be synthesized .using a DNA synthesizer and standard procedures.
Certain toxins of the subject invention have been specifically exemplified herein. Since these toxins are merely exemplary of the toxins of the subject invention, it should be readily apparent that the subject invention comprises variant or equivalent toxins (and nucleotide sequences coding for equivalent toxins) having the same or similar pesticidal activity of the exemplified toxin. Equivalent toxins will have amino acid .omology with an c;cmplifi-d toxin.
This amino acid identity will typically be greater than 69%, preferably be greater than more preferably greater than 80%, more preferably greater than 90%, and can be greater than These identities are as determined using standard alignment techniques. The amino acid homology will be highest in critical regions of the toxin which account for biological activity or are involved in the determination of three-dimensional configuration which ultimately is responsible for the biological activity. In this regard, certain amino acid substitutions are 18 acceptable and can be expected if these substitutions are in regions which are not critical to activity or are conservative amino acid substitutions which do not affect the three-dimensional configuration of the molecule. For example, amino acids may be placed in the following classes: non-polar, uncharged polar, basic, and acidic. Conservative substitutions whereby an amino acid of one class is replaced with another amino acid of the same type fall within the scope of the subject invention so long as the substitution does not materially alter the biological activity of the compound. Table 2 provides a listing of examples of amino acids belonging to each class.
Table 2.
15 Class of Amino Acid Examples of Amino Acids Nonpolar Ala, Val, Leu, lie, Pro, Met, Phe, Trp Uncharged Polar Gly, Ser, Thr, Cys, Tyr, Asn, Gin Acidic Asp, Glu Basic Lys, Arg, His In some instances, non-conservative substitutions can also be made. The critical factor is that these substitutions must not significantly detract from the biological activity of the toxin.
The 6-endotoxins of the subject invention can also be characterized in terms of the shape and location of toxin inclusions, which are described above.
As used herein, reference to "isolated" polynucleotides and/or "purified" toxins refers to these molecules when they are not associated with the other molecules with which they would be found in nature. Thus, reference to "isolated and purified" signifies the involvement of the "hand of man" as described herein. Chimeric toxins and genes also involve the "hand of man." Recombinant hosts. The toxin-encoding genes of the subject invention can be intoduced into a wide variety of microbial or plant hosts. Expression of the toxin gene results, dircty or indirectly, in the production and maintenance of the pesticide. With suitable microbial hosts, Pseudomonas, the microbes can be applied to the situs of the pest, where they will proliferate and be ingested. The result is a control of the pest. Alternatively, the microbe hosting the toxin gene can be killed and treated under conditions that prolong the activity of the toxin and stabilize the cell. The treated cell, which retains the toxic activity, then can be applied to the environment of the target pest.
Where the Bacillus toxin gene is introduced via a suitable vector into a microbial host, and said host is applied to the environment in a living state, it is essential that certain host microbes be used. Microorganism hosts are selected which are known to occupy the "phytosphere" (phylloplane, phyllosphere, rhizosphere, and/or rhizoplane) of one or more crops of interest. These microorganisms are selected so as to be capable of successfully competing in the particular environment (crop and other insect habitats) with the wild-type microorganisms, provide for stable maintenance and expression of the gene expressing the poiypeptide pesticide, and, desirably, provide for improved protection of the pesticide from environmental degradation and inactivation.
10 A large number of microorganisms are known to inhabit the phylloplane (the surface of ~the plant leaves) and/or the rhizosphere (the soil surrounding plant roots) of a wide variety of important crops. These microorganisms include bacteria, algae, and fungi. Of particular interest are microorganisms, such as bacteria, genera Pseudomonas, Erwinia, Serratia, Klebsiella, Xanthomonas, Streptomyces, Rhizobium, Rhodopseudomonas, Methylophilius, Agrobacterium, Acetobacter, Lactobacillus, Arthrobacter, Azotobacter, Leuconostoc, and Alcaligenes; fungi, particularly yeast, genera Saccharomyces, Cryptococcus, Kluyveromyces, Sporobolomyces, Rhodotorula, and Aureobasidium. Of particular interest are such phytosphere bacterial species as Pseudomonas syringae, Pseudomonasfluorescens, Serratia marcescens, Acetobacterxylinum, Agrobacterium tumefaciens, Rhodopseudomonas spheroides, Xanthomonas campestris, *20 Rhizobium melioti, Alcaligenes entrophus, and Azotobacter vinlandii; and phytosphere yeast species such as Rhodotorula rubra, R. glutinis, R. marina, R. aurantiaca, Cryptococcus albidus, C. diffluens, C. laurentii, Saccharomyces rosei, S. pretoriensis, S. cerevisiae, Sporobolomyces S roseus, S. odorus, Kluyveromyces veronae, and Aureobasidium pollulans. Of particular interest are the pigmented microdrganisms.
A wide variety of ways are available for introducing a Bacillus gene encoding a toxin into a microorganism host under conditions which allow for stable maintenance and expression of the gene. These methods are well known to those skilled in the art and are described, for example, in United States Patent No. 5,135,867, which is incorporated herein by reference.
Synthetic genes whi.h aie finctic:.a!ly equivalent to the toxins of the subject invention can also be used to transform hosts. Methods for the production of synthetic genes can be found in, for example, U.S. Patent No. 5,380,831.
Treatment of cells. As mentioned above, Bacillus or recombinant cells expressing a Bacillus toxin can be treated to prolong the toxin activity and stabilize the cell. The pesticide microcapsule that is formed comprises the Bacillus toxin within a cellular structure that has been stabilized and will protect the toxin when the microcapsule is applied to the environment of the target pest. Suitable host cells may include either prokaryotes or eukarvotes. As hosts, of particular interest will be the prokaryotes and the lower eukaryotes, such as fungi. The cell will usually be intact and be substantially in the proliferative form when treated, rather than in a spore form.
Treatment of the microbial cell, a microbe containing the Bacillus toxin gene, can be by chemical or physical means, or by a combination of chemical and/or physical means, so long as the technique does not deleteriously affect the properties of the toxin, nor diminish the cellular capability of protecting the toxin. Methods for treatment of microbial cells are disclosed 10 in United States Patent Nos. 4,695,455 and 4,695,462, which are incormorated herein by reference.
Methods and formulations for control of pests Control of pests using the isolates, toxins, and genes of the subject invention can be accomplished by a variety of methods known to those skilled in the art. These methods include, for example, the application of Bacillus isolates to the pests (or their location), the application of recombinant microbes to the pests (or their locations), and the transformation of plants with genes which encode the pesticidal toxins of the subject invention. Transformations can be made by those skilled in the art using standard techniques.
Materials necessary for these transformations are disclosed herein or are otherwise readily available to the skilled artisan.
20 Formulated bait granules containing an attractant and the toxins of the Bacillus isolates, or recombinant microbes comprising the genes obtainable from the Bacillus isolates disclosed herein, can be applied to the soil. Formulated product can also be applied as a seed-coating or root treatment or total plant treatment at later stages of the crop cycle. Plant and soil treatments of Bacillus cells may be employed as wettable powders, granules or dusts, by mixing with various inert materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, phosphates, and the like) or botanical materials (powdered corncobs, rice hulls, walnut shells, and the like). The formulations may include spreader-sticker adjuvants, stabilizing agents, other pesticidal additives, or surfactants. Liquid formulations may be aqueous-based or non -aqueol-s and employed as foams, gels, suspensions, emulsifiahie concentrates, or the like. The ingredients may include rheological agents, surfactants, emulsifiers, dispersants, or polymers.
As would be appreciated by a person skilled in the art, the pesticidal concentration will vary widely depending upon the nature of the particular formulation, particularly whether it is a concentrate or to be used directly. The pesticide will be present in at least 1% by weight and may be 100% by weight. The dry formulations will have from about 1-95% by weight of the pesticide while the liquid formulations will generally be from about 1-60% by weight of the solids in the liquid phase. The formulations that contain cells will generally have from about 102 to about 104 cells/mg. These formulations will be administered at about 50 mg (liquid or dry) to 1 kg or more per hectare.
The formulations can be applied to the environment of the pest, soil and foliage, by spraying, dusting, sprinkling, or the like.
Polvnucleotide probes. It is well known that DNA possesses a fundamental property called base complementarity. In nature, DNA ordinarily exists in the form of pairs of antiparallel strands, the bases on each strand projecting from that strand toward the opposite strand.
10 The base adenine on one strand will always be opposed to the base thymine on the other strand, and the base guanine will be opposed to the base cytosine The bases are held in apposition by their ability to hydrogen bond in this specific way. Though each individual bond is relatively weak, the net effect of many adjacent hydrogen bonded bases, together with o0* base stacking effects, is a stable joining of the two complementary strands. These bonds can be broken by treatments such as high pH or high temperature, and these conditions result in the dissociation, or "denaturation," of the two strands. If the DNA is then placed in conditions which make hydrogen bonding of the bases thermodynamically favorable, the DNA strands will anneal, or "hybridize," and reform the original double stranded DNA. If carried out under appropriate conditions, this hybridization can be highly specific. That is, only strands with a 20 high degree of base complementarity will be able to form stable double stranded structures. The relationship of the specificity of hybridization to reaction conditions is well known. Thus, hybridization may be used to test whether two pieces of DNA are complementary in their base sequences. It is this hybridization mechanism which facilitates the use of probes of the subject invention to readily detect' and characterize DNA sequences of interest.
The probes may be RNA or DNA. The probe will normally have at least about 10 bases, more usually at least about 17 bases, and may have up to about 100 bases or more. Longer probes can readily be utilized, and such probes can be, for example, several kilobases in length.
The probe sequence is designed to be at least substantially complementa-r to a portion of a gene encoding a toxin of interest. The probe necd not have perfect compiemertarity to the sequence to which it hybridizes. The probes may be labelled utilizing techniques which are well known to those skilled in this art.
One approach for the use of the subject invention as probes entails first identifying by Southern blot analysis of a gene bank of the Bacillus isolate all DNA segments homologous with the disclosed nucleotide sequences. Thus, it is possible, without the aid of biological analysis, to know in advance the probable activity of many new Bacillus isolates, and of the individual gene products expressed by a given Bacillus isolate. Such a probe analysis provides a rapid method for identifying potentially commercially valuable insecticidal toxin genes within the multifarious subspecies of B.I.
One hybridization procedure useful according to the subject invention typically includes the initial steps of isolating the DNA sample of interest and purifying it chemically. Either lysed bacteria or total fractionated nucleic acid isolated from bacteria can be used. Cells can be treated using known techniques to liberate their DNA (and/or RNA). The DNA sample can be cut into pieces with an appropriate restriction enzyme. The pieces can be separated by size through i 10 electrophoresis in a gel, usually agarose or acrylamide. The pieces of interest can be transferred to an immobilizing membrane.
The particular hybridization technique is not essential to the subject invention. As improvements are made in hybridization techniques, they can be readily applied.
The probe and sample can then be combined in a hybridization buffer solution and held at an appropriate temperature until annealing occurs. Thereafter, the membrane is washed free of extraneous materials, leaving the sample and bound probe molecules typically detected and quantified by autoradiography and/or liquid scintillation counting. As is well known in the art, if the probe molecule and nucleic acid sample hybridize by forming a strong non-covalent bond between the two molecules, it can be reasonably assumed that the probe and sample are 20 essentially identical. The probe's detectable label provides a means for determining in a known manner whether hybridization has occurred.
In the use of the nucleotide segments as probes, the particular probe is labeled with any suitable label known to those skilled in the art, including radioactive and non-radioactive labels.
Typical radioactive labels include or the like. Non-radioactive labels include, for example, ligands such as biotin or thyroxine, as well as enzymes such as hydrolases or perixodases, or the various chemiluminescers such as luciferin, or fluorescent compounds like fluorescein and its derivatives. The probes may be made inherently fluorescent as described in International Application No. WO 93/16094.
Various degrees of stringency of hybridization can be e;npioved. The more severe the conditions, the greater the complementarity that is required for duplex formation. Severity can be controlled by temperature, probe concentration, probe length, ionic strength, time, and the like. Preferably, hybridization is conducted under moderate to high stringency conditions by techniques well known in the art, as described, for example, in Keller, M.M. Manak (1987) DNA Probes, Stockton Press, New York, NY., pp. 169-170.
As used herein "moderate to high stringency" conditions for hybridization refers to conditions which achieve the same, or about the same, degree of specificity of hybridization as the conditions employed by the current applicants. Examples of moderate and high stringency conditions are provided herein. Specifically, hybridization of immobilized DNA on Southern blots with 32P-labeled gene-specific probes was performed by standard methods (Maniatis et In general, hybridization and subsequent washes were carried out under moderate to high stringency conditions that allowed for detection of target sequences with homology to the exemplified toxin genes. For double-stranded DNA gene probes, hybridization was carried out overnight at 20-25 C below the melting temperature (Tm) of the DNA hybrid in 6X SSPE, 10 Denhardt's solution, 0.1% SDS, 0.1 mg/ml denatured DNA. The melting temperature is described by the following formula (Beltz, K.A. Jacobs, T.H. Eickbush, P.T. Cherbas. and F.C. Kafatos [1983] Methods ofEnzymology, R. Wu, L. Grossman and K. Moldave [eds.] Academic Press, New York 100:266-285).
Tm=81.5 C+16.6 Log[Na+]+0.41 (%formamide)-600/length of duplex in base pairs.
Washes are typically carried out as follows: Twice at room temperature for 15 minutes in 1X SSPE, 0.1% SDS (low stringency wash).
Once at Tm-20°C for 15 minutes in 0.2X SSPE, 0.1% SDS (moderate 20 stringency wash).
For oligonucleotide probes, hybridization was carried out overnight at 10-20 0 C below the melting temperature (Tm) of the hybrid in 6X SSPE, 5X Denhardfs solution, 0.1% SDS, 0.1 mg/ml denatured DNA. Tm for oligonucleotide probes was determined by the following formula: Tm (°C)=2(number T/A base pairs) +4(number G/C base pairs) (Suggs, T.
Miyake, E.H. Kawashime, MJ. Johnson, K. Itakura, and R.B. Wallace [1981] ICN-UCLA Symp.
Dev. Biol. Using Purified Genes, D.D. Brown Academic Press, New York, 23:683-693).
Washes were typically carried out as follows: Tw-ie -t rcsm temperature for 15 minutes IX SSPE, 0.1% SDS (low stringency wash).
Once at the hybridization temperature for 15 minutes in IX SSPE, 0.1% SDS (moderate stringency wash).
In general, salt and/or temperature can be altered to change stringency. With a labeled DNA fragment >70 or so bases in length, the following conditions can be used: Low: 1 or 2X SSPE, room temperature Low: 1 or 2X SSPE, 42°C Moderate: 0.2X or IX SSPE, High: 0.1X SSPE, Duplex formation and stability depend on substantial complementarity between the two strands of a hybrid, and, as noted above, a certain degree of mismatch can be tolerated.
Therefore, the probe sequences of the subject invention include mutations (both single and multiple), deletions, insertions of the described sequences, and combinations thereof, wherein said mutations, insertions and deletions permit formation of stable hybrids with the target 10 polynucleotide of interest. Mutations, insertions, and deletions can be produced in a given *i polynucleotide sequence in many ways, and these methods are known to an ordinarily skilled artisan. Other methods may become known in the future.
Thus, mutational, insertional, and deletional variants of the disclosed nucleotide sequences can be readily prepared by methods which are well known to those skilled in the art.
These variants can be used in the same manner as the exemplified primer sequences so long as the variants have substantial sequence homology with the original sequence. As used herein, substantial sequence homology refers to homology which is sufficient to enable the variant probe to function in the same capacity as the original probe. Preferably, this homology is greater than 50%; more preferably, this homology is greater than 75%; and most preferably, this homology is greater than 90%. The degree of homology needed for the variant to function in its intended capacity will depend upon the intended use of the sequence. It is well within the skill of a person trained in this art to make mutational, insertional, and deletional mutations which are designed to improve the function of the sequence or otherwise provide a methodological advantage.
PCR technoloy. Polymerase Chain Reaction (PCR) is a repetitive, enzymatic, primed synthesis of a nucleic acid sequence. This procedure is well known and commonly used by those skilled in this art (see Mullis, U.S. Patent Nos. 4,683,195, 4,683,202, and 4,800,159; Saiki, Randall Stephen Scharf, Fred Faloona, Karv B. -iullis, Gleni: T. Horn. Henry A. Erlich, Norman Arnheim [1985] "Enzymatic Amplification ot P-Giobi Genomic Sequences and Restriction Site Analysis for Diagnosis of Sickle Cell Anemia," Science 230:1350-1354.).
PCR
is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two oligonucleotide primers that hybridize to opposite strands of the target sequence. The primers are oriented with the 3' ends pointing towards each other. Repeated cycles of heat denaturation of the template, annealing of the primers to their complementary sequences, and extension of the annealed primers with a DNA polyinerase result in the amplification of the segment defined by the 5' ends of the PGR primers. Since the extension product of each primer can serve as a template for the other primer, each cycle essentially doubles the amount of DNA fragment produced in the previous cycle. This results in the exponential accumulation of the specific target fragment, up to several million-fold in a few hours. By using a thermostable DNA polvinerase such as Taq polymerase, which is isolated from the thermophilic bacterium Therm us aquaticus, the amplification process can be completely automated. Other enzymes which can be used are known to those skilled in the art.
The DNA sequences of the subject invention can be used as primers for PCR amplification. In performing PCR amplification, a certain degree of mismatch can be tolerated between primer and template. Therefore, mutations, deletions, and insertions (especially additions of nucleotides to the 5' end) of the exemplified primers fall within the scope of the subject invention. Mutations, insertions and deletions can be produced in a given primer by methods known to an ordinarily skilled artisan.
All of the U.S. patents cited herein are hereby incorporated by reference.
Following are examples which illustrate procedures for practicing the invention. These examples should not be construed as limiting. All percentages are by weight and all solvent mixture proportions are by volume unless otherwise noted.
a Example I Culturing of Bacillus Isolates Useful According to the Invention Growth of cells. The cellular host containing the Bacillus insecticidal gene may be grown in any convenient nutrient medium. These cells may then be harvested in accordance with conventional ways.- Alternatively, the cells can be treated prior to harvesting.
The Bacillus cells of the invention can be cultured using standard art media and fermentation techniques. During the fermentation cycle, the bacteria can be harvested by first separating the Bacillus vegetative cells, spores, crystals, and lysed cellular debris from the fermentation broth by means well known in the art. Any Bacillus spares or crystal 6-endotoxins forned can be recovered employing well-known techniques and used ccnivent.ona: Iendotoxin B.t. preparation. The supernatant from the fermentation process contains the toxins of the present invention. The toxins are isolated and purified employing well-known techniques.
A subculture of Bacillus isolates, or mutants thereof, can be used to inoculate the following medium, known as TB broth: Tryptone 12 g/l Yeast Extract 24 g/l Glycerol 4 g/l KH,PO, 2.1 g/ K,HPO, 14.7 g/l pH 7.4 The potassium phosphate was added to the autoclaved broth after cooling. Flasks were incubated at 30°C on a rotary shaker at 250 rpm for 24-36 hours.
The above procedure can be readily scaled up to large fermentors by procedures well known in the art.
oo*oo* The Bacillus obtained in the above fermentation, can be isolated by procedures well "known in the art. A frequently-used procedure is to subject the harvested fermentation broth to o separation techniques, centrifugation. In a specific embodiment, Bacillus proteins useful according the present invention can be obtained from the supernatant. The culture supernatant containing the active protein(s) can be used in bioassays.
Alternatively, a subculture of Bacillus isolates, or mutants thereof, can be used to inoculate the following peptone, glucose, salts medium: Bacto Peptone 7.5 g/l 20 Glucose 1.0 g/l
KH
2 PO, 3.4 g/l
K
2 ,HPO, 4.35 g/l Salt Solution 5.0 mI/l CaCI 2 Solution 5.0 ml/l pH 7.2 Salts Solution (100 ml) MgSOg-7H,O 2.46 g MnSO 4 0.04 g ZnSO,-7H,0 0.28 g FeSO,-7H 2 0 0.40 g CaCI 2 Solution (100 ml) CaCI 2 2H 2 O 3.66 g 27 The salts solution and CaCI, solution are filter-sterilized and added to the autoclaved and cooked broth at the time of inoculation. Flasks are incubated at 30 0 C on a rotary shaker at 200 rpm for 64 hr.
The above procedure can be readily scaled up to large fermentors by procedures well known in the an.
The Bacillus spores and/or crystals, obtained in the above fermentation, can be isolated by procedures well known in the art. A frequently-used procedure is to subject the harvested fermentation broth to separation techniques, centrifugation.
Example 2 Isolation and Preparation of Cellular DNA for PCR DNA can be prepared from cells grown on Spizizen's agar, or other minimal or enriched agar known to those skilled in the art, for approximately 16 hours. Spizizen's casamino acid agar comprises 23.2 g/l Spizizen's minimal salts 120 g; KHPO,, 840 g; KH,PO,, 360 g; sodium citrate, 60 g; MgSO,-7H,O, 12 g. Total: 1392 1.0 g/l vitamin-free casamino acids; 15.0 g/1 Difco agar. In preparing the agar, the mixture was autoclaved for 30 minutes, then a sterile, 50% glucose solution can be added to a final concentration of 0.5% (1/100 vol). Once ;the cells are grown for about 16 hours, an approximately 1 cm 2 patch of cells can be scraped from the agar into 300 pl of 10 mM Tris-HCl (pH 8.0)-1 mM EDTA. Proteinase K was added to 50 gg/ml and incubated at 55°C for 15 minutes. Other suitable proteases lacking nuclease 20 activity can be used. The samples were then placed in a boiling water bath for 15 minutes to inactivate the proteinase and denature the DNA. This also precipitates unwanted components.
The samples are then centrifuged at 14,000 x g in an Eppendorfmicrofuge at room temperature for 5 minutes to remove cellular debris. The supernatants containing crude DNA were transferred to fresh tubes and frozen at -20°C until used in PCR reactions.
Alternatively, total cellular DNA may be prepared from plate-grown cells using the QIAamp Tissue Kit from Qiagen (Santa Clarita, CA). following instructions from the manufacturer.
Example 3 Use of PCR Primers to Characteriz _anrdcnrifyZQ .l Genes Two primers useful in PCR procedures were designed to identify genes that encode pesticidal toxins. Preferably, these toxins are active against lepidopteran insects. The DNA from B.t. strains was subjected to PCR using these primers. Two clearly distinguishable molecular weight bands were visible in "positive" strains, as outlined below. The frequency of strains yielding a 339 bp fragment was 29/95 This fragment is referred to herein as the "339 bp fragment" even though some small deviation in the exact number of base pairs may be observed.
GARCCRTGGA AAGCAAATAA TAARAATGC (SEQ ID NO. AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2) The strains which were positive for the 339 bp fragment (29 strains) were: PS 11B, PS31G1, PS36A, PS49C, PS81A2, PS81F, PS81GG, PS81I, PS85AI, PS86BB1. PS86VI, PS86W1, PS89J3, PS91C2, PS94RI, PS101DD, PS158C2, PS185U2, PS192M4, PS202S, PS213E5, PS218G2, PS244A2, HD29, HD110, HD129, HD525, HD573a, and Javelin 1990.
The 24 strains which gave a larger (approximately 1.2 kb) fragment were: PS24J, PS33F2, PS45B1, PS52A1, PS62B1, PS80PP3, PS86A1, PS86Q3, PS88F16, PS92B, PS101Z2, *e *PS123D1, PS157C1, PS169E, PS177FI, PS177G, PS185L2, PS201L1, PS204C3, PS204G4, PS242H10, PS242K 17, PS244A2, PS244D 1.
It was found that Bacillus strains producing lepidopteran-active proteins yielded only the 339 bp fragment. Few, if any, of the strains amplifying the approximately 1.2 kb fragment ;had known lepidopteran activity, but rather were coleopteran-, mite-, and/or nematode-active B.I. crystal protein producing strains.
20 Example 4 DNA Sequencing of Toxin Genes Producing the 339 Fragment PCR-amplified segments of toxin genes present in Bacillus strains can be readily sequenced. To accomplish this, amplified DNA fragments can be first cloned into the PCR DNA TA-cloning plasmid vector, pCRU, as described by the supplier (Invitrogen, San Diego, CA). Individual pCR clones from the mixture of amplified DNA fragments from each Bacillus strain are chosen for sequencing. Colonies are lysed by boiling to release crude plasmid DNA.
DNA templates for automated sequencing are amplified by PCR using vector-specific primers flanking the plasmid multiple cloning sites. These DNA templates are sequenced using Applied Biosystems (Foster City, CA) automated sequencing methodologies. The polypeptide sequences can 'e deduced from these nucleotide sequences.
DNA from three of the 29 B.t. strains which amplified the 339 bp fragments were sequenced. A DNA sequence encoding a toxin from strain PS36A is shown in SEQ ID NO. 3.
An amino acid sequence for the 36A toxin is shown in SEQ ID. NO 4. A DNA sequence encoding a toxin from strain PS81F is shown in SEQ ID NO. 5. An amino acid sequence for the 81F toxin is shown in SEQ ID. NO 6. A DNA sequence encoding a toxin from strain Javelin 1990 is shown in SEQ ID NO. 7. An amino acid sequence for the Javelin 1990 toxin is shown in SEQ ID. NO 8.
Example 5 Determination of DNA Seauences fromnAdditional Genes Encodin Toxins from Strains PS158C2 and PS49C Genes encoding novel toxins were identified from isolates PS158C2 and PS49C as follows: Total cellular DNA was extracted from B.I. strains using Qiagen (Santa Clarita, CA) Genomic-tip 500/G DNA extraction kits according to the supplier and was subjected to PCR using the oligonucleotide primer pairs listed below. Amplified DNA fragments were purified on Qiagen PCR purification columns and were used as templates for sequencing.
For PS158C2. the primers used were as follows.
158C2 PRIMER A:
GCTCTAGAAGGAGGTAACTTATGAACAAGAATAATACTAAATTAAGC
(SEQ ID NO. 9) 339 reverse: AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2) The resulting PCR-amplified DNA fragment was approximately 2kbp in size. This DNA was 20 partially sequenced by dideoxy chain termination using automated DNA sequencing technology (Pekin Elmer/Applied Biosystems, Foster City, CA). A DNA sequence encoding a portion of a soluble toxin from PS158C2 is shown in SEQ ID NO. For PS49C, two separate DNA fragments encoding parts of a novel toxin gene were amplified and sequenced. The first fragment was amplified using the following primer pair: 49C PRIMER A: CATCCTCCCTACACTITCTAA (SEQ ID NO. 11) 339 reverse: AA.ARTTATCT CCCCAWGc TT '-ATCTCCATT TTG (SEQ ID NO. 2) The resulting approximately 1 kbp DNA fragment was used as a template for automated DNA sequence. A sequence of a portion of a toxin gene from strain PS49C is shown in SEQ ID NO.
The second fragment was amplified using the following primer pair: 49C PRIMER B: AAATTATGCGCTAAGTCTGC (SEQ ID NO. 13) 49C PRIMER C: TTGATCCGGACATAATAAT (SEQ ID NO. 14) The resulting approximately 0.57 kbp DNA fragment was used as a template for automated DNA sequencing. An additional sequence of a portion of the toxin gene from PS49C is shown in SEQ ID NO. Example 6 Additional Primers Useful for Characterizina and/or Identifying Toxin Genes The following primer pair can be used to identify and/or characterize genes of the SUP-1 family: SUP-1A: GGATTCGTTATCAGAAA (SEQ ID NO. 53) SUP-1B: CTGTYGCTAACAATGTC (SEQ ID NO. 54) These primers can be used in PCR procedures to amplify a fragment having a predicted size of 20 approximately 370 bp.-A band of the predicted size was amplified from strains PS 158C2 and PS49C.
Example 7 Additional Primers Useful for Characterizine and/or Identifvine Toxin Genes Another set of PCR primers can be used to identify and/or characterize additional genes encoding pesticidal toxins. The sequences of these primers were as follows: GGRTTAMTTGGRTAYTATTT (SEQ ID NO. 16) ATATCKWAYATTKGCATTTA (SEQ ID NO. 17) Redundant nucleotide codes used throughout the subject disclosure are in accordance with the IUPAC convention and include: R AorG M A or C Y C or T K G or T W Aor T Example 8 Identification and Sequencing of Genes Encoding Novel Soluble Protein Toxins from Bacillus Strains PCR using primers SEQ ID NO. 16 and SEQ ID NO. 17 was performed on total cellular genomic DNA isolated from a broad range of Bt strains. Those samples yielding an approximately 1 kb band were selected for characterization by DNA sequencing. Amplified DNA fragments were first cloned into the PCR DNA TA-cloning plasmid vector, pCR2.1, as described by the supplier (Invitrogen, San Diego. CA). Plasmids were isolated from recombinant clones and tested for the presence of an approximately 1 kbp insert by PCR using the plasmid vector primers, T3 and T7.
The following strains yielded the expected band of approximately 1000 bp, thus indicating the presence of a MIS-type toxin gene: PS10E1, PS31J2, PS33D2, PS66D3, PS68F, PS69AA2, PS168G1, PS177C8, PS17718, PS185AA2, PS196F3, PS196J4, PS197T1, PS 197U2, PS202E1, KB33, and KB38.
Plasmids were then isolated for use as sequencing templates using QIAGEN (Santa Clarita, CA) miniprep kits as described by the supplier. Sequencing reactions were performed using the Dye Terminator Cycle Sequencing Ready Reaction Kit from PE Applied Biosystems.
Sequencing reactions were run on a ABI PRISM 377 Automated Sequencer. Sequence data was collected, edited, and assembled using the ABI PRISM 377 Collection, Factura, and AutoAssembler software from PE ABI.
20 DNA sequences were determined for portions of novel toxin genes from the following isolates: PS1 0E, PS31J2, PS33D2, PS66D3, PS68F, PS69AA2, PS168G1, PS177C8, PS17718, PS185AA2, PS196F3, PS196J4, PS197T1, PS 197U2, PS202E1, KB33, and KB38. Polypeptide sequences were deduced for portions of the encoded, novel soluble toxins from the following isolates: PS10EI, PS31J2, PS33D2, PS66D3, PS68F, PS69AA2, PS177C8, PS177I8, PS 185AA2, PS196F3, PS196J4, PS 197T1, PS197U2, and PS202E1. These nucleotide sequences and amino acid sequences are shown in SEQ ID NOS. 18 to 48.
Example 9 Restriction Fragment engrth Polyvorphism (RFLP) of Toxins from Bacillus thuringiensis Strains Total cellular DNA was prepared from various Bacillus thuriengensis strains grown to an optical density of 0.5-0.8 at 600 nm visible light. DNA was extracted using the Qiagen Genomic-tip 500/G kit and Genomic DNA Buffer Set according to protocol for Gram positive bacteria (Qiagen Inc.; Valencia, CA).
Standard Southern hybridizations using 3 2 P-lableled probes were used to identifly and characterize novel toxin genes within the total genomic DNA preparations. Prepared total genomic DNA was digested with various restriction enzymes, electrophoresed on a 1% agarose gel, and immobilized on a supported nylon membrane using standard methods (Maniatis et PCR-arnplified DNA fragments 1.0- 1. 1 kb in length were gel purified for use as probes.
Approximately 25 ng of each DNA fr-agment was used as a template for priming nascent DNA synthesis using DNA polymerase I Klenow fragment England Biolabs), random hexanucleotide primers (Boehringer Mannheim) and 3 2 PdCTP.
Each IIP-lableled fragment served as a specific probe to its corresponding genomic DNA blot. Hybridizations of immobilized DNA with randomly labeled 2 P probes were performed in standard aqueous buffer consisting of 5X SSPE, 5X Denhardt's solution, 0.5% SDS, 0. 1 mg/ml at 65*C overnight. Blots were washed under moderate stringency in 0.2X SSC, 0. 1% SDS at 0 C and exposed to film. RFLP data showing specific hybridization bands containing all or part of the novel gene of interest was obtained for each strain.
(Strain) Probe Seq I.D. RFLP Data (approximate band sizes) Gene Name Number (PS)IOEI 18 EcoRI: 4 and 9 kbp, EcoRV: 4.5 and 6 kbp, KpnI: 12 and 24 kbp, Sacl: 13 and 24 kbp, SaIL >23 kbp, 5 and 15 kbp (PS)3 1J2 20 Apal: >23 kbp, BglII: 6.5 kbp, PstI: >23 kbp, Sadl: kbp, Sall: >23 kbp, XbaI: 5 kbp 20 (PS)33D2 22 Eco~I: 10 kbp, EcoRV: 15 kbp, Hindu: 18 kbp, 9.5 kbp, PstI: 8 kbp (PS)66D3 24 BamHI: 4.5 kbp, HindMl: >23 kbp, Kpnl: 23 kbp, 15 kbp, XbaI: >23 kbp (PS)68F 26 EcoRI: 8.5 and 15 kbp, EcoRV: 7 and 18 kbp, HindIfl: 2.1 and 9.5 kbp, PstI: 3 and 18 kbp, XbaI: 15 kbp (PS)69AA2 28 EcoRV: 9.5 kbp, HindMl: 18 kbp, KpnI: 23 kbp, >23 kbp, Pstl: 10 kbp, Sal: >23 kbp (PS)168GI 30 EcoRI: 10 kbp, EcoRV: 3.5 kbp, NheT: 20 kbp, 20 kbp, Sall: >23 kbp, Xbai. 1s kbp (PS)177C8 31 Hindl: 2 kbp, Xbal: 1, 9and 131kbp (PS)17718 33 Bam.R: >23 kbp, EcoRi: 10 kbp, Hindm: 2 kbp, >23 kbp, XbaI: 3.5 kbp (PS)185AA2 35 EcoRI: 7 kbp, EcoRV: 10 kbp NheI: 4 PstI: 3 kbp, Sail: >23 kbp, Xbal: 4 kbrp (PS)1I96F3 37 EcoRJ: 8 kbp, EcoRV: 9 kbp, NheI: 18 kbp, Pstl: 18 kbp, Sall: 20 kbp, XbaI: 7 kbp (Strain) Probe Seq I.D. RFLP Data (approximate band sizes) Gene Name Number (PS)1I96J4 39 Bam~i: >23 kbp, EcoRd: 3.5 and 4.5 kbp, Pstl: 9 and kbp, Sail: >23 kbp, XbaI: 2.4 and 12 kbp (PS)197T1 41 HindiTi: 10 kbp, KpnI: 20 kbp, Pstl: 20 kbp. Sac!l: Spel: 15 kbp. XbaI: 5 kbp (PS)197U2 43 EcoRi: 5 kbp, EcoRV: 1.9 kbp, NheI: 20 kbp, PstI: kbp. Sail: >23 kbp, XbaI: 7 kbp (PS)202E 1 45 EcoRV: 7 kbp, KpnI: 12 kbp, Nhel: 10 kbp, Pstl: Sail 23 kbp. Xbal: 1.8 kbp KB 33 47 EcoRi: 9 kbp, EcoRV: 6 kbp. HindlIIL 8 kbp, Kpnl: kbp, NheI: 22 kbp. Sail: >23 kbp KB38 48 Bamrifl: 5.5 kbp, EcoRV: 22 kbp, HindHI: 2.2 kbp, kbp, Pstl: >23 kbp *Enzvmes used in genomic DNA digests were chosen on the basis of lacking recognition sites within the sequence of the PCR fragments used as probes for each sample (except 1 77C8 for which the entire operon containing >1 XbaI site within the sequence was used). Strains indicated by asterisk contain more than one gene with high homology to the probe used, as indicated by the presence of multiple hybridizing bands.
Example 10 -Use of Additional PCR Primers for Characterizing and/or Identfidng Novel Genes Another set of PCR primers can be used to identify additional novel genes encoding pesticidal toxins. The sequences of these primers were as follows: ICON-forward: C'IGAYTIAAARATGATRTA (SEQ IDJ NO. 49) ICON-reverse: AATRGCSWATAAATAMGCACC (SEQ ID NO. These primers can be used in PCR procedures to amplify a fragment having a predicted size of about 450 bp.
Strains PSI1 77C8, PS 17718, and PS66D3 were screened and were found to have genes ain-plifiable wid' these PuCN primers. A sequence of a toxin gene from PS 177C8 is shown in SEQ ID NO. 5 1. An amino acid sequence of the 1 77C8-ICON toxin is shown in SEQ ID NO.
52.
Examnle 11 Use of Mixed Primer Pairs to Characterize and/or Identify Toxin Genes Various combinations of the primers described herein can be used to identify and/or characterize toxin genes. PCR conditions can be used as indicated below: SEO ID NO. 16/17 SEO ID NO. 49/50 SEO ID NO. 49/17 Pre-denature 94*C Imin. 94°C Imin. 94*C Imin.
Program 94 0 C Imin. 94*C Imin. 94*C 1min.
Cycle 42 0 C 2min. 42C 2min. 42°C 2min.
72C 3min. 72°C 3min. 72°C 3min. 5secicycl Repeat cycle 29 times Repeat cycle 29 times Repeat cycle 29 times Hold 4*C Hold 4*C Hold 4°C r Using the above protocol, a strain harboring a MIS-type of toxin would be expected to yield a 1000 bp fragment with the SEQ ID NO. 16/17 primer pair. A strain harboring a WARtype of toxin would be expected to amplify a fragment of about 475bp with the SEQ ID NO.
49/50 primer pair, or a fragment of about 1800 bp with the SEQ ID NO. 49/17 primer pair. The amplified fragments of the expected size were found in four strains. The results are reported in Table 3.
20 Table 3. Approximate Amplified Fragment Sizes (bp) Strain SEQ ID NO. 16/17 SEQ ID NO. 49/50 SEQ ID NO. 49/17 PS66D3 1000 900,475 1800 PS177C8 1000 475 1800 PS177I8 1000 900,550,475 1800 PS217U2 1000 2500. 1500. 900. 475 no band detected Example 12 Characterization and/or Identif-caton of .WA. Tor In a further embodiment of the subject invention, pesticidal toxins can be characterized and/or identified by their level of reactivity with antibodies to pesticidal toxins exemplified herein. In a specific embodiment, antibodies can be raised to WAR toxins such as the toxin obtainable from PS 177C8a. Other WAR toxins can then be identified and/or characterized by their reactivity with the antibodies. In a preferred embodiment, the antibodies are polyclonal antibodies. In this example, toxins with the greatest similarity to the 177C8a-WAR toxin would have the greatest reactivity with the polyclonal antibodies. WAR toxins with greater diversity react with the 177C8a polyclonal antibodies, but to a lesser extent. Toxins which immunoreact with polyclonal antibodies raised to the 177C8a WAR toxin can be obtained from, for example, the isolates designated PS177C8a, PS17718, PS66D3, KB68B55-2, PS185Y2, PS146F, KB53A49-4, PS17514, KB68B51-2, PS28K1, PS31F2, KB58B46-2. and PS146D. Such diverse WAR toxins can be further characterized by, for example, whether or not their genes can be amplified with ICON primers. For example, the following isolates do not have polynucleotide sequences which are amplified by ICON primers: PS177C8a, PS177I8, PS66D3, KB68B55-2, PS185Y2, PS146F, KB53A49-4, PS17514, KB68B51-2, PS28K1, PS31F2, KB58B46-2, and PS146D. Of these, isolates PS28KI, PS31F2, KB68B46-2, and PS146D show the weakest antibody reactivity, suggesting advantageous diversity.
Example 13 Bioassavs for Activity Against Lepidopterans and Coleonterans Biological activity of the toxins and isolates of the subject invention can be confirmed using standard bioassay procedures. One such assay is the budworm-bollworm (Heliothis virescens [Fabricius] and Helicoverpa zea [Boddie]) assay. Lepidoptera bioassays were conducted with either surface application to artificial insect diet or diet incorporation of samples.
All Lepidopteran insects were tested from the neonate stage to the second instar. All assays 20 were conducted with either toasted soy flour artificial diet or black cutworm artificial diet S (BioServ, Frenchtown, NJ).
Diet incorporation can be conducted by mixing the samples with artificial diet at a rate of 6 mL suspension plus 54 mL diet. After vortexmg, this mixture is poured into plastic trays with compartmentalized 3-ml wells (Nutrend Container Corporation, Jacksonville, FL). A water blank containing no B.t. serves as the control. First instar larvae (USDA-ARS, Stoneville, MS) are placed onto the diet mixture. Wells are then sealed with Mylar sheeting (ClearLam Packaging, IL) using a tacking iron, and several pinholes are made in each well to provide gas exchange. Larvae were held at 25*C for 6 days in a 14:10 (light:dark) holding room. Mortality and stunting are recorded after six days.
Bioassay by the top load method utilizes the same sample and diet preparations as listed above. The samples are applied to the surface of the insect diet. In a specific embodiment, surface area ranged from 0.3 to approximately 0.8 cm 2 depending on the tray size, 96 well tissue culture plates were used in addition to the format listed above. Following application, samples are allowed to air dry before insect infestation. A water blank containing no B.t. can serve as the control. Eggs are applied to each treated well and were then sealed with Mylar sheeting (ClearLam Packaging, IL) using a tacking iron, and pinholes are made in each well to provide gas exchange. Bioassays are held at 25°C for 7 days in a 14:10 (lightdark) or 28 0 C for 4 days in a 14:10 (light:dark) holding room. Mortality and insect stunting are recorded at the end of each bioassay.
Another assay useful according to the subject invention is the Western corn roorworm assay. Samples can be bioassayed against neonate western corn rootworm larvae (Diabrotica virgifera virgifera) via top-loading of sample onto an agar-based artificial diet at a rate of 160 ml/cm 2 Artificial diet can be dispensed into 0.78 cm 2 wells in 48-well tissue culture or similar plates and allowed to harden. After the diet solidifies, samples are dispensed by pipette onto the diet surface. Excess liquid is then evaporated from the surface prior to transferring approximately three neonate larvae per well onto the diet surface by camel's hair brush. To prevent insect escape while allowing gas exchange, wells are heat-sealed with 2-mil punched polyester film with 27HT adhesive (Oliver Products Company, Grand Rapids, Michigan).
Bioassays are held in darkness at 25 C, and mortality scored after four days.
Analogous bioassays can be performed by those skilled in the art to assess activity against other pests, such as the black cutworm (Agrotis ipsilon).
Results are shown in Table 4.
9 9 9** 9 .9 *99 9 9 999* 9. 9 Table 4. Genetics and function of concentrated B.t. sunernatnts screened fnr Ipn dnnfpran and -1 f Strain F. r r !Li;o e;;c8~~i Strain Approx. Total Protein ca. 80-100 H. virescens H. zen Diabrotica 339 bp PCR (pg/cm') kDa protein mortality Sfragment (pg/cm') mortality Stunting mortality Stunting PS31GI 8.3 2.1 70 yes 39 yes NT PS49C 13.6 1.5 8 yes 8 no NT 8.0 NT 18 no 13 no NT 35 NT 43 PS8A2 30.3 2.3 100 yes 38 yes NT1 PS81A2 18.8 1.6 38 yes 13 no NT PS81F 26 5.2 100 yes 92 yes NT PS811 10.7 1.7 48 yes 13 no NT' PS86BI 23.2 4.5 17 no 13 no PS86BI(#2) 90 17.5 PS86B1 35 6.8 PS122D3 33.2 1.8 21 no 21 no PS122D3 124 6.7 PS122D3 35 1.9 16 PS123DI 10.7 NT 0 no 0 no PS123DI 69 NT 54 PS123DI 35 NT 21 PS123DI 17.8 NT 5 no 4 no NT PS149B1 NT 9 NT' 0 no 0 yes NT PS 49BI NT 35 NT PS157CI 24 2 43 yes 13 yes PS157CI1 93 8 PS157C1 35 3 18 PS185L2(#I) 2 NT 8 no 0 no NT PS185L2 3 NT 10 no 25 no NT PS185U2 23.4 2.9 100 yes 100 yes N'T' 9 9 9* 9 9 9 0 S S *9* S S S 9 9.
*9* S. *5 S 9 0* S S S S S 599 *9 S
S..
9 Strain Approx. fTotal Protein ca. 80- 100 J I. virescens zen 1 Diabrotica 339bp (otinm mraty/mortality 33bpC (I/c 2 framen prt I P/)Imrai Stunting mortality StuntingI PS 192M4 10.7 2.0 9 no 4 -yes
NT
I-D129 j 44.4 4.9 100 yes 50 yes
NT
Javelin 1990 43.2 3.6 100 yes 96 yes
NT
water j 0-8, I 0-4 -12 not tested goo.
Exarnple 14 Results of Western Corn Roorworm Bioasspvs Concentrated liquid supernatant solutions, obtained according to the subject invention, were tested for activity against Western corn rootworm (WCRW). Supernatants from the following isolates were found to cause mortality against WCRW: PS1IOE 1, PS3 IF2. PS3 132, P533D2, PS66D3, PS68F, PS80JJ 1, PS 146D, PS517514, PSI 17718, PS I 96J4, PS 197171, PS I 97U2, K.B33, KB53A49-4, KB68B46-2, KB68B51-2, KB68B55-2, PS177C8. PS69AA-2). KB38, PS I96F3. PS I68GI1, PS202E 1. PS217U2 and PS I85AA2.
Example 15 -Results of Budworm/Bollworm-Bioassavs Concentrated liquid supernatant solutions, obtained according to the subject invention, were tested for activity against Heliothis virescens and Helicoverpa zea Supernatants from the following isolates were tested and were found to cause mortality against PS157C1, PS31GI, P549C, P5S1F, PS811, Javelin 1990, PS158C2, PS2025, PS36A, HD1I 10, and 1{D29. Supernatants from the following isolates were tested are were found to cause significant mortality against PS3IGI, PS49C, PS8IF, PS811, PS157C1, PS158C2.
P536A, HDlI 10, and Javelin 1990.
EX -mRle 16 Target Pests Toxins of the subject invention can be used, alone or in combination with other toxins, to control one or more -non-manunalian pests. These pests may be, for example, those listed in Table 5. Activity can readily be confirmed using the bioassays provided herein, adaptations of these bioassays, and/or other bioassays well known to those skilled in the art.
Table 5. Target pest species ORDER/Common Name
LEPLDOPTERA
European Corn Borer fIiuronmi~ Corn Borer resistant to Cry IAb Black Cutworm Fall Armyworm Southwestern Corn Borer Corn EarwormjBollworm Tobacco Budworm Latin Name Ostrinia nubilalis Ostrinia nubilalis Agrotis ipsilon Spodopterafrugiperda Diatraea grandiosella Helicoverpa zea Heliothis virescens
S
0 ORDER/Common Name Tobacco Budworm Rs Sunflower Head Moth Banded Sunflower Moth Argentine Looper Spilosoma Bertha Armyworm Diamondback Moth
COLEOPTERA
Red Sunflower Seed Weevil 10 Sunflower Stem Weevil Sunflower Beetle Canola Flea Beetle Western Corn Rootworm
DIPTERA
Hessian Fly
HOMOPTERA
Greenbug
HEMIPTERA
Lygus Bug
NEMATODA
Latin Name Heliothis virescens Homeosoma ellectellum Cochvlis hospes Rachiplusia nu Spilosoma virginica Mamestra configurata Plutella xvlostells Smicronyx fulvus Cylindrocopturus adspersus Zygoramma exclamationis Phyllotreta cruciferae Diabrotica virgifera virgifera
S
Mayetiola destructor Schizaphis graminum Lygus lineolaris Heterndpra i olvrino Example 17 Insertion of Toxin Genes Into Plants One aspect of the subject invention is the transformation of plants with genes encoding the insecticidal toxin of the present invention. The transformed plants are resistant to attack by the target pest.
Genes encoding pesticidal toxins, as disclozed herein, can be inserted into plant cells using a varicty uf tecluiques which are well known in the art. For example, a large number of cloning vectors comprising a replication system in E. coli and a marker that permits selection of the transformed cells are available for preparation for the insertion of foreign genes into higher plants. The vectors comprise, for example, pBR322, pUC series, M13mp series, pACYC184, etc. Accordingly, the sequence encoding the Bacillus toxin can be inserted into the vector at a suitable restriction site. The resulting plasmid is used for transformation into E. coli.
The E. coli cells are cultivated in a suitable nutrient medium, then harvested and lysed. The plasmid is recovered. Sequence analysis, restriction analysis, electrophoresis, and other biochemical-molecular biological methods are generally carried out as methods of analysis.
After each manipulation, the DNA sequence used can be cleaved and joined to the next DNA sequence. Each plasmid sequence can be cloned in the same or other plasmids. Depending on the method of inserting desired genes into the plant, other DNA sequences may be necessary.
If, for example, the Ti or Ri plasmid is used for the transformation of the plant cell, then at least the right border, but often the right and the left border of the Ti or Ri plasmid T-DNA. has to be joined as the flanking region of the genes to be inserted.
10 The use of T-DNA for the transformation of plant cells has been intensively researched and sufficiently described in EP 120 516; Hoekema (1985) In: The Binary Plant Vector System, Offset-durkkerij Kanters Alblasserdam, Chapter 5; Fraley et al., Crit. Rev. Plant Sci. 4:1- 46; and An et al. (1985) EMBO J. 4:277-287.
Once the inserted DNA has been integrated in the genome, it is relatively stable there and, as a rule, does not come out again. It normally contains a selection marker that confers on the transformed plant cells resistance to a biocide or an antibiotic, such as kanamycin, G 418, bleomycin, hygromycin, or chloramphenicol, inter alia. The individually employed marker should accordingly permit the selection of transformed cells rather than cells that do not contain the inserted DNA.
20 A large number of techniques are available for inserting DNA into a plant host cell.
0* Those techniques include transformation with T-DNA using Agrobacterium tumefaciens or Agrobacterium rhizogenes as transformation agent, fusion, injection, biolistics (microparticle bombardment), or electroporation as well as other possible methods. If Agrobacteria are used for the transformation, the DNA to be inserted has to be cloned into special plasmids, namely either into an intermediate vector or into a binary vector. The intermediate vectors can be integrated into the Ti or Ri plasmid by homologous recombination owing to sequences that are homologous to sequences in the T-DNA. The Ti or Ri plasmid also comprises the vir region necessary for the transfer of the T-DNA. Intermediate vectors cannot replicate themselves in Agrobacteria. The intermediate vector can be transferred into Agrobacterium tum=faciens by means of a helper plasmid (conjugation). Binary vectors can replicate themselves both in E. coli and in Agrobacteria. They comprise a selection marker gene and a linker or polylinker which are framed by the right and left T-DNA border regions. They can be transformed directly into Agrobacteria (Holsters et al. [1978] Mol. Gen. Genet. 163:181-187). The Agrobacterium used as host cell is to comprise a plasmid carrying a vir region. The vir region is necessary for the transfer of the T-DNA into the plant cell. Additional T-DNA may be contained. The bacterium so transformed is used for the transformation of plant cells. Plant explants can advantageously be cultivated with Agrobacterium tumefaciens or Agrobacrerium rhizogenes for the transfer of the DNA into the plant cell. Whole plants can then be regencrated from the infected plant material (for example, pieces of leaf, segments of stalk, roots, but also protoplasts or suspensioncultivated cells) in a suitable medium, which may contain antibiotics or biocides for selection.
The plants so obtained can then be tested for the presence of the inserted DNA. No special demands are made of the plasmids in the case of injection and electroporation. It is possible to use ordinary plasmids, such as. for example, pUC derivatives. In biolistic transformation.
10 plasmid DNA or linear DNA can be employed.
The transformed cells are regenerated into morphologically normal plants in the usual manner. If a transformation event involves a germ line cell, then the inserted DNA and corresponding phenotypic trait(s) will be transmitted to progeny plants. Such plants can be grown in the normal manner and crossed with plants that have the same transformed hereditary factors or other hereditary factors. The resulting hybrid individuals have the corresponding phenotypic properties.
In a preferred embodiment of the subject invention, plants will be transformed with genes wherein the codon usage has been optimized for plants. See, for example, U.S. Patent No.
5,380,831. Also, advantageously, plants encoding a truncated toxin will be used. The truncated toxin typically will encode about 55% to about 80% of the full length toxin. Methods for creating synthetic Bacillus genes for use in plants are known in the art.
It should be understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application.
SEQUENCE LISTING GENERAL INFORMATION:
APPLICANT:
Applicant Name(s): MYCOGEN CORPORATION Street address: 5501 Oberlin Drive City San Diego State/Province: California Country: US Postal code/Zip: 92121 Phone number: (619) 453-8030 Fax number: (619) 453-6991 TITLE OF INVENTION: Novel Pesticidal Toxins and Nucleotide S* Sequences Which Encode These Toxins (iii) NUMBER OF SEQUENCES: 134 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Saliwanchik, Lloyd Saliwanchik STREET: 2421 N.W. 41st Street, Suite A-i CITY: Gainesville STATE: FL COUNTRY: US ZIP: 32606-6669 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION DATA: APPLICATION
NUMBER:
FILING DATE:
CLASSIFICATION:
(vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 60/029,848 FILING DATE: 30-OCT-1996 (viii) ATTORNEY/AGENT INFORMATION: NAME: Saliwanchik, David R.
REGISTRATION NUMBER: 39,355 REFERENCE/DOCKET NUMBER: MA-708 (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: 352-375-8100 TELEFAX: 352-372-5800 INFORMATION FOR SEQ ID NO:1: a a SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: GARCCRTGGA AAGCAAATAA TAA.RAATGC INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: AAARTTATCT CCCCAWGCTr CATCTCCATT TTG INFORMATION FOR SEQ ID NO:3: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 2375 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 36a (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC AATGGCTTT ATGGA=rc~C r~rT'GGTATC AAAGACATTA GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATTTCTGGTA AATTGGATGG GGTGAATGGA AG CTT AAATG TTAAATACAG AATTATCTAA GGAAATATTA AAAA'rrGCAA AATGATGTrA ATAACAAACT CGATGCGATA AATACGATGC CAAGTrTAT TGATTATrI TGAACATGAT 7 =TAAAACG ATCAGCAGTr ACTAAATGAT ATCTTATCGC ACAGGGAAAC ATGAACAAAA TCAAGT'FrTA TTCGGGTATA TCTACCTAAA 120 180 240 300 360
CAAAATTATG
ATTACCTCTA TGTTGAGTGA TGTAATGA.AA CGCTAAGTCT GCAAATAGAA
TACTTAAGTA
CTTATTAACT
GAAAAATTTG
TCTCCTGCAA
AAAAATGATG
AATAATTTAT
GTGAAAACAA
CTGCAAGCAA
ATTGATTATA
AACATCCTCC
AGTGATGAAG
GAAATTAGTA
TATCAAGTCG
TGCCCAGATC
GTAATTACTA
AATITrTATG
GAAGCGGAGT
ATCAGTGAAA
AGATTAATTA
AGCAATAAAG
AACGGGTCCA
GTAGATCATA
ATTCACAAT
GTTAAAGGAA
GATACAAATA
GAT~rAAAGG
AACTYTATTA
AACAATTGCA
CTACACTTAC
AGGAATTAAC
ATATT=GA
TGGATGGTT
TCGGGCGTTC
GTGGCAGTGA
AAG=~TTCT-
=~CTATTAT
CTACACTTTC
ATGCAAAGAT
ATGATTCAAT
ATAAGGATTC
AATCTGAACA
AAATTGATTT
A'rTTrCTAC
ATAAAACGTT
CATTTTTGAC
CTTTAACATG
AAACTAAATT
TAGAAGAGGA
CAGGCGGAGT
TrATTGGAGA
AACCTTCTAT
AGAGATTTC'r
TGAAATTACA
TTrGCTACA
TGAGTTAACT
TGAAT=-~AC
AGCTTTAAAA
GGTCGGAAAT
TACTI'TAACA
GAATGAACAT
TAATACTTT
GATTGTGGAA
TACAGTATTA
CTTATCGGAA
AATCTATTAT
CACTAAAAAA
AGGAGAAATT
AAGTGCTAAT
TCCGATTAAT
TAAATCATAT
GATCGTCCCG
CAA=~AGAG
GAATGGAACT
TAATTTAAAA
TCA=IAATA
GATAAGTTGG
CCTGCGTATC
GAAACTAGTT
GAGTTAACTG
CTTAATACAT
ACTGCATCGG
GTTTATAACT
ACATGCCGAA
TTAAATAAGG
TCTAATCCTA
GCTAAACCAG
AAAGTATATG
GTTATTATG
ACAAATAACA
ATGAAAACTT
GACTTAAATA
GATGATGGGG
GGGTTGGCC
TTAAGAGAAC
CCAAGTGGTT
CCGTGGAAAG
AAAGCTflAT
CCGAAAACTG
GATGAAAATA
P.CTATTAATA
AGTCAAAATG
GAAAAGT1'AT
ATATTATTAA
AAAGGATTAA
CAAAAGTAAA
AACTAGCGAA
TCCACGATGT
A-ATTAArrAC
TCTTAATTGT
AATTATTAGG
AAAAAGAGGA
ATTATGCAAA
GACATGCATT
AGGCTAAGCT
GTGATATGGA
TAGTATTTCC
TAAGATATGA
AGAAAAAAGT
TGTATATGCC
TCCAAGCTGA
TACTGCTAGC
TTATTAGCAA
CAAATAATAA
ATGTTC-ATAA
AGT.D.TGTAAT
CTGGATATAT
AACGTTTTAC
GAGATGAAGC
TAAGTCCAGA
TGTAAATGTA
ATATGTGAAC
AAAGGATGGC
AAGTGTAACA
AATGGTAGGA
TAAAGAAAAT
ATTAACAGCT
CTTAGCAGAT
ATTTAGAGTA
AGTTAAAGGA
GATTGGGTTT
AAAACAAAAT
TAAATT-ATTG
AAATGAATAT
GGTAACAGCG
AGAATCAAGT
GTTAGGTGTC
TGAAAATTCA
AACAGAC'rrA TArrGTAGAG
GAATGCGTAT
GGACGGAGGA
CCAATATACT
TCATTATGAA
rACAGGAACT rrGGGGAGAT kTTAATTAAT 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 ATAATTAGA AGATTATCA-A GAGTGTA=rr AATTTAAAA T =TGGAAAT TACTCCTTCT
ACAAATAATT
CAGGGAGGAC
GTGTAT'rT=
TTTGAAAAAA
GAGAAAGATA
GTACA-TTTT
GGACGAGTAC
GAGGGATTCT
CTGTGTCCGG
GATATATGAG
ACTTTTATAT
ACGATGTCTC
GGGATCAACT
AA.AACAAAAC
AGATGCTAAT
CGGTGCTAAA
AGAGCTTTCT
TATTAAGTAA
46 AATATTAGCG GTAATACACT CACTC TTT AT CTTCAATTAG ATAGTTTTC AACTTATAGA GTAAGGATTA GAAATTCTAG GGAAGTGTTA GATGTTTCTG AAATGTTCAC TACAAAATTT CAAGGGAATA ATTTATATGG TGGTCCTATT
CCCAA
2100 2160 2220 2280 2340 2375 INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 790 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 36a (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: Met Asn Lys Aen Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe Asp Tyr Phe Asn Giy Ile Ile Met Asn Met Ile Phe Lys Tyr Gly Phe Thr Asp Thr 40 Gin Leu Leu Ala Thr Gly Gly Gly Asp Asn Asp Ile Ile Lys Asp Leu Thr Leu Ser Gly Lys Asp Glu s0 Leu Asp Ile Leu Lys Asn Leu Asn Asp Leu Ile Gly Val Asn Ala Gin Gly Leu As!, Thr Giu Ser Lys Glu Ile Lys Ile Ala Leu Asp Ala Asn Gin Val Leu 100 Met Leu Arg Val 115 Asp Val Asn Asn Giu Gin Ile Asn Thr 110 Ser Asp Val Tyr Leu Pro Thr Ser Met Leu 125 Met Lys 130 Gin Asn Tyr Ala Ser Leu Gin Ile Glu Tyr Leu Ser Lys 140 Gin Leu Gin Giu Ile Ser Asp Lys Leu Asp Ile Ile Asn Val Asn Val 145 150 Leu Lys Ser Leu Asp 225 As n Thr Asn Leu Ser 305 Asn Lys Pro Vai Lys 385 Cys Ile Tyr Ser Thr 210 Giy Asn Lys Phe Thr 290 Ile Ile Val Giy Leu 370 Asp Pro Asn Val Lys 195 Glu Phe Leu Giu Leu 275 Thr Met Leu Lys His 355 Lys Se r Asp Ser Asn 180 Val Leu Giu Phe Asn 260 Ile Cys Asn Pro Gly 340 Ala Val Leu Gin Thr 165 Glu Lys Thr Phe Gly 245 Val Val Arg Giu Thr 325 Scr Leu Tyr Ser Ser Leu Lvs Lys Glu Tyr 230 Arg Lys Leu Lys His 310 Leu Asp Ile Glu Giu 390 Giu Thr Phe Asp Leu 215 Leu Ser Thr Thr Leu 295 Leu 5cr Glu Gly Ala 375 Val1 Gin Gil Gly 200 Al a Asn Ala Ser Ala 280 Leu Asn Asn Asp Phe 360 Lys Ile Ile Ile Glu 185 Ser Lys Thr Leu Gly 265 Leu Gly Lys Thr Ala 345 Giu Leu Tyr Tyr Thr 170 Leu Pro Ser Phe Lys 250 Ser Gin Leu Giu Phe 330 Lys Ile Lys
GA.Y
Tyr 410 155 Pro Thr Al a Val1 His 235 Thr Giu Ala Ala Lys 315 5cr Met Ser Gin Asp 395 rhr Al a Phe Asn Thr 220 Asp Al a Val Lys Asp 300 Giu Asn Ile Asn.
Asn 3 Ej Asn Tyr Al a Ile 205 Lys Val Ser Gly Ala 285 Ile Gi u Pro Val Asp 365 Tyr Asn Gir Thr 190 Leu Asn Met Giu Asn 270 Phe Asp Phe Asn Glu 350 Ser Gin Lys Ile Arg 175 *Glu Asp *Asp Val Leu 255 Vai Leu Tyr Arg Tyr 335 Ala Ile Val I<':u Val 415 160 Ile Thr Giu Val1 Gly 240 Ile Tyr Thr Thr Val 320 Al a Lys Thr Asp Lau 400 Phe Pro Asn Giu Tyr Val Ile Thr Lys Ile Asp Phe Thr Lys Lys Met Lys 420 425 4130 48 Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 435 445 0 Glu Lys 465 lie Asp Glu Val Glu 545 Val Lys Thr Leu Asn 625 Asp Ala Leu Ser Ile 450 Thr Ser Glu Leu Pro 530 Glu Asp Asp Glu Ile 610 Leu Leu rp Leu rhr 690 Asp Leu Glu Asn Leu 515 Pro Asp His Gly Tyr 595 Asp Glu Lys Gly Ser Asn Leu Ser Thr Ser 500 Leu Ser Asn Th- Gly 580 Val Glu Asp Gly Asp 560 ?ro Ile Asn Ala Phe 485 Arg Ala Gly Leu Gly 565 Ile Ile Asn Tyr Val 645 Asn Giu I Ser C Lys Asr 470 Leu Leu Thr Phe Glu 550 Gly Ser Gin Th- Gin 630 Tyr Phe eu fly Lys 455 Asp Thr Ile Asp Ile 535 Pro Val Gin Tyr Gly 615 Thr Leu Ile Ile Asn 695 Lys Asp Pro Thr Leu 520 Ser Trp Asn Phe Thr 600 Tyr Ile Ile Ile ksn 580 rhr Vai Gly Ile Leu 505 Ser Asn Lys Gly Ile 585 Vai Ile Asn Leu Leu 665 Thr I Leu I Glu Val Asn 490 Th- Asn Ile Ala Thr 570 Gly Lys His Lys Lys 550 lu ksn rh- Ser Tyr 475 Gly Cys Lys Vai Asn 555 Lys Asp Gly Tyr Arg 635 Ser Ile Asn Leu Ser 460 Met Phe Lys Glu Glu 540 Asn Ala Asn Lys Glu 620 Phe Gin Ser Trp Tyr 700 Glu Pro Gly Ser Thr 525 Asn Lys Leu Leu Pro 605 Asp Thr Asn Pro Th- 685 Gin C Ala Leu Leu Tyr 510 Lys Gly Asn Tyr Lys 590 Ser Thr Thr Gly Ser 670 3er :ly Glu Gly Gin 495 Leu Leu Ser Ala Val 575 Pro Ile Asn Gly Asp 655 Glu I Thr Gly I Tyr Va1 480 Ala Arg Ile Ile Tyr 560 His Lys His Asn Thr 640 Glu Lys fly .rg Ile Leu Lys Gin Leu Gin Leu Asp Ser 715 Phe Ser Thr Tyr Val Tyr Phe Ser Val Ser Giy Asp Ala Asn Val 725 730 Arg Glu Val Leu Phe Giu Lys Arg Tyr Met Ser 740 745 Ser Giu Met Phe Thr Thr Lys Phe Giu Lys Asp 755 760 Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro 770 775 Arg Ile Arg Asn Ser 735 Gly Asn Ala Lys Asp Val 750 Phe Tyr Ile Glu 765 Val His Phe Tyr Asp Val Ser Ile Lys Pro 785 790 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 2370 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 8lFd (xi) SEQUENCE DESCRIPTION: SEQ ID ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGC CTTAC
C.
S
CAAGTTTTAT TGATTATTTT AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG GATACAGGTG GTGATCTAAC ATTTCTGGTA AATTGGATGG TTAAATACAG AATTATCTAA AATGATGTTG ATAACAAACT ATTACCTCTA TGTTGAGTGA TACTTAAGTA AACAATTGCA CTTATTAACT CTACACTTAC GAAAAATTTG AGGAATTAAC TCTCCTGCAG ATATI-CTTGA AAAAATGATG TGGATGGTTT CCTAGACGAA ATTTTAAAGA ATCAGCAGTT
GGTGAATGGA
AGAAATATTA
CGATGCGATA
TGTAATGAAA
AGAGAT'I7CT
TGAAATTACA
TTTTGCTACA
TGAGTTAACT
TGAA'TrrAC
AGCTI'AAATG
AAAATTGCAA
AATACGATGC
CAAAATTATG
GATAAGTTGG
CCTGCGTATC
GAAAC'IAGI-r
GAGTTAACTG
CTTAATACAT
ATCTTATCGC
ATGAACAAAA
TTCGGGTATA
CGCTAAGTCT
ATATTATTAA
AAAGGATTAA
CAAAAGTAAA
AACTAGCGAA
TCCACGATGT
ACTAAATGAT
ACAGGGAAAC
TCAAGTT=A
TCTACCTAAA
GCAAT~A.GAA
TGTA%?AT.3TA
ATATGTGAAC
AAAGGATGGC
AAGTGTAACA
AATGGTAGGA
120 180 240 300 360 420 540 600 660 720 *see too.
00.0: se.
AATAATTTAT
GTGAAAACAA
CTGCAAGCAA
ATTGATTATA
AACATCCTCC
AGTGATGAAG
GAAA7TAGTA
TATCAAGTTG
TGCCCAGATC
GTAATTACTA
AAT*I'ATG
GAAGCGGAGT
ATCAGTGAAA
AGATTAATTA
AGCAATAAAG
AACGGGTCCA
GTAGATCATA
ATTTCACAAT
GTTAAAGGAA
GATACAAATA
GATTAAAGG
AAC=DATTA
ACAAATAATT
CAGGGAGGAC
GTGTA'IrT~r
TTTGAAAAAA
GGGAAAGATA
GTACAGTTTC
*TCGGGCGTTC
GTGGCAGTG.P
AAG=TTC
CTTCTATTAI
CTACACTTTC
ATGCAAAGAT
ATGATTCAAT
ATAAGGATTC
AATCTGAACA
AAATTGATTT
ATTCTTCTAC
ATAGAACGTT
CATTTGAC
CTTTAACATG
AAACTAAATT
TAGAAGAGGA
CAGGCGGAGT
TTATTGGAGA
AAC=TCTAT
ATAATTTAGA
GAGTGTATTT
TIrTGGAAAT
GGACGAGTAC
GAGGAATTCT
CTGTGTCCGG
GATATATGAG
ACI=TATAT
CCGATGTCTC
AGCTTTAAAA
*GGTCGGAAAT
TACTTTAACA
*GAATGAACAT
TAATACTTTT
GATTGTGGAA
TACAGTATTA
CTTATCGGAA
AATCTATTAT
TACTAAAAAA
AGGAGAAATT
AAGTGCTAAT
TCCGATTAAT
TAAATCATAT
GATCGTCCCG
CAATTAGAG
GAATGGRACT
TAAGTTAAAA
TCATTTAAAA
AGATTATCAA
AATTTTAAAA
TAGTCCTTCI
GGGATCAACT
AAAACAAAAC
AGATGCTAAT
CGGTGCTAAA
AGAGCTI'rCT
TATTAAGTAA
ACTGCATCGG
*GTTTATAAC7
*ACATGCCGAA
TTAAATAAGG
TCTAATCCTA
GCTAAACCAG
AAAGTATATG
GTTATTTATG
ACAAATAACA
ATGAAAACTT
GACTTAAATA
GATGATGGAG
GGGTTGGCC
TTAAGAGAAC
CCCAGTGGTT
CCGTGGAAAG
AAAGCTTTAT
CCGAAAACTG
GATGAAAATA
ACTATTACTA
AGTCAAAATG
.GAAAAGTTAT
AAT~i-rAGCG C=rATTA3.
GTAAGGATTA
GATGTTCTG
CAAGGGAATA
AATTAATTAC
TCTTAATTGT
AATTATTAGG
AAAAAGAGGA
ATTATGCAAA
GACATGCATT
AGGCTAAGCT
GTGATATGGA
TAGTATTTCC
TAAGATATGA
AGAAAAAAGT
TGTATATGCC
TCCAAGCTGA
TACTGCTAGC
TTATTAAAAA
CAAATAATAA
ATGTTCATAA
AGTATGTAAT
CTGGATATAT
AACGTrTTAC
GAGATGAAGC
TAAGTCCAGA
(GT~lATACACT
A-TAGTTTTTC
GAAATTCTAG
AAATTTCAC
ATrI'AAATGG
TAAAGAAAAT
ATTAACAGC-T
CTTAGCAGAT
ATTTAGAGTA
AGTTAAAGGA
GGTTGGGTT
AAAACAAAAT
TAAATTATrG
AAATGAATAT
GGTAACAGCG
AGAATCAAGT
GTTAGGTGTC
TGAAAATrCA
AACAGACTA
TATTGTAGAG
GAATGAGTAT
GGACGGAGGA
CCAATATACT
TCATTATGAA
TACAGGAACT
TTGGGGAGAT
PTTAATTAAT
CACTCT=AT
~ACTIATAGA
3GAAGTGTTA
IACAAAATT
[GGCCCTATT
780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2370 0 INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 789 amino acids (R TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 8lFd (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe lie Asp Ile Met AsD Glu Leu Asp Leu Asn Asn Gin Met Leu Met Lys 130 Gin Leu 145 Leu Ile Lys Tyr Tyr Asn Ile Gly Thr Val Arg 115 Gin Gin As~i Val Phe Met Leu Val Giu Leu 100 Val Asn Giu Ser Asn 180 Asn Ile Lys Asn Leu Asn Ile Thr 165 Glu Gly Phe Asn Gly 70 Ser Asp Leu Al a Ser I5O Leu Lys Ile Lys Gin 55 Ser Lys Val Pro Leu 135 Asp Th~r Phe Tyr Thr 40 Gin Leu Giu Asp Lys 120 Ser Lys Giu Glu Gly 25 Asp Leu Asn Ile Asn 105 Ile Leu Leu Ile Glu Phe Thr Leu Asp Leu Lys Thr Gin Asp Thr 170 Leu Ala Gly Asn Leu 75 Lys Leu Ser Ile Ile 155 Pro Thr Thr Gly Asp Ile I le Asp) Met Glu 14-0 Ile Ala Phe Gly Asp Ile Al1a Al a Al a Leu 125 Asn Tyr Ala Ile Leu Ser Gin Asn Ile 110 Ser Leu Val Gin Thr 190 Lys Thr Gly Giy Glu Asn Asp Ser Asn Arg 175 Glu Asp Leu Lys Asn Gin Thr Val Lys Val 160 Ile Thr Ser Ser Lys Val Lys Lys Asp Gly 195 200 Ser Pro Ala Asp Ile Leu Asp Glu 205 Thr Clu 52 Leu Ala Lys Leu Thr Glu Leu 210 Ser Val Thr Lys Asn Asp Val 220 r o r Asp 225 Asn Thr Asn Leu Ser 305 Asn Lys Pro Val Lys 385 Cys Pro Thr Glu Arg Gly Asn Lvs Phe Thr 290 Ile Ile Val Gly Leu 370 Asp Pro Asn Leu Ile 450 hr Phe Leu Glu Leu 275 Thr Met Leu Lys His 355 Lys Ser Asp Glu Arg 435 Asp Leu Glu Phe Asn 260 Ile Cys Asn Pro Gly 340 Ala Val Leu Gin Tyr 420 Tyr Leu 2 Ser I Phe Gly 245 Val Val Arg Glu Thr 325 Ser Leu Tyr Ser Ser 405 lal lu ksn la ~Tyr 230 Arg Lys Leu Lys His 310 Leu Asp Val Glu Glu 390 Glu Ile Val Lys Asn 470 Leu Ser Thr Thr Leu 295 Leu Ser Glu Gly Ala 375 Val Gin Thr Thr Lys I 455 Asp I Asr Ala Ser Al a 280 Leu Asn Asn Asp Phe 360 Lys Ile Ile Lys Ala 140
-VS
ksp 1 Thr Leu Glv 265 Leu Gly Lys Thr Ala 345 Glu Leu Tyr Tyr Ile 2 425 Asn I V 3) Gly Phe Lys 250 Ser Gln Leu Glu Phe 330 Lys Ile Lys ly Tyr 110 ~sp he ;al His Asp 235 Thr Ala Glu Val Ala Lvs Ala Asp 300 Lys Glu 315 Ser Asn Met Ile Ser Asn Gin Asn 380 Asp Met 395 Thr Asn Phe Thr I Tyr Asp 4 r Ser C Tyr Met 1 475 Val Ser Gly Ala 285 le Glu Pro Val Asp 365 Tyr AsD Asn Lys 3er 145 lu aro Met Glu Asn 270 Phe Asp Phe Asn Glu 350 Ser Gin Lys Ile Lys 430 Ser Ala Leu Va Le 25! Va Let Ty: Arg Tyr 335 Ala Ile Val Leu Val 415 Met Thr Glu 3iy 1 Gly 240 i Ile 1 Tyr a Thr Thr Val 320 Ala Lys Thr Asp Leu 400 Phe Lys Gly Tyr Val 480 Ile Ser Giu Thr Leu Thr Pro Ile Gly Phe Giy Leu Gin Ala 495 53 Asp Glu Asn Ser Arg Leu Ile Thr Leu Thr Cys Lys Ser Tyr Leu Arg 500 505 510 Glu Vaal Glu 545 Val Lys Thr Leu Asn 625 Asp Ala Leu Ser Gly 705 Val Arg Ser Leu Pro 530 Glu Asp Asp Glu Lys 610 Leu Leu Trp Leu Thr 690 Ile Tyr Glu 3lu Leu 515 Pro Asp His Gly Tyr 595 Asp Glu Lys Gly.
Ser 675 Asn Leu Phe Val Ile I Leu Ser Asn Thr Gly 580 Val Glu Asp Gly Asp 660 Pro Ile Lys Ser jeu 740 ?he Ala Gly Leu Gly 565 Ile Ile Asn Tyr Val 645 Asn Glu Ser Gin Val 725 Phe Thr 1 Thr Phe Glu 550 Gly Ser Gin Thr Gin 630 Phe Leu Gly Asn 710 Ser flu rhr Asp Ile 535 Pro Vai Gin Tyr Gly 615 Thr Leu Ile Ile Asn 695 Leu 4 Gly Lys Lys Leu 520 Lvs Trp Asn Phe Thr 600 lyr Ile Ile Ile Asn 680 Thr Gln ksp 'he Ser Asn Lys Gly Ile 585 Val Ile Thr Leu Leu 665 Thr Leu Leu Ala Tyr 745 Gly Asr Ile Ala Thr 570 Gly Lys His Lys Lys 650 Glu Asn Thr Asp Asn 730 Met Lys 1 Lys Glu Val Glu 540 Asn Asn 555 Lys Ala Asp Lys Gly Lys Tyr Glu 620 Arg Phe 635 Ser Gin Ile Ser Asn Trp Leu Tyr 700 Ser Phe 715 Val Arg Ser Giy 2 Asp Asn I Thr 525 Asn Lys Leu Leu Pro 605 Asp Thr Asn Pro Thr 685 Gln Ser Tie Ua ?he Lys Gly Asn Tyr Lys 590 Ser Thr Thr Gly Ser 670 Ser Gly Thr Arg Lys 750 Tyr Leu Ser Glu Val 575 Pro Ile Asn Gly Asp 655 Glu Thr Gly Tyr Asn 735 Asc 1 Ile C Ile Ile Tyr 560 His Lvs His Asn Thr 640 Glu Lys Gly Arg Arg 720 Ser Ja) flu 755 760 765 Leu Ser 770 Gin Gly Asn Asn Asn Giy Gly Pro Ile Val Gin Phe Pro 780 Asp Val Ser Ile Lys 785 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 2375 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: JaV9O (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGC CTTAC
AATGGCATTT
GATACAGGTG
ATTTCTGGTA
TTAAATACAG
AATGATGTTA
ATTACCTCTA
TACTTAAGTA
CTTATTAACT
GAAAAATTTG
TCTCCTGCAG
AAAAATGATG
AATAATTTAT
GTGAAAACAA
CTGCAAGCAA
ATTGATTATA
AACATCCTCC
ATGGATTTGC
GTGATCTAAC
AATTGGATGG
AATTATCTAA
ATAACAAACT
TGTTGAGTGA
AACAATTGCA
CTACACTTAC
AGGAATTAAC
ATATTCTTGA
TGGATGGTTT
TCGGGCGTTC
GTGCkGTGA
AAGCTTTTCT
CrrCTATrAT
CTACACTTTC
CACTGGTATC
CCTAGACGAA
GGTGAATGGA
GGAAATATTA
CGATGCGATA
TGTAATGAAA
AGAGATTTCT
TGAAATTACA
TTTTGCTACA
TGAGTTAACT
TGAATTTTAC
A:GCTTAAAA
TACTTTAACA
GAATGAACAT
TAATACTT'rr
AAAGACATTA
AI-FrAAAGA
AGCTTAAATG
AAAATTGCAA
AATACGATGC
CAAAATTATG
GATAAGTTGG
CCTGCGTATC
GAAACTAGTT
GAGTTAACTG
CTTAATACAT
ACTGCATCGG
Z'ITTATAACT
ACATGCCGAA
TTAAATAAGG
TCTAATCCTA
CAAGT-TTTAT
TGAACATGAT
ATCAGCAGTT
ATCTTATCGC
ATGAACAAAA
TCGGGTATA
CGCTAAGTCT
ATATTATTAA
AAAGGATTAA
CAAAAGTAAA
AACTAGCGAA
TCCACGATGT
AATTAATTAC
TCTTAATTGT
AATTATTAGG
TGATTA'rTT
TTTAAAACG
ACTAATGAT
ACAGGGAAAC
TCAAGTTrI'A
TCTACCTAAA
GCAAATAGAA
TGTAAATGTA
ATATGTGAAC
AAAGGATGGC
AAGTGTAACA
AATGGTAGGA
TAAAGAAAAT
ATTAACAGCT
CTTAGCA.GAT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 AAAAAGAGGA ATTAGAGTA ATTATGCAAA AGTTAAAGGA AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGG'Tr
AAAGTATATG
GAAATTAGTA ATGATI'CAAT TACAGTAT'rA AGGCTAAGCT AAAACAAAAT TATCAAGTCG ATAAGGATTC TGCCCAGATC AATCTGAACA GTAATTACTA AAATTGATTT AATTTTTATG ATTCTTCTAC GAAGCGGAGT ATAGAACGTT ATCAGTGAAA CATTrTTGAC AGATTAATTA CTTTAACATG AGCAATAAAG AAACTAAATT AACGGGTCCA TAGAAGAGGA.
GTAGATCATA CAGGCGGAGT A TTr CACAAT TTATTGGAGA GTTAAAGGAA AACCTTCTAT GATACAAATA ATAA TTT AGA GAITTTAAAGG GAGTGTA'rrT AACTTTATTA TTTGGAAAT ACAAATAATT GGACGAGTAC CAGGGAGGAC GAGGGATTCT GTGTATTTT CTGTGTCCGG TTTGAAAAAA GATATATGAG GAGAAAGATA ACT TITTATAT GTACATTTTT ACGATGTCTC
CTTATCGGAA
AATCTATTAT
CACTAAAAAA
AGGAGAAATT
AAGTGCTAAT
TCCGATTAAT
TAAATCATAT
GATYGTCCCG
CAATTTAGAG
GAATGGAACT
TAAGTTAAAA
TCAITTTAAAA
AGATTATCAA
AATTTAAAA
TAGTCCTTCT
GGGATCAACT
AAAACAAAAC
AGATGCTAAT
CGGTGCTAAA
AGAGCTI-rCT
TATTAAGTAA
GTTATTTATG
ACAAATAACA
ATGAAAACTT
GACTTAAATA
GATGATGGGG
GGGTTTGGCC
TTAAGAGAAC
CCAAGTGGTT
CCGTGGAAAG
AAAGCTTTAT
CCGAAAACTG
GATGAAAATA
ACTATTAATA
AGTCAAAATG
GAAAAGTTAT
AATATTAGCG
CTTCAATTAG
GTAAGGATTA
GATGTTTCTG
CAAGGGAATA.
CCCAA
GTGATATGGA
TAGTATTTCC
TAAGATATGA
AGAAAAAAGT
TGTATATGCC
TCAAG.CTGA
TACTGCTAGC
TTATTAGCAA
CAAATAATAA
ATGTTCATAA
AGTATGTAAT
CTGGATATAT
AACGTTTTAC
GAGATGAAGC
TAAGTCCAGA
GTAATACACT
ATAGITTrTC
GAAATTCTAG
AAATGTTCAC
ATTTATATGG
TAAATTATTG
AAATGAATAT
GGTAACAGCG
AGAATCAAGT
GTTAGGTGTC
TGAAAATTCA
AACAGACTTA
TATTGTAGAC-
GAATGCGTAT
GGACGGAGGA
CCAATATACT
TCATTATGAA
TACAGGAACT
TTGGGGAGAT
ATTAATTAAT
CACTCTI'TAT
AACTTATAGA
GGAAGTGTTA
TACAAAATT'
TGGTCCTATT
1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2375 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 790 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein 56 (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: Jav9O (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: Met Asn Ls Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe r r le Ile Asp Leu Leu Asn Met Met Gin 145 Leu Lys Ser Leu Asp 225 Asp Met Glu Asp Asn Gin Leu Lys 130 Leu Ile Tyr Ser Thr 210 Gly Tyr Asn Ile Gly Thr Val Arg 115 Gin Gin Asn Val Lys 195 Glu Phe Phe Met Leu Val Glu Leu 100 Val Asn Glu Ser Asn 180 Val Leu 3lu Asn Gly Ile Ile Phe Lvs Lys Asn Gin 55 Asn Giy Ser 70 Leu Ser Lys Asn Asp Vai Tyr Leu Pro Tyr Ala Leu 3-3 Ile Ser Asp 150 Thr Leu Thr 165 Glu Lys Phe Lys Lys Asp Thr Giu Leu 21' Phe Tyr Leu 2 230 Gly Arg Ser 2 245 Tvr Thr 40 Gin Leu Glu Asn Lys 120 Ser Lys Glu Glu Gly 200 ksn kla Gly Phe 25 Asp Thr Leu Leu Asn Asp Ile Leu 90 Asn Lys 105 Ile Thr Leu Gin Leu Asm Ile Thr 170 Glu Leu 185 Ser Pro Lys Ser Thr Phe Leu Lys 250 Ala Gly Asn Leu 75 Lys Leu Ser Ile Ile 155 Pro Thr Ala Val His 235 Thr Thr Glv Asp Ile Ile Asp Met Glu 140 Ile Ala Phe Asp Thr Asp Ala Gly Asp Ile Ala Ala Ala Leu 125 Tyr Asn Tyr Ala Ile 205 Lys Val Ser Ile Leu Ser Gin Asn Ile 110 Ser Leu Vai Gin Thr 190 Leu Asn Met Glu Lys Thi Gly Gly Glu Asn Asp Ser Asn Arg 175 Glu Asp Asp V1al eu Asp Leu Lvs Asn Gin Thr Va1 Lys Vai 160 Ile Thr Glu Val Gly 240 Ile Asn Asn Leu Phe 255 Thr Lys Glu Asn Val Lys Thr Ser Gly 260 265 Ser Giu Vai Gly Asn Val Tyr 270 Asn Phe Leu Ile Val Leu Thr Ala 275 280 Gin Aia Lys Ala Phe Leu Thr 285 Leu Ser 305 Asn Lys Pro Vai Lys 385 Cys Pro Thr Glu Arg 465 Ile Asp C r11 I Val P Thi 290 lie Ile Val Gly Leu 370 Asp Pro Asn Leu Ile 450 Thr Ser flu euL 'ro .30 Thl Met Let Lys His 355 Lys Ser Asp Glu Arg 435 Asp Leu Glu Asn Leu 515 Pro rCy! Asr 1 Pro Gly 340 Ala Val Leu Gin Tyr 420 Tyr Leu Ser Thr Ser 500 Leu Ser Arg Lys Leu Le 295 1 Giu His Leu As 310 Thr Leu Ser Asl 325 Ser Asp Giu AsI Leu Ile Giy Pht 36C Tyr Giu Ala Lys 375 Ser Giu Val Ile 390 Ser Glu Gin Ile 405 Val Ile Thr Lys Glu Vai Thr Ala 440 Asn Lys Lys Lys 455 Ala Asn Asp Asp 470 Phe Leu Thr Pro 485 Arg Leu Ile Thr Ala Thr Asp Leu 520 Gly Phe Ile Ser 535 n Lys Glu n Thr Phe 330 Ala Lys 345 Glu Ile Leu Lys Tyr Gly Tyr Tyr 410 Ile Asp 425 Asn Phe Val Glu Gly Val Ile Asn C 490 Leu Thr C 505 Lyl 31! Se Met Sez Gir Asp 395 Thr Phe Tyr Ser ryr 175 fly .ys 00 s Glu r Asn Ile Asn Asn 380 Met Asn Thr Asp Ser 460 Met Phe C Lys S Glu T Glu Glu Pro Val Asp 365 Tvr Asp Asn Lys Ser 445 lu ?ro fly er :hr 6sn Phe p Asr Glu 350 Ser Gin Lys Ile Lys 430 Ser Ala Leu Leu Tyr 510 Lys Gly Arg Tvr 335 Ala Ile Vai Leu Val 415 Met Thr Glu Gly 1 Gin 495 Leu 2 Leu I Ser I u Gly Leu Ala Asp Ile Asp Tyr Thr Val 320 Ala Lvs Thr Asp Leu 400 Phe Lys Gly Tyr Val 480 la !.rg 3eP :le Ser Asn Lys Asn Ile Val 540 Glu Asp Asn Leu Pro Trp Lys Ala Asn 555 Asn Lys Asn Ala Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu *Tyr Val His 565 575 Lys Asp Gly Gly Ile Ser Gir 580 Thr Leu Asn 625 Asp Al a Ser Gly 705 Val Arg Ser *Giu *Lys 610 Leu Leu Trp Leu Thr 690 Ile Tyr Glu Giu Tyr 595 Asp Glu Lys Gly Ser 675 Asn Leu Phe Val Met Val1 Giu Asp Gly Asp 660 Pro Ile Lys Ser Leu 740 Phe Ile Asn Tyr Vai 645 Asn Giu Ser Gin Val 725 Phe Thr Gin Thr Gin 630 Tyr Phe Leu Gly Asn 710 Ser Giu Thr Gly 615 Thr Leu Ile Ile Asn 695 Leu Gly Lys Lys Phe Thr 600 Tyr Ile Ile Ile Asn 680 Thr Gin Asp Arg Phe 760 Ile 585 Val Ile Asn Leu Leu 665 Thr Leu Leu Ala Tyr 745 Glu Gly Lys His Lys Lys 650 Glu Asn Thr Asp Asn 730 Met Lys Gly Tyr Arg 635 Ser Ile Asn Leu Ser 715 Val Ser A*sp Lys Giu 620 Phe Gin Ser Trp, Tyr 700 Phe Arg Gly Asn Pro 605 Asp Thr Asn Pro Thr 685 Gin Ser Ile Ala Phe 765 590 Ser Thr Thr Gly 5cr 670 Ser Gly Thr Arg Lys 750 T'yr Asr Gly Asm 655 Glu Thr Gly Tyr Asn 735 Asp Ile *His Asn Thr 640 Giu Lys Gly Arg Arg 720 Ser Val Giu Asp Lys Leu Lys Pro Lys 4 755 Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro 770 775 Asp Vai Ser Ile Lys Pro 785 790 INFORf-UTION FOR SEQ ID NC:9: SEQUENCE CHARACTERISTICS: LENGTH: 47 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) Ile 780 Val His Phe Tyr 59 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: GCTCTAGAAG GAGGTAACTT ATGAACAAGA ATAATACTAA ATTAAGC INFORMATION FOR SEQ ID Wi SEQUENCE CHARACTERISTICS: LENGTH: 2035 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 158C2-ptl (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l0: ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCTACC GAGTTTTATT GATTATTTTA
ATGGCATTTA
ATACAGGTGG
TTTCTGGTA.A
TAAATACAGA
ATGATGTTA.A
ATTCACATCT
TGGAATTACC
ATTAACGTAA
ATTAAATATG
GTAAAAAAGG
GCGAAAAGTG
GATGTAATGG
ATGCTAAAG
ATTGTATTAA
TTAGGCTTAG
TGGATTTGCC
TAATCTAACC
ATTGGATGGG
ATI'AGCTAAG
TAACAAACTA
ATGTTAAGTG
TTTAAGTAAC
ATGTGCTTAT
TGAATGAAAA
ATAGCTCTCC
TTACAAAAAA
TGGGAAATAA
AAAATGTGAA
CAGCTCTACA
CAGATATTGA
ACTGGTATCA
TTAGACGAAA
GTAAATGGGA
CAAATCTTAA
GACTGCGATA
ATGTACTGAA
ATCTGCACCT
TAACTCTACG
ATTTGACGAT
TGCTGATATT
TGACGTGGAT
TTTATTCGGT
AACAAGTGGC
AGCAAAAGCT
TTATACTTCT
AAGACATTAT
TCCTAAAGAA
GCTTAAATGA
A.AGTTGCAAA
A.ATACGATGC
GCCAAAATTA
TGGCAAGAAA
CTTACTGAAA
TI'AACTTTTG
CTITGACGAGT
GGTTTTGAAT
CGTTCAGCTI'
AGTGAAGTAG
TTTCTTACTT
ATCATGAATG
C1I ICTAATA
GAATATGATT
TCAGCAGTTA
TCTTATCGCA
TGAACAAAAT
TTAAAATATA
TGTGCTTAAG
TCTCCGACA.A
TTACACCTGC
CTACAGAAAA
TAACTGAATT
TTTACCTTAA
TAAAAACTGC
GAAA1G11TTA
TAACAACATG
AGCATAAA
CCTTTCTAA
TTTAAAACGG
CTAAATGAGA
CAGGGAAACT
CAAGTT=AA
TCTACCTAAA
TCTTGCAAAT
GCTAGATATT
GTATCAACGA
CACTTTAAAA
AACTGAACTA
TACATTCCAT
TTCGVCAATTA
TAAI-rT; ±I'A
CCGAAAATTA
TAAGGAAAAA
TCCTAATTAT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 GAGGAATTTA GAGTAAACAT CCTTCCCACA GCAAAAGCTA AGGGAAGTAA TGAAGATACA AAGATGATTG TGGAAGCTAA ACCAGGATAT 18 1080
GTTTTGGTTG
AAGCTAAAAA
TGATACGGAT
GCATTTCCAA
AGGTATGAGG
ACAAAAGTAG
TACATGCCGC
GCAGTCGATG
TTGTTAGCGA
ATTAGCAATA
AATAATGAGA
GTTCATAAGG
TACTTGATTC
GAAAATTACA
CGITTTACTA
AATGAAGCTT
GATTTGAAAT
AAGATTATCA
AAATTATTAT
ATGAATATGT
CGACAGCGAA
AATCAAGTGA
TAGGTCTTAT
AAAATTCCAG
CAGATTTAAA
TTGTAGAGAA
ATGCGAATGT
ATGGTGAATT
GATATATTOT
TTTACGAGGA
CAGGAACTGA
GGGGAGACAC
GAGCAATAAT
AATTGATAAG
GTCCGGATCA
TATTACTAAA
TTTTTATGAT
AGCGGAGTAT
CAGTGAAACA
ACTAGTAACT
TAATAAAGAA
TGGAAATATA
AGATTATTCA
CTCACATTTT
AAAAGGAAAA
TACAAATAAT
TTCGACAGGA
TTTTTT1'CTC
TCAATTACAG
GATTCGTTAT
ATCTGAACAA
ATTGCTTTTA
TCTTCTACAG
AGTATGCTAA
TTT'-TTAAATC
TTAACATGTA
ACTAAATTGA
GAAATGGACA
GGCGGAGTGA
ATIGGAGACA
GCTTCTATTT
AATTTAGAAG
TTTTATTTAT
TAGAAAGAGG
TATTAAAAGC
CAGAAATAAT
TATATTATAC
CTAAAAAAA.T
GGGATATTGA
AAGCTAGTGA
CAATTAATGG
GATCATATTT
TTGTCCCACC
CCTTAGAACC
ATGGAACTAG
AGTTGAAATC
TTTTAAAAGA
AI'ATCAAAC
TI'TTTACTAC
TAACTTATGA
ATATCAAGCT
ATATAGTACG
AAAGAACATA
GAACAGTTTA
TCTAAATAAG
TGATGAAGTT
ATTTAGGCTT
AAGAGAGACA
TAATGTTTT=
ATGGAAGGCA
AGCTTTATAT
TAAAACAGAA
TGAAAGAAAT
TATTACTAAA
TCAAGATGGA
ACAAG
1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2035
U
INFORMATION FOR SEQ ID NO:11: Wi SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l1: CATC~CTCCCT .ZATl f-77CTA A INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 950 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 49C3-ptl (Xi4) SEQUENCE: DES
AAACTAGAGG
TAGTTGGATT
TAAAACACAA
ATAAATTATT
CAAATGAATA
AGGTCACAC
TAGAATCAAG
CGATAGGTAC
ATGAAAATTC
CAACAGACTI'
ATATTGTAGA
AAAAATGCGT
GAGGATGGTG
ATTCCATATA
TACATATCAT
GAGTGATAAG
TGAAATAAGT
CTATCAAATT
ATGTCCGGAT
TGTTATCACT
GAATTTTTAT
TGAAGCGGAG
TATAAGTGAA
AAGACTAGTA
AAGTAATAAA
AAATGGGAAC
ATGTAGATCA
AGTTCTCACA
TTGTAAAGGG
GAAGAAACAT
GATAAGGATT CGTTATCAGA
CAATCTGAAC
AAAATTGC-.T
GACTCTTCTA
TTTAGTATGC
ACATTTTTGA
ACTTTGACAT
GAAACTAAAC
TTAGAGGGAG
TACCGGAGGT
AT'rrATI-GGG
GAAAGCTGCT
AAATGTATTA
TTACTAAAAA
CAGGAGATAT
TAAATGCTAA
CTCCAATTAA
GTAAATCATA
TGATTGTCCC
AAAACTTAGA
GTAAATGGAA
GATAAATTGA
ATTTATTTAA
AATI'GTTTAT
TACAAATAAA
ACTGAACAGT
TGATCTAAAT
TAATGATGGT
TGGATTTGGC
TTTAAGAGAG
ACCTAATGGT
GCCGTGGGAA,
CTAAAGTTTr
AATTGAAAAC
AAGATGAAAA
CRIPTION: SEQ ID NO:12: GATGCGAAAA TCATTATGGA AGCTAAACCT AAGGATTCAA TTGCAGTATT AAAAGTI-rAT
GGATATGCTT
CAGGCAAAGC
GGTGATATAG
ATAGCAT=c
TTAAGATATG
AAGAAAAAAA
GTTTATATGC
CTCGTAGTCG
ACATTGTTAG
TTATTAGCA
AGCAAATAAC
ATATGTTCAT
AGAATATGTA
AAATGGGGAT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 950 CATAATGCAA TTGAAGATTT TTCCAGCTGT AACTTCAATA ATGAI-1-I-CG CATCCTTATC ATCCrCCTAG CTTTTTCATA ATAGGATAGA INFORMATION FOR SEQ ID NO:13: W) SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOCY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: AAA'ITATGCG CTAAGTCTGC INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: TTGATCCGGA CATAATAAT 19 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 176 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 49C8-ptl (xi) SEQUENCE DESCRIPTION: SEQ ID GTAAATTATG CGCTAAGTCT GCACCTTTTT TCACTGTTAC TAAACATCAC TTTTCCTATA STCCCCTTAGC TCTTATGGAT TATTGAGCAA ACTTATCTTG TTAATTACTA CTCCCCATCA 120 TATGCTAAAC AAAAACCAAA CAAACATTAT CTATTATATG TCCGGATCAA AATGTA 176 9 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: GGRTTAMTTG GRTAYTATTT INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: off* LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: ATATCKWAYA TTrKGCATTTA INFORMATION FOR SEQ ID NO:l8: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1076 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 1OE1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: TGGGATTACT TGGATATTAT TTCCAGGATC AAAAGTTTCA GCAACTTGCT TTGATGGCAC
S
S
S S 8055
ATAGACAAGC
AGCAGCAACA
ACTATGTATT
TTGTCAATCA
TTAGAATTGA
TCAACTGGTC
CAAA'TrI-rTC
AACAAGGAGA
GTGATACAGA
AACAAATAGC
CAAACCCCTA
GCCGTATCGA
TTCTGATTTG
CATTCAATCT
GTCAACCTCA
AACTI'CTATG
ATATGTCCCA
GATTTCAGGA
TCGTAAACAA
TGAGAAAAAA
TGAITk'n
AGTGAAATGG
TAAGTCTCAT
TAACGGTGTC
GAAATCCCGA
GTTAGATGGC
TCCGACCAAC
ACAGAACCGA
GAAGATACAA
TCAGAGATAG
GATCAAGAGA
GTATCTCGCA
A''TTATGATG
GACGATTCTA
ACAGTAGGAG
AAAGCAGAAG
AAGATGACGT
TGGCTATAT
AGGTCGTGAT
ITCAACTCGA
AAGAACAAGA
AACCAATTCC
AAATCATCCC
GTAAGAGATC
AATGGGAAAC
TGAAGGATAG
ATCCATACAC
CCAGAAATCC
GAAACAGTTA
TCAGCCACCT
TGAACTCGAT
AAAAGATAAG
GAACCTCCTT
GGAGAATGCT
TGAAACCAGT
TTTAGCTACA
GGAAGGATAC
AGG'ITATACC
AGATTGGGAA
TTTAGTCGCG
CTATCCAAGG
CAAACAGGAG
GGAAAAACCA
CTCTATAAAA
GACTI'TCAGC
TTCCATTTAC
TTGTTTCAGG
AATCCTATCC
ACGATACGGG
AAATATGTGT
AAAGCGGCTG
GCCTATCCAA
120 180 240 300 360 420 480 540 600 660 720 780 CTGTTGGTGT ACATATGGAA AGATTAATTG TCTCCGAAAA ACAAAATATA TCAACAGGGC 64 TTGGAAAAAC TGTATCTGCG TCTATGTCCG CAAGCAATAC CGCAGCGATr ACGGCAGGTA 900 TTGATGCAAC AGCCGGTGCC TCTTTACTCG GGCCATCTGG AAGTGTCACG GCTCATTTTT 960 CTTATACAGG ATCTAGTACA TCCACCGTTG AAGATAGCTC CAGCCGGAAT TGGAGTCAAG 1020 ACCTTGGGAT CGATACGGGA CAATCTGCAT ATTTAAATGC CAAATGTACG ATATAA 1076 INFORMATION FOR SEQ ID NO:19: i) SEQUENCE CHARACTERISTICS: LENGTH: 357 amino acids TYPE: amino acid STRANEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: Gly Leu Leu Gly Tyr Tyr Phe Gin Asp Gin Lys Phe Gin Gin Leu Ala 1 5 10 Leu Met Ala His Arg Gin Ala Ser Asp Leu Giu Ile Pro Lys Asp Asp 25 Val Lys Gin Leu Leu Ser Lys Giu Gin Gin His Ile Gin Ser Val Arg 40 Trp Leu Gly Tyr Ile Gin Pro Pro Gin Thr Gly Asp Tyr Vai Leu Ser 55 Thr Ser Ser Asp Gin Gin Val Val Ile Giu Leu Asp Gly Lys Thr Ile 70 75 Val Asn Gin Thr Ser Met Thr Glu Pro Ile Gin Leu Giu Lys Asp Lys 90 Leu Tyr Lys Ile Arg Ile Giu Tyr Vai Pro Giu Asp Thr Lys Giu Gin 100 105 110 Glu Asn Leu Leu Asp Phe Gin Leu Asn Tp 3ez 12 Ser Giy Ser Glu 115 120 125 Ile Giu Pro Ile Pro Giu Asn Ala Phe His Leu Pro Asn Phe Ser Arg 130 135 140 Lys Gin Asp Gin Giu Lys Ile Ile Pro Giu Thr Ser Leu Phe Gin Glu 145 150 155 160 Gin Gly Asp Giu Lys Lys Vai Ser Arg Ser Lys Arg Ser Leu Ala Thr 165 170 175 Asn Pro Ile Arg Asp Thr Asp Asp Asp Ser Ile Tyr Asp Giu Trp Giu 180 185 190 Thr Giu Gly Tyr Thr Ile Arg Giu Gin Ile Ala Val Lys Trm Asp Asp 195 200 205 Ser Met Lys Asp Arg Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Lys 210 215 220 Ser His Thr Val Giy Asp Pro Tyr Thr Asp Trp Giu Lys Ala Ala Gly 225 230 235 240 Arg Ile Asp Asn Gly Val. Lys Ala Giu Ala Arg Asn Pro Leu Val Ala 9...245 250 255 Ala Tyr Pro Thr Val Gly Vai His Met Glu Arg Leu Ile Val Ser Giu *260 265 270 Lys Gin Asn Ile Ser Thr Gly Leu Gly Lys Thr Val Ser Ala Ser met 275 280 285 Ser Ala Ser Asn Thr Ala Ala Ile Tbr Ala Gly Ile Asp Ala Thr Ala 35290 295 30032 Gly Ala Ser Leu Leu Gly Pro Ser Giy Ser Val Thr Ala His Phe Ser Tyr Thn Gly Ser 5cr Thr Ser Thr Val. Glu Asp 5cr Ser Ser Aing Asn 32-5 330 335 Trp Ser Gin Asp Leu Gly Ile Asp Thr Gly Gin Ser Ala Tyr Leu Asn 340 345 350 Ala Lys Cys Thr Ile *99999355 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1045 base pairs TYPE: nucleic acid STRAN'DEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genoniic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 31J2 (xi) SEQUENCE DESCRIPTION: SEQ ID TGGGTTACTT GGGTATTATT 66 TTAAAGGAAA AGAT TTTAAT AATCTTACTA TTTATGATTT AGAAACAGCG AATTCTTTAT
AACACGTGAG
ACAACAAACC
TFI'TACCTTT
TCGCAAAAA
CAAAATTGAA
GAAATTATTI'
AAATCCTGAA
CTTGTITTAGC
AGATGGAGAT
AGCTGTTAAA
TTTTAGACAG
GGATTTATCT
TGTTAGCTTG
TACTTCATCG
TGGACCAGAA
GGCCAAAGAG
AATACTCTTA
TATCAATCTA
CAATTATCGG
GGCCAAAAGA
TATCAATCTG
AAAATAAATA
TTTGGTAAAG
AATAAAAGTA
GCCATTCCTG
TGGGACGAAG
CACACTGCTG
AATGCA.AAAG
GAAAATGTCA
AATAATTGGT
GGTTTGTTGT
TGGGGTACAA
TCGTTGGAT
ATGATGAGCA
AACAAGTTGT
ATAAAGCGTT
GTCAAAAACA
AAAAAACTCA
AACGAGATAT
ATGTATGGGA
GATTAGCTGA
GTGACCCCTA
AAACATT'AA
CCATATCAAA
CCTATACAAA
CTTTTGGAGT
CTAAGGGAGA
TCATTAGAA
AAACCCAGAT
ATCTCAGCAA
AACATATTTA
AGATGAAGAT
AGAAAATGGG
TAAGGGATAT
TAGTGACTAT
TCCATTGGTG
AGATGAAAAT
TACAGAGGGG
AAGTGCCAAT
CGCAACACAA
AAAGATAAAT
AGTCAAATGT-
GTGCAACAAG
AAGAAAGCAT
ATAGATGAGG
TATACCATCA
AAAAAGTTTG
GAAAAGGCAT
GCTGC'IrTTC
AAAACTGCTG
GCATCTATTG
TATCAACArr
TATAATACAG
CGGTTTAATA AAAAGCAAAA TGCTATTATA GAAATCGATG TAT1'TGCTCC
TAGATAAGCA
AAGCTGGAGA
GGAAAGTTAT
TAGTTCCCAT
TTAAAGAATT
ACGAATTGAG
CGAAAAGCAG
ATACAGATAC
AAGGAAGAGT
=IICCAATCC
CAAAAGATTT
CAAGTGTCAA
AAATTGCGTC
AAGCTGGAAT
CTGAAACAGT
CTTCAGCAGG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1045 ATATCTAAAT GCCAATGTAC GATAT INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 348 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: ot.;tide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 31J2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr 1 5 10 Ile Phe Ala Pro Thr Arg Giu Asn Thr Leu Ile Tyr Asp Leu Giu Thr r r Ala Trp Leu Ser Leu Asp Lys Gly 145 Leu Asp Gly Ala Thr 225 Asp Pro Asn Asn Ile Ser Gin Val Ser Gin 130 Lys Phe Thr Tyr Asp 210 Ala Leu Ser Lys Ser Gly Asp Lys Pro Gin 115 Ser Glu Ser Asp Thr 195 Lys Gly Ser Vai Thr 2 275 Leu Leu Asp Gly Ile 100 Met Gin Lys Asn Thr 180 Ile ly Asp ksn ksn kla Leu Ile Glu Gin Lys Phe Gin Thr Lys 165 Asp Lys Tyr Pro Ala 245 Val Glu Asp Lys His 70 Lys Ile Lys Val Gin 150 Ser Gly Gly Lys Tyr 230 Lys Ser Ile Lys Gin 40 Ser Lys 55 Ala Ile Lys Gln Glu Tyr Glu Leu 120 Gin Gin 135 Thr Tyr Lys Arg Asp Ala Arg Val 200 Lys Phe 215 Ser Asp Glu Thr Leu Glu Ala Ser 280 Ser Ile 295 25 Gir Lys Ile Val Gin 105 Lys Asp Leu Asp Ile 185 Ala Vai Tyr Phe Asn 265 rhr 1 Gl Ali iGli Val 90 Ser Leu Glu Lys Ile 170 Pro Val Ser Glu Asn 250 Val Ser i Thr Tyr Gly Asp 1 Ile Asp 75 His Leu Asp Lys Phe Lvs Leu Arg 140 Lys Ala 155 Asp Glu Asp Vai Lys Trp Asn Pro 220 Lys Ala 235 Pro Leu Thr Ile Ser Asn Glr Phe Gly Glu Ala Ile 125 Asn Ser Asp Trp Asp 205 Phe Ser Val Ser ksn Ser Thr Lvs Lys Leu 110 Asn Pro Lys Ile Glu 190 Glu Arg Lys Ala Lys 2 270 Trp, Ile Phe Va1 Asp Asn Ser Glu Ser Asp 175 Glu Gly 3ml Asp kla 255 ksp er Arg Gin Ile Lys Pro Gin Phe Ser 160 Glu Asn Leu His Leu 240 Phe Glu Tyr Thr Asn 290 Thr Glu Gly Ala flu Ala Gly Ile Giy Pro Glu Gly 300 68 Leu Leu Ser Phe Gly Val Ser Ala Asn Tyr Gin His Ser Glu Thr Val 305 310 315 320 Ala Lys Glu Trp Gly Thr Thr Lys Gly Asp Ala Thr Gln Tyr Asn Thr 325 330 335 Al1a Ser Ala Gly Tyr Leu Asn Ala Asn Val Arg Tyr 340 345 INFORMATION FOR SEQ. ID NO:22: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1641 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 33D2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: CCAAAGGGGG NTTAAACCNG GANGGTTNNN TNNTTNNTTN TNGAANCCCA NTTGGAAACC CNATNAAArI'
TACCNGGATT
TTI"TTTNGGN
GAAACGGAAC
TGCCTTTTT
TCTGCTTTTA
ACACTATCTA
AAATAGAAAG
CGCTTTTGGC
AAGAAC-AAAA
GATTAATTGG
GAGAAAAAAG
CCATTICGATG
CCTCTAATGA
CNTGGTTANT
TNAAGGCAGA
TTTGCCCAA.A
AACGTGAGTA
GAAAA2INTAA
GAGGGGCTGT
GTCTATACAT
GATGACCCCT
TCAAATCGCC
TACAGCTAAA
TCACTATTTT
TAAATTACTA
GGAAGGAAAT
AAATGTTACA
GGTNGTGAGT
AN'ITNTTNNT
AAACAAGGAT
TGATAAACAT
AAGGTTTCGT
TACCT'TGGCT
CTTATCTT7 r
ATGCAATTAA
GCCTTCCCGT
ACAGAACATC
ACTGATGATC
GATTCAAAAA
GTGAAACCTC
GTAAAAGTAG
GNNTNTTA
NGCTNN'rTAA GAATCCTG'rr
CTITTACAAA
GGCATTGCCA
GCTATTTCTC
CATCATGATT
AAAATGTATA
CTTCCTCTrT
AACAGAAAAA
AGTTTACTAA
TAGTAAAGCA
CTGAAACAGG
ATGGAGAAAC
NCNGAGNTTG CCCNTTTGNN
AGGTTNTGNT
ATTCCNCCCT
CTGCGACATC
CACGTTATAC
TGTGGTTGAA
CCAGTCGTAC
CAAATGTTTA
TGCGGAAGAC
AGAAACAAAA
CACAGCATTr
AGATATGTCC
AGAATATCTA
TGTTATTAAc
TNTNANTGAA
NGAAAAAATN
TTGTTGAAAA
AAAAACCACG
TCTCGTATAG
ATTTACTCAA
ACCATTACAG
GGGAAGAAAA
CCAGTTGTGG
ATTCAAGTAG
AATTTGAAAT
CTTTCCACGT
AAAGCTAACA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 69 GAAAAAGATA AACCACACTC TGGAAAAAGC AATGAAACTC TATTGAAATT GAATATCATG TTCCTGAGAA CGGGAAGGAA CTACAATTAT TTTGGCAAAT AAATCCCAGA AAAAAACATA CTATCACCAA
GTTCAACTCA
TAGAAGAAAA
CAGCACTAGG
CCTATACGGA
AAGATCCACT
CTAGAAATGA
CAAGCACAAC
TTTCTTTTTC
ATACTGAAAG
ATCTCAACAA
TGGCTATACA
CTATAAAAAA
CTTTGAAAAA
AGTAGCCGCT
AACGGTCACT
AACAAATAGC
AT1'CTCTCCC
TAGCACATGG
AATCAAAATG
r11'AAAGACG
TACATATCCA
GTAACAGGAC
TATCCCTCGG
GAAGGAGACT
ATCGATGTTG
AAATATACGC
TCTTCACAAT
ATCTTTCTGA
ATAGGGATOG
GTGCGATTGT
AITTCTAATAA
ACATGCCGGA
TAGGTGTTGC
CAGGTACTGT
GGGGATCCAT
ATCTTGGAG
TAGCGTATAA
AAATGACCAG
ACAGATACAA
GGATAAAATC
TGCCTGGAAC
GGCTAAAACA
GGCAACTAAA
TATGGAAAAA
I-rCAAAAACC
TGGATGGGGA
AAAGCTGTTrA
CCGCAACAGC
CCTGATAGTT
GATTCCTATG
GCTGCTGACC
GATGAAGTAA
TTTCATTTTT
GTAACCAATA
GAAA.AAGGAT
1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1641 TAATAGTACC GCTGTTGCTG TCCTCAGAA CGTGCTTTCT TAAATGCCAA TATACGATAT A INFORMATION FOR SEQ ID NO:23: Wi SEQUENCE CHARACTERISTICS: LENGTH: 327 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE:- INDIVIDUAL ISOLATE: 33D2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: Gly Leu Ile Gly His Tyr Phe Thr Asp Asp Gln 1 5 10 Phe Ile Gln Val Gly Glu L,s Ser Lvs Lev __e u 1 sp Thr Asn Thr Ala Ser Lys Ile Val Glu Gly Asn Val 45 Lys Gln Asp Met Ser Asn Leu Lys 40 Ile Arg Trp Lys Pro Pro Glu Thr Gly Glu s0 55 Tyr Leu Leu Ser Thr Ser Ser Asn Glu Asn Vai Thr Val Lys Val Asp Gly Glu Met Ile Gin Ser Ser 145 Leu Asn Asn Thr Val 225 Ser Thr Ser Tyr Ser 305 Leu 3 Glu Glu Ile Pro 130 Gin Glu Asp Lys Gly 210 Ala Arg Val Ile Thr 290 [hr !sn Lys Tyr Asn 115 Asn Gin Glu Ser Ala 195 His Ala Asn Thr Gly 275 His Trp Ala Ala His 100 Asp Leu Asn Asn Tyr 180 Lys Met Tyr Glu Asn 260 Trp Ser Ser I Asn I Mel Va Gl Sei Glr Gly 165 Ala Thr Pro Pro rhr 245 rhr Gly rrp er le 70 t Lys Leu I Pro Glu i Lys Ala Glu Gin 135 1 Asn Asp 150 Tyr Thr Ala Leu Ala Ala Glu Ala 215 Ser Val 230 Val Thr Ser Thr Glu Lys Ser Asn 295 Gin Leu 2 310 Arg Tyr Glu Asn Val 120 Ile Ara Phe Gly Asp 200 Thr Gly Glu rhr Gly 280 Ser Lys Gly 105 Lys Gin Asp Ls Tyr 185 Pro Lys Val Gly rhr 265 Phe rhr Thr Val 75 Asp Lys 90 Lys Glu Ile Pro Pro Gin Gly Asp 155 Asp Gly 170 Lys Lys Tyr Thr Asp Glu Ala Met 235 Asp Ser 250 Asn Ser Ser Phe Ala Val Ile Pro Leu Glu Gin 140 Lys Ala Tyr Asp Val 220 Glu Gly Ile Ser Ala Asn His Gin Lvs 125 Arg Ile Ile Ile Phe 205 Lys Lys Thr Asp Phe 285 Asp Lys Ser Leu 110 Asn Ser Pro Val Ser 190 Glu Asp Phe Val Jai 270 'er Chr Ala Ile Phe Ile Thr Asp Ala 175 Asn Lys Pro His Ser 255 Gly Pro I Glu E Asn Glu Trp Leu Gin Ser 160 Trp Ser Val Leu Phe 240 Lys fly .ys jer r r r 300 Ser Giu Arg Ala Phe kla Tyr Asn Pro 315 INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH.: 1042 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 66D3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: TTAATTGGGT ACTATTTTAA AGGAAAAGAT TTTAATAATC TTACTATATT TGCTCCAACA
CGTGAGAATA
CAAACCTATC
ACCTTTCAAT
CAAAAAGGCC
ATTGAATATC
TTATTTAAAA
CCTGAATTTG
TTTAGCAATA
GGAGATGCCA
GTTAAATGGG
AGACAGCACA
TATCTAATG
AGCTTG4GAAA
TCATCGAATA
CCAGAAGGTT
AAAGAGTGGG
CTAAATGCCA
CTCTTATTTA TC AATCTATTCG TT TATCGGATGA TG AAAAGAAACA AG AATCTGATAA AG TAAATAGTCA A GTAAAGAAAA AA AAAGTAAACG A TTCCTGATGT AT ACGAAGGATT AG CTGC'rGGTGA CC CAAAAGAAAC AT ATGTCACCAT AT' AT TGGTCCTA TA TGTTGTC=r TG GTACAACTAA. GG( ATGTAC3,rTA TA
LATTTAGAA
GGATCGGT
,AGCATGCT
TTGT-TCAT
CGTTAAAC
AACAATCT
CTCAAACA
ATATAGAT
GGGAAGAA
CTGATAAG
CCTATAGT
TAATCCA
CAAAAGAT
CAAATACA
3AGTAAGT
GAGACGCA
ACAGCGAATT
TTAATAAAAA
ATTATAGAAA
TTAGAAAAAG
CCAGATAGTC
CAGCAAGTGC
TATTAAAGA
GAAGATATAG
AATGGGTATA
CTTTATTAGA
GCAAAAAAGC
TCGATGGGA.A
ATAAATTAGT
AAATGT-TTAA
AACAAGACGA
AAGCATCGAA
ATGAGGATAC
CCATCAAAGG
TAAGCAACAA
TGGAGA=T'r
AGTTATTTCG
TCCCATCAAA
AGAATTGAAA
AT TGAGAAAT
AAGCAGCCTG
AGATACAGAT
AAGAGTAGCT
CAATCC7=T
AGATTTGGAT
TGTCAATGT T
TGCGTCTACT
TGGAATTGGA
AACAGTGGCC
AGCAGGATAT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1042 GGATATAAAA AGTTTGTTTC GACTATGAAA AGGCATCAAA TI'GGTGGCTG CTTTTCCAAG GAAAATAAAA CTGCTGAAA.T GAGGGGGCAT CTATTGAAGC GCCAATTATC AACATTCTGA ACACAATATA ATACAG C INFORMATION FOR SEQ ID Ci) SEQUENCE CHARACTERISTICS: LENGTH: 347 amino acids TYPE: amino acid STRANDEDNESS: single 72 TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 66D3 (xi) SEQUENCE DESCRIPTION: SEQ ID Leu Ile Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr Iie 1 5 10 -1 Phe Ala Pro Thr Arg Glu Asn Thr Leu Iie Tyr Asp Leu Glu Thr Ala 25 Asn Ser Leu Leu Asp Lys Gin Gin Gin Thr Tyr Gin Ser Ile Arg Trp 35 40 Ile Gly Leu Ile Lys Ser Lys Lys Ala Gly Asp Phe Thr Phe Gin Leu 55 er Asp Asp Glu His Ala Ile Ile Glu Ile Asp Gly Lys Val Ile Ser 70 75 Gin Lys Gly Gin Lys Lys Gin Val Val His Leu Glu Lys Asp Lys Leu 85 90 Val Pro Ile Lys Ile Glu Tyr Gin Ser Asp Lys Ala Leu Asn Pro Asp 100 105 110 S: er Gln Met Phe Lys Glu Leu Lys Leu Phe Lys Ile Asn Ser Gin Lys 115 120 125 Gin Ser Gin Gin Val Gin Gin Asp Glu Leu Arg Asn Pro Glu Phe Gly 130 135 140 Lys Glu Lys Thr Gin Thr Tyr Leu Lys Lys Ala Ser Lys Ser Ser Leu 145 150 155 160 Phe Ser Asn Lys Ser Lys Arg Asp Ile Asp Glu Asp Ile Asp Glu Asp 165 170 175 Thr Asp Thr Asp Gly Asp Ala Ile Pro Asp Val Trp Glu Glu Asn Gly 180 185 190 Tyr Thr Ile Lys Gly Arg Val Ala Val Lys Trp Asp 3l Gliy Leu Ala 195 200 205 Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe Arg Gin His Thr 210 215 220 Ala Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala Ser Lys Asp Leu Asp 225 230 235 oAn Leu Ser Asn Ala Ser Val Asn Val 260 Lys Thr Ala Glu 275 Asn Thr Glu Gly Lys Glu Thr Phe 245 Ser Leu Glu Asn Ile Ala Ser Thr 280 Ala Ser Ile Glu 295 Val Ser Ala Asn 73 Asn Pro Leu Val Ala Ala Phe Pro 250 255 Val Thr Ile Ser Lys Asp Glu Asn 265 270 Ser Ser Asn Asn Trp Ser Tyr Thr 285 Ala Gly Ile Gly Pro Giu Glv Leu 290 Leu Ser 305 Lys Glu Phe Gly 310 Trp Gly Thr Thr 325 Tvr- Gin His 315 Ser Glu Thr Val Ala 320 Gin Tvr Asn Thr Ala 33S Lys Giy Asz Ala Asn Val 345 Ala Thr 330 Arg Tyr Ser Ala Gly Tyr Leu Asn 340 INFORMATION FOR SEQ ID NO:26: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1278 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDlIVIDUAL ISOLATE: 68F (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: TGGATTACTT GGGTACTATT TTAAAGGGAA AGATTTTAAT GATCTrACTG AACGCGTGGG AATACTCTTG TATATGATCA ACAAACAGCA AATACATTAC ACAACAAGAC TTTCAGTCTA TTCGTTGGGT TGGTTTAATT CAAAGTAAAG TT= ACATTT AA CTT ATCAG ATGATGAACA TACGATGATA GAAATCGATG TTCTAATAAA GGGAAAGAAA AACAAGTTGT CCATTTAGAA AAAGGACAGT CAAAATAGAA TATCAAGCTG ATGAACCATT TAATGCGGAT AGTCAAACCT GAAACTCTTT AAAGTAGATA CTAAGCAACA GTCCCAGCAA ATTCAACTAG AAACCCTGAA T=AATAAAA AAGAAACACA AGAA~rCTA ACAAAAGCAA CCTTATTACT CAAAAAGTGA AGAGTACTAG GGATGAAGAC ACGGATACAG
TATTTGCACC
TAAATCAAAA
AAGCAGGCGA
GGAAAGTTAT
TCGTTTCTAT
TTAAAAAT
ATGAATTAAG
CAAAAACAAA
ATGGAGATTC
120 180 240 300 360 420 480 540
TATTCCAGAC
GGATGATTCA
CACGGTTGGA
TGCAAAAGAA
AAAAGTGATA
GAATTGGTCG
CCT-ATCTT
AACATCTACG
TGTTCGCTAC
TGTATTAAAT
TATCTCACCA
ATTTGGGAAG
TTAGCAAGTA
GATCCTTATA
ACATTTAAcc
TTGTCTCCAG
TATACGAATA
GGTGTAAGTG
GGAAATACTT
AATAACGTGG
AAAGATACCA
GGACAAAGTT
AAAATGGGTA
AAGGATATAC
CAGATTATGA
CATTAGTTGC
ATGAGAACTT
CAGAAGGGGC
CAAACTATCA
CGCAATT1TAA
GAACGGGTGC
TCGCAACGAT
TACCATCCAA AATAAGATTG GAAA TTT GTT TCAAACCCAC AAAAGCAGCA AGGGA TTT AG GGCTT'rTCCA AGTGTGAATG ATCAAATAGT ATCGAGTCTC TTCTATTGAA GCTGGTGGGG ACATTCTGAA ACAGTTGGGT TACAG CTCA GCGGGGTATT AATCTATGAT GTAAAGCCAA AACAGCAAAA TCGAATACGA
CCGTCAAATG
TAGATACTCA
ATTTGTCAAA
TGAGTATGGA
ATTCATCTAC
GAGCAT'rAGG
ATGAATGGGG
TAAATGCGAA
CAACGAGT
CTGCATTAAG
TCACATCGAT
600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1278
S
S
*SSS
ATCCGAAACA AGGTCAAAAT GGAATCGCGA GGATGATTTT AACTCACATC TAATACCCAA TTAATCCA CGATTACATT GAATAAGCAA CAGGTAGGTC AACTGTTAAA
S
S..
S
INFORM4ATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 425 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE- INDIVIDUAL ISOLATE: 68F (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys 1 5 10 Val Phe Ala Pro Thr Ar7- Gly Asrn Thr Leu 25 Ala Asn Thr Leu Leu Asn Gin Lys Gin Gin 40 Trp Val Gly Leu Ile Gin Ser Lys Glu Ala s0 55 Asp Phe Asn Asp Leu Thr Val Tyr Asp Gin Gin Thr Asp Phe Gin Ser Ile Arg Gly Asp Phe Thr Phe Asn Leu Ser Asp Asp Giu His Thr Met Glu Ile Asp Gly Lys Val Ile Se Pht Asi Glr Asr 145 Let Asp Gin Tyr Pro 225 Ala Val Ser Gly Val 305 Thr Asn Vai Ser Gin 130 Lvs Ile Gly Asn Thr 210 Tyr Lys Ser Ile Ala 290 Ser I Ser I Ly Se GlI 11! Se2 Lys Thr Asp Lys 195 Lys Thr lu Met 'lu 275 ;er hr 'hr s Gly r Ile 100 1 Thr Gin Glu Gin Ser 180 Ile Phe Asp Thr Glu i 260 Ser I Ile C Asn *J Gly 3 Lys Lys Phe Gin Thr Lys 165 Ile Ala Vai Tyr Phe 245 Lys iis flu 7yr sn 25 Glu Lys Gin Ile Giu Tyr Lys Asn Leu 120 Ile Gin Leu 135 Gin Glu Phe 150 Val Lys Ser Pro Asp Ile Val Lys Trp 200 Ser Asn Pro 215 Glu Lys Ala 230 Asn Pro Leu Val Ile Leu Ser Ser Thr J 280 Ala Gly Giy C 295 Gin His Ser C 310 Thr Ser Gin P Va Gi1 105 Lys Asp Leu Thr Trp 185 Asp Leu Ala Val Ser 265 sn fly flu 'he 1 Val 90 i Ala Leu Glu Thr Arg 170 Glu Asp Asp Arg Ala 250 Pro Trp Ala I Thr Asn 'I His Asp Phe Leu Lys 155 Asp Glu Ser Thr Asp 235 iAl Asp Ser jeu 115 :hr Leu Glu Lys Arg 140 Ala Glu Asn Leu His 220 Leu Phe Glu Tyr Gly 300 Gly *J Glt Prc Val 125 Asn Thr Asp Gly Ala 205 Thr Asp Pro Asn Thr 285 .eu N'r 1Ly! Phc lc Ast Pro Lys Thr Tyr 190 Ser Vai Leu Ser Leu 270 Asn Ser G lu s Gly Gin Asn Ala Thr Lvs 1 Giu Phe Thz Asn 160 Asp Thr 175 Thr Ile Lys Gly Gly Asp Ser Asn 240 Val Asn 255 Ser Asn Thr Glu Phe Gly T= Cly 320 Ala Ser Ala Gly Tyr 335 Leu Asn Ala Asn Val 340 Arg Tyr Asn Val Gly Thr Gly Ala Ile Tyr 350 Asp Val Lys Pro Thr Thr Ser 355 Thr Ile Thr Ala Lys Ser Asn 370 375 Gin Ser Tyr Pro Lys Gin Gly 385 390 Asp Asp Phe Asn Ser His Pro 405 Val Leu Asn Lys Thr Ile Ala Thr Thr Ala Leu Ile Ser Pro Gly Gin Asn ly Ile Ala Ile Thr Ser 395 Ile Thr Leu Asn Lys Gin Gin Val 410 415 Gin Leu Leu Asn 420 Asn Thr Gin Leu *9 INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 983 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 69AA2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: TGGATTACTT GGGTACTATT TTACTGATGA TCAG TTTACT AGGAGAAAAA AGTAAATTAC ATCCATTCGA TGGGAAGGAA GTCCTCTAAT GAAAATGT1'A CATGGAAAAA GCAATGAAAC TGTTCCTGAG AACGGGAAGG TAAAATCCCA GAAAAAAACA GCGTTCAACT CAATCTCAAC TTTAGAAGAA AATGGCTATA TGCAGCACTA GGCTATAAAA CCCCTATACG GACTTTGAAA AAAAGATCCA CTAGTAGCCG
TAGATTCAAA
ATGTGAAACC
CAGTAAAAGT
TCGAAAAAGA
AACTACAATT
TACTATCACC
AAAATCAAAA
CATTTAAAGA
AATACATATC
AAGTAACAGG
CTTATCCCTC
AATAGTAAAA
TCCTGAAACA
AGATGGAGAA
TAAACCACAC
A'TT1TGGCAA
AAATCTTCT
TGATAGGGAT
CGGTGCGATT
CAATTCTAAT
ACACATGCCG
GGTAGGTGT1'
AACACAGCAT
CAAGATATGT
GGAGAATATC
ACTGTTATTA
TCTATTGAAA
ATAAATGACC
GAACAGATAC
GGGGATAAAA
GTTGCCTGGA
AAGGCTAAAA
GAGGCAACTA
GCTATGGA.AA
TTATTCAAGT
CCAATTTGAA
TACTTTCCAC
ACAAAGCTAA
TTGAATATCA
AGAAAGCTGT
AACCGCAACA
TCCCTGATAG
ACGATTCCTA
CAGCTGCTGA
AAGATGAAGT
AATTTCAI-rr 120 180 240 300 360 420 480 540 600 660 720 TTCTAGAAAT GAAACGGTCA CTGAAGGAGA CTCAGGTACT G TTCAAAAA CCGTAACCAA TACAAGCACA ACAACAAATA GCATCGATGT TGGGGGATCC ATTGGATGGG GAGAAAAAGG ATTTTCTITT TCATTCTCTC CCAAATATAC GCATTCTTGG AGTAATAGTA CCGCTGTTGC TGATACTGAA AGTAGCACAT GGTCTTCACA ATTAGCGTAT AATCCTrCAG AACGTGCTNT CTTAAATGCC AATAKACGAT NTA OS@9 *69@9* 9 9 INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 327 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi' ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 69AA2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: Gly Leu Leu Gly Tyr Tyr Phe Thr Asp Asp 1 5 10 Phe Ile Gin Val Gly Giu Lys Ser Lys Leu .9 9 9 9 9 Gin Phe Thr Asn Thr Ala Leu Asp Ser Lys Gin Asp Met Ser Asn Leu Lys Ser Ile Arg Trp Lys Ile Vai Gly Asn Val Ser Asn Glu Lys Pro Pro 50 Asn Val Glu Thr Gly Giu Tyr Val Lys Val Asp Gly Leu Leu Ser Glu Thr Vai Thr Asn Lys Ala Glu Lys Ala Lys Leu Giu Lys Lys Pro His Ser Ile Glu Ile Giu Tyr Gin Ile Asn 115 Ser Pro Asn 130 Pro Glu Acn Glu Leu Gin Gin Lys Ala Val 120 Ile Pro Giu Lys 125 Pro Gin Gin Arg 140 Leu Phe Trp 110 Asn Ile Leu Ser Thr Gin Leu Ser Glu Gin Ile Gin 135 Gin Gin Asn Gin Asn Asp Arg Asp Gly Asp 150 155 Lys Ile Pro Asp Ser 160 78 Leu Glu Glu Asn Gly Tyr Thr Phe Lys Asp Gly Ala Ile Val Ala Trp 165 170 175 Asn Asp Ser Tyr Ala Ala Leu Gly Tyr Lys Lys Tyr,Ile Ser Asn Ser 180 185 190 Asn Lys Ala Lys Thr Ala Ala Asp Pro Tyr Thr Asp Phe Glu Lys Val 195 200 205 Thr Gly His Met Pro Glu Ala Thr Lys Asp Glu Val Lys Asp Pro Leu 210 215 220 Val Ala Ala Tyr Pro Ser Val Gly Val Ala Met Glu Lys Phe His Phe 225 230 235 240 Ser Arg Asn Glu Thr Val Thr Glu Gly Asp Ser Gly Thr Val Ser Lys 245 250 255
*S
Thr Val Thr Asn Thr Ser Thr Thr Thr Asn Ser Ile Asp Val Gly Gly 260 265 270
S
er Ile Gly Trp Gly Glu Lys Gly Phe Ser Phe Ser Phe Ser Pro Lys S* 275 280 285 Tyr Thr His Ser Trp Ser Asn Ser Thr Ala Val Ala Asp Thr Glu Ser 290 295 300 Ser Thr Trp Ser Ser Gin Leu Ala Tyr Asn Pro Ser Glu Arg Ala Xaa 305 310 315 320 Leu Asn Ala Asn Xaa Arg Xaa 325 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: o" LENGTH: 1075 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 168G1 (xi) SEQUENCE DESCRIPTION: SEQ ID TGGGTTAATT GGATATTATT TCCAGGATCA AAAATTTCAA CAACTCGCTT TAATGGTACA TAGGCAAGCT TCTGATTTAA AAATACTGAA AGATGACGTG AAACATTTAC TATCCGAAGA 120 TCAACAACAC ATTCAATCAG TAAGGTGGAT AGGCTATATT AAGCCACCTA AAACAGGAGA 180
CTACGTATTG
TCTCAATCAG
TCAACCTCAT CCGACCAACA GGT CATGATT GAACTAGATG GCTTCTATGA CAGAACCTGT TCAACTGA AAAGATAAAC TAAAATTGAA TATGTTCCGG AACAAACAGA AACACAAGAT ACGCTTCTTG
GAACTGGTCT-
AGACC-TCT
ACCTGGAGAC
ATGATACAAG
AGACAACTGG
7CTAATCCCT
GGACGTATTG
ACCGTY"GGTG
CTCGGAAAAA
ATTGATACGA
TCTGATACAG
T=-TCAGGCG
CGTAAACAAG
GAGAAAAAAA
ATGATGATGG
CAGTGAAATG
ATAATTCCCA
ATAAGGCGAT
TACATATGGA
CAATATCTGC
CGGCTGGTGC
GATCCAGTAC
ATCAAGAAAA
TATCTCGAAC-
GATT TCC-GAT GGACGATT CT
TACAGTACGG
CAAAGGAGAA
AAAACTGA
GTCAATGTCT-
TTCTTAC-T
ATCCACTGTT
GCTT7-CCT GAGGCA.AGTT
TAAACGGTCC
GCGTGGGAA.A
ATGA;AGGATC
GATCCATACA
GCTAGGAATC
GTCoTCCGALGA
GCAAC-TAATA
GGACCGTCTG
GAAAATAGCT
TA=A.AATG
7=-AACTACA
CAGAAGGATA
GAGGGTATAC
CAGATTGGGA
CTAGTCGC
AACAAAACAT
CCGCAGCGAT
GAAGCGTCAC
CAAGTAATAA
CCAATGTACG
GAAAAACAGA AACGA-.CCA GAAAATGCAT
GT-AAAGTCAT-
CGTATAAAAT
A;rrAA.ACT
TATTCAGAA
C-ATTCTCTA-T
CACGATACAA
CAATA.TGT-r
AAAAGCGGCT
GGCCTATCCA
ATCAACTGGA,
TACAGCGGGC
GGCTCATTTT
TGGAC-TCAA
ATATA
240 300 360 420 480 54 0 600 660 720 780 840 900 960 1020 1075 GATCTTGGAA TCGATACGGG ACAATCTGCA TNFORMATION FOR SEQ ID NO:31: SEQUENCE CRARACTERISTICS: LENGTF: 2645 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear MOLECULE TYPE: DNA (gencrnic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 177c Ba (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: ATGAAGAAG',A AGTTAGCAAG TGTTGTAACG TGTACC-TTAT TAGCTCCTAT GTTTTTGAAT r-GATGTGA ATG&-TG1 TTA CGCAGACAGC AAAACAAATC AAAT=CTAC AACACAGAAA, AATCAACAGA AAGAGATGGA CCGAAAAGGA TTACTTGGGT ATTA=-rCAA AGGAAAAGAT TTTAGTAATC TTACTATGTT TGCACCGACA CGTGATAGTA CTCTATTA TGATCAACAA ACAGCAAATA AACTATTAGA TAAAAAACAA CAAGAATATC AGTCTATTCG TI'GGATTGGT
ACATTTAACT
TTGATTCAGA GTAAAGAAAC GGGAGATTTC TATCTGAGGA TGAACAGGCA
ATTATAGAAA
TTAGAAAAAG
ATTGACAGTA
CAGCAAGTCC
TTCTTAGCGA
GAAGACACGC-
ATTCAAAATA
=TGTTTCAA
GCAGCAAGAG
TTCCAAGTG
AATAGTGTAG
CGTTGAAGCGG
TCTGAAACAG
GCTTCAGCGG
TACGATGTAA
GCGAAATCTA
CAAAATGGAA
AAAAAACAAG
GATGGTGTTT
GGTGTCATAC
GTAGCAGAAA
TTAACTI-rAA
TI'ATTATATT
AATACAGCAA
TCAATGGGAJ
GAAAAT1'AG']
AAACATTTA;
AGCAAGATG;
AACCATCGAP
ATACGGATGC~
GAATCGCTG'I
ATCCGCTAGA
ACCTAGATT
TGAATGTTAG
AGTCTCATTC
GGATTGGACC
TTGCACAAGA
GATAT'rAAA
AACCTACAAC
ATTCTACAGC
TCGCAATAAC
TAGATAATCT
ATAAGATAAA
AACAAATCAA
AACGTGTAGC
AAGATGCCCT
ATAAAAACAA
AAGAAGTGAC
kAATTATTTC', *TCCAATCAA2 AGAACTTAA2
ACTGAGAAA
AATAAAT=fl
GGACTCTATI
AAAGTGGGAC
*AAGTCACACz
GTCAAATGCA
TATGGAAAAG-
ATCCACGAA'T
AAAAGGTATT
ATGGGGAACA
TGCAAATGTT
AAGTTTTGTA
CTI'AAATATA
ATCAATGGAT
GCTAAATAAT
AGAtACACAT
GGCTAAAACA
GGCAAAAGAT
GAAGCTTTCA
ACCGATATAC
CAAACAATTA
r' AATAAAGGG; k ATAGAGTATC kTTA=TAAAP 7 CCTGArT.
TTCACTCAAA
CCTGACCTTT
GATTCTYTAG
GTTGGTGATC
AAGGAAACGT
GTGATATTAT
TGGTC=ATA
TCGTTCGGAG
TCTACAGGAA
CGATATAACA
TI'AAATAACG
TCTCCTGGAG
GATrIAATT
AAACCTATGA
GGAAATATAG
GCGTCTATTA
TATGAAAATC
TATCCAGATCO
AATGATACCA
AAAATGAATG
TCAATTGGTA
TATTCTTCTA
LAAGAAAAGCA
'AATCAGATAC
LTAGATAGTCA
ACAAGAAAGA
*AAATGAAAAG
*GGGAAGAAAA
CAAGTAAAGG
CTTATACAGA
TTAACCCATT
CACCAAATGA
CAAATACAGA
TTAGCGTAALA
ATACTT"CGCA
ATGTAGGAAC
ATACTATCGC
AAAGTTACCC
CCCATCCGAT
TGTGGAAAC
TAACTGGCGG
TTGTGGATGA
CAGAAGATAA
AAATAAAA-GA2 TTATGACTrA
CTGGGAAATT
TTACAATCAA;
AATGGACAAA
ATAATCCGGA
AGTTOTCCAT
AAAATTTAAT
AAACCAACCC
ATCACAGGAA
GGAAATTGAT
TGGGTATACG
GTATACGAAA
TATGAAAA'
GGTAGCTGCr
AAATTTATCC
AGGTGCTTCT
CTATCAACAC
ATTCAATACG
TGGTGCCATC
AACTATTACG
GAAAAAAGGA
TACATTAAAT
AAACCAAACA
P.GAATGGAAT
TGGGGAACGT
k.ACACCGTCT k.ATAGAGGGA
-='AGATGAA
rAAAGATGTA kTTGTCTATA
:ACAAATATT
.'GCTAATTTG
420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 AGTCATTTAT ATGATGTAAA ACTGACTCCA CTrrATGATA ATGCTGAGTC TAATGATAAC GTTTC-AGGTG GAAATAACGG AAAAAAACAA ACATTAAATA CAGATGCTCA AGAAAAATTA
ATATGAAGTC
TCACTACAAA
A- J4ATI.
TATI'TTGGGA
ATTCAGAAAT
TTGATAAAAA
AACCATTGCA
ACGTGAGTGA
TTACAAAATA
TTAAAATTAA
AATAG
AGAAAAAAAC
AACAGTGAAT
AAGTAATCCA
TGATATTTCT
TAAACAGATT
AGGTGGGATT
AAATTATGTG
CACACTI'GAA
TAGTRAAAAT
ACACAATGTG
GTGAATAAAG
ATTTCTTCAA
ATAACAGATG
TATAGTAGGT
CATTATGGTG
ACAAAATATA
AGTGATAAAA
GAACAAGGAT
81
AATAAAAATC
AGATTACTAT
ACAATTACAA
TI'CATATTAA
TAGCATCAAT
ATGGTATTAA
AATTTATTAA
AAGTTACTTA
TTTACAAGGA
TATTTTATGA
GTACTATTAT ATAAGTTTAT
AGATGGGGAG
AAGATTAGAT
AACGAATGAT
AAAACCGGAA
GTTAGAAGAT
TGAAGCTAGT
TAGTAGTGAG
TGGGACAATT
CAGTGGATTA
ATTTATCCGA
ATTATAGCTC
GAAATAACTT
AATTTAACAG
GGAATCCTTA
TTTAATATTG
TTAGGACAAA
AAATTTGATT
AATTGGGACT
2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2645 TGCTATTACT TATGATGGTA AAGAGATGAA TGTTTTCAT AGATATAATA INFORMATION FOR SEQ ID NO:32: Wi SEQUENCE CHARACTERISTICS: LENGTH: 881 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 177C8 -vipi (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: Met Lys Lys Lys Leu Ala Ser Val Val Thr C 1 5 10 Met Phe Leu Asn Gly Asn Val Asn Ala Val T 25 Asn Gln Ile Ser Thr Thr Gln Lys Asn Gin G 40 Lys Gly Leu Leu Gly Tyr Tyr Phe Lys Gly L: Thr Met Phe Ala Pro Thr Arg Asp Ser Thr L4 70 7! ys Thr Leu Leu Ala Pro yr Ala Asp Ser Lys Thr In Lys Glu Met Aksp Arg ys Asp Phe Ser Asn Leu eu Ile ~ueTyr Asp Gln Thr Arg Asn Ile Lys 145 Ile Gin Phe Asn Thr 225 Ile Gly Asp Asn Asn 305 Asn Glu Ala Tr Teu Ser 130 Leu Asr Asn Asn Leu 210 Asp Gin Tyr Pro Ala 290 Vtal Ser ;ly Asn Ile Ser 115 Asn Val Ser Gin Lys 195 Phe Gly Asn Thr Tyr 275 Lys Ser I Val Ala Lys Gly 100 Glu Lys Pro Lys Pro 180 Lys Thr Asp Arg Lys 260 rhr Glu Met 3lu ;er Leu Leu Asp Gly Ile Thr 165 Gin Glu Gin Ser Ile 245 Phe Asp Thr Glu Ser I 325 Val Leu Ile uJu Lys Lys 150 Phe Gin Ser Lys Ile 230 Ala Val Tyr Phe Lys 310 His flu Asp Gin Gin Glu 135 Ile Lvs Va1 Gin Met 215 Pro Val Ser Glu Asn 295 Val Ser Ala Lys Ser Ala 120 Lvs Glu Glu Gin Glu 200 Lys Asp Lys Asn Lys 280 Pro Ile 3er fly 82 Lys Gin Lys Glu 105 Ile Ile Gin Val Tyr Gin Leu Lys 170 Gin Asp 185 Phe Leu Arg Glu Leu Trp Trp Asp 250 Pro Leu 265 Ala Ala Leu Val Leu Ser Thr Asn 330 lie Gly I r Thi Glu Val Ser 155 Leu Glu Ala Ile Glu 235 Asp Glu Arg Ala Pro 315 rrp ?ro G1 Ilf His 140 Asp Phe Leu Lys Asp 220 Glu Ser Ser Asp Ala 300 Asn Ser Lys i Asp Asn 125 Leu Thr Lys Arg Pro 205 Glu Asn Leu His Leu 285 Phe Glu Tyr Gly Phe 110 Gly Glu Lvs Ile Asn 190 Ser Asp Gly Ala Thr 270 Asp Pro Asn Thr Ile 350 Thr Lvs Lvs Phe Asp Pro Lys Thr Tyr Ser 255 Val Leu Ser Leu Asn 335 Ser Gin Giu Tyr Gin Ser Ile Phe Ile Gly Asn 160 Ser Glu le Asp Thr 240 Lys Gly Ser Val Ser 320 rhr Phe 340 345 Gly Val Ser 355 Val Asn Tyr Gin Ser Glu Thr Val Ala Gin Glu Trp 365 83 Gly Thr Ser Thr Gly Asn Thr Ser Gin Phe Asn Thr A.la Ser Ala Gly 380 370 375 r Ty: 38! Ty Ali Gl Met Asr 465 Asp Gly Ile Lys Asp 545 Leu Tyr Thr Thr A-1a r Leu r Asp Thr Glu Asp 450 Asn Gly Glu Ile Asp 530 Ala I Leu *J Leu Thr G Pro L 610 Glu S As2 Va Ile Ser 435 Asp Leu Val Trp Val 515 yr eu Eyr ~sp ly 95 ys er n Ala 1 Lys Thr 420 Tyr Phe Leu Tyr Asn 500 Asp Glu Lys Tyr Glu 580 Lys I Met P Asn A As Prc Ale Prc Asr Asn Lys 485 Gly Asp Asn Leu ys 565 ksn ,he sn sp Val 390 Thr i Lys Lys Ser Asn 470 Ile Val Gly Pro Ser 550 Asn Thr 1 Lys 3 Val 1 s Asn S Th Sei Lys His 455 Lys Lys Ile Glu Glu 535 Tyr Lys kia ksp rhr el5 ;er r Ser Asn 3 Gly 440 Pro Pro Asp Gin Arg 520 Asp Pro Pro Lys Val 600 Ile I Ile C Phi Se 42! Gl Ile Met Thr Gin 505 Val Lys Asp Ile 3iu 585 Ser -ys fly Val 41C Thi 1 Asn Thr Met His 490 Ile Ala Thr Glu Tyr 570 Val His Leu Lys 395 Leu Asn Ala Leu Gly lie Leu Asn 460 Leu Glu 475 Gly Asn Lys Ala Glu Lys Pro Ser 540 Ile Lys 555 Glu Ser Thr Lys C Leu Tyr 6 S':r T1- L 620 Trp Thr A 635 Asn Asn Ala 445 Lys Thr Ile Lys Arg 525 Leu Glu 3er In Lsp ~eu n AsF Ile 430 lie Lys Asn Val Thr 510 Val Thr Ile Val Leu 590 Val Tyr Thr Th 41! Ser Thr Gir Gin Thr 495 Ala Ala Leu Glu Met 575 Asn Lys Asp Asn Arg Tyr Asn Asn Val Gly Thr Gly Ala Ile 400 r Ile Pro Ser Val Thr 480 Gly Ser Ala Lys Gly 560 Thr Asp Leu Asn Ile 640 625 630 Val Ser Gly Gly Asn 645 Asn Gly Lys Lys Tyr Ser Ser Asn Asn Pro 655 84 Thr Asp 665 Asp Ala Asn Leu Thr Leu Asn 660 Ala Gin Giu Lys Leu Asn Lys 670 Asn Gin Thr 705 His Asp Ser Ser Gly 785 Glu Glu Lys Gin Arg Cys 690 Val Asn Glu Ile Arg 770 Gly Pro Leu Asp ly 850 Asp 675 Glu Asn Ile Ile Lys 755 Tyr Ile Leu Gly Gly 835 Leu Ile Val Lys Thr 740 Pro Gly His Gin Gin 820 Thr Phe STyr Thr Asn Ser 725 Leu Glu Ile Tyr Asn 805 Asn Ile Tyr Ile Ile Lys 710 Asn Phe Asn Lys Gly 790 Tyr Val Lys ksp Ser Asn 695 AsD Pro Trp Leu Leu 775 Glu Val Ser Phe Ser Leu 680 *Gly Asn Ile Asp Thr 760 Glu Phe Thr Asp Asp 840 Gly Tyr Glu Tyr Ser Asp 745 Asp Asp Ile Lys Thr 825 Phe Leu Met Ile Lvs Ser 730 Ile Ser Gly Asn Tyr 810 Leu 4 Thr Asn Lys Tyr Arg 715 Ile Ser Glu Ile Glu 795 Lys 3iu Lys rrp Ser Pro 700 Leu His Ile Ile Leu 780 Ala Val Ser Tyr Asp 860 Glu 685 Ile Asn Ile Thr Lys 765 Ile Ser Thr Asp Ser 845 Phe Lys Thr Ile Lys Asp 750 Gin Asp Phe Tyr Lys 830 Xaa Lys Asr Thz Ile Thr 735 Vai Ile Lys Asn Ser 815 Ile Asn lie Thr Lys Ala 720 Asn Ala Tyr Lys Ile 800 Ser Tyr Glu Asn r 855 Ile Thr Tyr Asp Gly Lys Giu Met Asn Val 870 875 Phe His Arg Tyr INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 1022 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 17718 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: TGGATTAATT GGGTATTATT TCAAAGGAAA AGATITTTAAT AATCTTACTA TGTTTGCACC
GACACGTGAT
ACAACAAGAA
T'TCACATTT
TTCTAATAAA
CAAAATAGAG
TAAA'rrATTr
ATTTAACAAA
GCAAAAAATG
TCTTTGGGAA
GCTAGCAAGT
CGATCCCTAT
AACGTTCAAC
ATTATCACCA
TTATACGAAT
TGGAGTGAGT
AGGAAATACT
TA
AATACCCTTA
TATCAGTCCA
AACTTATCAA
GGGAAAGAAA
TATCAATCAG
AAAATAGATA
AAAGAATCAC
AAAAGAGATA
GAAAATGGGT
AAGGGATATA
ACTGATTATG
CCATTGGTAG
AATGAAAATT
ACAGAAGGAG
GTTAATTATC
TCACAATTCA
TGTATGACCA
TTCG'ITGGAT
AGGATGAACA
AGCAAGTTGT
ATACGAAATT
GTCAAAACCA
AGGAAT1TFT
TTGATGAAGA
ACACGATTCA
CAAAATTTGT
AAAAGGCCGC
CTGCTTTYCC
TATCCAATAG
CTCCATTGA
AACACTCTGA.
ATACGGCTTC
ACAAACAGCG
TGGTTTGATT
GGCAATTATA
CCATTTAGAA
TAATATTGAT
ATCTCAACAA
AGCAAAAGCA
TACGGATACA
AAATAAAGTT
T1'CGAATCCA
AAGGGATTTA
AAGTGTGAAT
TGTAGAGTCT
AGCTGGTGGC
AACAGTTGCA
AGCGGGATAT
AATGCATTAT
CAGAGTAAAG
GAAATCGATG
AAAGAAAAAT
AGTAAAACAT
GTTCAACTGA
TCAAAAACAA
GATGGAGACT
GCTGTCAAAT
TTAGACAGCC
GATTTATCAA
GTTAGTATGG
CATTCATCCA
GGTCCATTAG
CAAGAATGGG
TTAAATGCCA
TAGATAAAAA
AAACGGGCGA
GGAAAATCAT
TAGTTCCAAT
TTAAAGAACT*
GAAACCCTGA
ACCTTTTTAA
CCATTCCTGA
GGGATGATTC
ACACAGTTGG
ATGCAAAGGA
AAAAGGTGAT
CGAATTGGTC
GCCTTTCTT1'
GAACATCTAC
ATATACGATA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1022 INFORMATION FOR SEQ ID NO:34: Wi SEQUENCE CHARACTERISTICS: LENGTH: 340 am~ino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 17718 (xi) Gly 1 86 SEQUENCE DESCRIPTION: SEQ ID NO:34: Leu Ile Giy Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr 5 10 is T I I e ro Ir Arg Asp Asn Ala Asn Ala Leu Leu Asp Lys Lys Thr 25 Gin Leu Met Tyr Asp Gin Gin Thr Gin Giu Tyr Gin Ser le Arg Thr Gly Asp Phe Thr Phe Asn Trp Ile Gly Leu Ile Gir r r Leu Ser Leu Asp Asn Glu 145 Gin Ser Val Phe Sei Asz Val Ser Gin 130 Ser Lys Ile Ala 7al r Lys a Lys Pro Lys 115 Ser Gin Met Pro Val 195 Ser Glu Asn Val Ser 275 Asp Gly Ile 100 Thr Gin Glu Lys Asp 180 Lys Asn Lys Pro Ile I 260 'er I Glu Lys Lys Phe Gin Phe Arg 165 Leu Trp Pro Gin 70 Glu Ile Lys Val Leu 150 Asp Trp Asp Leu 40 Ser Lys Glu 55 Ala Ile Ile Lys Gin Val Glu Tyr Gin 105 Glu Leu Lys 120 Gin Leu Arg 135 Ala Lys Ala Ile Asp Glu Glu Giu Asn 185 Asp Ser Leu 200 Asp Ser His 215 Arg Asp Leu Ala Ala Xaa G1i Val 90 Sex Leu Asn Ser Asp 170 Gly Ala Thr Asp Pro i Ile 75 His Asp Phe Pro Lys 155 Thr Tyr Ser Val Leu 2:5 Ser Asp Gly Leu Glu Thr Lys Lys Ile 125 Glu Phe 140 Thr Asn Asp Thr Thr Ile Lys Gly 205 Gly Asp 220 Ser- Asn Val Asn Ser Asn c Thr Giu C 285 Lys Lys Phe 110 Asp Asn Leu Asp Gin 190 Tyr Pro kla lal Glu Asn Ser Lys Phe Gly 175 Asn Thr Tyr Lys Ser Ile Lys Ile Gin Lys Lys 160 Asp Lys Lys rhr hlu let 210 Asp 225 Thr Glu Ser kla Ala 230 .eu Val 245 jeu Ser hr Asn 250 Asn Giu Asn 265 Ser Tyr Thr 280 255 Leu Asn Ile Giu Ala Gly Gly Gly Pro Leu Gly Leu Ser Phe Gly Vai Ser Vai 290 295 300 Asn Tyr Gin His Ser Giu Thr Val Ala Gln Giu Trp Gly Thr Ser Thr 305 310 315 320 Gly Asn Thr Ser Gin Phe Asn Thr Ala Ser Ala Gly Tyr Leu Asn Ala 325 330 335 Asn Ile Arg Tyr 340 ()INFORMATION FOR Ui) SEQUENCE CHARACTERISTICS: LENGTH: 1073 base pairs TYPE: nucleic acid STP.ANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 185AA-2 (xi) SEQUENCE DESCRIPTION: SEQ ID TGGA'PTAATT GGGTATTATT TCCAGGAGCA AAACTTTGAG AAACCCGCTT TGATAGCAAA
TAGACAAGCT
ACAGCAACAC
CTATGTATTrG
TGTCAATCAA
TAGAATTGAA
GAAGTGGTCA
AGATTTCT
ACAAGGAGAT
TGATACAGAT
ACAGGTGGCA
TAACCCTTAT
CAGTATCGAT
TGTTGGTGTA
TCTGATTTGG
ATTCAATCTG
TCAACCTCAT
ACTTCTATGA
TATGTCCCAG
ATTCAGGAG
CATAAACAAG
GAGAAAAAAG
GATGATAGTA
GTGAAATGGG
AAGTCTCGTA
AATGCTGTCA
CATATGGAAA
AAATACCGAA
TTAGATGGCT
CCGACCAACA
CAGAACCGAT
GAGATACACA
CCGAGATAGA
ATCAAGAGAA
TATCACGCAG
TTTCTGATGA
ACGATTCTAT
CAGTAGGAGA
AAGCAGAAGC
GATTAATTGT
AGATGACGTG
TGGCTATATT
GGTCGTGATT
TCAACTAGAA
AGGACAAGAG
ACCAATTCCG
AATCATCCCT
TAAGAGATCT
AAAGAGTTAC
CAGCCACCTC
GAACTCGATG
AAAGATAAAC
AACCTTCTGG
GATCATGCTT
GAAACCAATT
TCAGATAAAG
TATCCAAAGA
AAACAGGAGA
GAAAAACCAT
GCTATAAAAT
ACTTTCAACT
TCCATTTACC
TA'I7TCAGAA
ATCCTGACCG
CA-1CAAAG
AGTATGTGTC
120 180 240 300 360 420 480 540 600 660 720 780 840 ATGGGAAACG AGTGGATATA GAAGGAGCTA GGTTATACCA
TCCATACACA
CAGAAATCCT
CTCCGAACAA
GATTGGGAAA AAGCGGCTGG TTAGTCGCGG CCTATCCAAC CAAAATATAT CAACAGGGCT 88 TGGAAAAACC GTATCTGCGT CTACGTCCGC AAGCAATACC GCAGCGATTA CGGCAGGTAT 900 TGATGCAACA GCTGGTGCCT CTITACTTGG GCCATCTGGA AGTGTCACGG CTCATTTTTC 960 TTACACGGGA TCTAGTACAG CCACCATTGA AGATAGCTCC AGCCGTAATT GGAGTCGAGA 1020 CCGGGATT GATACGGAC AAGLUTGATA T AAATGCC AATATACGAT ATA 1073 INFORMATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 357 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 185AA2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: Gly Leu Ile Gly Tyr Tyr Phe Gin Glu Gin Asn Phe Giu Lys Pro Ala 1 5 10 Leu Ile Ala Asn Arg Gin Ala Ser Asp Leu Glu Ile Pro Lys Asp Asp 25 Val Lys Giu Leu Leu Ser Lys Giu Gin Gin His Ile Gin Ser Val Arg 40 Trp Leu Gly Tyr Ile Gin Pro Pro Gin Thr Gly Asp Tyr Val Leu Ser 55 Thr Ser Ser Asp Gin Gin Val Val Ile Giu Leu Asp Gly Lys Thr Ile 70' 75 Val Asn Gin Thr Ser Met Thr Giu Pro Ile Gin Leu Giu Lys Asp Lys 90 Arg Tyr Lys Ile Arg Ile Giu Tyr Val Pro Giy Asp Thr Gin Gly Gin 100 105 110 Glu Asn Leu Leu Asp Phe Gin Lcu Lys Trp Ser Ile Ser Gly Ala Glu 115 120 125 Ile Giu Pro Ile Pro Asp His Ala Phe His Leu Pro Asp Phe Ser His 130 135 140 Lys Gin Asp Gin Giu Lys Ile Ile Pro Giu Thr Asn Leu Phe Gin Lys 145 150 155 160 89 Ser Arg Ser Gin Gly Asp Glu Lys Lys Val 165 Lys Arg 5cr Ser Asp Lys 170 175 Asp Thr Ser Ser 225 Ser Ala Gin Ser Gly 305 Tyr Pro Asp Arg Asp 180 Scr Gly Tyr Thr 195 Met Lys Glu Leu 210 Arg Thr Val Gly le Asp Asn Ala 245 T'vr Pro Thr Val 260 Gin Asn Ile Ser 275 Ala Ser Asn Thr 290 Ala Ser Leu Leu Thr Gly Ser Ser 325 Thr Ile Gly Asp 230 Val Gly Thr Ala Gly 310 Asp Gin Tyr 215 Pro Lys Val Giy Aia 295 Pro Arg 200 Thr Tyr Al a His Leu 280 Ile Ser 185 Gin Lys Thr Glu Met 265 Gly Thr Gly Vai Tyr Asp Al a 250 Giu Lys Al a Ser Ala Val1 Trp 235 Arg Arg Thr Gly Val 315 Asp Asp Asp Ser Ile Ser Vai Se r 220 Giu Asn Leu Val Ile 300 Thr Ser Asp Lys 205 Asn Lys Pro Ile Ser 285 Asp Ala Ser Glu 190 Trp Pro Aila Leu Val 270 Ala Al a His Ser *Trp Asp Tyr Al a Vai 255 Ser Ser Thr Phe Arg.
Giu Asp Lys Gly 240 Ala Giu Thr Ala Ser 320 a a Thr Ala Thr Ile Glu 330 335 Trp, Ser Arg Asp Leu Gly Ile Asp Thr Gly Gin Ala Aia Tyr Leu Asn 340 350 Ala Asn Ile Arg Tyr 355 INFORMATION FOR SEQ ID NO:37: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1073 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 196F3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:
TAAATTTCAA
TGGGTTACNT GOOTATTAYT TTCAGGATAC CAACTTGCTT TAATGGCACA
C
C
TAGACAAGCC
TCAACAACAC
TTATGT..TrG
TCTCAATCAA
TAGAATTGAA
CAACTGGTCG
AGATCTTTCT
TCAAGGAGAA
CGATACAGAT
ACAATTGGCA
TAATCCTTAT
ACGTATCGAC
AGTTGGCGTA
GGGAAAAACA
CGATGCAACG
TTATACGGGT
TCTTGGTATT
TCAGATTTAC
ATTCAAGCAC
TCAACTTCA7
TCTTCTATGA
TATGTATCAE
ATTTCAGGTG
CGGGAACAAG
GGGAAACAAG
GATGATGGGA
GTAAGATGGA
AAGTCTCATA
CAAGCTGTGA
CATATGGAAA
GTATCTGCGT
GTTGGTGCCT
TCGAGTACAT
GATACCAGCC
CCGACCAACA
CCGAACCCAT
AAAGTAAAAC
CTACGGTAGA
NTAAAGATAA
TATCTCGAAG
TTTACGATGA
ACGATTCTAT
CTGTAGGAGA
AAATAGAAGC
GACTGATTGT
CTACATCTGC
CTTTACTTGG
CCACTGTTGA
AATCTGCGTA
GGTCTTCACC
TCGATTAGAA
AGAAAAAGAG
ACCAATTCCA
AATCATCCCT
TAAAAGATCT
ATGGGAAACA
GAAGGATCAA
TCCATACACA
CAGAAACCCA
CTCTGAAAAA
AAGTAATACA
GAACTCNATG
AAAGATAAAC
ACGCTCCTAG
GATAATGCTT
GAAACAAGTT
CTAGCTGTGA
AGCGGCTATA
GGCTATACCA
GACTGGGAAA
TrAGTTGCAG
CAAAATATAG
GCGGGGATTA
AAATAAACAA AAATGAMGTC AAGGATTTAC TGAGATGGAT GGGCTATATr CAGCCACCTC TATcAAAGGA
AAACAGGAGA
GAAAAATAAT
AATATAMAAT
ACTTTCAAC -r
TTCAGTTACC
TATTGCAGGA
ATCCTCTACA
CGATTCAAAG
AATATGTGTC
AAGCAGCTGG
CATATCCAAC
CAACAGGACT
CAGCGGGAAT
CCCATTrTTTC
GGAGTCAAGA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1073
*CC.
ACCTTCGGGA AGTGTCACCG AAATAGCTCG AGTAATAATr CTTAAATGCC AATGTAAGAT INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 357 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii! MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 196F3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: Gly Leu Xaa Gly Tyr Xaa Phe Gln Asp Thr Lys Phe Gln Gln Leu Ala 1 5 10 Leu Met Ala Hii
S
Val Trp Thr Leu Gin Glu Val Glu 145 Gin Asn Thr Ser I Ser I 225 Arg Ala I Lys C Lys Met Ser Asn Tyr Thr Glu 130 Gin Gly Pro Ser Met 210 His Ile ryr In Asp Gly Ser Gin Xaa Leu 115 Pro Xaa Glu Leu Gly 195 Lys Thr Asp Pro Asn I Let Asp Ser Ile 100 Leu Ile Lys Gly His 180 Tyr Asp lal .le lee Arg 1 Leu Ile Gin Ser I Arg Asp I Pro Asp I
I
Lys G 165 Asp Thr I Gin G Gly A 2 Ala V 245 Val G Ala T 91 Gin Ala Ser Asp
S
Sei Glr Gln 70 Met ile ?he tsp ys .50 1n 'hr le ly sp 30 al ly hr Lys I Pro 55 Val Thr Glu Gin Asn 135 Ile Val Asp Gin Tyr 215 Pro Lys Gly I Asp 40 Pro Phe Glu Tyr Leu 120 Ala Ile Ser Asp Arg 200 rhr Lyr lie ;is .eu Gin Gin Thr Pro Val 105 Asn Phe Pro Arg Asp 185 Gin Lys Thr 2 Glu I
C
265 Gly L Gir Thr Glu Ile 90 Ser Trp Gin Glu Ser 170 Gly Leu Tyr ksp Ua 50 flu Jys SHis Gl Leu 75 Arg Xaa Ser Leu Thr 155 Lys Ile Ala Vai Trp 235 Arg Arg Thr SIle AsI Xaa Leu Ser Ile Pro 140 Ser Arg Tyr Val Ser 220 Glu Asn Leu Val Gin Glv Glu Lvs Ser 125 Asp Leu Ser Asp Arg 205 Asn Lys Pro Ile Ser 2 285 Ala Vai Lys Lys Thr 110 Gly Leu Leu Leu Glu 190 Trp Pro Ala Leu Val 270 kla Val Leu Ile Asp Glu Ala Ser Gin Ala 175 Trp Asn Tyr Ala Val 255 Ser Ser Leu Glu Ile Asn Lys Asn Xaa Arg Ser Ile Lys Lys Thr Arg Asp 160 Val Glu Asp Lys Gly 240 Ala Glu Thr Ser Ala 290 Ser Asn Thr Ala Gly 295 Ile Thr Ala Gly Ile Asp Ala Thr Vai 300 92 Gly Ala Ser Leu Leu Gly Pro Ser Gly Ser Val Thr Ala His Phe Ser- 305 310 315 320 Tyr Thr Gly Ser Ser Thr Ser Thr Val Glu Asn Ser Ser Ser Asn Asn 325 330 335 Ti-p Ser Gin Asp Leu Gly lie Asp Thr Ser Gin Ser Ala Tyr Leu Asn 340 345 350 Ala Asn Vai Arg Tyr 355 0OS* INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 1073 base pairs TYPE: nucleic acid STRAN~DEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 196J4 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: TGGGTTAATT GGGTATTATT TCCAGGATCA AAAGTTTCAA CAACTTGCTT
TAATGGCACA
TAGACAAGCT TCTAATTTAA ACATACCAAA AAATGAAGTG AAACAGTTAT
TATCCGAAGA
S.
S
9S*@ TCAACAACAT ATTCAATCCG
TTAGGTGGAT
S
S
TTATATATTG
TCTTAATCAA
TAGAATTGAA
GAATTGG'rcA
AGACTTTTCT
ACAAGAAGAT
TGATACAGAT
TCAAATAGCG
TAACCCCTAT
TCAACTTCAG
TCTTCTATGA
TATGTCCCAG
ATTTCAGGAG
CATAAACAAG
GCAAACAAAG
GATGATGCTA
GTGAAATGGG
AATTCGCATA
CCGATCGACA
CAGCACCCAT
AAGATACAAA
ATAAGGTAGA
ATCAAGAGAA
TCTCTCGAAA
TTTATGATGA
CGGATATATC
TGTCGTAATT
TCAATTAGAA
AGGACAGGAA
ACCAATTCCG
AATCATCCCT
TAAACGATCC
ATGGCAAAC-A
AAATCACCTC
GAACTTGACG
AAAGATAAAC
AACCTCTTTG
GAGAATGCAT
GAAGCAAGTr ATAGC-fACAG GDAGGA'1ACA.
AAACGGGAGA
GAAAAACCAT
TTTATAAAAT
ACTTTCAACT
TCTGTTGCC
TATTCCAGGA
Gl1rCT7t'GTA C-ATAr7'AACG
AGTATGTGTC
120 180 240 300 360 420 480 540 600 660 720 ACGATTCTAT GAAGGAGCGA GGTTATACCA CAGTAGGAGA TCCCTACACA GATTGGGAAjA AAGCGGCTGG ACGCATTGAT
CAGGCAATCA
AGTTGGTGTA CATATGGAAA
AAGTAGAAGC
AACTGATTGT
TAGGAATCCA
TTCTGAGAAA
TTAGTGCAG
CAAAATATAT
CCTATCCAAC
CAACTGGGGT
93 TGGAAAAACA GTATCTGCGG CTATGTCCAC TGGTAATACC GCAGCGATTA CGGCAGGAAT TGATGCGACC GCCGGGGCAT CTTTACTTGG ACCTTCTGGA AGTGTGACGG CTCATTTTTC TTATACAGGG TCTAGTACAT CTACAATTGA AAATAGTTCA AGCAATAATT GGAGTAAAGA TCTGGGAATC GATACGGGGC AATCTGCTTA TTTAAATGCC AATGTACGAT ATA 900 960 1020 1073
S
000S
S
000S INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 357 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 196J4 (xi) SEQUENCE DESCRIPTION: SEQ ID Gly Leu Ile Gly Tyr Tyr Phe Gin Asp Gin Lys Phe Gin Gin Leu Ala *fee 05 0 0000 0000 0*
S.
0 0 0 6S* *000 0 Leu Met Ala His Arg 20 Val Lys Gin Leu Leu 35 Trp Ile Gly Tyr Ile 50 Thr Ser Ala Asp Arg Leu Asn Gin Ser Ser Gin Ala Ser Asn Leu 25 Ser Glu Asp Gin Gln 40 Lys Ser Pro Gin Thr 55 Asn Ile Pro His Ile Gln Gly Asp Tyr Lys Asn Glu Ser Val Arg Ile Leu Ser His Val Val Ile Glu Leu 70 Met Thr Ala Pro Ile Gin Asp Gly Lys Thr Leu Tyr Lys Gl Asn 1-eu Val Giu Pro 130 Ile Arg 100 Leu Giu Lys Asp Lys Ile Glu Tyr Pro Glu Asp Thr Lys Gly Gin 110 Gly Asp Lys Phe Asp Phe Gin Leu 120 Ile Pro Giu Asn Ala 135 Trp Ser Ile Phe Leu Leu Pro Asp Phe Ser His 140 Lys Gin Asp Gin Giu Lys Ile Ile Pro Giu Ala Ser Leu Phe Gin Glu 145 ISO 155 160 Gin Glu Gly Ser Thr G1 Ser Met 210 Ser His 225 Arg Ile Ala Tyr Lys Gin Asp Leu Gly 195 Lys Thr Asp Pro Asn 275 Ala Tyr 180 Tyr Glu Val Gin Thr 260 Ile Asn 165 Asp Thr Arg Gly Ala 245 Val Ser Lys Thr Ile Gly Asp 230 Ile Gly Thr 94 Val Ser Arg Asp Asp Asp 185 Gin Arg Gin 200 Tyr Thr Lys 215 Pro Tyr Thr Lys Val Glu Val His Met 265 Gly Val Gly 280 Asn 170 Ala Ile Tyr Asp Ala 250 Glu Lys Lys Ile Ala Val Trp 235 Arg Lys Thr Arg Ser Tyr Asp Val Lys 205 Ser Asn 220 Glu Lys Asn Pro Leu Ile Val Ser 285 Ile Asp 300 Thr Ala Ser Ser Ile Ala Thr 175 Glu Trp Glu 190 r Trp Pro Ala Leu Val 270 Ala Ala His Ser Asp Tyr Ala Val 255 Ser Ala Thr Phe Asn Asp Asn Gly 240 Ala Glu Met Ala Ser 320 Asn Ser Thr Gly Asn Thr Ala Ala Ile Thr Ala Gly 290 295
C
Gly Ala Ser Leu Leu 305 Tyr Thr Gly Ser Ser 325 330 335 Trp Ser Lys Asp Leu Gly Ile Asp Thr Gly Gin Ser Ala 340 345 Ala Asn Val Arg Tyr 355 INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS: LENGTH: 1046 base pairs TYPE: nucisic acid STRANDEDITESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 197T1 (xi) SEQUENCE DESCRIPTION: SEO ID NO:41: Tyr Leu Asn 350 TTAAAGGAAA AGATTTTAAT TGGATTAATT GGGTATTATT AATCT'rACTA TATTTGCTCC AACACGTGAG AATACTCTTA TTTATGATTT AGAAACAGCG ACAACAAACC TATCAATCTA TTCGTTGGAT
TTTTACCTTT
TTCGCAAAAA
CAAAATrGAA GAAATTA11rT
AAATCCTGAA
CTGTT TAGC
AGATGGAGAT
AGCTGTTAAA
TTTTAGACAG
GGAITATCT
TGTTAGCTTG
TACTTCATCG
TGGACCAGAA
GGCCAAAGAG
ATATCTAAAT
CAATTIATC5G
GGCCAAAAGA
TATCAATCTG
AAAATAAATA
TTTGGTAAAG
AATAAAAGTA
GCCATTCCTG
TGGGACGAAG
CACACTGCTG
AATGCAAAAG
GAAAATGTCA
AATAATTGGT
GGTI'TGTTGT
TGGGGTACAA
GCCAATGTAC
ATGATGAGCA
AACAAGTTGT
ATAAAGCGTT
GTCAAA.AACA
AAAAAACTCA,
AACGAGATAT
ATGTATGGGA
GATTAGCTGA
GTGACCCCTA
AAACATTTAA
CCATATCAAA
CCTATACAAA
CTTTTGGAGT
CTAAGGGAGA
GATATA
CGGTTTAATA
TGCTATTATA
TCATTTAGAA
AAACCCAGAC
ATCTCAGCAA
AACATATTTA
AGATGAAGAT
AGAAAATGGG
TAAGGGATAT
TAGTGACTAT
TCCATTGGTG
AGATGAAAAT
TACAGAGGGG
AAGTGCCAAT
CGCAACACAA
AATTCTTTAT
AAAAGCAAAA
GAAATCGATG
AAAGATAAAT
AGTCAAATGT-
GTGCAACAAG
AAGAAAGCAT
ATAGATGAGG
TATACCATCA
AAAAAGTTTG
GAAAAGGCAT
GCTGCTTTTC
PLAAACTGCTG
GCATCTATTG
TATCAACATT
TATAATACAG*
TAGATAAGCA
AAGCTGGAGA
GGAAAGTTAT
TAGTTCCCAT
TAAAGAATT
ACGAATTGAG
CGAAAAGCAG
ATACAGATAC
AAGGAAGAGT
'I'TCCAATCC
CAAAAGATTT
CAAGTGTCAA
AAATTGCGTC
AAGCTGGAAT
CTGAAACAGT
CTICAGCAGG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1046 INFORMATION FOR SEQ ID NO:42: Wi SEQUENCE CHARACTERISTICS: LENGTH: 348 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 197T1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: Gly Leu Ile Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr 1 5 10 is 96 Asn Thr Leu 25 Ile Phe Ala Pro Thr Arg Glu Ile Tyr Asp Leu Giu Thr Ala Trp Leu Ser Leu Asp Lys Gly 145 Leu Asp Gly Ala Thr 225 Asp Pro E Asn I Asr Ile Ser Gin Val Ser Gin 130 Lys Phe Thr Tyr Asp 210 kla .eu er ~ys Ser Gly Asp Lys Pro Gin 115 Ser Glu Ser Asp Thr 195 Lys Gly Ser Vai l Thr Lei Leu Asp Gly Ile 100 Met Gin Lys Asn Thr 180 Ile 3ly ksp ksn k, n Ua 1 Leu I ile Glu Gin Lys Phe Gin Thr Lys 165 Asp Lys Tyr.
Pro Ala 245 Val Glu AsI Lys His 70 Lys Ile Lys Val Gin 150 Ser Gly Gly Lys Tyr 230 Lys Ser Ile Lys Ser 55 Ala Lys Glu Glu Gin 135 Thr Lys Asp Arg Lys 215 Ser Glu Leu Ala Gin Gin 40 Lys Lys Ile le Gin Val Tyr Gin 105 Leu Lys 120 Gin Asp Tyr Leu Arg Asp Ala Ile 185 Val Ala 200 Phe Val Asp Tyr Thr Phe Glu Asn 265 Ser Thr 280 Gl Al.
Gli.
Val 90 Ser Leu Glu Lys Ile 170 Pro Val Ser Glu Asn 250 Val Ser Th: L Gi' 1 Il 75 His Asr Phe Leu Lys 155 Asp Asp Lys Asn Lys 235 Pro Thr Ser c r Tyr Asp Aso Leu Lys Lys Arg 140 Ala Glu Val Trp Pro 220 Ala Leu Ile I Asn Ile G 300 Gir Phe Gly Glu Ala Ile 125 Asn Ser Asp Trp Asp 205 Phe Ser lal ;er sn Ser Thr Lvs Lys Leu 110 Asn Pro Lys Ile Glu 190 Glu Arg Lys Ala Lys 2 270 Trp I Ile Phe Va1 Asp Asn Ser Glu Ser Asp 175 Glu Gly Gln Asp kla 255 ksp ;er Arg Gin Ile Lys Pro Gin Phe Ser 160 Glu Asn Leu His Leu 240 Phe Glu Tyr 275 Thr Asn Thr Glu Gly Ala Ser Ile Glu Ala Gly 290 295 ;ly Pro Glu Gly 97 Leu Leu Ser Phe Gly Val Ser Ala Asn Tyr Gin His Ser Glu Thr Val 305 310 315 320 Ala Lys Glu Trp Gly Thr Thr Lys Gly Asp Ala Thr Gin Tyr Asn Thr 325 330 335 Ala Ser Ala Gly Tyr Leu Asn Ala Asn Val Arg Tyr 340 345 INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 1002 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (iJi) MOLECULE TYPE: DNA (genornic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 197U2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: TGGGTTAATT GGGTATTATT TTACGGATGA GCAGCATAAG GAAGTAGCTT
AGGTGAAAAA
TTCAGCGCAA
CTCTI'CTGAT
TATGGAGAAA
CAACCCGAAT
GCTCATCCCA
TGARAAAAAA
TAAGGACCTT
TGTTCCTTGG
TCAATCGCGT
TGCCGAAACA
TACGATGGAA
AACTTCAAAA
ATTACATTG
AMTACATTAG
TGGATTGGWA
AAAGATACTA
CCCATATATT
AGTGAGAAAA
GAAAAATACA
GACGCATCGA
ATCCCAGATG
GATGAATCTC
ACAGCGCAGG
CAACTGGAAjA CAGTTTA'rrT
AGTATGACAG
AATCCATTCG
CAGATTCAGC
ATATACAGGT
TTTTAAAACT
TAGAAAAAGA
CTTTACGATT-
TTCTGTCTCC
GACATTTATT
AATTTGAAAA
TTCAAGAACA
ATCCATATAC
CGCGTGACCC
TCTCTAAAAA
AAAG'rrCTGA.
GAAAATGAAG
ACCTCAAACA
AAAAACGACA
GGGGAATATA
CAATGGGGAA ACGATTATTC
TAAAGTATAC
ATCTTGGAAA
CGATTTTTCT
ATTTACTAAG
AAATGGGTAT
GGGCTTTAAA
AGATTTTGAA
TTA-GT i'jC
TGATAATGTG
AACGACTTAC
GAAATTCAAA
ATGGGGGGCA
AAAATAGCAG
GATGAATTGA
ACATTCAATG
AAATATATrT
AAAGTAACCG
GCTTATCCGG
CAGGAATCTA
TCTGTT-GAGA
TTAYTCAATT
AAAAGATTCT
CGTT TTCCAC
AAAAATCTAA
TCGAGCATAA
CCAATTCAGA
ATCAAGAAAA
AAGAT1'CTGA
GGATTCAAAT
CCAATCCATA
GATATATGCC
CTGTAGGGGT
ATGGTGGAGG
TAGGAGGGAA
PLCAGTTGGAA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 CACTGGCGGA AATTTCTCCT AAATATTCTC AAATGGAGCA TCTACAACAG AGGGAGAAAG TA CTT CCTGG AGCTCACAAA TTGGTATTAA 98 CACGGCTGAA CGCGCGTTTT TTAAATGCCA ATATTCGATA TA INFORMATION FOR SEQ ID NO:44: SEQUENCE CHARACTERISTICS: LENGTH: 333 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 197U2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 1002 Gly Leu lie Gly Tyr Tyr Phe Thr Asp Glu Gin His Lys Glu Val Ala 1 5 10 4 4 Phe Lys Gin Asp 65 Met Ile Lys Ser Ala Xaa Lys Val 50 Thr Glu Glu Met Pro 130 Ser Gin Asn Pro Ile Lys His Gly 115 Asp Arg Leu Asp Gin Leu Pro Asn 100 Gly Phe His Gly Lys Thr Lys Ile Asn Thr Ser Leu Glu Lys Xaa Lys Ile Leu 40 Gly Glu Tyr 55 Leu Asn Gly 70 Tyr Leu Glu Pro Asn Ser Asn Ser Glu 120 Lys Ile Ala 135 Leu Phe Thr 150 Thr 25 Ser Thr Glu Lys Glu 105 Leu Asp Lys Leu Ala Phe Thr Asp 90 Lys Ile Gin Ala Gin Ser Ile 75 Lys Thr Pro Glu Asp Trp Thr Ile Val Leu Glu Asn 140 Ser Ile Ser Gin Tyr Arg Lys 125 Xaa Ala Xaa Ser Lys Glu Leu 110 Tyr Lys Lys Asn Asp Ser Ile Ser Ile Lys Met Ile Lys Asn Gin Trp Leu Asp Asp Glu Leu Lys Asp Ser Asp 155 1n Lys Asp Leu Ile Pro Asp Glu Phe Glu Lys Asn Gly Tyr Thr Phe Asn 165 170 Ser Leu Gin Glu Gly Ile Gin Ile 180 Val Pro Trp Asp 175 Gin Gly Phe 190 99 Lys Lys Tyr Ile Ser Asn Pro Tyr Gin Ser Arg Thr Ala Gin Asp Pro 195 200 205 Tyr Thr Asp Phe Giu Lys Val Thr Gly Tyr Met Pro Ala Giu Thr Gin 210 215 220 Leu Giu Thr Arg Asp Pro Leu Val Ala Ala Tyr Pro Ala Val Gly Val 225 230 235 240 Thr Met Glu Gin Phe Ile Phe Ser Lys Asn Asp Asn Val Gin C-lu Ser 245 250 255 Asn Gly Gly Gly Thr Ser Lys Ser Met Thr Giu Ser Ser Giu Thr Thr 260 265 270 Tvr Ser Val Giu Ile Gly Gly Lys Phe Thr Leu Asn Pro Phe Ala Leu 275 280 285 **Ala Giu Ile Ser Pro Lys Tyr Ser His Ser Trp, Lys Asn Gly Ala Ser 290 295 300 Thr Thr Giu Gly Glu Ser Thr Ser Trp, Ser Ser Gin Ile Giy Ile Asn 305 310 315 320 Thr Ala Giu Arg Ala Phe Phe Lys Cys Gin Tyr Ser Ile 325 330 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1073 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 202EI (xi) SEQUENCE DESCRIPTION: SEQ ID TGGGTTAATT GGGTACTATT TTCAGGATCA AAAGTTTCAA CAACTCGCTT TGATGGCACA s 1AGACAAGCT TCAGATTTAG AAATACCTAA AAATGAAGTG AAGGATATAT TATCTAAAGA 2 .~o TCAACAAAT ATTCAATCAG TGAGATGGAG GGGGTATATT AAGCCACCTC AAACAGGAGA 180 CTATATATTG TCAACCTCAT CCGACCAACA GGTCGTGATT GAACTCGATG GAAAAAACAT 240 TGTCAATCAA ACTTCTATGA CAGAACCAAT TCAACTCGAA AAAGATAAAC TCTATAAAAT 300 TAGAATTGAA TATGTCCCAG GAGATACAAA AGGACAAGAG AGCCTCTTG ACTTTCAACT 360
TAACTGGTCA
AGACTTTTCT
ACAAGGAGAT
TGATACAGAT
ACAACTAGCG
TAACCCTTAC
CCGTATCGAT
TGTT-GGTGTA
TGGAAAAACC
TAATGCAACA
TTATACAGGA
TCTTGGAATC
ATITCAGGAG
CATCAAC.AAG
GAGAAAAAAG
GATGATGGTA
GTGAAATGGG
AAGGCTCATA
AACGCTGTCA
CATATGGAAA
GTATCTGTGT
GCCGGTGCCT
TCTAGTACAT
GATACGGGAC
ATACGGTGGA
ATCAAGAGAA
TATCTCGTAG
TTTATGATGA
ACGATTCTAT
CAGTAGGAGA
AAGCAGAAGC
GACTAATTG-T
CTATGTCCGC
CTTACTTGG
CCACTGTTGA
AATCTGCGTA
100
ACCAATTCCG
ACTCATCCCT
TAAGAGGTCT
ATGGGAAACG
GAAGGAGCGA
TCCCTACACA
TAGGAATCCT
CTCCGAAAAA
AAGCAATACC
GCCATCTGGA
AAATAGCTCA
TTTAAATGCC
GAGAATGCAT
GAAATCAGTC
TTAGCTACAA
GAAGGATACA
GGTTATACTA
GATTGGGAAA
TTAGTCGCOG
CAAAATATAT
GCAGCGAT"TA
AACGTCACGG
AGTAATAATT
AATGTAAGAT
TTCTGTTACC
TATTTCAGGA
ACCCTCTCCT
CAATACAGGG
AGTATGTGTC
AAGCGGCTGG
CCTATCCAAC
CAACAGGACT
CGGCAGGAAT
CTCATTTTTC
GGAGTCAAGA
ATA
420 480 540 600 660 '720 780 840 900 960 1020 1073 INFORMATION FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 357 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 202E1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: Gly Leu Ile Giy Tyr Tyr Phe Gin Asp Gin Lys Phe 1 5 10 Leu Met Ala Fi.s Arg Gin Ala Ser Asp Leu Glu Ile 25 Val Lyc A~sp Ile Leu Ser.Lyz Asp Gin Gin His Ile 40 Trp Arg Giy Tyr Ile Lys Pro Pro Gin Thr Gly Asp 55 Thr Ser Ser Asp Gin Gin Vai Val Ile Glu Leu Asp 70 75 Gin Gin Leu Ala is Pro Gin Tyr Lys Asn Glu 5cr Vai Arg Ile Leu Ser Giy Lys Asn Ile 101 Val Asn Gin Thr Ser Met Thr Giu Pro Ile Gin Leu Glu Lys Asp Lys 90 Leu Tyr Lys Ile Arg Ile Giu Tyr Vai Pro Gly Asp Thr Lys Gly Gin 100 i05 110 Giu Ser Leu Leu AsD Phe Gin Leu Asn Trp Ser Ile Ser Giy Asp Thr 115 :20 125 Vai Giu Pro Ile Pro Giu Asn A-1a Phe Leu Leu Pro Asp Phe Ser His 130 135 140 Gin Gin Asp Gin Giu Lys Leu Ile Pro Giu le Ser Leu Phe Gin Giu 145 150 155 160 Gin Giy Asp Giu Lys Lys Val Ser Arg Ser Lys Ara Ser Leu Ala Thr 165 170 175 .Asn Pro Leu Leu Asp Thr Asp Asp Asp Gly Ile Tyr Asp Giu Trp Giu 180 185 190 Thr Giu Gly Tyr Thr Ile Gin Gly Gin Leu Ala Val Lys Trp Asp Asp 195 200 205 Ser Met Lys Giu Arg Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Lys 210 215 220 *Ala His Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala Ala Gly 225 230 235 240 Arg Ile Asp Asn Ala Val Lys Ala Giu Ala Arg Asn Pro Leu Val Al a *245 250 255 Ala Tyr Pro Thr Vai Gly Vai His Met Giu Arg Leu Ile Val Ser Giu 260 265 270 Lys Gin Asn Ile Ser Thr Gly Leu Gly Lys Thr Val Ser Val Ser Met 275 280 285 Ser Ala Ser Asn Thr Ala Ala Ile Thr Ala Gly Ile Asn Ala Thr Ala 290 295 300 Gly Ala Ser Leu Leu Gly Pro Ser Gly Asn Val Thr Ala His Phe Ser 305 310 :4 5 320 Tyr Thr Giy Ser Ser Thr Ser Thr Val Glu ALn -1er Ser Se:.A Aim 325 3C335 Trp Ser Gin Asp Leu Gly Ile Asp Thr Gly Gin Ser Ala Tyr Leu Asn 340 345 350 Ala Asn Val Arg Tyr 355 102 INFORMATION FOR SEQ ID NO:47: Wi SEQUENCE CHARACTERISTICS: LENGTH: 967 base pairs TYPE: nucleic acid STRANflEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: KE33 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: TGGATTACTT GGGTACTATT TTGAAGAACC AAACTTAAT 0 000*0* *0*0
AAAAAACAAC
AAATAAAGGC
ATATGTTTT
TGTAATGGGT
CCGCTI'TGAA
TACAAAAGAA
TTACCCAAAA
TGGAATACCA
ATGGAATCCT
TACAGTAGGT
CARGAAAYCC
CRTCTCTKTA
AGRACAGAAG
ACAGGATTGG
AGTAATTTAT
ATTCAATCTG
TTTAGTCCTT
AGAAAAATTA
AAAACAAATA
ATCATTTCTC
ACAAATTTAT
GATGACTGGG
GCTTATGA-AG
GATCCATATA
TTKTAGCAGA
WAARTGKTGA
GCACTTCASG
CTCTAGAAAA
CTAGATGGTT
CCAACCATGA
TGTTAGAAGA
ATCTAGATAT
AAAACTGTTT
'I7GGGGATGT
AAATTAATGG
GGTTATATAC
CAGATTTAGA
AGCTWATCCG
TKTWTTCAAA
*AGAACATAT7 AGGTT 1rTA
AATCATGATT
AGGAAAGGTA
AAACTGCGAA
GCTGGCACCT
ATCTACTACG
TTATACGTTT
TAAATATATT
GAACGTMCAA
AAAAATTGGA
TGCTCAAGAA
GACC'TTCTAT
TCATCGT-TAT
AAACCAAAGC
CAAATCGATA
TATCCAATTC
CTACITGGA
GATTATCATA
ACTAGTGATA
GATGGTACAA
TCTAACCCTA
AGCTAAAKGG
BTTAGCATGG
AATKACTACT
GAGCAG.AAGG
CAACAAACAC
TAATCACACA
CTAGTAT TAG
A.AACGGATGA
ACAAAATTAT
GAATTGAATG
CGCATTCTGA
ATACAGAATT
CTGATAATGA
ATATAATTCA
AACAAGC.AAG
ATCAAAGAAS
AAGAATTACT
TACTTCTAGT
AA.AAAAACCT
AACGGAACAA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 TAGYGCAGGC ATTGAGGGAG =TCAGCCTC CTTI'CGCAT TCATCTTCAA AT2AAT GGAA
AGATATA
CAkATGATTCA TCT TGATACA GGAGAATCAG CGTATTTAAA TGCCAATGTA INFORMATION FOR SEQ ID NO:48: Wi SEQUENCE CHARACTERISTICS: LENGTH: 972 base pairs TYPE: nucleic acid 103 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: KB38 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: TGGATTACTT GGGTATTATT TTGAAGAACC AAACI7AAT AAC=TCTAT TAATCACACA
AAAAAACAAC
AAATAAAGGC
ATATGTTTTT
TGTAATGGGT
CCGCTT TGAA
TACAAAAGAA
TTATCCAAAA
ATGGAATACC
AGTGGAATTC
GTACAGTAGG
CCTCTCTAGA
AAGAATTACT
CTAGTAGAAC
AACCTACAGG
AGTAATTTAT
ATTCAATCTG
TITTAGTCCTT
AGAAAAATTA
AAAACAAATA
ATCATTTCTC
ACAAATTTAT
AGATGACTGG
TGCTTATGAA
TGATCCATAT
AGCAAGGAAT
CATCTCT=A
AGAAGGCACT
ATTGGTTTCA
CTCTAGAAAA
CTAGATGGTT
CCAACCATGA
TGTTAGAAAA
ATATAGATAT
AAAACTTTTT
TTGGAGATGT
GAAATTAATG
GGGTTATATA
ACAGATTTAG
CCTTTAGTAG
AATGTTGATT
TCACGTAGCG
GCCTCCTTI
AGAACATATT
AGGTTTTTTA
AATTATGATT
AGGAAAGGTA
AAACTGCGAA
GCTGGCACCT
ATCTACTACG
GTTATACCTT
CTAAATATGT
AGAAAGTAAC
CAGCTTATCC
TTI'CAAATGC
CAGGCATTGA
CGCATTCATC
TCATCGTTAT
AAACCAGAGC
CAAATCGATA
TATCCAATTC
CTACTTTGGA
GATTATAACA
ACTWAGTGAT
TGATGGTACA
TTCTAATCCT
AGCTCAAATG
AAAAATTGGA
TCAAGAAAAT
GGGAGGAGCA
TTCAACAACA
CTAGTATTAG.
AAACGGATGA
ACAAAA!TTAT
GAATTGAATG
CGCACTCTGA
ATACAGAATr
ACTGATAATG
AATATAATrC
AAACAAGCAA
GATCGAGCAA
GTTAGCATGG
ACTACTTCTT
GAAGGAAAAA
AACACAACGG
120 180 240 300 360 420 480 540 600 660 720 780 840 900
C
AACAAATGAA
ATGTAAGATA
TGGAACAATG
TA
AT1'CATCTTG ATACAGGAGA ATCAGCGTAT 'rrAAATGCCA INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) 0 104 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: CTTGAYTTTA AARATGATRT A INFORMATION FOR SEQ ID Wi SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STP.ANDEDNESS: single TOPOLOGY: linear (id) MOLECULE TYPE: DNA (genlornc) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:SO: AATRGCSWAT AAATAMGCAC C INFORMATION FOR SEQ ID NO:51: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1341 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 177C8 vip2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: ATGTTTATGG TrCTAAAAA ATTACAAGTA GTTACTAAAA CTGTATTGCT TAGTACAGTT
TTCTCTATAT
CAAAGTAAAT
GAAGATAAGG
GCTACTGAAA
TATAAAGAAA
GAAATTGATA
GTGGAACCGA
GATGCAATGG
CTAGATACGC
CTTTATTAAA
ATACTAACTT
AAAAAGCGAA
AAGGAAAAAT
TTACrT*ITC
AGATGTTTGA
CAACAATTGG
CACAGTTAA
ATTTAACTGC
TAATGAAGTG
GCAAAATCTA
AGAATGGGGG
GAATAATTTT
TATGGCAGGC
TAAAACCAAT
ATTTAATAAA
AGAACAATTT
TCAACAAG'rr
ATAAAAGCTG
AAAATCACTG
AAAGAAAAAG
TTAGATAATA
TCATTGAAG
CTATCAAATT
TCTTAACAG
TTAGATAGGG
TCCAGTAAAG
AACAATTAAA
ACAAGGTAGA
AAAAAGAGTG
AAAATGATAT
ATGAAAJ'AAA.
CTATTATCAC
AAGGTAATAC
ATATTAAGTT
AAAGAGTTAT
TATAAATTCT
GGATTTTAAA
GAAACTAACT
AAAGACAAAT
AGArrA A'
CTATAAAAAT
GATTAATTCT
TGATAGTTAT
=~GAAGGTTr ACGGTTCCGA GTGGGAAAGG TTCTACTACT AGTGAATACA AAATGCTCAT TGATAATGGG GTGGTGAAAA AAGGGGTGGA GTGCTTACAA TTTAAAAATG ATATAA.ATGC TGAAGCGCAT GCTAAAGATT TAACCGATTC GCAAAGGGAA AAAGAAATCA ATAATTATTT AAGAAATCAA CAAATAAAAA ATATTTCTGA TGCTTTAGGG TATAGATGGT GTGGCATGCC GGAATTTGGT AAAGAT7=TG AAGAACAATT TTTAAATACA *AGCTTATCGA GTGAACGTCT TGCAGCTTTT GTTCCGAAAG GAAGTACGGG TGCGTATTTA GAGATCCTAC TI'GATAAAGA TAGTAAATAT AAGGTGT1'AA GCGATATGTA G 105
CCA-ACAAAAG,
TATATGGTCC
ATTGAAGGGA
AGCTGGGGTA
GCTTTAGATG
GGCGGAAGTG
AAGAAACCAA
TATCAAATTA
ATCAAAGAAG
GGATCTAGAA
AGTGCCATTG
CATATTGATA
CAGGTGTCAT
ATGTAGATAA
CTTTAAAAAA
TGAAGAATTA
GGTATGCTAG
GAAATGAAAA
TACCGGAAAA
GTGATCCGTT
ACAAAGGATA
AAATTATATT
GTGGATTTGC
AAGTAACAGA
TTTAAATAAT
GGTATCAAAA
GAGTCTTGAC
TGAAGAGTGG
GCAAGATTAT
ACTAGATGCT
TATACTGTG
ACCTTCT= A
TATGAGTACA
ACGATTACAA
AAGTGAAAAA
GGTAATTATT-
660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1341 9 INFORMATION FOR SEQ ID NO:52: Wi SEQUENCE CHARACTERISTICS: LENGTH: 446 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vi) ORIGINAL SOURCE: INDIVIDUAL ISOLATE: 177C8 vip2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: Met Phe Met Val 5cr Lys Lys Leu Gin Val Val Thr Lys 1 5 10 Leu Ser Thr A'Eal Phe Ser Ile 5cr Leu Leu Asn Asn Giu Thr Val Leu Val Ile Lys Ala Glu Gin Leu Asn Ile Asn Ser Gin Ser Lys 40 Asn Leu Lys Ile Thr Asp Lys Val Glu Asp Phe s0 55 Tyr Thr Asn Leu Gin Lys Giu Asp Lys Glu Lys Ala Lys Glu Trp, Gly Lys Glu Lys Glu Lys Glu Trp Lys Leu Thr 70 75 Ala Thr Glu Lys Gly Lys Met Asn Asn Phe Leu Asp Asn Lys Asn Asp 90 Ile Lys Thr Asn Tyr Lys Giu Ile Thr Phe Ser Met Ala Gly Ser Phe 100 105 110 Glu Asp Glu Ile Lys Asp Leu Lys Giu Ile Asp Lys Met Phe Asp Lys 115 120 125 Thr Asn Leu Ser Asn Ser Ile Ile Thr Tyr Lys Asn Val Giu Pro Thr 130 135 140 Thr Ile Gly Phe Asn Lys Ser Leu Thr Giu Gly Asn Thr Ile Asn Ser 145 150 155 160 *Asp Ala Met Ala Gin Phe Lys Giu Gin Phe Leu Asp Arg Asp Ile Lys 165 170 175 Phe Asp Ser Tyr Leu Asp Thr His Leu Thr Ala Gin Gin Val Ser Ser 180 185 190 Lys Giu Arg Val Ile Leu Lys Val Thr Vai Pro Ser Gly Lys Gly Ser 195 200 205 Thr Thr Pro Thr Lys Ala Gly Val Ile Leu Asn Asn Ser Giu Tyr Lys 210 215 220 Met Leu Ile Asp Asn Gly Tyr Met Val His Val Asp Lys Val Ser Lys *225 230 235 240 *Val Val Lys Lys Gly Val Giu Cys Leu Gin Ile Glu Gly Thr Leu Lys 245 250 255 Lys Ser Leu Asp Phe Lys Asn Asp Ile Asn Ala Giu Ala His Ser Ti-p 260 265 270 Gly Met Lys Asn Tyr Giu Giu Ti-p Ala Lys Asp Leu Thr Asp Ser Gin 275 280 285 Arg Giu Ala Leu Asp Gly Tyr Ala Arg Gin Asp Tyr Lys Giu Ile Asn 290 295 300 Asn Tyr Leu Arg Asn C-in Gly Gly 13e- 73v Asn Giu Lys Leu Asp Ala 305 3 10 315 320 Gin Ile Lys Asn Ile Ser Asp Ala Leu Gly Lys Lys Pro Ile Pro Giu 325 330 335- Asn Ile Thr Val Tyr Arg Ti-p Cys Gly Met Pro Giu Phe Giy Tyr Gin 340 345 350 107 Ile Ser Asp Pro Leu Pro Ser Leu Lys Asp Phe Glu Glu Gin Phe Leu 355 360 365 Asn Thr Ile Lys Glu Asp Lys Gly Tyr Met Ser Thr Ser Leu Ser Ser 370 375 380 Glu Arg Leu Ala Ala Phe Gly Ser Arg Lys Ile Ile Leu Arg Leu Gin 385 390 395 400 Val Pro Lys Gly Ser Thr Gly Ala Tyr Leu Ser Ala Ile Gly Gly Phe 405 410 415 Ala Ser Glu Lys Glu Ile Leu Leu Asp Lys Asp Ser Lys Tyr His Ile 420 425 430 .Asp Lys Val Thr Glu Val Ile Ile Lys Val Leu Ser Asp Met 435 440 445 INFORMATION FOR SEQ ID NO:53: SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: SGGATTCGTTA TCAGAAA 17 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: CTGTYGCTAA CAATGTC 17 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single 108 TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Asp Giu Pro Phe Asn Ala Asp 1 INFORMATION FOR SEQ ID NO:56: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: GCTGATGAAC CATTTAATGC C 21 INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids S(B) TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: Leu Phe Lys Val Asp Thr Lys Gln 1 INFORMATION FOR SEQ ID NO:58: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPF: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: CTCTTTAAAG TAGATACTAA GC 22 109 INFORMATION FOR SEQ ID NO:59: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: Pro Asp Glu Asn Leu Ser Asn Ile Glu 1 INFORMATION FOR SEQ ID ***Oe SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid eec STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID GATGAGAACT TATCAAATAG TATC 24 INFORMATION FOR SEQ. ID NO:61: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS:.. single TOPOLOGY: linear e (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: Ala Asn Ser Leu Leu Asp Lys Gin Gin Gin Thr Tyr 1 5 INFORMATION FOR SEQ ID NO:62: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 110 (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: CGAATTCTTT ATTAGATAAG CAACAACAAA CCT INFORMATION FOR SEQ ID NO:63: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: Val Ile Ser Gln Lys Gly Gln Lys 1 INFORMATION FOR SEQ ID NO:64: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid S(C) STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: GTTATTTCGC AAAAAGGCCA AAAG 24 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Glu Tyr Gln Ser Asp Lys Ala Leu Asn Pro Asp 1 5 INFORMATION FOR SEQ ID NO:66: 111 SEQUENCE CHARACTERISTICS: LENGTH: 31 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: GAATATCAAT CTGATAAAGC GTTAAACCCA G 31 INFORMATION FOR SEQ ID NO:67: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: Ser Ser Leu Phe Ser Asn Lys Ser Lys 1 INFORMATION FOR SEQ ID NO:68: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 0..0 (ii) MOLECULE TYPE: DNA (genomic) 0* (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: GCAGCYTGTT TAGCAATAAA AGT 23 INFORMATION FOR SEQ ID NO:69: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: Ile Lys Gly Arg Val Ala Val Lys 1 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID CAAAGGAAGA GTAGCTGTTA INFORMATION FOR SEQ ID NO:71: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: Val Asn Val Ser Leu Glu Asn Val Thr 1 INFORMATION FOR SEQ ID NO:72: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: CAATGTTAGC TTGGAAAATG TCACC INFORMATION FOR SEQ ID NO:73: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids 113 TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: Thr Ala Phe Ile Gin Val Gly Glu 1 INFORMATION FOR SEQ ID NO:74: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: AGCATTTATT CAAGTAGGAG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Tyr Leu Leu Ser Thr Ser Ser 1 INFORMATION FOR SEQ ID NO:76: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 114 TCTACTTTCC ACGTCCTCT 19 INFORMATION FOR SEQ ID NO:77: SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: Gin Ile Gin Pro Gin Gin Arg 1 INFORMATION FOR SEQ ID NO:78: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) S" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: CAGATACAAC CGCAACAGC 19 INFORMATION FOR SEQ ID NO:79: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: Pro Gin Gin Arg Ser Thr Gin Ser 1 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid 115 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECUL.E TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8O: CCGCAACAGC GTTCAACTCA ATC 23 INFORMATION FOR SEQ ID NO:81: Wi SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81; Asp Gly Ala Ile Val Ala Trp, 1S INFORMATION FOR SEQ ID NO:82: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: GACGGTGCGA TTGTTGCCTG G 21 INFORMATION FOR SEQ ID NO:83: SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: Rnminn acid STR~uqDEDNfESS: .single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: Glu Gly Asp Ser Gly Thr Val INFORMATION FOR SEQ ID NO:84: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: GAAGGAGACT CAGGTACTG 19 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 6 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Thr Val Thr Asn Thr Ser INFORMATION FOR SEQ ID NO:86: SEQUENCE
CHARACTERISTICS:
LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single S TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: CCGTAACCAA
TACAAGCAC
19 INFORMATION FOR SEQ ID NO:87: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 117 (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: Ser Ser Gin Leu Ala Tyr Asn Pro Ser 1 INFORMATION FOR SEQ ID NO:88: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: CTTCACAATT AGCGTATAAT CCTTC INFORMATION FOR SEQ ID NO:89: SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPEr amino acid STRANDEDNESS: single TOPOLOGY: linear S(ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: Glu Gln His Lys Glu Val Ala 1 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid iC) STRANDEDNESS: single TOPOLOGf: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID GAGCAGCATA AGGAAGTAG 19 INFORMATION FOR SEQ ID NO:91: 118 SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE iPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: Phe Asn Gly Ile Gin Ile Val Pro 1 INFORMATION FOR SEQ ID NO:92: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single S(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: CATTCAATGG GATTCAAATT GTTCC INFORMATION FOR SEQ ID NO:93: SEQUENCE
CHARACTERISTICS:
LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: Val Gln Glu Ser Asn Gly Gly Gly 1 INFORMATION FOR SEQ ID NO:94: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) 119 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: GTGCAGGAAT CTAATGGTGG
AGG
23 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Glu Ile Gly Gly Lys Phe Thr Leu Asn INFORMATION FOR SEQ ID NO:96: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:96: SGATAGGAGGG AAATTTACAT TG 22 22 INFORMATION FOR SEQ ID NO:97: SEQUENCE
CHARACTERISTICS:
LENGTH: 19 base pairs a: TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:97: CGAATTGAAT
GCCGCTTTG
19 INFORMATION FOR SEQ ID NO:98: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs 120 TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:98: CTCAAAACTK TTTGCTGGCA
CC
22 INFORMATION FOR SEQ ID NO:99: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:99: GGATCRAGCA
ACCTCTCTAG
INFORMATION FOR SEQ ID NO:100: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:100: ACTACTTACT
TCTAGTAG
18 INFORMATION FOR SEQ ID NO:101: SEQUENCE
CHARACTERISTICS:
LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: Siiole TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:101: Ser Asp Gln Gln Val Val Ile Glu 1 121 INFORMATION FOR SEQ ID NO:102: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:102: CCGAYCRACA KGTCRTRATT G 21 INFORMATION FOR SEQ ID NO:103: SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:103: Asn Gln Thr Ser Met Thr Glu 1 INFORMATION FOR SEQ ID NO:104: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:104: TCARDCTTCT ATGACAGMAC C 21 INFORMATION FOR SEQ ID NO:105: SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 122 (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: Gin Asp Gin Glu Lys Ile Ile Pro 1 INFORMATION FOR SEQ ID NO:106: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:106: CAAGATCAAG ARAARMTYAT YCCT 24 INFORMATION FOR SEQ ID NO:107: SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid STRANDEDNESS: single S(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:107: Ser His Lys Gin Asp Gin Glu 1 INFORMATION FOR SEQ ID NO:108: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (il) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:108: CTCRTMAACA
AGATCAAG
18 INFORMATION FOR SEQ ID NO:109: 123 SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:109: Ser Gly Ser Val Thr Ala His 1 INFORMATION FOR SEQ ID NO:110: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:110: CTGGAARYGT SACGGCTC 18 S. INFORMATION FOR SEQ ID NO:111: So S(i) SEQUENCE
CHARACTERISTICS:
LENGTH: 22 base pairs TYPE: nucleic acid S* STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:111: GCTTAGTATC TACTTTAAAG AG 22 INFORMATION FOR SEQ ID NO:112: SEQUENCE CHAPRACTERISTICS.
LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: 124 GATACTATTT GATAAGTTCT CATC 24 INFORMATION FOR SEQ ID NO:113: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:113: CTTTTGGCCT TTTTGCGAAA TAAC 24 INFORMATION FOR SEQ ID NO:114: SEQUENCE
CHARACTERISTICS:
LENGTH: 31 base pairs TYPE: nucleic acid STRANDEDNESS: single ea** TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:114: CTGGGTTTAA CGCTTTATCA GATTGATATT C 31 e INFORMATION FOR SEQ ID NO:115: SEQUENCE
CHARACTERISTICS:
LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:115: ACTTTTATTG CTAAACARGC TGC -3 INFORMATION FOR SEQ ID NO:116: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 125 (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:116: TAACAGCTAC TCTTCCTTTG INFORMATION FOR SEQ ID NO:117: SEQUENCE CHARACTERISTICS: LENGTH: 25 base Dairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:117: GGTGACATTT TCCAAGCTAA CATTG INFORMATION FOR SEQ ID NO:118: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:118: AGAGGACGTG GAAAGTAGA 19 S* INFORMATION FOR SEQ ID NO:119: S(i) SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) ;MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:119: GCTGTTGCGG TTGTATCTG 19 INFORMATION FOR SEQ ID NO:120: SEQUENCE CHARACTERISTICS: 126 LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:120: GATTGAGTTG AACGCTGTTG CGG 23 INFORMATION FOR SEQ ID NO:121: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:121: CCAGGCAACA ATCGCACCGT
C
21 INFORMATION FOR SEQ ID NO:122: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:122: CAGTACCTGA
GTCTCCTTC
19 INFORMATION FOR SEQ ID NO:123: SEQUENCE CHARACTERISTICS.
LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:123: GTGCTTGTAT
TGGTTACGG
127 INFORMATION FOR SEQ ID NO:124: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:124: GAAGGATTAT ACGCTAATTG TGAAG INFORMATION FOR SEQ ID NO:125: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:125: GGAACAATTT GAATCCCATT GAATG INFORMATION FOR SEQ ID NO:126: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:126: CCTCCACCAT TAGATTCCTG CAC 23 iW INFORMATION FOR SEQ ID NO:127: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) 128 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:127: CAATGTAAAT TTCCCTCCTA TC 22 INFORMATION FOR SEQ ID NO:128: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: GGTGCCAGCA AAMAGTTTTG AG 22 INFORMATION FOR SEQ ID NO:129: SEQUENCE
CHARACTERISTICS:
LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:129: CTAGAGAGGT TGCTYGATCC INFORMATION FOR SEQ ID NO:130: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEO ID :0:130: CTACTAGAAG TAAGTAGT INFORMATION FOR SEQ ID NO:131: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid 129 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:131: GGTKCTGTCA TAGAAGHYTG A 21 INFORMATION FOR SEQ ID NO:132: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:132: AGGRATRAKY TTYTCTTGAT
CTTG
24 INFORMATION FOR SEQ ID NO:133: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear S(ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:133: CTTGATCTTG TTKAYGAG 18 INFORMATION FOR SEQ ID NO:134: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:134: GAGCCGTSAC
RYTTCCAG

Claims (53)

1. An isolated pesticidal toxin belonging to the SUP-1 family, wherein said toxin is obtainable from a Bacillus thuringiensis isolate selected from the group consisting of PS49C (NRRL B-21532) and PS158C2 (NRRL B-18872).
2. The toxin according to claim 1, which is encoded by a polynucleotide wherein the complement of a nucleotide sequence selected from the group consisting of SEQ ID NOS. 9 to 13, 15 and 53 hybridizes with said polynucleotide when said complement is used as a hybridization probe.
3. The toxin according to claim 1, which is encoded by a polynucleotide wherein the reverse sequence of a nucleotide sequence selected from the group consisting of SEQ ID NOS. 14 and 54 hybridizes with said polynucleotide when said reverse sequence is used as a hybridization probe.
4. The toxin according to claim 1, wherein said isolate is PS158C2 and said isolate comprises a polynucleotide that encodes said toxin wherein the complement of SEQ ID NO:10 hybridizes with said polynucleotide when said complement is used as a hybridization probe. The toxin according to claim 1, wherein said isolate is PS49C and said isolate comprises a polynucleotide that encodes said toxin wherein the complement of a nucleotide sequence selected from the group consisting of SEQ ID NO:12 and SEQ ID NO:15 hybridizes with said polynucleotide when said complement is used as a hybridization probe.
6. An isolated pesticidal toxin belonging to the SUP-1 family, wherein a polynucleotide sharing greater than 90% identity with a nucleotide sequence selected from the group consisting of SEQ ID NOS. 10, 12 and 15 encodes at least a portion of said toxin.
7. An isolated pesticidal toxin belonging to the SUP-1 family, wherein a polynucleotide selected from the group consisting of SEQ ID NOS. 10, 12 and encodes at least a portion of said toxin.
8. The isolated pesticidal toxin according to any one of claims 1, 6 or 7, which is 30 encoded by a polynucleotide wherein a portion of said polynucleotide can be amplified by PCR utilising a primer pair selected from the group consisting of SEQ ID NOS. 53 and 54, and SEQ ID NOs 53 and 14.
9. An isolated pesticidal toxin belonging to the SUP-1 family, which is encoded by a polynucleotide wherein a portion of said polynucleotide can be amplified by PCR utilising the primer pair SEQ ID NOS. 53 and 54. The toxin according to any one of claims 1 to 9, wherein said toxin can be obtained from the supernatant of a culture of said Bacillus thuringiensis isolate.
11. An isolated pesticidal SUP-1 family toxin from a Bacillus thuringiensis isolate, said toxin being substantially as hereinbefore described with reference to any one 40 of the examples.
12. An isolated polynucleotide which encodes a pesticidal toxin according to any one of claims 1 to 11. 131
13. The polynucleotide according to claim 12, wherein said polynucleotide is optimised for expression in plants.
14. The polynucleotide according to claim 12, at least a portion of which comprises a nucleotide sequence sharing greater than 90% identity with a nucleotide sequence selected from the group consisting of SEQ ID NOS. 10, 12 and The polynucleotide according to claim 12, at least a portion of which comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS. 12 and
16. An isolated polynucleotide sequence according to claim 12, wherein a portion 0o of said polynucleotide can be amplified by PCR utilising a primer pair selected from the group consisting of SEQ ID NOS. 53 and 54, and SEQ ID NOs 53 and 14.
17. The polynucleotide of claim 16, which hybridises with a probe selected from the group consisting of SEQ ID NOS. 10, 12, and 15, or complements thereof.
18. An isolated polynucleotide which encodes a pesticidal SUP-1 family toxin from a Bacillus thuringiensis isolate, said polynucleotide being substantially as hereinbefore described with reference to any one of the examples.
19. A PCR primer pair for the detection and/or isolation of a nucleotide sequence encoding a SUP-1 family toxin, wherein said primer pair is selected from the group consisting of SEQ ID NOS. 53 and 54, and SEQ ID NOs 53 and 14.
20. An isolated polynucleotide useful as a PCR primer or a hybridisation probe for the detection and/or isolation of a nucleotide sequence encoding a SUP-1 family toxin, wherein said polynucleotide is selected from the group consisting of SEQ ID NOS. 9 to 53 and 54, or complements thereof.
21. An isolated polynucleotide useful as a PCR primer or a hybridisation probe for the detection and/or isolation of a nucleotide sequence encoding a SUP-1 family toxin, substantially as hereinbefore described with reference to any one of the Examples.
22. A method for detecting and/or isolating a nucleotide sequence encoding a SUP-1 family toxin in or from a sample, said method comprising submitting said sample to a PCR primer pair of claim 19 or a polynucleotide of claim 20 or claim 21 under i 30 hybridising conditions, and amplifying any hybridised sequences by PCR.
23. An isolated polynucleotide which encodes a pesticidal SUP-1 family toxin according to any one of claims 1 to 11, detected and/or isolated by a method according to claim 22.
24. A recombinant DNA molecule comprising a polynucleotide according to any one of claims 12 to 18 or 23.
25. A recombinant DNA molecule comprising a polynucleotide which encodes a pesticidal SUP-1 family toxin, substantially as hereinbefore described with reference to any one of the Examples.
26. A plasmid comprising a polynucleotide according to any one of claims 12 to 40 18 or 23, or a recombinant DNA molecule according to claim 24 or claim @0* 132
27. A plasmid comprising a polynucleotide which encodes a pesticidal SUP-1 family toxin, substantially as hereinbefore described with reference to any one of the Examples.
28. A vector comprising a polynucleotide according to any one of claims 12 to 18 or 23, a recombinant DNA molecule according to claim 24 or claim 25, or a plasmid according to claim 26 or claim 27.
29. A vector comprising a polynucleotide which encodes a pesticidal SUP-1 family toxin, substantially as hereinbefore described with reference to any one of the Examples. to 30. A method for transforming a host cell with a polynucleotide encoding a SUP- 1 family toxin, said method comprising contacting said host cell with a polynucleotide according to any one of claims 12 to 18 or 23, a recombinant DNA molecule according to claim 24 or claim 25, a plasmid according to claim 26 or claim 27, or a vector according to claim 28 or claim 29. Is 31. A method for transforming a host cell with a polynucleotide encoding a SUP- 1 family toxin, substantially as hereinbefore described with reference to any one of the examples.
32. A host cell transformed by a method according to claim 30 or claim 31.
33. The transformed host cell according to claim 32, which is a bacterium.
34. The transformed host cell according to claim 32 which is a plant cell. A transformed host cell comprising a heterologous nucleotide sequence encoding a pesticidal SUP-1 family toxin, said host being substantially as hereinbefore described with reference to any one of the examples.
36. A transformed host comprising a heterologous nucleotide sequence encoding a pesticidal toxin according to any one of claims 1 to 11.
37. The transformed host according to claim 36, wherein said host is a bacterium.
38. The transformed host according to claim 36, wherein said host is a plant.
39. A transformed host comprising a heterologous nucleotide sequence encoding a pesticidal toxin belonging to the SUP-1 family, wherein said toxin can be encoded by a 30 polynucleotide wherein a portion of said polynucleotide can be amplified by the primer pair SEQ ID NOS. 53 and 54. The transformed host according to claim 39, wherein said host is a bacterium.
41. The transformed host according to claim 39, wherein said host is a plant.
42. A transformed host comprising a heterologous nucleotide sequence encoding a pesticidal SUP-1 family toxin, said host being substantially as hereinbefore described with reference to any one of the examples.
43. A method of producing a transgenic plant expressing a Bacillus thuringiensis SUP-1 family toxin, said method comprising re-generation of a whole plant from a S:::transformed plant cell according to claim 34. S. 40 44. A recombinant plant prepared by a method according to claim 43. A plant comprising a plurality of cells according to claim 34. *ooo
46. A transgenic plant expressing a Bacillus thuringiensis SUP-1 family toxin, substantially as hereinbefore described with reference to any one of the examples.
47. Transgenic seed from a plant according to any one of claims 38, 41, or 44 to 46 and which comprises said heterologous nucleotide sequence.
48. A method for preparing a pesticidal composition comprising at least a SUP-1 family toxin, said method comprising culturing a plurality of transgenic cells according to any one of claims 32 to 35, a transformed host according to any one of claims 36 to 42, or a transgenic plant according to any one of claims 44 to 46 under conditions promoting expression of said toxin, and optionally isolating the toxin.
49. The method according to claim 48, wherein said cultured cells are treated so as to release at least said SUP-1 family toxin from said cells. The method according to claim 49, further comprising isolating or purifying at least said SUP-1 family toxin from said treated cell culture.
51. A method for preparing a pesticidal composition comprising at least a SUP-1 family toxin, said method being substantially as hereinbefore described.
52. A pesticidal composition prepared by a method according to any one of claims 48 to 51.
53. A pesticidal composition comprising a toxin according to any one of claims 1 to 11.
54. A pesticidal composition according to claim 52 or claim 53, further comprising agriculturally/pesticidally acceptable additives, including carriers, adjuvants, stabilising agents, rheological agents, emulsifiers, dispersants, polymers and surfactants. A pesticidal composition comprising a SUP-1 toxin, substantially as hereinbefore described.
56. A method for controlling a non-mammalian pest, wherein said method comprises contacting said pest with a toxin according to claim 1.
57. A method for controlling a non-mammalian pest, wherein said method comprises contacting said pest with a toxin according to any one of claims 2 to 11.
58. A method for controlling a non-mammalian pest comprising applying to the 30 situs or environment of said pest a pesticidally effective amount of transgenic cells according to any one of claims 32 to 35, a transformed host according to any one of claims 36 to 42, a transgenic plant according to any one of claims 44 to 46, or a pesticidal composition according to any one of claims 52 to
59. A method for controlling a non-mammalian pest with a SUP-1 family toxin, substantially as hereinbefore described.
60. A PCR primer pair according to claim 19, or a polynucleotide of claim 20 or claim 21, when used for detecting and/or isolating a nucleotide sequence encoding a SUP- 1 family toxin in or from a sample.
61. A PCR primer pair according to claim 19, or a polynucleotide of claim 20 or claim 21, when used for detecting and/or isolating a nucleotide sequence encoding a SUP- 1 family toxin in or from a sample, substantially as hereinbefore described with reference to any one of the examples. 134
62. A recombinant DNA molecule comprising a polynucleotide according to any one of claims 12 to 18 or 23, when used for transforming a host cell or organism to express a SUP-1 family toxin.
63. A recombinant DNA molecule comprising a polynucleotide according to any one of claims 12 to 18 or 23, when used for transforming a host cell or organism to express a SUP-1 family toxin, substantially as hereinbefore described with reference to any one of the examples.
64. Transgenic cells according to any one of claims 32 to 35, a transformed host according to any one of claims 36 to 42, or a transgenic plant according to any one of claims 44 to 46, when used for controlling a non-mammalian pest. Transgenic cells according to any one of claims 32 to 35, a transformed host according to any one of claims 36 to 42, or a transgenic plant according to any one of claims 44 to 46, when used for controlling a non-mammalian pest, substantially as hereinbefore described. Dated 31 May, 2004 Mycogen Corporation Patent Attorneys for the Applicant/Nominated Person SPRUSON FERGUSON *0* OeO*9* S *0*4 *400 a *o o eoo••° [I:\DAYLIB\LIBA]03589#.doc:mqt
AU10114/02A 1996-10-30 2002-01-10 Novel pesticidal toxins and nucleotide sequences which encode these toxins Ceased AU775377B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US60029848 1996-10-30
AU50983/98A AU5098398A (en) 1996-10-30 1997-10-30 Novel pesticidal toxins and nucleotide sequences which encode these toxins
PCT/US1997/019804 WO1998018932A2 (en) 1996-10-30 1997-10-30 Novel pesticidal toxins and nucleotide sequences which encode these toxins

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU50983/98A Division AU5098398A (en) 1996-10-30 1997-10-30 Novel pesticidal toxins and nucleotide sequences which encode these toxins

Publications (2)

Publication Number Publication Date
AU1011402A AU1011402A (en) 2002-03-07
AU775377B2 true AU775377B2 (en) 2004-07-29

Family

ID=32739139

Family Applications (1)

Application Number Title Priority Date Filing Date
AU10114/02A Ceased AU775377B2 (en) 1996-10-30 2002-01-10 Novel pesticidal toxins and nucleotide sequences which encode these toxins

Country Status (1)

Country Link
AU (1) AU775377B2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994021795A1 (en) * 1993-03-25 1994-09-29 Ciba-Geigy Ag Novel pesticidal proteins and strains
AU1305997A (en) * 1996-01-15 1997-08-11 Ciba-Geigy Ag Method of protecting crop plants against insect pests

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994021795A1 (en) * 1993-03-25 1994-09-29 Ciba-Geigy Ag Novel pesticidal proteins and strains
AU1305997A (en) * 1996-01-15 1997-08-11 Ciba-Geigy Ag Method of protecting crop plants against insect pests

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ESTRUCH ET AL, PNAS, 1996, 93(11): 5389-5394 *

Also Published As

Publication number Publication date
AU1011402A (en) 2002-03-07

Similar Documents

Publication Publication Date Title
US6204435B1 (en) Pesticidal toxins and nucleotide sequences which encode these toxins
US6656908B2 (en) Pesticidal toxins and nucleotide sequences which encode these toxins
US6274721B1 (en) Toxins active against pests
US20040128716A1 (en) Polynucleotides, pesticidal proteins, and novel methods of using them
US6752992B2 (en) Toxins active against pests
US6570005B1 (en) Toxins active against pests
US6603063B1 (en) Plants and cells transformed with a nucleic acid from Bacillus thuringiensis strain KB59A4-6 encoding a novel SUP toxin
AU738922B2 (en) Bacillus thuringiensis toxins
AU775377B2 (en) Novel pesticidal toxins and nucleotide sequences which encode these toxins
EP1078067B1 (en) Bacillus thuringiensis toxins and genes for controlling coleopteran pests
US6051550A (en) Materials and methods for controlling homopteran pests

Legal Events

Date Code Title Description
SREP Specification republished
TH Corrigenda

Free format text: IN VOL 18, NO 29, PAGE(S) 824 UNDER THE HEADING APPLICATIONS ACCEPTED - NAME INDEX IN THE NAME OF MYCOGEN CORPORATION, SERIAL NO. 775377, INID (54), AMEND THE TITLE TO READ NOVEL PESTICIDAL TOXINS AND NUCLEOTIDE SEQUENCES WHICH ENCODE THESE TOXINS