AU9712501A - Insecticidal protein toxins from photorhabdus - Google Patents

Insecticidal protein toxins from photorhabdus Download PDF

Info

Publication number
AU9712501A
AU9712501A AU97125/01A AU9712501A AU9712501A AU 9712501 A AU9712501 A AU 9712501A AU 97125/01 A AU97125/01 A AU 97125/01A AU 9712501 A AU9712501 A AU 9712501A AU 9712501 A AU9712501 A AU 9712501A
Authority
AU
Australia
Prior art keywords
seq
protein
photorhabdus
toxin
dna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU97125/01A
Inventor
Michael B. Blackburn
David J. Bowen
Todd A. Ciche
Jerald C. Ensign
Raymond Fatig
Richard H. Ffrench-Constant
Lining Guo
Timothy D. Hey
Donald J. Merlo
Gregory L Orr
James Petell
Jean L Roberts
Thomas A. Rocheleau
Sue Schoonover
James A Strickland
Kitisri Sukhapinda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wisconsin Alumni Research Foundation
Original Assignee
Dow AgroSciences LLC
Wisconsin Alumni Research Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dow AgroSciences LLC, Wisconsin Alumni Research Foundation filed Critical Dow AgroSciences LLC
Priority to AU97125/01A priority Critical patent/AU9712501A/en
Publication of AU9712501A publication Critical patent/AU9712501A/en
Assigned to WISCONSIN ALUMNI RESEARCH FOUNDATION reassignment WISCONSIN ALUMNI RESEARCH FOUNDATION Alteration of Name(s) of Applicant(s) under S113 Assignors: DOW AGROSCIENCES LLC, WISCONSIN ALUMNI RESEARCH FOUNDATION
Abandoned legal-status Critical Current

Links

Description

S&F Ref: 453730D1
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT
ORIGINAL
E
C
Name and Address of Applicants: Dow AgroSciences LLC 9330 Zionsville Road Indianapolis Indiana 46268-1054 United States of America Wisconsin Alumni Research Foundation 614 North Walnut Street P.O. Box 7365 Madison Wisconsin 53707-7365 United States of America Actual Inventor(s): Address for Service: Invention Title: Jerald C. Ensign. David J. Bowen, James Petell, Raymond Fatig, Sue Schoonover, Richard H. Ffrench- Constant, Thomas A. Rocheleau, Michael B. Blackburn, Timothy D. Hey, Donald J. Merlo, Gregory L. Orr, Jean L. Roberts, James A. Strickland, Lining Guo, Todd A.
Ciche, Kitisri Sukhapinda Spruson Ferguson St Martins Tower,Level 31 Market Street Sydney NSW 2000 (CCN 3710000177) Insecticidal Protein Toxins from Photorhabdus The following statement is a full description of this invention, including the best method of performing it known to me/us:c r t c r' f T e 5845c Insecticidal Protein Toxins from Photorhabdus Field of the Invention The present invention relates to toxins isolated from bacteria and the use of said toxins as insecticides.
Background of the Invention Many insects are widely regarded as pests to homeowners, to picnickers, to gardeners, and to farmers and others whose investments in agricultural products are often destroyed or diminished as a result of insect damage to field crops. Particularly in areas where the growing season is short, significant insect damage can mean the loss of all profits to growers and a dramatic decrease in crop yield. Scarce supply of particular agricultural products invariably results in higher costs to food processors and, then, to the ultimate consumers of food plants and products derived from those plants.
Preventing insect damage to crops and flowers and eliminating the nuisance of insect pests have typically relied on strong organic pesticides and insecticides with broad 15 toxicities. These synthetic products have come under attack by the general population as being too harsh on the environment and on those exposed to such agents. Similarly in S•non-agricultural settings, homeowners would be satisfied to have insects avoid their homes or outdoor meals without needing to kill the insects.
The extensive use of chemical insecticides has raised environmental and health c n f concerns for farmers, companies that [N:/libc]00328:mer produce the insecticides, government agencies, public interest groups, and the public in general. The development of less intrusive pest management strategies has been spurred along both by societal concern for the environment and by the development of biological tools which exploit mechanisms of insect management.
Biological control agents present a promising alternative to chemical insecticides.
Organisms at every evolutionary development level have devised means to enhance their own success and survival. The use of biological molecules as tools of defense and aggression is known throughout the animal and plant kingdoms. In addition, the relatively new tools of the genetic engineer allow modifications to biological insecticides to accomplish particular solutions to particular problems.
15 One such agent, Bacillus thuringiensis is an effective insecticidal agent, and is widely commercially used as such. In fact, the insecticidal agent of the Bt bacterium is a protein which ~has such limited toxicity, it can be used on human food crops on the day of harvest. To non-targeted organisms, the Bt toxin is a digestible non-toxic protein.
Another known class of biological insect control agents are "certain genera of nematodes known to be vectors of transmission for oo insect-killing bacterial symbionts. Nematodes containing insecticidal bacteria invade insect larvae. The bacteria then kill 25 the larvae. The nematodes reproduce in the larval cadaver. The ~nematode progeny then eat the cadaver from within. The bacteriacontaining nematode progeny thus produced can then invade additional larvae.
In the past, insecticidal nematodes in the Steinernema and 30 Heterorhabditis genera were used as insect control agents.
Apparently, each genus of nematode hosts a particular species of bacterium. In nematodes of the Heterorhabditis genus, the symbiotic bacterium is Photorhabdus luminescens.
Although these nematodes are effective insect control agents, it is presently difficult, expensive, and inefficient to produce, maintain, and distribute nematodes for insect control.
It has been known in the art that one may isolate an insecticidal toxin from Photorhabdus luminescens that has activity only when injected into Lepidopteran and Coleopteran insect larvae.
This has made it impossible to effectively exploit the insecticidal properties of the nematode or its bacterial symbiont. What would be useful would be a more practical, less labor-intensive wide-area delivery method of an insecticidal toxin which would retain its biological properties after delivery. It would be quite desirous to discover toxins with oral activity produced by the genus Photorhabdus. The isolation and use of these toxins are desirous due to efficacious reasons. Until applicants' discoveries, these toxins had not been isolated or characterized.
Summary of the Invention According to a first embodiment of the invention, there is provided a substantially pure microorganism culture of a Photorhabdus strain designated B2, DEP1, DEP2, DEP3, P. zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2, MP3, MP4, MP5, GL98, GL101, GL138, GL155, GL217, or GL257.
According to a second embodiment of the invention, there is provided a composition, comprising an effective amount of a Photorhabdus protein toxin that has functional activity against an insect, wherein the toxin is produced by a purified culture of Photorhabdus strain designated B2, DEP1, DEP2, DEP3, P. zealandrica, P. hepialus, HB- 15 Arg, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, A. Cows, MP1, MP2, MP3, MP4, MP5, GL98, GL101, GL138, GL155, GL217, or •GL257.
According to a third embodiment of the invention, there is provided a purified preparation comprising, a protein having an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:88.
According to a fourth embodiment of the invention, there is provided a chimeric DNA construct, adapted for expression in a prokaryotic or eukaryotic host comprising, to 3' a transcriptional promoter active in the host; a DNA sequence encoding a Photorhabdus protein that has functional activity against an insect; and a transcriptional terminator, wherein the protein encoded by the DNA sequence has an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:88.
[I:\DayLib\LIBFF]67109spec.doc:gcc 3a According to a fifth embodiment of the invention, there is provided a method of controlling an insect, comprising orally delivering to an insect an effective amount of a protein toxin that has functional activity against an insect, wherein the protein is produced by a purified culture of Photorhabdus strain designated B2, DEP1, DEP2, DEP3, P.
zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2, MP3, MP4, MP5, GL98, GL101, GL138, GL155, GL217, or GL257.
According to a sixth embodiment of the invention, there is provided a DNA or RNA oligonucleotide coding for an amino acid sequence selected from SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:88.
According to a seventh embodiment of the invention, there is provided a method for s15 expressing a protein produced by a purified bacterial culture of the genus Photorhabdus in a prokaryotic or eukaryotic host in an effective amount so that the protein has functional activity against an insect, wherein the method comprises: constructing a chimeric DNA construct having 5' to 3' a promoter, a DNA sequence encoding a protein, a transcription terminator, and then transferring the chimeric DNA construct into the host, 20 wherein the protein encoded by the DNA sequence has an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:86.
According to an eighth embodiment of the invention, there is provided a method of making an antibody against a protein fragment that is part of a protein having functional activity, where the protein is produced by bacteria of the Enterobacteriaceae family, wherein the method comprises: a) isolating a fragment of the protein, where the protein fragment is at least six amino acids; b) immunizing a mammalian species with the protein fragment; and c) harvesting serum containing antibody or antibody from the spleen of the mammalian species, where the antibody harvested is antibody to the protein fragment having functional activity.
[I:\DayLib\LIBFF]671 09spec.doc:gcc 3b The native toxins are protein complexes that are produced and secreted by growing bacteria cells of the genus Photorhabdus, of interest are the proteins produced by the species Photorhabdus luminescens. The protein complexes, with a molecular size of approximately 1,000 kDa, can be separated by SDS-PAGE gel analysis into numerous component proteins. The toxins contain no hemolysin, lipase, type C phospholipase, or nuclease activities. The toxins exhibit significant toxicity upon exposure administration to a number of insects.
The present invention provides an easily administered insecticidal protein as well as the expression of toxin in a heterologous system.
The present invention also provides a method for delivering insecticidal toxins that are functional active and effective against many orders of insects.
Objects, advantages, and features of the present invention will become apparent from the following specification.
Brief Description of the Drawings 15 Fig. 1 is an illustration of a match of cloned DNA isolates used as a part of sequence genes for the toxin of the present invention.
Fig. 2 is a map of three plasmids used in the sequencing process.
Fig. 3 is a map illustrating the inter-relationship of several partial DNA fragments.
Fig. 4 is an illustration of a homology analysis between the protein sequences of 20 TcbAii and TcaBii proteins.
Fig. 5 is a phenogram of Photorhabdus strains. Relationship of Photorhabdus Strains was defined by rep-PCR.
The upper axis of Fig. 5 measures the percentage similarity of strains based on scoring of rep-PCR products 0.0 [no similarity] to 1.0 [100% similarity]). At the right axis, the numbers and letters indicate the various strains tested; 14=W-14, [I:\DayLib\LIBFF]671 09spec.doc:gcc Hm=Hm, H9=H9, 7=WX-7, 1=WX-1, 2=WX-2, 88=HP88, NC-1=NC-1, 4=WX-4, 9=WX-9, 8=WX-8, 10=WX-10, WIR=WIR, 3=WX-3, 11=WX-11, 5=WX-5, 6=WX- 6, 12=WX-12, x14=WX-14, 15=WX-15, Hb=Hb, B2=B2, 48 through 52=ATCC 43948 through ATCC 43952. Vertical lines separating horizontal lines indicate the degree of relatedness (as read from the extrapolated intersection of the vertical line with the upper axis) between strains or groups of strains at the base of the horizontal lines strain W-14 is approximately 60% similar to strains H9 and Hm).
Fig. 6 is an illustration of the genomic maps of the W-14 Strain.
Fig. 6A is an illustration of the tea and tcb loci and primary gene products.
Fig. 7 is a phenogram of Photorhabdus strains as defined by 15 rep-PCR. The upper axis of Fig. 7 measures the percentage similarity of strains based on scoring of rep-PCR products 0.0 [no similarity] to 1.0 [100% similarity]). At the right axis, the numbers and letters indicate the various strains tested.
Vertical lines separating horizontal lines indicate the degree of relatedness (as read from the extrapolated intersection of the vertical line with the upper axis) between strains or groups of strains at the base of the horizontal lines strain Indicus is approximately 30% similar to strains MP1 and HB Oswego). Note that the Photorhabdus strains on the phenogram are as follows: 14 25 W-14; Hm Hm; H9 H9; 7 WX-7; 1 WX-1; 2 WX-2; 88 HP88; NC1 NC-1; 4 WX-4; 9 WX-9; 8 WX-8; 10 WX-10; 30 W30; WIR WIR; 3 WX-3; 11 WX-11; 5 WX-5; 6 WX-6; 12 WX-12; 15 WX-153-X14 WX-14; Hb Hb; B2 B2; 48 ATCC 43948; 49 ATCC 43949; 50 ATCC 43950; 51 ATCC 43951; 52 ATCC. 43952.
Detailed Description of the Invention The present inventions are directed to the discovery of a unique class of insecticidal protein toxins from the genus Photorhabdus that have oral toxicity against insects. A unique feature of Photorhabdus is its bioluminescence. Photorhabdus may be isolated from a variety of sources. One such source is nematodes, more particularly nematodes of the genus Heterorhabditis. Another such source is from human clinical samples from wounds, see Farmer et al. 1989 J. Clin. Microbiol. 27 -4pp. 1594-1600. These saprohytic strains are deposited in the American Type Culture Collection (Rockville, MD) ATCC #s 43948, 43949, 43950, 43951, and 43952, and are incorporated herein by reference. It is possible that other sources could harbor Photorhabdus bacteria that produce insecticidal toxins. Such sources in the environment could be either terrestrial or aquatic based.
The genus Photorhabdus is taxonomically defined as a member of the Family Enterobacteriaceae, although it has certain traits atypical of this family. For example, strains of this genus are nitrate reduction negative, yellow and red pigment producing and bioluminescent. This latter trait is otherwise unknown within the Enterobacteriaceae.' Photorhabdus has only recently been described as a genus separate from the Xenorhabdus (Boemare et al., 1993 Int.
15 J. Syst. Bacteriol. 43, 249-255). This differentiation is based on DNA-DNA hybridization studies, phenotypic differences presence (Photorhabdus) or absence (Xenorhabdus) of catalase and bioluminescence) and the Family of the nematode host (Xenorhabdus; Steinernematidae, Photorhabdus; Heterorhabditidae). Comparative, cellular fatty-acid analyses (Janse et al. 1990, Lett. Appl.
Microbiol 10, 131-135; Suzuki et al. 1990, J. Gen. Appl.
~Microbiol., 36, 393-401) support the separation of Photorhabdus from Xenorhabdus.
In order to establish that the strain collection disclosed 25 herein was comprised of Photorhabdus strains, the strains were characterized based on recognized traits which define Photorhabdus and differentiate it from other Enterobacteriaceae and Xenorhabdus S. species. (Farmer, 1984 Bergey's Manual of Systemic Bacteriology Vol. 1 pp.510-511; Akhurst and Boemare 1988, J. Gen. Microbiol. 134 30 pp. 1835-1845; Boemare et al. 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255, which are incorporated herein by reference). The traits studied were the following: gram stain negative rods, organism size, colony pigmentation, inclusion bodies, presence of catalase, ability to reduce nitrate, bioluminescence, dye uptake,_ gelatin hydrolysis, growth on selective media, growth temperature, survival under anerobic conditions and motility. Fatty acid analysis was used to confirm that the strains herein all belong to the single genus Photorhabdus.
Currently, the bacterial genus Photorhabdus is comprised of a single defined species, Photorhabdus luminescens (ATCC Type strain #29999, Poinar et al., 1977, Nematologica 23, 97-102). A variety of related strains have been described in the literature Akhurst et al. 1988 J. Gen. Microbiol., 134, 1835-1845; Boemare et al. 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255; Putz et al.
1990, Appl. Environ. Microbiol., 56, 181-186). Numerous Photorhabdus strains have been characterized herein. Because there is currently only one species (luminescens) defined within the genus Photorhabdus, the luminescens species traits were used to characterize the strains herein. As can be seen in Fig. 5, these strains are quite diverse. It is not unforeseen that in the.future there may be other Photorhabdus species that will have some of the attributes of the luminescens species as well as some different characteristics that are presently not defined as a trait of Photorhabdus luminescens. However, the scope of the invention herein is to any Photorhabdus species or strains which produce proteins that have functional activity as insect control agents, regardless of other traits and characteristics.
15 Furthermore, as is demonstrated herein, the bacteria of the genus Photorhabdus produce proteins that have functional activity as defined herein. Of particular interest are proteins produced by the species Photorhabdus luminescens. The inventions herein should in no way be limited to the strains which are disclosed herein.
These strains illustrate for the first time that proteins produced by diverse isolates of Photorhabdus are toxic upon exposure to insects. Thus, included within the inventions described herein are the strains specified herein and any mutants thereof, as well as any strains or species of the genus Photorhabdus that have the 25 functional activity described herein.
There are several terms that are used herein that have a particular meaning and are as follows: By "functional activity" it is meant herein that the protein 30 toxin(s) function as insect control agents in that the proteins are orally active, or have a toxic effect, or are able to disrupt or deter feeding, which may or may not cause death of the insect.
When an insect comes into contact with an effective amount of toxin delivered via transgenic plant expression, formulated protein compositions(s), sprayable protein composition(s), a bait matrix or other delivery system, the results are typically death of the insect, or the insects do not feed upon the source which makes the toxins available to the insects.
By the use of the term "genetic material" herein, it is meant to include all genes, nucleic acid, DNA and RNA.
By "homolog" it is meant an amino acid sequence that is identified as possessing homology to a reference W-14 toxin polypeptide amino acid sequence.
By "homology" it is meant an amino acid sequence that has a similarity index of at least 33% and/or an identity index of at least 26% to a reference W-14 toxin polypeptide amino acid sequence, as scored by the GAP algorithm using the BlOsum 62 protein scoring matrix (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, WI).
By "identity" is meant an amino acid sequence that contains an identical residue at a given position, following alignment with a reference W-14 toxin polypeptide amino acid sequence by the GAP 15 algorithm.
The protein toxins discussed herein are typically referred to as "insecticides". By insecticides it is meant herein that the protein toxins have a "functional activity" as further defined herein and are used as insect control agents.
By the use of the term "oligonucleotides" it is meant a macromolecule consisting of a short chain of nucleotides of either RNA or DNA. Such length could be at least one nucleotide, but 25 typically are in the range of about 10 to about 12 nucleotides.
The determination of the length of the oligonucleotide is well within the skill of an artisan and should not be a limitation herein. Therefore, oligonucleotides may be less than 10 or greater than 12.
By the use of the term "Photorhabdus toxin" it is meant any protein produced by a Photorhabdus microorganism strain which has functional activity against insects, where the Photorhabdustoxin could be formulated as a sprayable composition, expressed by a transgenic plant, formulated as a bait matrix, delivered via baculovirus, or delivered by any other applicable host or delivery system.
By the use of the term "toxic" or "toxicity" as used herein it is meant that the toxins produced by Photorhabdus have "functional activity" as defined herein.
By "truncated peptide" it is meant herein to include any peptide that is fragment(s) of the peptides observed to have functional activity.
By "substantial sequence homology" is meant either: a DNA fragment having a nucleotide sequence sufficiently similar to another DNA fragment to produce a protein having similar biochemical properties; or a polypeptide having an amino acid sequence sufficiently similar to another polypeptide to exhibit similar biochemical properties.
Fermentation broths from selected strains reported in Table were used to determine the following: breadth of insecticidal toxin production by the Photorhabdus genus, the insecticidal spectrum of these toxins, and to provide source material to purify 15 the toxin complexes. The strains characterized herein have been shown to have oral toxicity against a variety of insect orders.
Such insect orders include but are not limited to Coleoptera, Homoptera, Lepidoptera, Diptera, Acarina, Hymenoptera and Dictyoptera.
As with other bacterial toxins, the rate of mutation of the bacteria in a population causes many related toxins slightly different in sequence to exist. Toxins of interest here are those which produce protein complexes toxic to a variety of insects upon exposure, as described herein. Preferably, the toxins are active 25 against Lepidoptera, Coleoptera, Homopotera, Diptera, Hymenoptera, Dictyoptera and Acarina. The inventions herein are intended to capture the protein toxins homologous to protein toxins produced by S..the strains herein and any derivative strains thereof, as well as any protein toxins produced by Photorhabdus. These homologous proteins may differ in sequence, but do not differ in function from those toxins described herein. Homologous toxins are meant to include protein complexes of between 300 kDa to 2,000 kDa and are comprised of at least two subunits, where a subunit is a peptide which may or may not be the same as the other subunit.
Various protein subunits have been identified and are taught in the Examples herein. Typically, the protein subunits are between about 18 kDa to about 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to 160 kDa; about 80 kDa to about 100 kDa; and about 50 kDa to about 80 kDa.
As discussed above, some Photorhabdus strains can be isolated from nematodes. Some nematodes, elongated cylindrical parasitic worms of the phylum Nematoda, have evolved an ability to exploit insect larvae as a favored growth environment. The insect larvae provide a source of food for growing.nematodes and an environment in which to reproduce. One dramatic effect that follows invasion of larvae by certain nematodes is larval death. Larval death results from the presence of, in certain nematodes, bacteria that produce an insecticidal toxin which arrests larval growth and inhibits feeding activity.
Interestingly, it appears that each genus of insect parasitic nematode hosts a particular species of bacterium, uniquely adapted for symbiotic growth with that nematode. In the interim since this research was initiated, the name of the bacterial genus Xenorhabdus was reclassified into the Xenorhabdus and the Photorhabdus.
Bacteria of the genus Photorhabdus are characterized as being symbionts of Heterorhabditus nematodes while Xenorhabdus species are symbionts of the Steinernema species. This change in 1 5 nomenclature is reflected in this specification, but in no way should a change in nomenclature alter the scope of the inventions described herein.
The peptides and genes that are disclosed herein are named according to the guidelines recently published in the Journal of S" 20 Bacteriology "Instructions to Authors" p. i-xii (Jan. 1996), which is incorporated herein by reference. The following peptides and genes were isolated from Photorhabdus strain W-14.
oo 0..
00oo Table I Peptide/Gene Nomenclatur- Toxin Cormlex 0.e 1 2 3 4 Peptide Peptide Gene Gene Name Sequence ID No.* Name Sequence ID No.* tca genomic region TcaA 34 c tcaA 33 TcaAi pro-peptide tcaA TCaAji [151-, 34c tcaA TcaAiii 35c tcaA- TcaA 1 v [621a tcaA- TcaB (19, 2 0 26c tcaB 2S TcaBi [3a (19, 2 0 )b ,28C tcaB 27 TcaBi 1 30c tcaB 29 TcaC 32c tcaC 31 tcb aenonic region TcbA 12c, [161', (21, tcbA 11 22, 23, 2 4 )b TcbA 1 pro-peptide tcbA TcbAii (21, 22, 23, tcbA 52 24 )b 53c TcbAiii [401', 55c tcbA 54 tcc genomic regzion TccA 5.7C tccA 56 TccB 59c tccB 58 TccC 61C tccC tcd penomic region TcdA (17, 18, 37, 38, tcdA 3 6 46 39, 42, 4 3 47c TcdAj pro-peptide tcdA TcdAii [13]a, (17, 18, 37, tcdA 48 38, 3 9 )b ,49c TcdAiii [411a, (42, 4 3 tcdA TcdB (141a tcdB 8 Sequence ID No.'s in brackets are peptide N-termini; bNumbers in parentheses are N-termini of internal peptide tryptic fragments cdeduced from gene sequence d nera gene fragment The sequences listed above are grouped by genomic region. More specifically, the Photorhabdus Iurinesence bacteria (W-14) has at least four distinct genomic regions- tca, tcb, tcc and tcd. As can be seen in Table 1, peptide products are produced from these distinct genomic regions. Furthermore, as illustrated in the Examples, specifically Examples 15 and 21, individual gene products produced from three genomic regions are associated with insect activity. There is also considerable homology between these four genomic regions.
0e As is further illustrated in the Examples, the tcbA gene was expressed in E. coli as two possible biological active protein fragments (TcbA and TcbAii/iii). The tcdA gene was also expressed in E. coli. As illustrated in Example 16, when the native unprocessed TcbA toxin was treated with the endogeneous metalloproteases or insect gut contents -containing proteases, the TcbA protein toxin was processed into smaller subunits that were less than the size of the native peptides and Southern Corn Rootworm activity increased. The smaller toxin peptides remained associated as part of a toxin complex. It may be desirable in some situations to increase activation of the toxin(s) by proteolytic processing or using truncated peptides. Thus, it may be more desirable to use truncated peptide(s) in some applications, i.e., commercial transgenic plant applications.
15 In addition to the W-14 strain, there are other species within the Photorhabdus genus that have functional activity which is differential (specifically see Tables 20 and 36). Even though there is differential activity, the amino acid sequences in some cases Shave substantial sequence homology. Moreover, the molecular probes indicate that some genes contained in the strains are homologous to the genes contained in the W-14 strain. In fact all of the strains illustrated herein have one or more homologs of W-14 toxin genes.
The antibody data in Example 26 and the N-terminal sequence data in Example 25 further support the conclusion that there is homology and identity (based on amino acid sequence) between the protein toxin(s) produced by these strains. At the molecular level, the W-14 gene probes indicated that the homologs or the W-14 genes themselves (Tables 37, 38, and 39) are dispersed throughout the Photorhabdus 'Gv genus. Further, it is possible that new toxin genes exist in other strains which are not homologous to W-14, but maintain overall protein attributes (see specifically Examples 14 and Even though there is homology or identity between toxin genes produced by the Photorhabdus strains, the strains themselves are quite diverse. Using polymerase chain reaction technology further discussed in Example 22, most of the strains illustrated herein are quite distinguishable. For example as can be seen in Figs. 5, the percentage relative similarity of some of the strains, such as HP88 and NC-1, was about 0.8, which indicates that the strains are similar, while HP88 and Hb was about 0.1, which indicates substantial diversity. Therefore, even though the insect toxin genes or gene products that the strains produce are the same or similar, the strains themselves are diverse.
-11- In view of the data further disclosed in the Examples and discussions herein, it is clear that a new and unique family of insecticidal protein toxin(s) has been discovered. It has been further illustrated herein that these toxin(s) widely exist within bacterial strains of the Photorhabdus genus. It may also be the case that these toxin genes widely exist within the family Enterobacteracaea. Antibodies prepared as described in Example 21 or gene probes prepared as described in Example 25 may be used to further screen for bacterial strains within the family Enterobacteracaea that produce the homologous toxin(s) that have functional activity. It may also be the case that specific primer sets exist that could facilitate the identification of new genes within the Photorhabdus genus or family Enterobacteracaea.
I As stated above, the antibodies may be used to rapidly screen 15 bacteria of the genus Photorhabdus or the family Enterbacteracaea for homologous toxin products as illustrated in Example 26. Those S* skilled in the art are quite familiar with the use of antibodies as an analysis or screening tool (see US Patent No. 5,430,137, which is incorporated herein by reference). Moreover, it is generally accepted in the literature that antibodies are elicited against 6 to 0 0a 20 amino acid residue segments that tend to occupy exposed surface of polypeptides (Current Protocols in Immunology, Coligan et al, National Institutes of Health, John Wiley Sons, Inc.). Usually the amino acid consist of contiguous amino acid residues, however, 25 in certain cases they may be formed by non-contiguous amino acids that are constrained by specific conformation. The amino acid segments recognized by antibodies are highly specific and commonly referred to epitopes. The amino acid fragment can be generated by chemical and/or enzymatic cleavage of the native protein, by automated, solid-phase peptide synthesis, or by production from genetic engineering organisms. Polypeptide fragments can be isolated by a variety and/or combination of HPLC and FPLC chromatographic methods known in the art. Selection of polypeptide fragment can be aided by the use of algorithms, for example Kyte and Doolittle, 1982, Journal of Molecular Biology 157: 105-132 and Chou and Fasman, 1974, Biochemistry 13: 222-245, that predict those sequences most likely to exposed on the surface of the protein. For preparation of immunogen containing the polypeptide fragment of interest, in general, polypeptides are covalently coupled using chemical reactions to carrier proteins such as keyhole limpet hemocyanin via free amino (lysine), sulfhydyl (cysteine), phenolic (tyrosine) or carboxylic (aspartate or glutamate) groups. Immunogen with an adjuvant is injected in animals, such as mice or rabbits, or -12chickens to elicit an immune response against the immunogen.
Analysis of antibody titer in antisera of inject animals against polypeptide fragment can be determined by a variety of immunological methods such as ELISA and Western blot. Alternatively, monoclonal antibodies can be prepared using spleen cells of the injected animal for fusion with tumor cells to produce immortalized hybridomas cells producing a single antibody species. Hybridomas cells are screened using immunological methods to select lines that produce a specific antibody to the polypeptide fragment of interest. Purification of antibodies from different sources can be performed by a variety of antigen affinity or antibody affinity columns or other chromatographic HPLC or FPLC methods.
The toxins described herein are quite unique in that the S toxins have functional activity, which is key to developing an insect management strategy. In developing an insect management °strategy, it is possible to delay or circumvent the protein degradation process by injecting a protein directly into an S..organism, avoiding its digestive tract. In such cases, the protein administered to the organism will retain its function until it is 20 denatured, non-specifically degraded, or eliminated by the immune system in higher organisms. Injection into insects of an insecticidal toxin has potential application only in the laboratory, and then only on large insects which are easily injected. The observation that the insecticidal protein toxins 25 herein described exhibits their toxic activity after oral ingestion or contact with the toxins permits the development of an insect management plan based solely on the ability to incorporate the protein toxins into the insect diet. Such a plan could result in So, the production of insect baits.
.he iP otorhabdus toxins may be administered to insects in a purified form. The toxins may also be delivered-n--amounts from about 1 to about 100 mg liter of broth. This may vary upon formulation condition, conditions of the inoculum source, techniques for isolation of the toxin, and the like. The toxins may be administered as an exudate secretion or cellular protein originally expressed in a heterologous prokaryotic or eukaryotic host. Bacteria are typically the hosts in which proteins are expressed. Eukaryotic hosts could include but are not limited to plants, insects and yeast. Alternatively, the toxins may be produced in bacteria or transgenic plants in the field or in the insect by a baculovirus vector. Typically the toxins will be introduced to the insect by incorporating one or more of the toxins into the insects' feed.
Complete lethality to feeding insects is useful but is not required to achieve useful toxicity. If the insects avoid the toxin or cease feeding, that avoidance will be useful in some applications, even if the effects are sublethal. For example, if insect resistant transgenic crop plants are desired, a reluctance of insects to feed on the plants is as useful as lethal toxicity to the insects since the ultimate objective is protection of the plants rather than killing the insect.
There are many other ways in which toxins can be incorporated into an insect's diet. As an example, it is possible to adulterate the larval food source with the toxic protein by spraying the food with a protein solution, as disclosed herein. Alternatively, the purified protein could be genetically engineered into an otherwise harmless bacterium, which could then be grown in culture, and 15 either applied to the food source or allowed to reside in the soil in an area in which insect eradication was desirable. Also, the protein could be genetically engineered directly into an insect food source. For instance, the major food source of many insect larvae is plant material.
By incorporating genetic material that encodes the ~insecticidal properties of the Photorhabdus toxins into the genome of a plant eaten by a particular insect pest, the adult or larvae would die after consuming the food plant. Numerous members of the monocotyledonous and dictyledenous genera have been transformed.
Transgenic agronmonic crops as well as fruits and vegetables are of commercial interest. Such crops include but are not limited to maize, rice, soybeans, canola, sunflower, alfalfa, sorghum, wheat, cotton, peanuts, tomatoes, potatoes, and the like. Several techniques exist for introducing foreign genetic material into plant cells, and for obtaining plants that stably maintain and express the introduced gene. Such techniques include acceleration of genetic material coated onto microparticles directly into cells Patents 4,945,050 to Cornell and 5,141,131 to DowElanco) Plants may be transformed using Agrobacterium technology, see U.S.
Patent 5,177,010 to University of Toledo, 5,104,310 to Texas A&M, European Patent Application 0131624B1, European Patent Applications 120516, 159418B1 and 176,112 to Schilperoot, U.S. Patents 5,149,645, 5,469,976, 5,464,763 and 4,940,838 and 4,693,976 to Schilperoot, European Patent Applications 116718, 290799, 320500 all to MaxPlanck, European Patent Applications 604662 and 627752 to Japan Tobacco, European Patent Applications 0267159, and 0292435 and U.S. Patent 5,231,019 all to Ciba Geigy, U.S. Patents 5,463,174 and 4,762,785 both to Calgene, and U.S. Patents 5,004,863 and 4 5,159,135 both to Agracetus. Other transformation technology includes whiskers technology, see U.S. Patents 5,302,523 and 5,464,765 both to Zeneca. Electroporation technology has also been used to transform plants, see WO 87/06614 to Boyce Thompson Institute, 5,472,869 and 5,384,253 both to Dekalb, W09209696 and W09321335 both to PGS. All of these transformation patents and publications are incorporated by reference. In addition to numerous technologies for transforming plants, the type of tissue which is contacted with the foreign genes may vary as well. Such tissue would include but would not be limited to embryogenic tissue, callus tissue type I and II, hypocotyl, meristem, and the like. Almost all plant tissues may be transformed during dedifferentiation using appropriate techniques within the skill of an artisan.
Another variable is the choice of a selectable marker. The preference for a particular marker is at the discretion of the o artisan, but any of the following selectable markers may be used along with any other gene not listed herein which could function as a selectable marker. Such selectable markers include but are not 20 limited to aminoglycoside phosphotransferase gene of transposon .(Aph II) which encodes resistance to the antibiotics kanamycin, neomycin and G418, as well as those genes which code for resistance or tolerance to glyphosate; hygromycin; methotrexate; phosphinothricin (bialophos); imidazolinones, sulfonylureas and 25 triazolopyrimidine herbicides, such as chlorosulfuron; bromoxynil, dalapon and the like.
In addition to a selectable marker, it may be desirous to use a reporter gene. In some instances a reporter gene may be used *without a selectable marker. Reporter genes are genes which are typically not present or expressed in the recipient organism or tissue. The reporter gene typically encodes for a protein which provides for some phenotypic change or enzymatic property.
Examples of such genes are provided in K. Weising et al. Ann. Rev.
Genetics, 22, 421 (1988), which is incorporated-herein by reference. A preferred reporter gene is the glucuronidase (GUS) gene.
Regardless of transformation technique, the gene is preferably incorporated into a gene transfer vector adapted to express the Photorhabdus toxins in the plant cell by including in the vector a plant promoter. In addition to plant promoters, promoters from a variety of sources can be used efficiently in plant cells to express foreign genes. For example, promoters of bacterial origin, such as the octopine synthase promoter, the nopaline synthase promoter, the mannopine synthase promoter; promoters of viral origin, such as the cauliflower mosaic virus (35S and 19S), reengineered 35S, known as 35T (see PCT/US96/16582, WO 97/13402 published April 17, 1997, which is incorporated herein by reference) and the like may be used. Plant promoters include, but are not limited to ribulose-1,6-bisphosphate (RUBP) carboxylase small subunit (ssu), beta-conglycinin promoter, phaseolin promoter, ADH promoter, heat-shock promoters and tissue specific promoters.
Promoters may also contain certain enhancer sequence elements that may improve the transcription efficiency. Typical enhancers include but are not limited to Adh-intron 1 and Adh-intron 6.
Constitutive promoters may be used. Constitutive promoters direct continuous gene expression in all cells types and at all times actin, ubiquitin, CaMV 35S). Tissue specific promoters are responsible for gene expression in specific cell or tissue types, such as the leaves or seeds zein, oleosin, napin, ACP) and these promoters may also be used. Promoters may also be are active during a certain stage of the plants' development as well as active in plant tissues and organs. Examples of such promoters include 20 but are not limited to pollen-specific, embryo specific, corn silk specific, cotton fiber specific, root specific, seed endosperm specific promoters and the like.
Under certain circumstances it may be desirable to use an inducible promoter. An inducible promoter is responsible for 25 expression of genes in response to a specific signal, such as: physical stimulus (heat shock genes); light (RUBP carboxylase); hormone metabolites; and stress. Other desirable transcription and translation elements that function in plants may be used. Numerous plant-specific gene transfer vectors are known to rtteart In addition, it is known that to obtain high-expression of bacterial genes in plants it is preferred to reengineer the bacterial genes so that they are more efficiently expressed in the cytoplasm of plants. Maize is one such plant where it is preferred to reengineer the bacterial gene(s) prior to transformation to increase the expression level of the toxin in the plant. One reason for the reengineering is the very low G+C content of the native bacterial gene(s) (and consequent skewing towards high A+T content). This results in the generation of sequences mimicking or duplicating plant gene control sequences that are known to be highly A+T rich. The presence of some A+T-rich sequences within the DNA of the gene(s) introduced into plants TATA box regions normally found in gene promoters) may result in aberrant -16transcription of the gene(s). On the other hand, the presence of other regulatory sequences residing in the transcribed mRNA polyadenylation signal sequences (AAUAAA), or sequences complementary to small nuclear RNAs involved in pre-mRNA splicing) may lead to RNA instability. Therefore, one goal in the design of reengineered bacterial gene(s), more preferably referred to as plant optimized gene(s), is to generate a DNA sequence having a higher G+C content, and preferably one close to that of plant genes coding for metabolic enzymes. Another goal in the design of the plant optimized gene(s) is to generate a DNA sequence that not only has a higher G+C content, but by modifying the sequence changes, should be made so as to not hinder translation.
An example of a plant that has a high G+C content is maize.
The table below illustrates how high the G+C content is in maize.
15 As in maize, it is thought that G+C content in other plants is also high.
Table 2 Compilation of G+C Contents of Protein Codina Reaions of Maize Genes Protein Class a Range %G+C Mean %G+Cb Metabolic Enzymes (40) 44.4-75.3 59.0 Storage Proteins Group I (23) 46.0-51.9 48.1 (1.3) Group II (13) 60.4-74.3 67.5 (3.2) Group I II (36) 46.0-74.3 55.1 (9.6) c Structural Proteins (18) 48.6-70.5 63.6 (6.7) Regulatory Proteins 57.2-68.9 62.0 (4.9) Uncharacterized Proteins 41.5-70.3 64.3 (7.2) All Proteins (108) 44.4-75.3 60.8 (5.2) aNumber of genes in class given in parentheses.
b Standard deviations given in parentheses.
SCombined groups mean ignored in calculation of overall mean.
-17- For the data in Table 2, coding regions of the genes were extracted from GenBank (Release 71) entries, and base compositions were calculated using the MacVector T program (IBI, New Haven, CT) Intron sequences were ignored in the calculations. Group I and II storage protein gene sequences were distinguished by their marked difference in base composition.
Due to the plasticity afforded by the redundancy of the genetic code some amino acids are specified by more than one codon), evolution of the genomes of different organisms or classes or organisms has resulted in differential usage of redundant codons. This "codon bias" is reflected in the mean base composition of protein coding regions. For example, organisms with relatively low G+C contents utilize codons having A or T in the third position of redundant codons, whereas those having higher G+C contents utilize codons having G or C in the third position. It is thought that the presence of "minor" codons within a gene's mRNA may reduce the absolute translation rate of that mRNA, especially when the relative abundance of the charged tRNA corresponding to the minor codon is low. An extension of this is that the diminution of 20 translation rate by individual minor codons would be at least additive for multiple minor codons. Therefore, mRNAs having high relative contents of minor codons would have correspondingly low translation rates. This rate would be reflected by the synthesis of low levels of the encoded protein.
In order to reengineer the bacterial gene(s), the codon bias of the plant is determined. The codon bias is the statistical codon distribution that the plant uses for coding its proteins.
After determining the bias, the percent frequency of the codons in *ooo.- the gene(s) of interest is determined. The primary codons preferred by the plant should be determined as well as the second and third choice of preferred codons. The amino acid sequence of the protein of interest is reverse translated so that the resulting nucleic acid sequence codes for the same protein as the native bacterial gene, but the resulting nucleic acid sequence corresponds to the first preferred codons of the desired plant. The new sequence is analyzed for restriction enzyme sites that might have been created by the modification. The identified sites are further modified by replacing the codons with second or third choice preferred codons. Other sites in the sequence which could affect the transcription or translation of the gene of interest are the exon:intron 5' or 3' junctions, poly A addition signals, or RNA polymerase termination signals. The sequence is further analyzed and modified to reduce the frequency of TA or GC doublets. In -18addition to the doublets, G or C sequence blocks that have more than about four residues that are the same can affect transcription of the sequence. Therefore, these blocks are also modified by replacing the codons of first or second choice, etc. with the next preferred codon of choice. It is preferred that the plant optimized gene(s) contains about 63% of first choice codons, between about 22% to about 37% second choice codons, and between and 0% third choice codons, wherein the total percentage is 100%. Most preferred the plant optimized gene(s) contain about 63% of first choice codons, at least about 22% second choice codons, about 7.5% third choice codons, and about 7.5% fourth choice codons, wherein the total percentage is 100%. The method described above enables one skilled in the art to modify gene(s) that are foreign to a particular plant so that the genes are optimally expressed in plants. The method is further illustrated in application PCT/US96/16582, WO 97/13402 published April 17, S1997.
Thus, in order to design plant optimized gene(s) the amino acid sequence of the toxins are reverse translated into a DNA 20 sequence, utilizing a nonredundant genetic code established from a codon bias table compiled for the gene DNA sequence for the particular plant being transformed. The resulting DNA sequence, which is completely homogeneous in codon usage, is further modified e to establish a DNA sequence that, besides having a higher degree of 25 codon diversity, also contains strategically placed restriction enzyme recognition sites, desirable base composition, and a lack of sequences that might interfere with transcription of the gene, or translation of the product mRNA.
It is theorized that bacterial genes may be more easily expressed in plants if the bacterial genes are expressed in the plastids. Thus, it may be possible to express bacterial genes in plants, without optimizing the genes for plant expression, and obtain high express of the protein. See U.S. Patent Nos.
4,762,785; 5,451,513 and 5,545,817, which are incorporated herein by reference.
One of the issues regarding commercial exploiting transgenic plants is resistance management. This is of particular concern with Bacillus thuringiensis toxins. There are numerous companies commerically exploiting Bacillus thuringiensis and there has been much concern about Bt toxins becoming resistant. One strataegy for insect resistant management would be to combine the toxins produced by Photorhabdus with toxins such as Bt, vegetative insect proteins (Ciba Geigy) or other toxins. The combinations could be formulated -19for a sprayable application or could be molecular combinations.
Plants could be transformed with Photorhabdus genes that produce insect toxins and other insect toxin genes such as Bt as with other insect toxin genes such as Bt.
European Patent Application 0400246A1 describes transformation of 2 Bt in a plant, which could be any 2 genes. Another way to produce a transgenic plant that contains more than one insect resistant gene would be to produce two plants, with each plant containing an insect resistant gene. These plants would be backcrossed using traditional plant breeding techniques to produce a plant containing more than one insect resistant gene.
In addition to producing a transformed plant containing plant optimized gene(s), there are other delivery systems where it may be desirable to reengineer the bacterial gene(s). Along the same :15 lines, a genetically engineered, easily isolated protein toxin fusing together both a molecule attractive to insects as a food source and the insecticidal activity of the toxin may be engineered and expressed in bacteria or in eukaryotic cells using standard, well-known techniques. After purification in the laboratory such a toxic agent with "built-in" bait could be packaged inside standard coo insect trap housings.
Another delivery scheme is the incorporation of the genetic :material of toxins into a baculovirus vector. Baculoviruses infect particular insect hosts, including those desirably targeted with the Photorhabdus toxins. Infectious baculovirus harboring an expression construct for the Photorhabdus toxins could be o: .introduced into areas of insect infestation to thereby intoxicate S" o or poison infected insects.
Transfer of the insecticidal properties requires nucleic acid sequences encoding the coding the amino acid sequences for the Photorhabdus toxins integrated into a protein expression vector appropriate to the host in which the vector will reside. One way to obtain a nucleic acid sequence encoding a protein with insecticidal properties is to isolate the native genetic material which produces the toxins from Photorhabdus, using information deduced from the toxin's amino acid sequence, large portions of which are set forth below. As described below, methods of purifying the proteins responsible for toxin activity are also disclosed.
Using N-terminal amino acid sequence data, such as set forth below, one can construct oligonucleotides complementary to all, or a section of, the DNA bases that encode the first amino acids of the toxin. These oligonucleotides can be radiolabeled and used as molecular probes to isolate the genetic material from a genomic genetic library built from genetic material isolated from strains of Photorhabdus. The genetic library can be cloned in plasmid, cosmid, phage or phagemid vectors. The library could be transformed into Escherichia coli and screened for toxin production by the transformed cells using antibodies raised against the toxin or direct assays for insect toxicity.
This approach requires the production of a battery of oligonucleotides, since the degenerate genetic code allows an amino acid to be encoded in the DNA by any of several three-nucleotide combinations. For example, the amino acid arginine can be encoded by nucleic acid triplets CGA, CGC, CGG, CGT, AGA, and AGG. Since one cannot predict which triplet is used at those positions in the toxin gene, one must prepare oligonucleotides with each potential 15 triplet represented. More than one DNA molecule corresponding to a "protein subunit may be necessary to construct a sufficient number of oligonucleotide probes to recover all of the protein subunits necessary to achieve oral toxicity.
From the amino acid sequence of the purified protein, genetic 20 materials responsible for the production of toxins can readily be isolated and cloned, in whole or in part, into an expression vector using any of several techniques well-known to one skilled in the art of molecular biology. A typical expression vector is a DNA S" plasmid, though other transfer means including, but not limited to, 25 cosmids, phagemids and phage are also envisioned. In addition to features required or desired for plasmid replication, such as an origin of replication and antibiotic resistance or other form of a selectable marker such as the bar gene of Streptomyces hygroscopicus or viridochromogenes, protein expression vectors normally additionally require an expression cassette which incorporates the cis-acting sequences necessary for transcription and translation of the gene of interest. The cis-acting sequences required for expression in prokaryotes differ from those required in eukaryotes and plants.
A eukaryotic expression cassette requires a transcriptional promoter upstream to the gene of interest, a transcriptional termination region such as a poly-A addition site, and a ribosome binding site upstream of the gene of interest's first codon. In bacterial cells, a useful transcriptional promoter that could be included in the vector is the T7 RNA Polymerase-binding promoter.
Promoters, as previously described herein, are known to efficiently promote transcription of mRNA. Also upstream from the gene of interest the vector may include a nucleotide sequence encoding a -21signal sequence known to direct a covalently linked protein to a particular compartment of the host cells such-as the cell surface.
Insect viruses, or baculoviruses, are known to infect and adversely affect certain insects. The affect of the viruses on insects is slow, and viruses do not stop the feeding of insects.
Thus viruses are not viewed as being useful as insect pest control agents. Combining the Photorhabdus toxins genes into a baculovirus vector could provide an efficient way of transmitting the toxins while increasing the lethality of the virus. In addition, since different baculoviruses are specific to different insects, it may be possible to use a particular toxin to selectively target particularly damaging insect pests. A particularly useful vector for the toxins genes is the nuclear polyhedrosis virus. Transfer vectors using-this virus have been described and are now the 15 vectors of choice for transferring foreign genes into insects. The eeoee: virus-toxin gene recombinant may be constructed in an orally transmissible form. Baculoviruses normally infect insect victims through the mid-gut intestinal mucosa. The toxin gene inserted behind a strong viral coat protein promoter would be expressed and 20 should rapidly kill the infected insect.
In addition to an insect virus or baculovirus or transgenic plant delivery system for the protein toxins of the present 0 invention, the proteins may be encapsulated using Bacillus a thuringiensis encapsulation technology such as but not limited to 25 U.S. Patent Nos. 4,695,455; 4,695,462; 4,861,595 which are all incorporated herein by reference. Another delivery system for the protein toxins of the present invention is formulation of the ~protein into a bait matrix, which could then be used in above and below ground insect bait stations. Examples of such technology include but are not limited to PCT Patent Application WO 93/23998, which is incorporated herein by reference.
As is described above, it might become necessary to modify the sequence encoding the protein when expressing it in a non-native host, since the codon preferences of other hosts may differ from that of Photorhabdus. In such a case, translation may be quite inefficient in a new host unless compensating modifications to the coding sequence are made. Additionally, modifications to the amino acid sequence might be desirable to avoid inhibitory crossreactivity with proteins of the new host, or to refine the insecticidal properties of the protein in the new host. A genetically modified toxin gene might encode a toxin exhibiting, for example, enhanced or reduced toxicity, altered insect resistance development, altered stability, or modified target species specificity.
In addition to the Photorhabdus genes encoding the toxins, the scope of the present invention is intended to include related nucleic acid sequences which encode amino acid biopolymers homologous to the toxin proteins and which retain the toxic effect of the Photorhabdus proteins in insect species after oral ingestion.
For instance, the toxins used in the present invention seem to first inhibit larval feeding before death ensues. By manipulating the nucleic acid sequence of Photorhabdus toxins or i~s controlling sequences, genetic engineers placing the toxin gene into plants could modulate its potency or its mode of action to, for example, keep the eating-inhibitory activity while eliminating the absolute 15 toxicity to the larvae. This change could permit-the transformed .o plant to survive until harvest without having the unnecessarily dramatic effect on the ecosystem of wiping out all target insects.
'All such modifications of the gene encoding the toxin, or of the protein encoded by the gene, are envisioned to fall within the 20 scope of the present invention.
Other envisioned modifications of the nucleic acid include the addition of targeting sequences to direct the toxin to particular parts of the insect larvae for improving its efficiency.
Strains W-14, ATCC 55397, 43948, 43949, 43950, 43951, 43952 25 have been deposited in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA. Amino acid and nucleotide sequence data for the W-14 native toxin (ATCC 55397) is presented .oo below. Isolation of the genomic DNA for the toxins from the bacterial hosts is also exemplified herein. The other strains identified herein have been deposited with the United States Department of Agriculture, 1815 North University Drive, Peoria, IL 61604.
Standard and molecular biology techniques were followed and taught in the specification herein. Additional information may be found in Sambrook, Fritsch, E. and Maniatis, T. (1989), Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Press; Current Protocalsin Molecular Bioloyv, ed. F. M. Ausubel et al., (1997), which are both incorporated herein by reference.
The following abbreviations are used throughout the Examples: Tris tris (hydroxymethyl) amino methane; SDS sodium dodecyl sulfate; EDTA ethylenediaminetetraacetic acid, IPTG isopropylthio-Bgalactoside, X-gal 5-bromo-4-chloro-3-indoyl-B-D-galactoside, CTAB cetyltrimethylammonium bromide; kbp kilobase pairs; dATP, dCTP, dGTP, dTTP, I 2'-deoxynucleoside 5'-triphosphates of adenine, cytosine, guanine, thymine, and inosine, respectively; ATP adenosine 5' triphosphate.
Example 1 Purification of Toxin from Photorhabdus luminescens and Demonstration of Toxicity after Oral Delivery of Purified Toxin The insecticidal protein toxin of the present invention was purified from Photorhabdus luminescens strain W-14, ATCC Accession Number 55397. Stock cultures of Photorhabdus luminescens were maintained on petri dishes containing 2% Proteose Peptone No. 3 PP3, Difco Laboratories, Detroit MI) in 1.5% agar, incubated 15 at 25 0 C and transferred weekly. Colonies of the primary form of the bacteria were inoculated into 200 ml of PP3 broth supplemented with 0.5% polyoxyethylene sorbitan mono-stearate (Tween 60, Sigma Chemical Company, St. Louis, MO) in a one liter flask. The broth cultures were grown for 72 hours at 30 0 C on a rotary shaker. The 20 toxin proteins can be recovered from cultures grown in the presence or absence of Tween; however, the absence of Tween can affect the form of the bacteria grown and the profile of proteins produced by the bacteria. In the absence of Tween, a variant shift occurs insofar as the molecular weight of at least one identified toxin 25 subunit shifts from about 200 kDa to about 185 kDa.
The 72 hour cultures were centrifuged at 10,000 x g for minutes to remove cells and debris. The supernatant fraction that contained the insecticidal activity was decanted and brought to S mM K 2 HPO, by adding an appropriate volume of 1.0 M K 2 HPO,. The pH was adjusted to 8.6 by adding potassium hydroxide. This supernatant fraction was then mixed with DEAE-Sephacel (Pharmacia LKB Biotechnology) which had been equilibrated with 50 mM K 2
HPO
4 The toxic activity was adsorbed to the DEAE resin. This mixture was then poured into a 2.6 x 40 cm column and washed with 50 mM KHPO, at room temperature at a flow rate of 30 ml/hr until the effluent reached a steady baseline UV absorbance at 280 nm. The column was then washed with 150 mM KC1 until the effluent again reached a steady 280 nm baseline. Finally the column was washed with 300 mM KC1 and fractions were collected.
Fractions containing the toxin were pooled and filter sterilized using a 0.2 micron pore membrane filter. The toxin was then concentrated and equilibrated to 100 mM KPO pH 6.9, using an ultrafiltration membrane with a molecular weight cutoff of 100 kDa -24at 4 0 C (Centriprep 100, Amicon Division-W.R. Grace and Company). A 3 ml sample of the toxin concentrate was applied to the top of a 2.6 x 95 cm Sephacryl S-400 HR gel filtration column (Pharmacia LKB Biotechnology). The eluent buffer was 100 mM KPO,, pH 6.9, which was run at a flow rate of 17 ml/hr, at 4 0 C. The effluent was monitored at 280 nm.
Fractions were collected and tested for toxic activity.
Toxicity of chromatographic fractions was examined in a biological assay using Manduca sexta larvae. Fractions were either applied directly onto the insect diet (Gypsy moth wheat germ diet, ICN Biochemicals Division ICN Biomedicals, Inc.) or administered by intrahemocelic injection of a 5 pl sample through the first proleg of 4th or 5th instar larva using a 30 gauge needle. The weight of each larva within a treatment group was recorded at 24 hour 15 intervals. Toxicity was presumed if the insect ceased feeding and died within several days of consuming treated insect diet or if death occurred within 24 hours after injection of a fraction.
The toxic fractions were pooled and concentrated using the Centriprep-100 and were then analyzed by HPLC using a 7.5 mm x cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium phosphate, pH 6.9 eluent buffer running at 0.4 ml/min. This analysis revealed the toxin protein to be contained within a single sharp peak that eluted from the column with a retention time of approximately 33.6 minutes. This retention time corresponded to an 25 estimated molecular weight of 1,000 kDa. Peak fractions were collected for further purification while fractions not containing this protein were discarded. The peak eluted from the HPLC absorbs UV light at 218 and 280 nm but did not absorb at 405 nm.
Absorbance at 405 nm was shown to be an attribute of xenorhabdin Electrophoresis of the pooled peak fractions-irn-a- nondenaturing agarose gel (Metaphor Agarose, FMC BioProducts) showed that two protein complexes are present in the peak. The peak material, buffered in 50 mM Tris-HCl, pH 7.0, was separated on a 1.5% agarose stacking gel buffered with 100 mM Tris-HCl at pH and 1.9% agarose resolving gel buffered with 200 mM Tris-borate at pH 8.3 under standard buffer conditions (anode buffer 1M Tris-HCl, pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycine). The gels were run at 13 mA constant current at 150C until the phenol red tracking dye reached the end of the gel. Two protein bands were visualized in the agarose gels using Coomassie brilliant blue staining.
The slower migrating band was referred to as "protein band 1" and faster migrating band was referred to as "protein band The two protein bands were present in approximately equal amounts. The Coomassie stained agarose gels were used as a guide to precisely excise the two protein bands from unstained portions of the gels.
The excised pieces containing the protein bands were macerated and a small amount of sterile water was added. As a control, a portion of the gel that contained no protein was also excised and treated in the same manner as the gel pieces containing the protein.
Protein was recovered from the gel pieces by electroelution into 100 mM Tris-borate pH 8.3, at 100 volts (constant voltage) for two hours. Alternatively, protein was passively eluted from the gel pieces by adding an equal volume of 50 mM Tris-HCl, pH 7.0, to the gel pieces, then incubating at 30 0 C for 16 hours. This allowed the 15 protein to diffuse from the gel into the buffer, which was then collected.
Results of insect toxicity tests using HPLC-purified toxin (33.6 min. peak) and agarose gel purified toxin demonstrated toxicity of the extracts. Injection of 1.5 ig of the HPLC purified protein kills within 24 hours. Both protein bands 1 and 2, •recovered from agarose gels by passive elution or electroelution, :were lethal upon injection. The protein concentration estimated for these samples was less than 50 ng/larva. A comparison of the weight gain and the mortality between the groups of larvae injected 9* with protein bands 1 or 2 indicate that protein band 1 was more toxic by injection delivery.
When HPLC-purified toxin was applied to larval diet at a concentration of 7.5 pg/larva, it caused a halt in larval weight gain (24 larvae tested). The larvae begin to feed, but after consuming only a very small portion of the toxin treated diet they began to show pathological symptoms induced by the toxin and the larvae cease feeding. The insect frass became discolored and most larva showed signs of diarrhea. Significant insect mortality resulted when several 5 ig toxin doses were applied to the diet over a 7-10 day period.
Agarose-separated protein band 1 significantly inhibited larval weight gain at a dose of 200 ng/larva. Larvae fed similar concentrations of protein band 2 were not inhibited and gained weight at the same rate as the control larvae. Twelve larvae_were fed eluted protein and 45 larvae were fed protein-containing agarose pieces. These two sets of data indicate that protein band 1 was orally toxic to Manduca sexta. In this experiment it appeared that protein band 2 was not toxic to Manduca sexta.
-26- Further analysis of protein bands 1 and 2 by SDS-PAGE under denaturing conditions showed that each band was composed of several smaller protein subunits. Proteins were visualized by Coomassie brilliant blue staining followed by silver staining to achieve maximum sensitivity.
The protein subunits in the two bands were very similar.
Protein band 1 contains 8 protein subunits of 25.1, 56.2, 60.8, 65.6, 166, 171, 184 and 208 kDa. Protein band 2 had an identical profile except that the 25.1, 60.8, and 65.6 kDa proteins were not present. The 56.2, 60.8, 65.6, and 184 kDa proteins were present in the complex of protein band 1 at approximately equal concentrations and represent 80% or more of the total protein content of that complex.
The native HPLC-purified toxin was further characterized as 15 follows. The toxin was heat labile in that after being heated to 60 0 C for 15 minutes it lost its ability to kill or to inhibit weight gain when injected or fed to Manduca sexta larvae. Assays were designed to detect lipase, type C phospholipase, nuclease or red blood cell hemolysis activities and were performed with purified toxin. None of these activities were present. Antibiotic S. zone inhibition assays were also done and the purified toxin failed to inhibit growth of Gram-negative or -positive bacteria, yeast or filamentous fungi, indicating that the toxic is not a xenorhabdin antibiotic.
25 The native HPLC-purified toxin was tested for ability to kill insects other than Manduca sexta. Table 3 lists insects killed by Sthe HPLC-purified Photorhabdus luminescens toxin in this study.
Table 3 Insects Killed by Photorhabdus luminescens Toxin Genus and Route of Common Name Order species Deliver Tobacco Lepidoptera Manduca sexta Oral and horn worm injected Mealworm Coleoptera Tenebrio molitor Oral Pharaoh ant Hymenoptera Monomoriumn pharoanis Oral German Dictyoptera Blattella germanica -Oral and cockroach injected Mosquito Diptera Aedes aegypti Oral Further Characterization of the High Molecular Weight Toxin Complex In yet further analysis, the toxin protein complex was subjected to further characterization from W-14 growth medium. The culture conditions and initial purification steps through the S-400 HR column were identical to those described above. After isolation of the high molecular weight toxin complex from the S-400 HR column fractions, the toxic fractions were equilibrated with 10 mM Tris- HC1, pH 8.6, and concentrated in the centriplus 100 (Amicon) concentrators. The protein toxin complex was then applied to a weak anion exchange (WAX) column, Vydac 301VPH575 (Hesparia, CA), at a flow rate of 0.5 ml/min. The proteins were eluted with a linear potassium chloride gradient, 0-250 mM KC1, in 10 mM Tris-HCl pH 8.6 for 50 min. Eight protein peaks were detected by absorbance at 280 nm.
15 Bioassays using neonate southern corn rootworm (Diabrotica undecimpunctata howardi, SCR) larvae and tobacco horn worm (Manduca sexta, THW) were performed on all fractions eluted from the HPLC column. THW were grown on Gypsy Moth wheat germ diet (ICN) at 25 0
C
with a 16 hr light 8 hr dark cycle. SCR were grown on Southern Corn Rootworm Larval Insecta-Diet (BioServ) at 25 0 C with a 16 hr light 8 hr dark cycle.
The highest mortality for SCR and THW larvae was observed for peak 6, which eluted with ca. 112 mM to 132mM KC1. SDS-PAGE analysis of peak 6 showed predominant peptides of 170 kDa, 66 kDa, 25 63 kDa, 59.5 kDa and 31 kDa. Western blot analysis was performed on peak 6 protein fraction with a mixture of polyclonal antibodies made against TcaAii-syn, TcaAiii-syn, TcaBii-syn, TcaC-syn, and TcbAii-syn peptides (described in Example 21) and C5F2, a monoclonal antibody against the TcbAii i peptide. Peak 6 contained immuno-reactive bands of 170 kDa, 90 kDa, 66 kDa, 59.5 kDa and 31 kDa---These are very close to the predicted sizes for the TcaC (166 kDa), TcaAii+ TcaAii i (92 kDa), TcaAiii (66 kDa), TcaBii (60 kDa) and TcaAii (25 kDa), respectively. Peak 6 which was further analyzed by native agarose gel electrophoresis, as described herein, migrated as a single band with similar mobility to that of band 1.
The protein concentration of the purified peak 6 toxin protein was determined using the BCA reagents (Pierce). Dilutions of the protein were made in 10 mM Tris, pH 8.6 and applied to the diet bioassays. After 240 hours all neonate larvae on diet bioassays that received 450 ng or greater of the peak 6 protein fraction were dead. The group of larvae that received 90 ng of the same fraction -28had 40% mortality. After 240 hrs the survivors that received 90 ng and 20 ng of peak 6 protein fraction were ca. 10% and respectively, of the control weight.
Example 2 Insecticide Utility The Photorhabdus luminescens utility and toxicity were further characterized. Photorhabdus luminescens (strain W-14) culture broth was produced as follows. The production medium was 2% Bacto Proteose Peptone" Number 3 (PP3, Difco Laboratories, Detroit, Michigan) in Milli-Q"deionized water. Seed culture flasks consisted of 175 ml medium placed in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput and autoclaved for 20 minutes, 15 T=250 0 F. Production flasks consisted of 500 mls in a 2.8 liter 500 ml tribaffled flask with a Delong neck, covered by a Shin-etsu silicon foam closure. These were autoclaved for 45 minutes, T=250'F. The seed culture was incubated at 28'C at 150 rpm in a gyrotory shaking incubator with a 2 inch throw. After 16 hours of growth, 1% of the seed culture was placed in the production flask which was allowed to grow for 24 hours before harvest. Production of the toxin appears to be during log phase growth. The microbial broth was transferred to a 1L centrifuge bottle and the cellular biomass was pelleted (30 minutes at 2500 RPM at 4'C, about 1600] HG-4L Rotor RC3 Sorval centrifuge, Dupont, Wilmington, DE).
SThe primary broth was chilled at 4'C for 8 16 hours and recentrifuged at least 2 hours (conditions above) to further clarify the broth by removal of a putative mucopolysaccharide which precipitated upon standing. (An alternative processing method combined both steps and involved the use of a 16 hour clarification centrifugation, same conditions as above.) This broth was then stored at 4 C prior to bioassay or filtration.
Photorhabdus culture broth and protein toxin(s) purified from this broth showed activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects. More specifically, the activity is seen against corn rootworm (larvae and adult), Colorado potato beetle, and turf grubs, which are members of the insect order Coleoptera. Other members of the Coleoptera include wireworms, pollen beetles, flea beetles, seed beetles and weevils. Activity has also been observed against aster leafhopper, which is a member of the order, Homoptera. Other members of the Homoptera include planthoppers, pear pyslla, apple -29sucker, scale insects, whiteflies, and spittle bugs, as well as numerous host specific aphid species. The broth and purified fractions are also active against beet armyworm, cabbage looper, black cutworm, tobacco budworm, European corn borer, corn earworm, and codling moth, which are members of the order Lepidoptera.
Other typical members of this order are clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm, and fall armyworm. Activity is also seen against fruitfly and mosquito larvae, which are members of the order Diptera. Other members of the order Diptera are pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly, house fly, and various mosquito species. Activity is seen against carpenter ant and Argentine ant, which are members of the order that also includes fire ants, oderous house ants, and 15 little black ants.
The broth/fraction is useful for reducing populations of insects and were used in a method of inhibiting an insect population. The method may comprise applying to a locus of the insect an effective insect inactivating amount of the active described. Results are reported in Table 4.
Activity against corn rootworm larvae was tested as follows.
Photorhabdus culture broth (filter sterilized, cell-free) or purified HPLC fractions were applied directly to the surface (about 1.5 cm 2 of 0.25 ml of artificial diet in 30 il aliquots following 25 dilution in control medium or 10 mM sodium phosphate buffer, pH respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) hatched from sterilized eggs, with second instar SCR grown on artificial diet or with second instar Diabrotica virgifera virgifera (Western corn rootworm, WCR) reared on corn seedlings grown in Metromix'. Second instar larvae were weighed prior to addition to the diet. The plates were sealed, placed in a humidified growth chamber and maintained at 27 0 C for the appropriate period (4 days for neonate and adult SCR, 2-5 days for WCR larvae, 7-14 days for second instar SCR). Mortality and weight determinations were scored as indicated. Generally, 16 insects per treatment were used in all studies. Control mortalities were as follows: neonate larvae, adult beetles, Activity against Colorado potato beetle was tested as follows.
Photorhabdus culture broth or control medium was applied to the surface (about 2.0 cm 2 of 1.5 ml of standard artificial diet held in the wells of a 24-well tissue culture plate. Each well received gl of treatment and was allowed to air dry. Individual second instar Colorado potato beetle (Leptinotarsa decemlineata, CPB) larvae were then placed onto the diet and mortality was scored after 4 days. Ten larvae per treatment were used in all studies.
Control mortality was 3.3%.
Activity against Japanese beetle grubs and beetles was tested as follows. Turf grubs (Popillia japonica, 2-3rd instar) were collected from infested lawns and maintained in the laboratory in soil/peat mixture with carrot slices added as additional diet.
Turf beetles were pheromone-trapped locally and maintained in the laboratory in plastic containers with maple leaves as food.
Following application of undiluted Photorhabdus culture broth or control medium to corn rootworm artificial diet (30 i1/1.54 cm 2 beetles) or carrot slices (larvae), both stages were placed singly 15 in a diet well and observed for any mortality and feeding. In both cases there was a clear reduction in the amount of feeding (and feces production) observed.
Activity against mosquito larvae was tested as follows. The assay was conducted in a 96-well microtiter plate. Each well contained 200 41 of aqueous solution (Photorhabdus culture broth, *control medium or H 2 0) and approximately 20, 1-day old larvae (Aedes aegypti). There were 6 wells per treatment. The results were read at 2 hours after infestation and did not change over the three day observation period. No control mortality was seen.
25 Activity against fruitflies was tested as follows. Purchased Drosophila melanogaster medium was prepared using 50% dry medium and a 50% liquid of either water, control medium or Photorhabdus culture broth. This was accomplished by placing 8.0 ml of dry medium in each of 3 rearing vials per treatment and adding 8.0 ml of the appropriate liquid. Ten late instar Drosophila melanogaster maggots were then added to each vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 3, 7 and 10 days of exposure. Incorporation of Photorhabdus culture broth into the diet media for fruitfly maggots caused a slight but significant reduction in day-10 adult emergence as compared to water and control medium reduction) Activity against aster leafhopper was tested as follows. The ingestion assay for aster leafhopper (Macrosteles severini) is designed to allow ingestion of the active without other external contact. The reservoir for the active/"food" solution is made by making 2 holes in the center of the bottom portion of a 35 x 10 mm Petri dish. A 2 inch Parafilm M square is placed across the top of -31the dish and secured with an ring. A 1 oz. plastic cup is then infested with approximately 7 leafhoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using undiluted Photorhabdus culture broth, the broth and control medium were dialyzed against water to reduce control mortality. Mortality is reported at day 2 where 26.5% control mortality was seen. In the tests using purified fractions (200_mg protein/ml) a final concentration of 5% sucrose was used in all treatments to improve survivability of the aster leafhoppers. The assay was held in an incubator at 28 0 C, 70% RH with a 16/8 photoperiod. The assay was graded for mortality at 72 hours. Control mortality was Activity against Argentine ants was tested as follows. A ml aliquot of 100% Photorhabdus culture broth, control medium or 15 water was pipetted into 2.0 ml clear glass vials. The vials were plugged with a piece of cotton dental wick that was moistened with the appropriate treatment. Each vial was placed into a separate 60xl6mm Petri dish with 8 to 12 adult Argentine ants (Linepithema humile). There were three replicates per treatment. Bioassay plates were held on a laboratory bench, at room temperature under fluorescent ceiling lights. Mortality readings were made after days of exposure. Control mortality was 24%.
Activity against carpenter ant was tested as follows. Black carpenter ant workers (Camponotus pennsylvanicus) were collected S" 25 from trees on DowElanco property in Indianapolis, IN. Tests with Photorhabdus culture broth were performed as follows. Each plastic bioassay container (7 1/8" x held fifteen workers, a paper harborage and 10 ml of broth or control media in a plastic shot glass. A cotton wick delivered the treatment to the ants through a hole in the shot glass lid. All treatments contained 5% sucrose.
Bioassays were held in the dark at room temperature and graded at 19 days. Control mortality was Assays delivering purified fractions utilized artificial ant diet mixed with the treatment (purified fraction or control solution) at a rate of 0.2 ml treatment/2.0 g diet in a plastic test tube. The final protein concentration of the purified fraction was less than 10 gg/g diet.
Ten ants per treatment, a water source, harborage and the treated diet were placed in sealed plastic containers and maintained in the dark at 27 0 C in a humidified incubator. Mortality was scored at day 10. No control mortality was seen.
Activity against various lepidopteran larvae was tested as follows. Photorhabdus culture broth or purified fractions were -32applied directly to the surface (about 1.5 cm 2 of 0.25 ml of standard artificial diet in 30 p1 aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate larva.
European corn borer (Ostrinia nubilalis) and corn earworm (Helicoverpa zea) eggs were supplied from commercial sources and hatched in-house, whereas beet armyworm (Spodoptera exigua), cabbage looper (Trichoplusia ni), tobacco budworm (Heliothis virescens), codling moth (Laspeyresia pomonella) and black cutworm (Agrotis ipsilon) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27 0 C for the appropriate period. Mortality and weight determinations were 15 scored at days 5-7 for Photorhabdus culture broth and days 4-7 for the purified fraction. Generally, 16 insects per treatment were used in all studies. Control mortality ranged from 4-12.5% for control medium and was less than 10% for phosphate buffer.
e oo -33- Table 4 Effect of Photorhabdus lumine censi(Strain W-14) Culture Broth and Purified Toxin Pr~cti, nn MnrtAIit-j AnA r,-rnwth W11 Inhibition of Different Tnq~ct- Different Insert OrHPrq/.qnPrjPq S S S S. S
S
S
*5 *5
S
*SSS
insect Orcier/Species Broth Puritie Fraction 16 Mort. 5 7TT Mort. r U. i
COLEOPTERA
Corn Rootworm Southern/neonate larva 100 na 100 na Southern/2 nd instar na 38.5 nt nt Southern/adult 45 nt nt nt Western/2 Id instar na 35 nt nt Colorado Potato Beetle 93 nt nt nt 2 "d instar Turf Grub na a.f. nt nt 3 d instar na a.f. nt nt adult
DJ.PTERA
Fruit Fly (adult emergence) 17 nt nt nt Mosquito larvae 100 na nt nt k1OMOFTERA Aster Leafhopper 96.S na 100 na
HYMENUFTERA
Argentine Ant 75 na nt na Carpenter Ant 71 na 100 na
LEI±DOPTERA
Beet Armyworn 12.S 36 18.75 41.4 Black Cutworm nt nt 0 71.2 Cabbage Looper nt nt 21.9 66.8 Codling Moth nt nt 6.25 45.9 Corn Earworm 56.3 94.2 97.9 na European Corn Borer 96.7 98.4 100 na Tobacco Budworn 13.5 52.5 19.4 85.6 Mr.= IUorLa±±Ly, ki.l. growLn inflinition, na not applicable, nt not tested, a.f. anti -feedant Insecticide Utility upon Soil Aonl)ication Photorhabdus lwninescens (strain W-14) culture broth was shown to be active against corn rootworm when applied directly to soil or a soil-mix (Metromixo) Activity against neonate SCR and WCR in -34- Metromix' was tested as follows (Table 5) The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 days. After roots were approximately 3-6 cm long, a single kernel/seedling was planted in a 591 ml clear plastic cup with 50 gm of dry Metromix. Twenty neonate SCR or WCR were then placed directly on the roots of the seedling and covered with Metromix'. Upon infestation, the seedlings were then drenched with 50 ml total volume of a diluted broth solution. After drenching, the cups were sealed and left at room temperature in the light for 7 days. Afterwards, the seedlings were washed to remove all Metromix and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as o 15 either or with representing no damage and representing severe damage.
Activity against neonate SCR in soil was tested as follows (Table The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 days. After the roots were approximately 3-6 cm long, S. a single kernel/seedling was planted in a 591 ml clear plastic cup with 150 gm of soil from a field in Lebanon, IN planted the previous year with corn. This soil had not been previously treated with insecticides. Twenty neonate SCR were then placed directly on the roots of the seedling and covered with soil. After S"infestation, the seedlings were drenched with 50 ml total volume of a diluted broth solution. After drenching, the unsealed cups were "incubated in a high relative humidity chamber at 78 0
F.
ee :Afterwards, the seedlings were washed to remove all soil and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either or with representing no damage and representing severe damage.
Table. Effect of Photorhabdus luminescens (Strain- 14) Culture Broth on Rootworm Larvae after Post-Infestation Drenching (Metromix*) Treatment Larvae Leaf Damage Root Weight (g) Southern Corn Rootworm Water Medium v/v) Broth (6.25%v/v) Water Media v/v) Broth (1.56% v/v) Western Corn Rootworm Water Broth v/v) 20 Water Broth v/v) Effect of Photorhabdus 0.4916 0.4416 0.4641 0.023 0.029 0.081 100 100 100 0.1410 0.006 0.1345 0.028 0.4830 0.031 0.4446 0.019 0.4069 0.026 0.2202 0.015 0.3879 0.013 28.7 30.4 104
S
*5
S.
S
S
S..
*5 S S *5 100 100 49 Table 6 luminescens (Strain W-14) Culture Broth on Southern Corn Rootworm Larvae after Post-Infestation Drenching Treatment Water Broth (50% v/v) Water Broth (50% v/v) Larvae Leaf Damage Root Weight(g) 0.2148 0.014 0.2260 0.016 0.0916 0.009 0.2428 0.032 Activity of Photorhabdus luminescens (strain W-14) culture broth against second instar turf grubs in Metromix' was observed in tests conducted as follows (Table Approximately 50 gm of dry Metromix* was added to a 591 ml clear plastic cup. The Metromix* was then drenched with 50 ml total volume of a 50% diluted Photorhabdus broth solution. The dilution of crude broth was made with water, with 50% broth being prepared by adding 25 ml of crude broth to 25 ml of water for 50 ml total volume. A 1% (w/v) solution of proteose peptone #3 (PP3), which is a 50% dilution of the normal media concentration, was used as a broth control. After drenching, five second instar turf grubs were placed on the top of the moistened Metromix'. Healthy turf grub larvae burrowed rapidly into the Metromix". Those larvae that did not burrow within lh were -36removed and replaced with fresh larvae. The cups were sealed and placed in a 28 0 C incubator, in the dark. After seven days, larvae were removed from the Metromix' and scored for mortality. Activity was rated the percentage of mortality relative to control.
Table 7 Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on Turf Grub after Pre-Infestation Drenching (Metromix') Treatment Mortality* Mortality Water 7/15 47 Control medium 15 w/v) 12/19 63 Broth (50% v/v) 17/20 20 *expressed as a ratio of dead/living larvae Example 4 Insecticide Utility upon Leaf Application Activity of Photorhabdus broth against European corn borer was t e seen when the broth was applied directly to the surface of maize leaves (Table In these assays Photorhabdus broth was diluted *4 4 100-fold with culture medium and applied manually to the surface of 30 excised maize leaves at a rate of about 6.0 pl/cm of leaf surface.
The leaves were air dried and cut into equal sized strips approximately 2 x 2 inches. The leaves were rolled, secured with paper clips and placed in 1 oz plastic shot glasses with 0.25 inch of'-2Tagaron the bot.tom surface to provide moisture. Twelve neonate European corn borers were then placed ont-o-the-rolled leaf and the cup was sealed. After incubation for 5 days at 27 0 C in -the dark, the samples were scored for feeding damage and recovered larvae.
-37- Table 8 Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on European Corn Borer Larvae Following Pre-Infestation Application to Excised Maize Leaves Treatment Leaf Damage Larvae Recovered Weight(mg) Water Extensive 55/120 0.42 mg Control Medium Extensive 40/120 0.50 mg Broth v/v) Trace 3/120 0.15 mg Activity of the culture broth against neonate tobacco budworm (Heliothis virescens) was demonstrated using a leaf dip methodology. Fresh cotton leaves were excised from the plant and leaf disks were cut with an 18.5 mm cork-borer. The disks were 15 individually emersed in control medium (PP3) or Photorhabdus luminescens (strain W-14) culture broth which had been concentrated approximately 10-fold using an Amicon (Beverly, MA), Proflux M12 tangential filtration system with a 10 kDa filter. Excess liquid was removed and a straightened paper clip was placed through the 20 center of the disk. The paper clip was then wedged into a plastic, oz shot glass containing approximately 2.0 ml of 1% Agar. This *served to suspend the leaf disk above the agar. Following drying of the leaf disk, a single neonate tobacco budworm larva was placed on the disk and the cup was capped. The cups were then sealed in a 25 plastic bag and placed in a darkened, 27 0 C incubator for 5 days.
SAt this time the remaining larvae and leaf material were weighed to establish a measure of leaf damage (Table 9).
Table 9 30 Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay Final Weights (mg) Treatment Leaf Disk Larvae Control leaves 55.7 1.3 na* Control Medium 34.0 2.9 4.3 0.91 Photorhabdus broth 54.3 1.4 0.0** not applicable, no live larvae found Example 5. Part A Characterization of Toxin Peptide Components In a subsequent analysis, the toxin protein subunits of the bands isolated as in Example 1 were resolved on a 7% SDS -38polyacrylamide electrophoresis gel with a ratio of 30:0.8 (acrylamide:BIS-acrylamide). This gel matrix facilitates better resolution of the larger proteins. The gel system used to estimate the Band 1 and Band 2 subunit molecular weights in Example 1 was an 18% gel with a ratio of 38:0.18 (acrylamide:BIS-acrylamide), which allowed for a broader range of size separation, but less resolution of higher molecular weight components.
In this analysis, 10, rather than 8, protein bands were resolved. Table 10 reports the calculated molecular weights of the 10 resolved bands, and directly compares the molecular weights estimated under these conditions to those of the prior example. It is not surprising that additional bands were detected under the different separation conditions used in this example. Variations Sbetween the prior and new estimates of molecular weight are also to -5 be expected given the differences in analytical conditions. In the analysis of this example, it is thought that the higher molecular weight estimates are more accurate than in Example 1, as a result of improved resolution. However, these are estimates based on SDS PAGE analysis, which are typically not analytically precise and result in estimates of peptides and which may have been further altered due to post- and co-translational modifications.
S. Amino acid sequences were determined for the N-terminal portions of five of the 10 resolved peptides. Table correlates the molecular weight of the proteins and the 25 identified sequences. In SEQ ID NO:2, certain analyses suggest that the proline at residue 5 may be an asparagine (asn). In SEQ ID NO:3, certain analyses suggest that the amino acid residues at positions 13 and 14 are both arginine (arg). In SEQ ID NO:4, .certain analyses suggest that the amino acid residue at position 6 30 may be either alanine (ala) or serine (ser). In SEQ ID certain analyses suggest that the amino acid residue at position 3 may be aspartic acid (asp).
ESTIMATE NEW ESTIMATE* SEO. LISTING 208 200.2 kDa SEQ ID NO:1 184 175.0 kDa SEQ ID NO:2 65.6 68.1 kDa SEQ ID NO:3 60.8 65.1 kDa SEQ ID NO:4 56.2 58.3 kDa SEQ ID 25.1 23.2 kDa SEQ ID *New estimates are based on SDS PAGE and are not based on gene sequences. SDS PAGE is not analytically precise.
Example 5, Part B Characterization of Toxin Peptide Components :New N-terminal sequence, SEQ ID NO:15, Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further N-terminal sequencing of peptides isolated from Native HPLC-purified toxin as eeo described in Example 5, Part A, above. This peptide comes from the 20 tcaA gene. The peptide labeled TcaAii, starts at position 254 and goes to position 491, where the TcaAiii peptide starts, SEQ ID .NO:4. The estimated size of the peptide based on the gene sequence is 25,240 Da.
25 Exame 6 Characterization of Toxin Peptide Components In yet another analysis, the toxin protein complex was reisolated from the Photorhabdus luminescens growth medium (after *30 culrt-.with~ut Tween) by performing a 10% 80% ammonium sulfate precipitation followed by an ion exchange chromatography step (Mono Q) and two molecular sizing chromatography steps. These conditions were like those used in Example 1. During the first molecular sizing step, a second biologically active peak was found at about 100 10 kDa. Based upon protein measurements, this fraction was 50 fold less active than the larger, or primary, active peak of about 860 100 kDa (native). During this isolation experiment, a smaller active peak of about 325 50 kDa that retained a considerable portion of the starting biological activity was also resolved. It is thought that the 325 kDa peak is related to or derived from the 860 kDa peak.
A 56 kDa protein was resolved in this analysis. The Nterminal sequence of this protein is presented in SEQ ID NO:6. It is noteworthy that this protein shares significant identity and conservation with SEQ ID NO:5 at the N-terminus, suggesting that the two may be encoded by separate members of a gene family and that the proteins produced by each gene are sufficiently similar to both be operable in the insecticidal toxin complex.
A second, prominent 185 kDa protein was consistently present in amounts comparable to that of protein 3 from Table 10, and may be the same protein or protein fragment. The N-terminal sequence of this 185 kDa protein is shown at SEQ ID NO:7.
Additional N-terminal amino acid sequence data were also obtained from isolated proteins. None of the determined N-terminal sequences appear identical to a protein identified in Table Other proteins were present in isolated preparation. One such protein has an estimated molecular weight of 108 kDa and an N- ;:ii f15 terminal sequence as shown in SEQ ID NO:8. A second such protein has an estimated molecular weight of 80 kDa and an N-terminal sequence as shown in SEQ ID NO:9.
When the protein material in the approximately 325 kDa active peak was analyzed by size, bands of approximately 51, 31, 28, and 22 kDa were observed. As in all cases in which a molecular weight was determined by analysis of electrophoretic mobility, these molecular weights were subject to error effects introduced by buffer ionic strength differences, electrophoresis power differences, and the like. One of ordinary skill would understand 25 that definitive molecular weight values cannot be determined using these standard methods and that each was subject to variation. It was hypothesized that proteins of these sizes are degradation products of the larger protein species (of approximately 200 kDa size) that were observed in the larger primary toxin complex.
30 Finally, several preparations included a protein having the Nterminal sequence shown in SEQ ID NO:10. This sequence was strongly homologous to known chaperonin proteins, accessory proteins known to function in the assembly of large protein complexes. Although the applicants could not ascribe such an assembly function to the protein identified in SEQ ID NO:10, it was consistent with the existence of the described toxin protein complex that such a chaperonin protein could be involved in its assembly. Moreover, although such proteins have not directly been suggested to have toxic activity, this protein may be important to determining the overall structural nature of the protein toxin, and thus, may contribute to the toxic activity or durability of the complex in vivo after oral delivery.
-41- Subsequent analysis of the stability of the protein toxin complex to proteinase K was undertaken. It was determined that after 24 hour incubation of the complex in the presence of a fold molar excess of proteinase K, activity was virtually eliminated (mortality on oral application dropped to about These data confirm the proteinaceous nature of the toxin.
The toxic activity was also retained by a dialysis membrane, again confirming the large size of the native toxin complex.
Example 7 Isolation. Characterization and Partial Amino Acid Sequencing of Photorhabdus Toxins Isolation and N-Terminal Amino Acid Seuencina 15 In a set of experiments conducted in parallel to Examples and 6, ammonium sulfate precipitation of Photorhabdus proteins was performed by adjusting Photorhabdus broth, typically 2-3 liters, to a final concentration of either 10% or 20% by the slow addition of Sammonium sulfate crystals. After stirring for 1 hour at 4 0 C, the material was centrifuged at 12,000 x g for 30 minutes. The supernatant was adjusted to 80% ammonium sulfate, stirred at 4 0
C
for 1 hour, and centrifuged at 12,000 x g for 60 minutes. The pellet was resuspended in one-tenth the volume of 10 mM Na 2
PO
4 pH and dialyzed against the same phosphate buffer overnight at 25 40C. The dialyzed material was centrifuged at 12,000 x g for 1 hour prior to ion exchange chromatography.
A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was equilibrated with 10 mM Na 2
'PO
4 pH 7.0. Centrifuged, dialyzed ammonium sulfate pellet was applied to the Q Sepharose column at a rate of 1.5 ml/min and washed extensively at 3.0 ml/min with equilibration buffer until the optical density 280) reached less than 0.100. Next, either a 60 minute NaCl gradient ranging from 0 to 0.5 M at 3 ml/min, or a series of step elutions using 0.1 M, 0.4 M and finally 1.0 NaCl for 60 minutes each was applied to the column. Fractions were pooled and concentrated using a Centriprep 100. Alternatively, proteins could be eluted by a single 0.4 M NaC1 wash without prior elution with 0.1 M NaC1.
Two milliliter aliquots of concentrated Q Sepharose-samples were loaded at 0.5 ml/min onto a HR 16/50 Superose 12 (Pharmacia) gel filtration column equilibrated with 10 mM Na2'PO 4 pH 7.0. The column was washed with the same buffer for 240 min at 0.5 ml/min and 2 min samples were collected. The void volume material was -42collected and concentrated using a Centriprep 100. Two milliliter aliquots of concentrated Superose 12 samples were loaded at ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) gel filtration column equilibrated with 10 mM Na 2
PO
4 pH 7.0. The column was washed with the same buffer for 240 min at 0.5 ml/min and 2 min samples were collected.
The excluded protein peak was subjected to a second fractionation by application to a gel filtration column that used a Sepharose CL-4B resin, which separates proteins ranging from about 30 kDa to 1000 kDa. This fraction was resolved into two peaks; a minor peak at the void volume (>1000 kDa) and a major peak which eluted at an apparent molecular weight of about 860 kDa. Over a one week period subsequent samples subjected to gel filtration showed the gradual appearance of a third peak (approximately 325 15 kDa) that seemed to arise from the major peak, perhaps by limited proteolysis. Bioassays performed on the three peaks showed that the void peak had no activity, while the 860 kDa toxin comDlex fraction was highly active, and the 325 kDa peak was less active, although quite potent. SDS PAGE analysis of Sepharose CL-4B toxin complex peaks from different fermentation productions revealed two distinct peptide patterns, denoted and The two patterns had marked differences in the molecular weights and concentrations of peptide components in their fractions. The pattern, produced most frequently, had 4 high molecular weight peptides 25 150 kDa) while the pattern had 3 high molecular weight peptides. In addition, the peptide fraction was found to have 2-3 fold more activity against European Corn Borer. This shift may be related to variations in protein expression due to age of inoculum and/or other factors based on growth parameters of aged cultures.
Milligram quantities of peak toxin complex fractions determined to be or peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycine (SeprabuffTM to PVDF membranes (ProBlottTM, Applied Biosystems) for 3-4 hours. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides in the pattern had unique Nterminal amino acid sequences compared to the sequences identified in the previous example. A 201 kDa (TcdAii) peptide set forth as SEQ ID NO:13 below shared between 33% amino acid identity and similarity (similarity and identity were calculated by hand) with SEQ ID NO:1 (TcbAii)(in Table 10 vertical lines denote amino acid identities and colons indicate conservative amino acid substitutions). A second peptide of 197 kDa, SEQ ID NO:14 (TcdB), had 42% identity and 58% similarity with SEQ ID NO:2 (TcaC) (similarity and identity were calculated by hand). Yet a third peptide of 205 kDa was denoted TcdAii. In addition, a limited Nterminal amino acid sequence, SEQ ID NO:16 (TcbA), of a peptide of at least 235 kDa was identical with the amino acid sequence,--SEQ
ID
NO:12, deduced from a cloned gene (tcbA), SEQ ID NO:11, containing a deduced amino acid sequence corresponding to SEQ ID NO:1 (TcbAii). This indicates that the larger 235+ kDa peptide was proteolytically processed to the 201 kDa peptide, (TcbAii), (SEQ ID NO:1) during fermentation, possibly resulting in activation of the molecule. In yet another sequence, the sequence originally reported as SEQ ID NO:5 (TcaBii) reported in Example 5 above, was 15 found to contain an aspartic acid residue (Asp) at the third position rather than glycine (Gly) and two additional amino acids Gly and Asp at the eighth and ninth positions, respectively. In yet two other sequences, SEQ ID NO:2 (TcaC) and SEQ ID NO:3 (TcaBi), additional amino acid sequence was obtained.
Densitometric quantitation was performed using a sample that was identical to the preparation sent for N-terminal analysis.
This analysis showed that the 201 kDa and 197 kDa peptides represent 7.0% and respectively, of the total Coomassie brillant blue stained protein in the pattern and are present in 25 amounts similar to the other abundant peptides. It was speculated that these peptides may represent protein homologs, analogous to the situation found with other bacterial toxins, such as various CryI Bt toxins. These proteins vary from 40-90% similarity at their "N-terminal amino acid sequence, which encompasses the toxic fragment.
Internal Amino Acid Sequencing To facilitate cloning of toxin peptide genes, internal amino acid sequences of selected peptides were obtained as followed.
Milligram quantities of peak 2A fractions determined to be or peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycine (Seprabuff T M to PVDF membranes (ProBlott T M Applied Biosystems) for 3-4 hours. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides, referred to as TcbAii (containing SEQ ID NO:1), TcdAii, and TcaBi (containing SEQ ID NO:3) were subjected to trypsin digestion by -44- Harvard MicroChem followed by HPLC chromatography to separate individual peptides. N-terminal amino acid-a-alysis was performed on selected tryptic peptide fragments. Two internal peptides were sequenced for the peptide TcdAii (205 kDa peptide) referred to as TcdAii-PTlll (SEQ ID NO:17) and TcdAii-PT79 (SEQ ID NO:18). Two internal peptides were sequenced for the peptide TcaBi (68 kDa peptide) referred to as TcaBi-PT158 (SEQ ID NO:19) and TcaBi-PT108 (SEQ ID NO:20). Four internal peptides were sequenced for the peptide TcbAii (201 kDa peptide) referred to as TcbAii-PT103 (SEQ ID NO:21), TcbAii-PT56 (SEQ ID NO:22), TcbAii-PT81(a) (SEQ ID NO:23), and TcbAii-PT81(b) (SEQ ID NO:24).
Table 11 N-Terminal Amino Acid Sequences 15 (similarity and identity were calculated by hand) o**e 201 kDa (33% identity 50% similarity to SEQ ID NO.1) L I G Y N N Q F S G A SEQ ID NO:13 I I I I 20 F I Q G Y S D L F G N A SEQ ID NO:1 197 kDa (42% identity 58% similarity SEQ ID NO.2) M Q N S Q T F S V G E L SEQ ID NO.'14 1 1 I I I 25 M Q D S P E V S I T T L SEQ ID NO.2 Example 8 Construction of a Cosmid Library of Photorhabdus luminescens W-14 30 Genomic DNA and its Screening to Isolate Genes Encoding Peptides Comprising the Toxic Protein Preparation o As a prerequisite for the production of Photorhabdus insect toxic proteins in heterologous hosts, and for other uses, it is necessary to isolate and characterize the genes that encode those peptides. This objective was pursued in parallel. One approach, described later, was based on the use of monoclonal and polyclonal antibodies raised against the purified toxin which were then used to isolate clones from an expression library. The other approach, described in this example, is based on the use of the N-terminal and internal amino acid sequence data to design degenerate oligonucleotides for use in PCR amplication. Either method can be used to identify DNA clones that contain the peptide-encoding genes so as to permit the isolation of the respective genes, and the determination of their DNA base sequence.
Genomic DNA Isolation Photorhabdus luminescens strain W-14 (ATCC accession number 55397) was grown on 2% proteose peptone #3 agar (Difco Laboratories, Detroit, MI) and insecticidal toxin competence was maintained by repeated bioassay after passage, using the method described in Example 1 above. A 50 ml shake culture was produced in a 175 ml baffled flask in 2% proteose peptone #3 medium, grown at 280C and 150 rpm for approximately 24 hours. 15 ml of this culture was pelleted and frozen in its medium at -20 0 C until it was thawed for DNA isolation. The thawed culture was centrifuged, (700 x g, 30 min) and the floating orange mucopolysaccharide material was removed. The remaining cell material was centrifuged (25,000 x g, 15 min) to pellet the bacterial cells, and the medium was removed and discarded.
15 Genomic DNA was isolated by an adaptation of the CTAB method described in section 2.4.1 of Current Protocols in Molecular Biology (Ausubel et al. eds, John Wiley Sons, 1994) (modified to include a salt shock and with all volumes increased 10-fold]. The pelleted bacterial cells were resuspended in TE buffer (10 mM Tris- HC1, 1 mM EDTA, pH 8.0) to a final volume of 10 ml, then 12 ml of M NaC1 was added; this mixture was centrifuged.20 min at 15,000 x g. The pellet was resuspended in 5.7 ml TE and 300 ml of 10% SDS "and 60 ml of 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, NY; in sterile distilled water) were added to the 25 suspension. This mixture was incubated at 37 0 C for 1 hr; then tooI: approximately 10 mg lysozyme (Worthington Biochemical Corp., Freehold, NJ) was added. After an additional 45 min, 1 ml of 5 M .NaCl and 800 ml of CTAB/NaC1 solution (10% w/v CTAB, 0.7 M NaCl) .were added. This preparation was incubated 10 min at 650C, then 30 gently agitated and further incubated and agitated for approximately 20 min to assist clearing of the cellular material.
An equal volume of chloroform/isoamyl alcohol solution (24:1, y/v) was added, mixed gently and centrifuged. After two extractions with an equal volume of PCI (phenol/chloroform/isoamyl alcohol; 50:49:1, v/v/v; equilibrated with 1 M Tris-HCl, pH Intermountain Scientific Corporation, Kaysville, UT), the DNA was precipitated with 0.6 volume of isopropanol. The DNA precipitate was gently removed with a glass rod, washed twice with 70% ethanol, dried, and dissolved in 2 ml STE (10 mM Tris-HCl pH 8.0, 10 mM NaC1, 1 mM EDTA). This preparation contained 2.5 mg/ml DNA, as determined by optical density at 260 nm OD 60 -46- The molecular size range of the isolated genomic DNA was evaluated for suitability for library construction. CHEF gel analysis was performed in 1.5% agarose (Seakem" LE, FMC BioProducts, Rockland, ME) gels with 0.5 X TBE buffer (44.5 mM Tris-HCl pH 44.5 mM H 3
BO
3 1 mM EDTA) on a BioRad CHEF-DR II apparatus with a Pulsewave 760 Switcher (Bio-Rad Laboratories, Inc., Richmond, CA).
The running parameters were: initial A time, 3 sec; final A time, 12 sec; 200 volts; running temperature, 4-18 0 C; run time, 16.5 hr.
Ethidium bromide staining and examination of the gel under ultraviolet light indicated the DNA ranged from 30-250 kbp in size.
Construction of Library A partial Sau3A 1 digest was made of this Photorhabdus genomic DNA preparation. The method was based on section 3.1.3 of Ausubel 15 (supra.). Adaptions included running smaller scale reactions under various conditions until nearly optimal results were achieved.
SSeveral scaled-up large reactions with varied conditions were run, the results analyzed on CHEF gels, and only the best large scale preparation was carried forward. In the optimal case, 200 pg of Photorhabdus genomic DNA was incubated with 1.5 units of Sau3A 1 (New England Biolabs, "NEB", Beverly, MA) for 15 min at 37 0 C in 2 ml total volume of lX NEB 4 buffer (supplied as 10X by the manufacturer). The reaction was stopped by adding 2 ml of PCI and centrifuging at 8000 x g for 10 min. To the supernatant were added 25 200 dl of 5 M NaC1 plus 6 ml of ice-cold ethanol. This preparation S* was chilled for 30 min at -20 0 C, then centrifuged at 12,000 x g for 15 min. The supernatant was removed and the precipitate was dried in a vacuum oven at 400C, then resuspended in 400 pi STE.
Spectrophotometric assay indicated about 40% recovery of the input DNA. The digested DNA was size fractionated on a sucrose gradient according to section 5.3.2 of CPMB (op. cit.). A 10% to 40% (w/v) linear sucrose gradient was prepared with a gradient maker in Ultra-ClearT m tubes (Beckman Instruments, Inc., Palo Alto, CA) and the DNA sample was layered on top. After centrifugation, (26,000 rpm, 17 hr, Beckman SW41 rotor, 20 0 fractions (about 750 pl) were drawn from the top of the gradient and analyzed by CHEF gel electrophoresis (as described earlier). Fractions containing Sau3A 1 fragments in the size range 20-40 kbp were selected and DNA was precipitated by a modification (amounts of all solutions increased approximately 6.3-fold) of the method in section 5.3.3 of Ausubel (supra.). After overnight precipitation, the DNA was collected by centrifugation (17,000 x g, 15 min), dried, redissolved in TE, -47pooled into a final volume of 80 Il,. and reprecipitated with the addition of 8 j1 3 M sodium acetate and 220 Ti ethanol. The pellet collected by centrifugation as above was resuspended in 12 1l TE.
Concentration of the DNA was determined by Hoechst 33258 dye (Polysciences, Inc., Warrington, PA) fluorometry in a Hoefer TKO100 fluorimeter (Hoefer Scientific Instruments, San Francisco, CA).
Approximately 2.5 ig of the size-fractionated DNA was recovered.
Thirty Ag of cosmid pWE15 DNA (Stratagene, La Jolla, CA) was digested to completion with 100 units of restriction enzyme BamH 1 (NEB) in the manufacturer's buffer (final volume of 200 pl, 37 0 C, 1 hr). The reaction was extracted with 100 gl of PCI and DNA was precipitated from the aqueous phase by addition of 20 pl 3M sodium acetate and 550 p1 -20 0 C absolute ethanol. After 20 min at -700C, the DNA was collected by centrifugation (17,000 x g, 15 min), dried 15 under vacuum, and dissolved in 180 Ml of 10 mM Tris-HC1, pH To this were added 20 Ml of 10X CIP buffer (100 mM Tris-HC1, pH 8.3; 10 mM ZnC12; 10 mM MgCl 2 and 1 Ml (0.25 units) of 1:4 diluted calf intestinal alkaline phosphatase (Boehringer Mannheim Corporation, Indianapolis, IN). After 30 min at 370C, the 20 following additions were made: 2 L1 0.5 M EDTA, pH 8.0; 10 Ml SSDS; 0.5 Ml of 20 mg/ml proteinase K (as above), followed by incubation at 55 0 C for 30 min. Following sequential extractions with 100 Ml of PCI and 100 gl phenol (Intermountain Scientific Corporation, equilibrated with 1 M Tris-HCl, pH the dephosphorylated DNA was precipitated by addition of 72 gl of 7.5 M ammonium acetate and 550 gl -20 0 C ethanol, incubation on ice for min, and centrifugation as above. The pelleted DNA was washed once with 500 Ml -200C 70% ethanol, dried under vacuum, and dissolved in
M
1 of TE buffer.
Ligation of the size-fractionated Sau3A 1 fragments to the BamH 1-digested and phosphatased pWE15 vector was accomplished using T4 ligase (NEB) by a modification use of premixed ligation buffer supplied by the manufacturer) of the protocol in section 3.33 of Ausubel. Ligation was carried out overnight in a total volume of 20 Ml at 15 0 C, followed by storage at 20 0
C.
Four Ml of the cosmid DNA ligation reaction, containing about 1 gg of DNA, was packaged into bacteriophage lambda using a commercial packaging extract (Gigapack' III Gold Packaging Extract, Stratagene), following the manufacturer's directions. The packaged preparation was stored at 4 0 C until use. The packaged cosmid preparation was used to infect Escherichia coli XL1 Blue MR cells (Stratagene) according to the Gigapack* III Gold protocols ("Titering the Cosmid Library"), as follows. XL1 Blue MR cells were grown in LB medium Bacto-tryptone, 10; Bacto-yeast extract, 5; Bacto-agar, 15; NaC1, 5; [Difco Laboratories, Detroit, MI]) containing 0.2% maltose plus 10 mM MgSO,, at 37 0 C. After hr growth, cells were pelleted at 700 x g (15 min) and resuspended in 6 ml of 10 mM MgSO,. The culture density was adjusted with 10 mM MgSO 4 to OD 00 0.5. The packaged cosmid library was diluted 1:10 or 1:20 with sterile SM medium (0.1 M NaCI, 10 mM MgSO, 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin), and p1 of the diluted preparation was mixed with 25 p1 of the diluted XL1 Blue MR cells. The mixture was incubated at 25 0 C for 30 min (without shaking), then 200 1l of LB broth was added, and incubation was continued for approximately 1 hr with occasional 15 gentle shaking. Aliquots (20-40 g1) of this culture were spread on LB agar plates containing 100 mg/l ampicillin LB-Amp,,,) and incubated overnight at 37 0 C. To store the library without amplification, single colonies were picked and inoculated into Sindividual wells of sterile 96-well microwell plates; each well containing 75 1l of Terrific Broth (TB media: 12 g/1 Bactotryptone, 24 g/l Bacto-yeast extract, 0.4% v/v glycerol, 17 mM KHPO,, 72 mM KHPO 4 plus 100 mg/l ampicillin TB-Amp, 00 and incubated (without shaking) overnight at 37 0 C. After replicating the 96-well plate into a copy plate, 75 pl/well of filter- 25 sterilized TB:glycerol v/v; with, or without, 100 mg/l ampicillin) was added to the plate, it was shaken briefly at 100 rpm, 37 0 C, and then closed with Parafilm' (American National Can, Greenwich, CT) and placed in a -70 0 C freezer for storage. Copy plates were grown and processed identically to the master plates.
A total of 40 such master plates (and their copies) were prepared.
Screening of the Library with Radiolabeled DNA Probes To prepare colony filters for probing with radioactively labeled probes, ten 96-well plates of the library were thawed at 25 0 C (bench top at room temperature). A replica plating tool with 96 prongs was used to inoculate a fresh 96-well copy plate containing 75 pl/well of TB-Amp,, 0 The copy plate was grown overnight (stationary) at 37 0 C, then shaken about 30 min at 100 rpm at 37 0 C. A total of 800 colonies was represented in these copy plates, due to nongrowth of some isolates. The replica tool was used to inoculate duplicate impressions of the 96-well arrays onto Magna NT (MSI, Westboro, MA) nylon membranes (0.45 micron, 220 x 250 mm) which had been placed on solid LB-Ampo 0 (100 ml/dish) in Bio-assay plastic dishes (Nunc, 243 x 243 x 18 mm; Curtin Mathison Scientific, Inc., Wood Dale, IL). The colonies were grown on the membranes at 37 0 C for about 3 hr.
A positive control colony (a bacterial clone containing a GZ4 sequence insert, see below) was grown on a separate Magna NT membrane (Nunc, 0.45 micron, 82 mm circle) on LB medium supplemented with 35 mg/l chloramphenicol LB-Cam,,), and processed alongside the library colony membranes. Bacterial colonies on the membranes were lysed, and the DNA was denatured and neutralized according to a protocol taken from the Genius T M System User's Guide version 2.0 (Boehringer Mannheim, Indianapolis, IN) Membranes were placed colony side up on filter paper soaked with 0.5 N NaOH plus 1.5 M NaCI for 15 min to denature, and neutralized 15 on filter paper soaked with 1 M Tris-HC1 pH 8.0, 1.5 M NaC1 for min. After UV-crosslinking using a Stratagene UV Stratalinker set on auto crosslink, the membranes were stored dry at 25 0 C until use.
Membranes were trimmed into strips containing the duplicate impressions of a single 96-well plate, then washed extensively by the method of section 6.4.1 in CPMB (op. cit.): 3 hr at 25 0 C in 3X SSC, 0.1% SDS, followed by 1 hr at 65 0 C in the same solution, then rinsed in 2X SSC in preparation for the hybridization step SSC 3 M NaC1, 0.3 M sodium citrate, pH 25 Amplification of a Specific Genomic Fragment of a TcaC Gene Based on the N-terminal amino acid sequence determined for the purified TcaC peptide fraction [disclosed herein as SEQ ID NO:2], a pool of degenerate oligonucleotides (pool S4Psh) was synthesized by standard 0-cyanoethyl chemistry on an Applied BioSystem ABI394 DNA/RNA Synthesizer (Perkin Elmer, Foster City, CA). The oligonucleotides were deprotected 8 hours at 55 0 C, dissolved in water, quantitated by spectrophotometric measurement, and diluted for use. This pool corresponds to the determined N-terminal amino acid sequence of the TcaC peptide. The determined amino acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents inosine: Amino Met Gin Asp Ser Pro Glu Val Acid S4Psh 5' ATG CA(A/G) GA(T/C) CCI GA(A/G) GT 3' Another set of degenerate oligonucleotides was synthesized (pool P2.3.5R), representing the complement of the coding strand for the determined amino acid sequence of the SEQ ID NO:17: Amino Acid Ala Phe Asn Ile Asp Asp Val Codons 5' GCN TT(T/C) AA(T/C) AT(A/T/C) GA(T/C) GA(T/C) GT 3' P2.3.5R 3'CG(A/C/G/T) AA(A/G) TT(A/G) TA(T/A/G) CT(A/G) CT(A/G) CA These oligonucleotides were used as primers in Polymerase Chain Reactions (PCR', Roche Molecular Systems, Branchburg, NJ) to amplify a specific DNA fragment from genomic DNA prepared from Photorhabdus strain W-14 (see above). A typical reaction (50 l) contained 125 pmol of each primer pool P2Psh and P2.3.5R, 253 ng of genomic template DNA, 10 nmol each of dATP, dCTP, dGTP, and dTTP, IX GeneAmp" PCR buffer, and 2.5 units of AmpliTaq" DNA polymerase (both from Roche Molecular Systems; 10X GeneAmp* buffer is 100 mM Tris-HCl pH 8.3, 500 mM KC1, 0.01% w/v gelatin). Amplifications were performed in a Perkin Elmer Cetus DNA Thermal Cycler (Perkin Elmer, Foster City, CA) using 35 cycles of 94 0 C (1.0 min), 55 0
C
(2.0 min), 72 0 C (3.0 min), followed by an extension period of min at 72 0 C. Amplification products were analyzed by 20 electrophoresis through 2% w/v NuSieve" 3:1 agarose (FMC BioProducts) in TEA buffer (40 mM Tris-acetate, 2 mM EDTA, pH A specific product of estimated size 250 bp was observed amongst numerous other amplification products by ethidium bromide "g/ml) staining of the gel and examination under ultraviolet light.
The region of the gel containing an approximately 250 bp product was excised, and a small plug (0.5 mm dia.) was removed and used to supply template for PCR amplification (40 cycles). The reaction (50 gl) contained the same components as above, minus genomic template DNA. Following amplification, the ends of.the 30 fragments were made blunt and were phosphorylated by incubation at 25 0 C for 20 min with 1 unit of T4 DNA polymerase (NEB), 1 nmol ATP, and u-Et ts of T4 kinase (Pharmacia Biotech Inc., Piscataway, NJ) DNA fragments were separated from residual primers by electrophoresis through 1% w/v GTG" agarose (FMC) in TEA. A gel slice containing fragments of apparent size 250 bp was excised, and the DNA was extracted using a Qiaex kit (Qiagen Inc., Chatsworth,
CA).
The extracted DNA fragments were ligated to plasmid vector pBC (Stratagene) that had been digested to completion with restriction enzyme Sma 1 and extracted in a manner similar to that described for pWE15 DNA above. A typical ligation reaction (16.3 4l) contained 100 ng of digested pBC DNA, 70 ng of 250 bp fragment DNA, 1 nmol [Co(NH3) 6 ]C1 3 and 3.9 Weiss units of T4 DNA ligase (Collaborative Biomedical Products, Bedford, MA), in IX -51ligation buffer (50 mM Tris-HC1, pH 10 mM MgCl; 10 mM dithiothreitol; 1 mM spermidine, 1 mM ATP, 100 mg/ml bovine serum albumin). Following overnight incubation at 14 0 C, the ligated products were transformed into frozen, competent Escherichia coli DH5a cells (Gibco BRL) according to the suppliers' recommendations, and plated on LB-Cam 5 plates, containing IPTG (119 Ag/ml) and X-gal gg/ml). Independent white colonies were picked, and plasmid DNA was prepared by a modified alkaline-lysis/PEG precipitation method (PRISMTM Ready Reaction DyeDeoxy TM Terminator Cycle Sequencing Kit Protocols; ABI/Perkin Elmer). The nucleotide sequence of both strands of the insert DNA was determined, using T7 primers [pBC bases 601-623: TAAAACGACGGCCAGTGAGCGCG) and LacZ primers [pBC bases 792-816: ATGACCATGATTACGCCAAGCGCGC) and protocols supplied with the PRISM T M sequencing kit (ABI/Perkin 15 Elmer). Nonincorporated dye-terminator dideoxyribonucleotides were removed by passage through Centri-Sep 100 columns (Princeton Separations, Inc., Adelphia, NJ) according to the manufacturer's instructions. The DNA sequence was obtained by analysis of the samples on an ABI Model 373A DNA Sequencer (ABI/Perkin Elmer). The DNA sequences of two isolates, GZ4 and HB14, were found to be as •illustrated in Fig. 1.
This sequence illustrates the following features: 1) bases 1represent one of the 64 possible sequences of the S4Psh degenerate oligonucleotides, ii) the sequence of amino acids 1-3 25 and 6-12 correspond exactly to that determined for the N-terminus of TdaC (disclosed as SEQ ID NO:2), iii) the fourth amino acid encoded is a cysteine residue rather than serine. This difference .is encoded within the degeneracy for the serine codons (see above), iv) the fifth amino acid encoded is proline, corresponding to the TcaC N-terminal sequence given as SEQ ID NO:2, v) bases 257-276 encode one of the 192 possible sequences designed into the degenerate pool, vi) the TGA termination codon introduced at bases 268-270 is the result of complementarity to the degeneracy built into the oligonucleotide pool at the corresponding position, and does not indicate a shortened reading frame for the corresponding gene.
Labeling of a TcaC Peptide Gene-specific Probe DNA fragments corresponding to the above 276 bases were amplified (35 cycles) by PCR" in a 100 l reaction volume, using 100 pmol each of P2Psh and P2.3.5R primers, 10 ng of plasmids GZ4 or HB14 as templates, 20 nmol each of dATP, dCTP, dGTP, and dTTP, units of AmpliTAq" DNA polymerase, and IX concentration of GeneAmp* buffer, under the same temperature regimes as described above. The amplification products were extracted from a 1% GTG" agarose gel by Qiaex kit and quantitated by fluorometry.
The extracted amplification products from plasmid HB14 template (approximately 400 ng) were split into five aliquots and labeled with "P-dCTP using the High Prime Labeling Mix (Boehringer Mannheim) according to the manufacturer's instructions.
Nonincorporated radioisotope was removed by passage through NucTrap' Probe Purification Columns (Stratagene), according to the supplier's instructions. The specific activity of the labeled DNA product was determined by scintillation counting to be 3.11 x 108 dpm/g. This labeled DNA was used to probe membranes prepared from 800 members of the genomic library.
"Screening with a TcaC-pnetide Gene Specific Probe The radiolabeled HB14 probe was boiled approximately 10 min, then added to "minimal hyb" solution. [Note: The "minimal hyb" '.method is taken from a CERES protocol; "Restriction Fragment Length Polymorphism Laboratory Manual version sections 4-40 and 4- 47; CERES/NPI, Salt Lake City, UT. NPI is now defunct, with its successors operating as Linkage Genetics]. "Minimal hyb" solution contains 10% w/v PEG (polyethylene glycol, M.W. approx. 8000), 7% w/v SDS; 0.6X SSC, 10 mM sodium phosphate buffer (from a 1M stock 25 containing 95 g/l NaH,PO,'lH 2 0 and 84.5 g/l Na,HPO 4 5 mM EDTA, and 100 mg/ml denatured salmon sperm DNA. Membranes were blotted dry briefly then, without prehybridization, 5 strips of membrane were placed in each of 2 plastic boxes containing 75 ml of "minimal hyb" and 2.6 ng/ml of radiolabeled HB14 probe. These were incubated overnight with slow shaking (50 rpm) at 60 0 C. The filters were washed three times for approximately 10 min each at 0 C in "minimal hyb wash solution" (0.25X SSC, 0.2% SDS), followed by two 30-min washes with slow shaking at 60 0 C in the same solution. The filters were placed on paper covered with Saran Wrap! (Dow Brands, Indianapolis, IN) in a light-tight autoradiographic cassette and exposed to X-Omat X-ray film (Kodak, Rochester, NY) with two DuPont Cronex Lightning-Plus C1 enhancers (Sigma Chemical Co., St. Louis, MO), for 4 hr at -70 0 C. Upon development (standard photographic procedures), significant signals were evident in both replicates amongst a high background of weaker, more irregular signals. The filters were again washed for about 4 hr at 68 0 C in "minimal hyb wash solution" and then placed again in the cassettes -53and film was exposed overnight at -7DOC. Twelve possible positives were identified due to strong signals on both of the duplicate 96well colony impressions. No signal was seen with negative control membranes (colonies of XL1 Blue MR cells containing pWE15), and a very strong signal was seen with positive control membranes cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
The twelve putative hybridization-positive colonies were retrieved from the frozen 96-well library plates and grown overnight at 37 0 C on solid LB-Amp 00 medium. They were then patched (3/plate, plus three negative controls: XL1 Blue MR cells containing the pWEl5 vector) onto solid LB-Amp, 1 o. Two sets of membranes (Magna NT nylon, 0.45 micron) were prepared for hybridization. The first set was prepared by placing a filter 15 directly onto the colonies on a patch plate, then removing it with adherent bacterial cells, and processing as below. Filters of the second set were placed on plates containing LB-Amp 00 medium, then inoculated by transferring cells from the patch plates onto the filters. After overnight growth at 37 0 C, the filters were removed from the plates and processed.
Bacterial cells on the filters were lysed and DNA denatured by placing each filter colony-side-up on a pool (1.0 ml) of 0.5 N NaOH in a plastic plate for 3 min. The filters were blotted dry on a paper towel, then the process was repeated with fresh 0.5 N NaOH.
After blotting dry, the filters were neutralized by placing each on a 1.0 ml pool of 1 M Tris-HCl, pH 7.5 for 3 min, blotted dry, and reneutralised with fresh buffer. This was followed by two similar 'soakings (5 min each) on pools of 0.5 M Tris-HCl pH 7.5 plus 1.5 M NaC. After blotting dry, the DNA was UV crosslinked to the filter (as above), and the filters were washed (25 0 C, 100 rpm) in about 100 ml of 3X SSC plus SDS (4 times, 30 min each with fresh solution for each wash). They were then placed in a minimal volume of prehybridization solution [6X SSC plus 1% w/v each of Ficoll 400 (Pharmacia), polyvinylpyrrolidone (av. M.W. 360,000; Sigma and bovine serum albumin Fraction V; (Sigma)] for 2 hr at 0 C, 50 rpm. The prehybridization solution was removed, and replaced with the HB14 "P-labeled probe that had been saved from the previous hybridization of the library membranes and which had been denatured at 95 0 C for 5 min. Hybridization was performed at 60 0 C for 16 hr with shaking at 50 rpm.
Following removal of the labeled probe solution, the membranes were washed 3 times at 25 0 C (50 rpm, 15 min) in 3X SSC (about 150 ml each wash). They were then washed for 3 hr at 68°C (50 rpm) in 0.25X SSC plus 0.2% SDS (minimal hyb wash solution), and exposed to X-ray film as described above for 1.5 hr at 250C (no enhancer screens). This exposure revealed very strong hybridization signals to cosmid isolates 22G12, 25A10, 26A5, and 26B10, and a very weak signal with cosmid isolate 8B10. No signal was seen with the negative control (pWE15) colonies, and a very strong signal was seen with positive control membranes (DH5a cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
Amplification of a Specific Genomic Fracment of a TcaB Gene Based on the N-terminal amino acid sequence determined for the purified TcaB i peptide fraction (disclosed here as SEQ ID NO:3) a pool of degenerate oligonucleotides (pool P8F) was synthesized as 15 described for peptide TcaC. The determined amino acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents inosine: Amino Acid Leu Phe Thr Gln Thr Leu Lys Glu Ala Arg P8F 5' TTT ACI CA(A/G) ACI (C/T)TI AAA GAA GCI (A/C)G 3'
(C/T)TI
25 Another set of degenerate oligonucleotides was synthesized (pool P8.108.3R), representing the complement of the coding strand for the determined amino acid sequence of the TcaBi-PTl08 internal peptide (disclosed herein as SEQ ID S30 Amino Acid Met Tyr Tyr Ile Gln Ala Gln Gln Codons ATG TA(T/C) TA(T/C) AT(T/C/A) CA(A/G) GC(A/C/G/T) CA(A/G CA(A/G) P8.108.3R 3' AT(A/G) AT(A/G) TA(A/G/T) GT(T/C) CGI GT(T/C) GT
TAC
These oligonucleotides were used as primers for PCR' using HotStart 50 TubesTM (Molecular Bio-Products, Inc., San Diego, CA) to amplify a specific DNA fragment from genomic DNA prepared from Photorhabdus strain W-14 (see above). A typical reaction (50 l) contained (bottom layer) 25 pmol of each primer pool P8F and P8.108.3R, with 2 nmol each of dATP, dCTP, dGTP, and dTTP, in IX GeneAmp' PCR buffer, and (top layer) 230 ng of genomic template DNA, 8 nmol each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of AmpliTaq' DNA polymerase, in lX GeneAmp* PCR buffer. Amplifications were performed by 35 cycles as described for the TcaC peptide.
Amplification products were analyzed by electrophoresis through 0.7% w/v SeaKem' LE agarose (FMC) in.TEA buffer. A specific product of estimated size 1600 bp was observed.
Four such reactions were pooled, and the amplified DNA was extracted from a 1.0% SeaKem' LE gel by Qiaex kit as described for the TcaC peptide. The extracted DNA was used directly as the template for sequence determination (PRISM" Sequencing Kit) using the P8F and PB.108.3R primer pools. Each reaction contained-about 100 ng template DNA and 25 pmol of one primer pool, and was processed according to standard protocols as described for the TcaC peptide. An analysis of the sequence derived from extension of the P8F primers revealed the short DNA sequence (and encoded amino acid sequence): GAT GCA TTG NTT GCT Asp Ala Leu (Val) Ala 15 which corresponds to a portion of the N-terminal peptide sequence odisclosed as SEQ ID NO:3 (TcaBi) Labelinc of a TcaBi-peptide Gene-specific Probe Approximately 50 ng of gel-purified TcaBi DNA fragment was labeled with 32 P-dCTP as described above, and nonincorporated "radioisotopes were removed by passage through a NICK Column" (Pharmacia). The specific activity of the labelled DNA was determined to be 6 x 109 dpm/gg. This labeled DNA was used to probe colony membranes prepared from members of the genomic library that 25 had hybridized to the TcaC-peptide specific probe.
The membranes containing the 12 colonies identified in the TcaC-probe library screen (see above) were stripped of radioactive TcaC-specific label by boiling twice for approximately 30 min each time in 1 liter of 0.1X SSC plus 0.1 SDS. Removal of radiolabel was checked with a 6 hr film exposure. The stripped membranes were then incubated with the TcaBi peptide-specific probe prepared above. The labeled DNA was denatured by boiling for 10 min, and then added to the filters that had been incubated for 1 hr in 100 ml of "minimal hyb" solution at 60 0 C. After overnight hybridization at this temperature, the probe solution was removed, and the filters were washed as follows (all in 0.3X SSC plus 0.1% SDS): once for 5 min at 25 0 C, once for 1 hr at 60 0 C in fresh solution, and once for 1 hr at 63 0 C in fresh solution. After hr exposure to X-ray film by standard procedures, 4 stronglyhybridizing colonies were observed. These were, as with the TcaCspecific probe, isolates 22G12, 25A10, 26A5, and 26B10.
-56- The same TcaBi probe solution was diluted with an equal volume (about 100 ml) of "minimal hyb" solution, ana-then used to screen the membranes containing the 800 members of the genomic library.
After hybridization, washing, and exposure to X-ray film as described above, only the four cosmid clones 22G12, 25A10, 26A5, and 26B10, were found to hybridize strongly to this probe.
Isolation of Subclones Containing Genes Encoding TcaC and YcaBi Peptides. and Determination of DNA Base Sequence Thereof Three hybridization-positive cosmids in strain XL1 Blue MR were grown with shaking overnight (200 rpm) at 30 0 C in 100 ml TB- Amp 00 After harvesting the cells by centrifugatioi, cosmid DNA was prepared using a commercially available kit (BIGprepTM, 5 Prime 3 Prime, Inc., Boulder, CO), following the manufacturer's 15 protocols. Only one cosmid, 26A5, was successfully isolated by this procedure. When digested with restriction enzyme EcoR 1 (NEB) and analyzed by gel electrophoresis, fragments of approximate sizes 14, 10, 8 (vector), 5, 3.3, 2.9, and 1.5 kbp were detected. A second attempt to isolate cosmid DNA from the same three strains (8 ml cultures; TB-Amp, 0 0 30 0 C) utilized a boiling miniprep method o* .oo (Evans G. and G. Wahl., 1987, "Cosmid vectors for genomic walking and rapid restriction mapping." in Guide to Molecular Cloning Techniques. Meth. Enzvmology, Vol. 152, S. Berger and A. Kimmel, eds., pgs. 604-610). Only one cosmid, 25A10, was successfully 25 isolated by this method. When digested with restriction enzyme EcoR I (NEB) and analyzed by gel electrophoresis, this cosmid showed a fragmentation pattern identical to that previously seen with cosmid 26A5.
A 0.15 ig sample of 26A5 cosmid DNA was used to transform ml of E. coli DH5a cells (Gibco BRL), by the supplier's protocols.
A single colony isolate of that strain was inoculated into 4 ml of TB-Amp 1 oo, and grown for 8 hr at 37 0 C. Chloramphenicol was added to a final concentration of 225 9g/ml, incubation was continued for another 24 hr, then cells were harvested by centrifugation and frozen at -20 0 C. Isolation of the 26A5 cosmid DNA was by a standard alkaline lysis miniprep (Maniatis et al., op. cit., p. 382), modified by increasing all volumes by 50% and with stirring or gentle mixing, rather than vortexing, at every step.
After washing the DNA pellet in 70% ethanol, it was dissolved in TE containing 25 g/ml ribonuclease A (Boehringer Mannheim) Identification of EcoR I Fragments Hybridizing to GZ4-derived and TcaBi Probes Approximately 0.4 9g of cosmid 25A10 (from XL1 Blue MR cells) and about 0.5 4g of cosmid 26A5 (from chloramphenicol-amplified DH5a cells) were each digested with about 15 units of EcoR I(NEB) for 85 min, frozen overnight, then heated at 65 0 C for five min, and electrophoresed in a 0.7% agarose gel (Seakem' LE, 1X TEA, 80 volts, min). The DNA was stained with ethidium bromide as described above, and photographed under ultraviolet light. The EcoR I digest of cosmid 25A10 was a complete digestion, but the sample of cosmid 26A5 was only partially digested under these conditions. The agarose gel containing the DNA fragments was subjected to depurination, denaturation and neutralization, followed by Southern blotting onto a Magna NT nylon membrane, using a high salt 15 SSC) protocol, all as described in section 2.9 of Ausubel et al.
(CPMB, op. cit.). The transferred DNA was then UV-crosslinked to the nylon membrane as before.
An TcaC-peptide specific DNA fragment corresponding to the insert of plasmid isolate GZ4 was amplified by PCR" in a 100 ml reaction volume as described previously above. The amplification products from three such reactions were pooled and were extracted from a 1% GTG' agarose gel by Qiaex kit, as described above, and quantitated by fluorometry. The gel-purified DNA (100 ng) was labeled with "P-dCTP using the High Prime Labeling Mix (Boehringer 25 Mannheim) as described above, to a specific activity of 6.34 x 108 dpm/gg.
The 32 P-labeled GZ4 probe was boiled 10 min, then added to "minimal hyb" buffer (at 1 ng/ml), and the Southern blot membrane containing the digested cosmid DNA fragments was added, and incubated for 4 hr at 600C with gentle shaking at 50 rpm. The membrane was then washed 3 times at 25 0 C for about 5 min each (minimal hyb wash solution), followed by two washes for 30 mi- each at 600C. The blot was exposed to film (with enhancer screens) for about 30 min at -70 0 C. The GZ4 probe hybridized strongly to the- 5.0 kbp (apparent size) EcoR I fragment of both these two cosmids, 26A5 and 25A10.
The membrane was stripped of radioactivity by boiling for about 30 min in 0.1X SSC plus 0.1 SDS, and absence of radiolabel was checked by exposure to film. It was then hybridized at 60 0
C
for 3.5 hours with the (denatured) TcaBi probe in "minimal hyb" buffer previously used for screening the colony membranes (above), washed as described previously, and exposed to film for 40 min at 0 C with two enhancer screens. With both cosmids, the TcaBi probe hybridized lightly with the about 5.0 kbp EcoR 1 fragment, and strongly with a fragment of approximately 2.9 kbp.
The sample of cosmid 26A5 DNA previously described, (from cells) was used as the source of DNA from which to subclone the bands of interest. This DNA (2.5 4g) was digested with about 3 units of EcoR I (NEB) in a total volume of 30 il for 1.5 hr, to give a partial digest, as confirmed by gel electrophoresis. Ten jg of pBC KS DNA (Stratagene) were digested for 1.5 hr with units of EcoR I in a total volume of 20 il, leading to total digestion as confirmed by electrophoresis. Both EcoR I-cut DNA preparations were diluted to 50 il with water, to each an equal volume of PCI was added, the suspension was gently mixed, spun in a microcentrifuge and the aqueous supernatant was collected. DNA was 1 5 precipitated by 150 il ethanol, and the mixture was placed at -20 0
C
overnight. Following centrifugation and drying, the EcoR I digested pBC KS was dissolved in 100 il TE; the partially digested 26A5 was dissolved in 20 gl TE. DNA recovery was checked by fluorometry.
20 In separate reactions, approximately 60 ng of EcoR I -digested BC DNA was ligated with approximately 180 ng or 270 ng of partially digested cosmid 26A5 DNA. Ligations were carried out in a volume of 20 il at 15 0 C for 5 hr, using T4 ligase and buffer from New England BioLabs. The ligation mixture, diluted to 100 gi with 25 sterile TE, was used to transform frozen, competent DH5a cells (Gibco BRL) according to the supplier's instructions. Varying amounts (25-200 il) of the transformed cells were plated on freshly prepared solid LB-Cam, medium with 1 mM IPTG and 50 mg/l X-gal.
Plates were incubated at 370C about 20 hr, then chilled in the dark for approximately 3 hr to intensify color for insert selection.
White colonies were picked onto patch plates of the same composition and incubated overnight at 37 0
C.
Two colony lifts of each of the selected patch plates were prepared as follows. After picking white colonies to fresh plates, round Magna NT nylon membranes were pressed onto the patch plates, the membrane was lifted off, and subjected to denaturation, neutralization and UV crosslinking as described above for the library colony membranes. The crosslinked colony lifts were vigorously washed, including gently wiping off the excess cell debris with a tissue. One set was hybridized with the GZ4(TcaC) probe solution described earlier, and the other set was hybridized with the TcaBi probe solution described earlier, according to the 'minimal hyb' protocol, followed by.washing and film exposure as described for the library colony membranes.
Colonies showing hybridization signals either only with the GZ4 probe, with both GZ4 and TcaBi probes, or only with the TcaBi probe, were selected for further work and cells were streaked for single colony isolation onto LB-Cam 3 5 media with IPTG and X-gal as before. Approximately 35 single colonies, from 16 different isolates, were picked into liquid LB-Cam 35 media and grown overnight at 37 0 C; the cells were collected by centrifugation and plasmid DNA was isolated by a standard alkaline lysis miniprep according to Maniatis et al. (op. cit. p. 368). DNA pellets were dissolved in TE 25 g/ml ribonuclease A and DNA concentration was determined by fluorometry. The EcoR I digestion pattern was analyzed by gel electrophoresis. The following isolates were picked as useful.
15 Isolate A17.2 contains religated pBC only and was used for a (negative) control. Isolates D38.3 and C44.1 each contain only the 2.9 kbp, TcaBi -hybridizing EcoR I fragment inserted into pBC These plasmids, named pDAB2000 and pDAB2001, respectively, are illustrated in Fig. 2.
20 Isolate A35.3 contains only the approximately 5 kbp, GZ4)hybridizing EcoR 1 fragment, inserted into pBC This plasmid was named pDAB2002 (also Fig. These isolates provided templates for DNA sequencing.
Plasmids pDAB2000 and pDAB2001 were prepared using the 25 BIGprep T kit as before. Cultures (30 ml) were grown overnight in TB-Cams to an ODg 00 of 2, then plasmid was isolated according to the manufacturer's directions. DNA pellets were redissolved in 100 il .TE each, and sample integrity was checked by EcoR I digestion and gel electrophoretic analysis.
Sequencing reactions were run in duplicate, with one replicate using as template pDAB2000 DNA, and the other replicate using as template pDAB2001 DNA. The reactions were carried out using the dideoxy dye terminator cycle sequencing method, as described above for the sequencing of the GZ4/HB14 DNAs. Initial sequencing runs utilized as primers the LacZ and T7 primers described above, plus primers based on the determined sequence of the TcaB i
PCR
amplification product (TH1 ATTGCAGACTGCCAATCGCTTCGG, TH12
GAGAGTATCCAGACCGCGGATGATCTG).
After alignment and editing of each sequencing output, each was truncated to between 250 to 350 bases, depending on the integrity of the chromatographic data as interpreted by the Perkin Elmer Applied Biosystems Division SeqEd 675 software. Subsequent sequencing "steps" were made by selecting appropriate sequence for new primers. With a few exceptions, primers (synthesized as described above) were 24 bases in length with a 50% G+C composition. Sequencing by this method was carried out on both strands of the approximately 2.9 kbp EcoR I fragment.
To further serve as template for DNA sequencing, plasmid DNA from isolate pDAB2002 was prepared by BIGprep T kit. Sequencing reactions were performed and analyzed as described above.
Initially, a T3 primer (pBS SK bases 774-796: CGCGCAATTAACCCTCACTAAAG) and a T7 primer (pBS KS bases 621-643: GCGCGTAATACGACTCACTATAG) were used to prime the sequencing reactions from the flanking vector sequences, reading into the insert DNA. Another set of primers, (GZ4F: GTATCGATTACAACGCTGTCACTTCCC; TH13: GGGAAGTGACAGCGTTGTAATCGATAC; 15 TH14: ATGTTGGGTGCGTCGGCTAATGGACATAAC; and LW1-204-: GGGAAGTGACAGCGTTGTAATCGATAC) was made to prime from internal sequences, which were determined previously by degenerate oligonucleotide-mediated sequencing of subcloned TcaC-peptide PCR products. From the data generated during the initial rounds of sequencing, new sets of primers were designed and used to walk the entire length of the about 5 kbp fragment. A total of 55 oligo primers was used, enabling the identification of 4832 total bp of contiguous sequence.
When the DNA sequence of the EcoR I fragment insert of 25 pDAB2002 is combined with part of the determined sequence of the pDAB2000/pDAB2001 isolates, a total contiguous sequence of 6005 bp was generated (disclosed herein as SEQ ID NO:25). When long open *reading frames were translated into the corresponding amino acids, the sequence clearly shows the TcaBi N-terminal peptide (disclosed as SEQ ID NO:3), encoded by bases 68-124, immediately following a methionine residue (start of translation). Upstream lies a potential ribosome binding site (bases 51-58), and downstream, at bases 215-277 is encoded the TcaBi-PT158 internal peptide (disclosed herein as SEQ ID NO:19). Further downstream, in the same reading frame, at bases 1787-1822, exists a sequence encoding the TcaBi-PT108 internal peptide (disclosed herein as SEQ ID Also in the same reading frame, at bases 1946-1972, is encoded the TcaBii N-terminal peptide (disclosed herein as SEQ ID and the reading frame continues uninterrupted to a translation termination codon at nucleotides 3632-3634.
The lack of an in-frame stop codon between the end of the sequence encoding TcaB, -PT108 and the start of the TcaBii encoding region, and the lack of a discernible ribosome binding site immediately upstream of the TcaBii coding region, indicate that peptides TcaBii and TcaBi are encoded by a single open reading frame of 3567 bp beginning at base pair 65 in SEQ ID NO:25), and are most likely derived from a single primary gene product TcaB of 1189 amino acids (131,586 Daltons; disclosed herein as SEQ ID NO:26) by post-translational cleavage. If the amino acid immediately preceding the TcaBii N-terminal peptide represents the C-terminal amino acid of peptide TcaBi, then the predicted mass of TcaBii (627 amino acids) is 70,814 Daltons (disclosed herein as SEQ ID NO:28), somewhat higher than the size observed by SDS-PAGE (68 kDa). This peptide would be encoded by a contiguous stretch of 1881 base pairs (disclosed herein as SEQ ID NO:27). It is thought that the native C-terminus of TcaBi lies somewhat closer to the C- 15 terminus of TcaBi-PTl08. The molecular mass of PT108 [3.438 kDa; determined during N-terminal amino acid sequence analysis of this peptide] predicts a size of 30 amino acids. Using the size of this peptide to designate the C-terminus of the TcaBi coding region [Glu at position 604 of SEQ ID NO:28], the derived size of TcaBi is 20 determined to be 604 amino acids or 68,4.63 Daltons, more in o* agreement with experimental observations.
Translation of the TcaBii peptide coding region of 1686 base pairs (disclosed herein as SEQ ID NO:29) yields a protein of 562 amino acids (disclosed herein as SEQ ID NO:30) with predicted mass of 60,789 Daltons, which corresponds well with the observed 61 kDa.
A potential ribosome binding site (bases 3682-3687) is found too* 48 bp downstream of the stop codon for the tcaB open reading frame.
At bases 3694-3726 is found a sequence encoding the N-terminus of peptide TcaC, (disclosed as SEQ ID NO.2). The open reading frame initiated by this N-terminal peptide continues uninterrupted to base 6005 (2361 base pairs, disclosed herein as the first 2361 base pairs of SEQ ID NO.31). A gene (tcaC) encoding the entire TcaC peptide, (apparent size about 165 kDa; about 1500 amino acids), would comprise about 4500 bp.
Another isolate containing cloned EcoR I fragments of cosmid 26A5, E20.6, was also identified by its homology to the previously mentioned GZ4 and TcaBi probes. Agarose gel analysis of EcoR I digests of the DNA of the plasmid harbored by this strain (pDAB2004, Fig. revealed insert fragments of estimated sizes 2.9, 5, and 3.3 kbp. DNA sequence analysis initiated from primers designed from the sequence of plasmid pDAB2002 revealed that the -62- 3.3 kbp EcoR I fragment of pDAB2004 .lies adjacent to the 5 kbp EcoR I fragment represented in pDAB2002. The 2361 base pair open reading frame discovered in pDAB2002 continues uninterrupted for another 2094 bases in pDAB2004 [diselosed herein as base pairs 2362 to 4458 of SEQ ID NO:31]. DNA sequence analysis using the parent cosmid 26A5 DNA as template confirmed the continuity of the open reading frame. Altogether, the open reading frame (tcaC SEQ ID NO:31) comprises 4455 base pairs, and encodes a protein (TcaC) of 1485 amino acids [disclosed herein as SEQ ID NO:32]. The calculated molecular size of 166,214 Daltons is consistent with the estimated size of the TcaC peptide (165 kDa), and the derived amino acid sequence matches exactly that disclosed for the TcaC Nterminal sequence [SEQ ID NO:2].
The lack of an amino acid sequence corresponding to SEQ ID 15 NO:17; used to design the degenerate oligonucleotide primer pool in the discovered sequence indicates that the generation of the PCR® products found in isolates GZ4 and HB14, which were used as probes in the initial library screen, were fortuitously generated by reverse-strand priming by one of the primers in the degenerate 20 pool. Further, the derived protein sequence does not include the internal fragment disclosed herein as SEQ ID NO:18. These sequences reveal that plasmid pDAB2004 contains the complete coding region for the TcaC peptide.
Further analysis of SEQ ID NO:25 reveals the end of an open reading frame (bases 1-43), which encodes the final 13 amino acids of the TcaAii i peptide, disclosed herein as SEQ ID NO:35. Only 24 bases separate the end of the TcaAii i coding region and the start of the TcaB i coding region. Included within the 24 bases are sequences that may serve as a ribosome binding site. Although possible, it is not likely that a Photorhabdus gene promoter is encoded within this short region. We propose that genomic region tea, which includes three long open reading frames [tcaA (SEQ ID NO:33), tcaB (SEQ ID NO:25, bases 65-36334), and tcaC (SEQ ID NO:31),which is separated from the end of tcaB by only 59 bases] is regulated as an operon, with transcription initiating upstream of the start of the tcaA gene (SEQ ID NO:33), and resulting in a polycistronic messenger RNA.
-63- Exampl& 9 Screenina of the Photorhabdus Genomic Library for Genes Encoding the TcbAii Peptide This example describes a method used to identify DNA clones that contain the TcbAii peptide-encoding genes, the isolation of the gene, and the determination of its partial DNA base sequence.
Primers and PCR Reactions The TcbAii polypeptide of the insect active preparation is about 206 kDa. The amino acid sequence of the N-terminus of this peptide is disclosed as SEQ ID NO:1. Four pools of degenerate oligonucleotide primers ("Forward primers": TH-4, TH-5, TH-6, and TH-7) were synthesized to encode a portion of this amino acid 1 15 sequence, as described in Example 8, and are shown below.
Tabe 2 Amino Acid Phe Ile Gin Gly Tyr Ser Asp Leu Phe TH-4 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) TCI GA(T/C) CTI TT- 3' 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) CTI TT- 3' 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) TCI GA(T/C) TT(A/G) TT- 3' TH-7 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) TT(A/G) TT- 3' In addition, a primary and a secondary sequence cf 30 an internal peptide preparation (TcbAii-PT81) have been determined and are disclosed herein as SEQ ID NO:23 and SEQ ID NO:24, respectively. Four pools of degenerate oligonucleotides ("Reverse Primers": TH-8, TH-9, TH-10 and TH-11) were similarly designed and synthesized to encode the reverse complement of sequences that encode a portion of the peptide of SEQ ID NO:23, as shown below.
-64- Tal 13. Amino Acid Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn TH-8 TH- 9 TH -10 (un I TH-11 3, TGI 3' TGI 3' TGI 3' TGI AT (A/G) AT (A/G) AT (A/G) AT (AIG)
GAI
TT (A/G)
GAI
TT (A/C)
TGI
TG I TG I
TGI
ACI
AC I TC (C/A) TC (G/A) AA (A/G) AA (A/C) AA (A/C) AA (A/G) CT (TIC) CT (T/C) CT (T/C) CT (TIC) CT (T/C) GT (T/C) GT (T/C) CT (T/C) CA I
CAI
CA I CA I
CCI
CCI
CCI
CGI
TT TT TT TT Sets of these primers were used in PCR" reactions to amplify TcbAii- encoding gene fragments from the genomic Photorhabdus luminescens W-14 DNA prepared in Example 6. All PCR* reactions were run with the "Hot Start" technique-using AmpliWax T M gems and other Perkin Elmer reagents and protocols. Typically, a mixture (total volume 11 Al) of MgCl 2 dNTP's, 10X GeneAmp" PCR Buffer II, and the primers were added to tubes containing a single wax bead. GeneAmp' PCR Buffer II is composed of 100 mM Tris-HCl, pH 8.3; and 500 mM KC1.] The tubes were heated to 80 0 C for 2 minutes and allowed to cool. To the top of the wax seals, a solution containing 10X GeneAmp PCR Buffer II, DNA template, and AmpliTaq' DNA polymerase were added. Following melting of the wax seal and mixing of components by thermal cycling, final reaction conditions (volume of 50 1l) were: 10 mM Tris-HCl, pH 8.3; 50 mM KC1; 2.5 mM 15 MgCl 2 200 gM each in dATP, dCTP, dGTP, dTTP; 1.25 mM in a single Forward primer pool; 1.25 M in a single Reverse primer pool, 1.25 units of AmpliTaq' DNA polymerase, and 170 ng of template DNA.
The reactions were placed in a thermocycler (as in Example 8) and run with the following program: Table 14 Temperature Time Cycle Repetition 94 0 C 2 minutes IX 94 0 C 15 seconds 55-65 0 C 30 seconds 72 0 C 1 minute 72oC minutes lX 0 C Constant A series of amplifications was run at three different annealing temperatures (550, 600, 65C) using the degenerate primer -66pools. Reactions with annealing at .65 0 C had no amplification products visible following agarose gel electrophoresis. Reactions having a 60 0 C annealing regime and containing primers TH-5+TH-10 produced an amplification product that had a mobility corresponding to 2.9 kbp. A lesser amount of the 2.9 kbp product was produced under these conditions with primers TH-7+TH-10. When reactions were annealed at 55 0 C, these primer pairs produced more of the 2.9 kbp product, and this product was also produced by primer pairs TH- 5+TH-8 and TH-5+TH-11. Additional very faint 2.9 kbp bands were seen in lanes containing amplification products from primer pairs TH-7 plus TH-8, TH-9, TH-10, or TH-11.
To obtain sufficient PCR amplification product for cloning and DNA sequence determination, 10 separate PCR reactions were set up using the primers TH-5+TH-10, and were run using the above conditions with a 55 0 C annealing temperature. All reactions were pooled and the 2.9 kbp product was purified by Qiaex extraction from an agarose gel as described above.
Additional sequences determined for TcbAii internal peptides are disclosed herein as SEQ ID NO:21 and SEQ ID NO:22. As before, 20 degenerate oligonucleotides (Reverse primers TH-17 and TH-18) were made corresponding to the reverse complement of sequences that encode a portion of the amino acid sequence of these peptides.
Table From SEO ID NO:21 Amino Acid Met Glu Thr Gin Asn Ile Gin Glu Pro TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI GTT/C GTT/C Table 16 From SEO ID NO:22 Amino Acid Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp TH-18 3'-TT(A/G) GGI TAI TT(A/G) TAI TT(A?G) TGI CCI TAI Degenerate oligonucleotides TH-18 and TH-17 were used in an amplification experiment with Photorhabdus luminescens W-14 DNA as template and primers TH-4, TH-5, TH-6, or TH-7 as the (Forward) primers. These reactions amplified products of approximately 4 kbp and 4.5 kbp, respectively. These DNAs were transferred from agarose gels to nylon membranes and hybridized with a 3P-labeled probe (as described above) prepared from the 2.9 kbp product amplified by the TH-5+TH10 primer pair. Both the 4 kbp and the kbp amplification products hybridized strongly to the 2.9 kbp probe. These results were used to construct a map ordering the TcbAii internal peptide sequences as shown in Fig. 3. Approximate distances between the primers are shown in nucleotides in Fig. 3.
DNA Sequence of the 2.9 kbp TcbAii-encodina Fragment Approximately 200 ng of the purified 2.9 kbp fragment (prepared above) was precipitated with ethanol and dissolved in 17 ml of water. One-half of this was used as sequencing template with pmol of the TH-5 pool as primers, the other half was used as template for TH-10 priming. Sequencing reactions were as given in Example 8. No reliable sequence was produced using the primer pool; however, reactions with TH-5 primer pool produced the sequence disclosed below:.
1 AATCGTGTTG ATCCCTATGC CGNGCCGGGT TCGGTGGAAT CGATGTCCTC ACCGGGGGTT 61 TATTNGAGGG ANTNGTCCCG TGAGGCCAAA AANTGGAATG AAAGAAGTTC AATTTNTTAC 121 CTAGATAAAC GTCGCCCGGN TTTAGAAAGN TTANTGNTCA GCCAGAAAAT TTTGGTTGAG S181 GAAATTCCAC CGNTGGTTCT CTCTATTGAT TNGGGCCTGG CCGGGTTCGA ANNAAAACNA 20 241 GGAAATNCAC AAGTTGAGGT GATGGNTTTG TNGCNANCTT NTCGTTTAGG TGGGGAGAAA 301 CCTTNTCANC ACGNTTNTGA AACTGTCCGG GAAATCGTCC ATGANCGTGA NCCAGGNTTN 361 CGCCATTGG Based on this sequence, a sequencing primer (TH-21, 25 CCGGGCGACGTTTATCTAGG-3') was designed to reverse complement bases 120-139, and initiate polymerization towards the 5' end end) of the gel-purified 2.9 kbp TcbAii-encoding PCR fragment. The determined sequence is shown below, and is compared to the biochemically determined N-terminal peptide sequence of TcbAii SEQ ID NO:1.
TcIbAi 2.9 kbp PCR Fragment Seguence Confirmation [Underlined amino acids encoded by degenerate oligonucleotides] SEQ ID NO:1 F I 0 G Y s D L F G A I I I I I I I I I I 1 2.9 kbp seq GC ATG CAG GGG TAT AGT GAC CTG TTT GGT AAT CGT GCT M Q G Y S D L F G N R A> From the homology of the derived amino acid sequence to the biochemically determined one, it is clear that the 2.9 kbp PCR fragment represents the TcbA coding region. This 2.9 kbp fragment was then used as a hybridization probe to screen the Photorhabdus W-14 genomic library prepared in Example 8 for cosmids containing the TcbAii-encoding gene.
-68- Screening the Photorhabdus Cosmid Library The 2.9 kb gel-purified PCR fragment was-labeled with 2 P using the Boehringer Mannheim High Prime labeling kit as described in Example 8. Filters containing remnants of approximately 800 colonies from the cosmid library were screened as described previously (Example and positive clones were streaked for isolated colonies and rescreened. Three clones (8All, 25G8, and 26D1) gave positive results through several screening and characterization steps. No hybridization of the TcbAii-specific probe was ever observed with any of the four cosmids identified in Example 8, and which contain the tcaB and tcaC genes. DNA from cosmids 8All, 25G8, and 26D1 was digested with restriction enzymes Bgl II, EcoR I or Hind III (either alone or in combination with one another), and the fragments were separated on an agarose gel and .15 transferred to a nylon membrane as described in Example 8. The membrane was hybridized with "P-labeled probe prepared from the kbp fragment (generated by amplification of Photorhabdus genomic DNA with primers TH-5+TH-17). The patterns generated from cosmid DNAs 8All and 26D1 were identical to those generated with 20 similarly-cut genomic DNA on the same membrane. It is concluded that cosmids 8All and 26D1 are accurate representations of the genomic TcbAii encoding locus. However, cosmid 25G8 has a single Bgl II fragment which is slightly larger than the genomic DNA.
This may result from positioning of the insert within the vector.
DNA Sequence of the tchA-encoding Gene The membrane hybridization analysis of cosmid 26D1 revealed that the 4.5 kbp probe hybridized to a single large EcoR I fragment (greater than 9 kbp). This fragment was gel purified and 30 ligated into the EcoR I site of pBC KS as described in Example 8, to generate plasmid pBC-Sl/R1. The partial DNA sequence of the insert DNA of this plasmid was determined by "primer walking" from the flanking vector sequence, using procedures described in Example 8. Further sequence was generated by extension from new oligonucleotides designed from the previously determined sequence.
When compared to the determined DNA sequence for the tcbA gene identified by other methods (disclosed herein as SEQ ID NO:11 as described in Example 12 below), complete homology was found to nucleotides 1-272, 319-826, 2578-3036, and 3068-3540 (total bases 1712). It was concluded that both approaches can be used to identify DNA fragments encoding the TcbAii peptide.
-69- Analysis of the Derived Amino Acid Sequence of the tcbA Gene The sequence of the DNA fragment identified as SEQ ID NO:11 encodes a protein whose derived amino acid sequence is disclosed herein as SEQ ID NO:12. Several features verify the identity of the gene as that encoding the TcbAii protein. The TcbAii Nterminal peptide (SEQ ID NO:1; Phe Ile Gln Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala) is encoded as amino acids 88-100. The TcbAii internal peptide TcbAii-PT81(a) (SEQ ID NO:23) is encoded as amino acids 1065-1077, and TcbAii-PT81(b) (SEQ ID NO:24) is encoded as amino acids 1571-1592. Further, the internal peptide TcbAii-PT56 (SEQ ID NO:22) is encoded as amino acids 1474-1488, and the internal peptide TcbAii-PT103 (SEQ ID NO:21) is encoded as amino acids 1614-1639. It is obvious that this gene is an authentic clone encoding the TcbAii peptide as isolated from insecticidal 15 protein preparations of Photorhabdus luminescens strain W-14.
The protein isolated as peptide TcbAii is derived from cleavage of a longer peptide. Evidence for this is provided by the fact that the nucleotides encoding the TcbAii N-terminal peptide SEQ ID NO:1 are preceded by 261 bases (encoding 87 N-terminalproximal amino acids) of a.longer open reading frame (SEQ ID NO:11). This reading frame begins with nucleotides that encode the amino acid sequence Met Gln Asn Ser Leu, which corresponds to the N-terminal sequence of the large peptide TcbA, and is disclosed herein as SEQ ID NO:16. It is thought that TcbA is the precursor 25 protein for TcbAii.
Relationship of tcbA. tcaB and tcaC Genes The tcaB and tcaC genes are closely linked and may be transcribed as a single mRNA (Example The tcbA gene is borne on cosmids that apparently do not overlap the ones harboring the tcaB and tcaC cluster, since the respective genomic library screens identified different cosmids. However, comparison of the amino sequences encoded by the tcaB and tcaC genes with the tcbA gene reveals a substantial degree of homology. The amino acid conservation (Protein Alignment Mode of MacVector T Sequence Analysis Software, scoring matrix pam250, hash value 2; Oxford Molecular Group, Campbell, CA) is shown in Fig. 4. On the score line of each panel in Fig. 4, up carats indicate homology or conservative amino acid changes,and down carats indicate nonhomology.
This analysis shows that the amino acid sequence of the TcbA peptide from residues 1739 to 1894 is highly homologous to amino acids 441 to 603 of the TcaB i peptide (162 of the total 627 amino acids of TcaB; SEQ ID NO:28). In addition, the sequence of TcbA amino acids 1932 to 2459 is highly homologous to amino acids 12 to 531 of peptide TcaBii (520 of the total 562 amino acids; SEQ ID Considering that the TcbA peptide (SEQ ID NO:12) comprises 2505 amino acids, a total of 684 amino acids at the Cproximal end of it is homologous to the TcaBi or TcaBii peptides, and the homologies are arranged colinear to the arrangement of the putative TcaB preprotein (SEQ ID NO:26). A sizeable gap in the TcbA homology coincides with the junction between the TcaBi and TcaBii portions of the TcaB preprotein. Clearly the TcbA and TcaB gene products are evolutionarily related, and it is proposed that 15 they share some common function(s) in Photorhabdus.
Example Characterization of Zinc-metalloproteases in Photorhabdus Broth: "Protease Inhibition, Classification and Purificatio Protease Inhibition and Classification Assays: Protease assays were performed using FITC-casein dissolved in water as substrate (0.08% final assay concentration). Proteolysis reactions were performed at 25 0 C for 1 h in the appropriate buffer with 25 gl 25 of Photorhabdus broth (150 4l total reaction volume). Samples were also assayed in the presence and absence of dithiothreitol. After Sincubation, an equal volume of 12% trichloroacetic acid was added to precipitate undigested protein. Following precipitation for h and subsequent centrifugation, 100 Al of the supernatant was placed into a 96-well microtiter plate and the pH of the solution was adjusted by addition of an equal volume of 4N NaOH.
Proteolysis was then quantitated using a Fluoroskan II fluorometric plate reader at excitation and emission wavelengths of 485 and 538 nm, respectively. Protease activity was tested over a range from pH 5.0-10.0 in 0.5 units increments. The following buffers were used at 50 mM final concentration: sodium acetate (pH 5.0 Tris-HCL (pH 7.0 and bis-Tris propane (pH 8.5-10.0). To identify the class of protease(s) observed, crude broth was treated with a variety of protease inhibitors (0.5 Mg/Ml final concentration) and then examined for protease activity at pH -71using the substrate described above. The protease inhibitors used included E-64 (L-trans-expoxysaccinylleucylam~do [4-,-guanidino butane), 3,4 dichloroisocoumarin, Leupeptin, pepstatin, amastatin, ethylenediaminetetraacetic acid (EDTA) and 1,10 phenanthroline.
Protease assays performed over a pH range revealed that indeed protease(s) were present which exhibited maximal activity at about pH 8.0 (Table 17). Addition of DTT did not have any effect on protease activity. Crude broth was then treated with a variety of protease inhibitors (Table 18). Treatment of crude broth with the inhibitors described above revealed that 1,10 phenanthroline caused complete inhibition of all protease activity when added at a final concentration of 50 pg, with the IC 50 5 gg in 100 Al of a 2 mg/ml crude broth solution. These data indicate that the most abundant protease(s) found in the Photorhabdus broth are from the zincmetalloprotease class of enzymes.
Table 17 Effect of pH on the Protease Activity Found in a Day 1 Production of Photorhabdus luminescens (Strain W-14) pH Flu. Unitsa Percent Activityb 5.0 3013 78 17 7994 448 6.0 12965 483 74 6.5 14390 1291 82 14386 1287 82 14135 198 17582 831 100 16183 953 92 9.0 16795 760 96 16279 1022 93 10.0 15225 210 87 a Flu. units Fluorescence Units (Maximum about 28,UUU; background about 2200).
b Percent activity relative to the maximum at pH Table 18 Effect of Different Protease nhibitors on the Protease Activity at DH 8 Found in a Day 1 Production of Photorhabdus luminescen (Strain W-14) Inhibitor Corrected Flu. Unitsa Percent Inhibitionb Control 13053 0 E-64 14259 0 1,10 Phenanthrolinec 15 99 3,4 Dichloroisocoumarind 7956 39 Leupeptin 13074 0 Pepstatinc 13441 0 Amastatin 12474 4 DMSO Control 12005 8 Methanol Control 12125 7 a Corrected Flu. Units Fluorescence Units ackground(22UO flu. units).
b Percent Inhibition relative to protease activity at pH 20 c Inhibitors were dissolved in methanol.
.d Inhibitors were dissolved in DMSO.
The isolation of a zinc-metalloprotease was performed by applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose 25 column equilibrated at 50 mM Na 2
PO
4 pH 7.0 as described in Example for Photorhabdus toxin. After extensive washing, *a 0 to 0.5 M NaC1 gradient was used to elute toxin protein. The majority of biological activity and protein was eluted from 0.15 0.45 M NaC1.
However, it was observed that the majority of proteolytic activity 30 was present in the 0.25-0.35 M NaCl fraction with some activity in the 0.15-0.25 M NaC1 fraction. SDS PAGE analysis of the 0.25-0.35 M NaC1 fraction showed a major peptide band of approximately kDa. The 0.15-0.25 M NaC1 fraction contained a similar 60 kDa band bbut at lower relative protein concentration. Subsequent gel 35 filtration of this fraction using a Superose 12 HR 16/50 column resulted in a major peak migrating at 57.5 kDa that contained a predominant 90% of total stained protein) 58.5 kDa band by SDS PAGE analysis. Additional analysis of this fraction using various protease inhibitors as described above determined that the protease was a zinc-metalloprotease. Nearly all of the protease activity present in Photorhabdus broth at day 1 of fermentation corresponded to the about 58 kDa zinc-metalloprotease.
In yet a second isolation of zinc-metalloprotease(s), W-14 Photorhabdus broth grown for three days was taken and protease activity was visualized using sodium dodecyl sulfate-polyacrylamide -73gel electrophoresis (SDS-PAGE) laced with gelatin as described in Schmidt, Bleakley, B. and Nealson, K.M. 1988. SDS running gels (5.5 x 8 cm) were made with 12.5 polyacrylamide (40% stock solution of acrylamide/bis-acrylamide; Sigma Chemical Co., St.
Louis, MO) into which 0.1% gelatin final concentration (Biorad EIA grade reagent; Richmond CA) was incorporated upon dissolving in water. SDS-stacking gels (1.0 x 8 cm) were made with polyacrylamide, also laced with 0.1% gelatin. Typically, 2.5 ag of protein to be tested was diluted in 0.03 ml of SDS-PAGE loading buffer without dithiothreitol (DTT) and loaded onto the gel.
Proteins were electrophoresed in SDS running buffer (Laemmli, U.K.
1970. Nature 227, 680) at 00 C and at 8 mA. After electrophoresis was complete, the gel was washed for 2 h in 2.5% Triton X- 00. Gels were then incubated for 1 h at 37 oC in 0.1 M glycine 15 (pH After incubation, gels were fixed and stained overnight Swith 0.1% amido black in methanol-acetic acid- water (30:10:60, vol./vol./vol.; Sigma Chemical Protease activity was visualized as light areas against a dark, amido black stained e* background due to proteolysis and subsequent diffusion of incorporated gelatin. At least three distinct bands produced by proteolytic activity at-58-, 41-, and 38 kDa were observed.
Activity assays of the different proteases in W-14 day three culture broth were performed using FITC-casein dissolved in water as substrate (0.02% final assay concentration). Proteolysis 25 experiments were performed at 37 0 C for 0-0.5 h in 0.1M Tris-HCl (pH with different protein fractions in a total volume of 0.15 ml.
Reactions were terminated by addition of an equal volume of 12% trichloroacetic acid (TCA) dissolved in water. After incubation at room temperature for 0.25 h, samples were centrifuged at 10,000 x g for 0.25 h and 0.10 ml aliquots were removed and placed into 96well microtiter plates. The solution was then neutralized by the addition of an equal volume of 2 N sodium hydroxide, followed by quantitation using a Fluoroskan II fluorometric plate reader with excitation and emission wavelengths of 485 and 538 nm, respectively. Activity measurements were performed using FITC- Casein with different protease concentrations at 370C for 0-10 min.
A unit of activity was arbitrarily defined as the amount of enzyme needed to produce 1000 fluorescent units/min and specific activity was defined as units/mg of protease.
Inhibition studies were performed using two zincmetalloprotease inhibitors; 1,10 phenanthroline and N-(arhamnopyranosyloxyhydroxyphosphinyl)-Leu-Trp(phosphoramidon) with stock solutions of the inhibitors dissolved in 100% ethanol and water, respectively. Stock concentrations were typically 10 mg/ml and 5 mg/ml for 1,10 phenanthroline and phosphoramidon, respectively, with final concentrations of inhibitor at 0.5-1.0 mg/ml per reaction. Treatment of three day W-14 crude broth with 1,10 phenanthroline, an inhibitor of all zinc metalloproteases, resulted in complete elimination of all protease activity while treatment with phosphoramidon, an inhibitor of thermolysin-like proteases (Weaver, Kester, and Matthews, B.W. 1977. J.
Mol. Biol. 114, 119-132), resulted in about 56% reduction of protease activity. The residual proteolytic activity could not be 15 further reduced with additional phosphoramidon.
*•oo The proteases of three day W-14 Photorhabdus broth were purified as follows: 4.0 liters of broth were concentrated using an Amicon spiral ultra filtration cartridge Type S1Y100 attached to an Amicon M-12 filtration device. The flow-through material having 20 native proteins less than 100 kDa in size (3.8 L) was concentrated to 0.375 L using an Amicon spiral ultra filtration cartridge Type S1Y10 attached to an Amicon M-12 filtration device. The retentate material contained proteins ranging in size from 10-100 kDa. This material was loaded onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem (Framington, MA) Poros® 50 HQ strong anion exchange packing that had been equilibrated in 10 mM sodium phosphate buffer (pH Proteins were loaded on the column at a flow rate of 5 ml/min, followed by washing unbound protein with buffer until A 28 0 0.00. Afterwards, proteins were eluted using a NaCl gradient of 0-1.0 M NaC1 in 40 min at a flow rate of 7.5 ml/min. Fractions were assayed for protease activity, supra., and active fractions were pooled. Proteolytically active fractions were diluted with 50% 10 mM sodium phosphate buffer (pH 7.0) and loaded onto a Pharmacia HR 10/10 Mono Q column equilibrated in 10 mM sodium phosphate. After washing the column with buffer until A 280 0.00, proteins were eluted using a NaCl gradient of 0-0.5 M NaC1 for 1 h at a flow rate of 2.0 ml/min.
Fractions were assayed for protease activity. Those fractions having the greatest amount of phosphoramidon-sensitive protease activity, the phosphoramidon sensitive activity being due to the 41/38 kDa protease, infra., were pooled. These fractions were found to elute at a range of 0.15-0.25 M NaC1. Fractions containing a predominance of phosphoramidon-insensitive protease activity, the 58 kDa protease, were also pooled. These fractions were found to elute at a range of 0.25-0.35 M NaC1. The phosphoramidon-sensitive protease fractions were then concentrated to a final volume of 0.75 ml using a Millipore centrifugal filter device Biomax-5K NMWL membrane. This material was applied at a flow rate of 0.5 ml/min to a Pharmacia HR 10/30 column that had been packed with Pharmacia Sephadex equilibrated in 10 mM sodium phosphate buffer (pH 0.1 M NaCI.
S. Fractions having the maximal phosphoramidon-sensitive protease activity were then pooled and centrifuged over a Millipore Ultrafree®-15 centrifugal filter device Biomax-5OK NMWL membrane.
o Proteolytic activity analysis, supra., indicated this material to have only phosphoramidon-sensitive protease activity. Pooling of the phosphoramidon-insensitive protease, the 58 kDa protein, was followed by concentrating in a Millipore Ultrafree®-15 centrifugal 20 filter device Biomax-50K NMWL membrane and further separation on a Pharmacia Superdex-75 column. Fractions containing the protease were pooled.
Analysis of purified 58- and 41/38 kDa purified proteases revealed that, while both types of protease were completely inhibited with 1,10 phenanthroline, only the 41/38 kDa protease was S. inhibited with phosphoramidon. Further analysis of crude broth indicated that protease activity of day 1 W-14 broth has 23% of the total protease activity due to the 41/38 kDa protease, increasing to 44% in day three W-14 broth.
Standard SDS-PAGE analysis for examining protein purity and obtaining amino terminal sequence was performed using 4-20% gradient MiniPlus SepraGels purchased from Integrated Separation Systems (Natick, MA). Proteins to be amino-terminal sequenced were blotted onto PVDF membrane following purification, infra., (ProBlott' Membranes; Applied Biosystems, Foster City, CA), visualized with 0.1% amido black, excised, and sent to Cambridge Prochem; Cambridge, MA, for sequencing.
Deduced amino terminal sequence of the 58- (SEQ ID NO:45) and 41/38 kDa (SEQ ID NO:44) proteases from three day old W-14 broth were DV-GSEKANEKLK (SEQ ID NO: 45) and DSGDDDKVTNTDIHR (SEQ ID NO:44), respectively.
Sequencing of the 41/38 kDa protease revealed several amino termini, each one having an additional amino acid removed by proteolysis. Examination of the primary, secondary, tertiary and quartenary sequences for the 38 and 41 kDa polypeptides allowed for deduction of the sequence shown above and revealed that these two proteases are homologous.
1 0 Example 11. Part A Screening of Photorhabdus Genomic Library Via Use of Antibodies for Genes Encoding TcbA Pentide In parallel to the sequencing described above, suitable 15 probing and sequencing was done based on the TcbAii peptide (SEQ ID This sequencing was performed by preparing bacterial culture broths and purifying the toxin as described in Examples 1 and 2 above.
Genomic DNA was isolated from the Photorhabdus luminescens strain W-14 grown in Grace's insect tissue culture medium. The bacteria were grown in 5 ml of culture medium in a 250 ml o* S Erlenmeyer flask at 28 0 C and 250 rpm for approximately 24 hours.
Bacterial cells from 100 ml of culture medium were pelleted at 5000 x g for 10 minutes. The supernatant was discarded, and the cell .25 pellets then were used for the genomic DNA isolation.
The genomic DNA was isolated using a modification of the CTAB method described in Section 2.4.3 of Ausubel (supra.). The section *entitled "Large Scale CsCl prep of bacterial genomic DNA" was followed through step 6. At this point, an additional chloroform/isoamyl alcohol (24:1) extraction was performed followed by a phenol/chloroform/isoamyl (25:24:1) extraction step and a final chloroform/isoamyl/alcohol (24:1) extraction. The DNA was precipitated by the addition of a 0.6 volume of isopropanol. The precipitated DNA was hooked and wound around the end of a bent glass rod, dipped briefly into 70% ethanol as a final wash, and dissolved in 3 ml of TE buffer.
The DNA concentration, estimated by optical density at 280/260 nm, was approximately 2 mg/ml.
Using this genomic DNA, a library was prepared. Approximately 50 g of genomic DNA was partly digested with Sau3 Al. Then NaC1 density gradient centrifugation was used to size fractionate the partially digested DNA fragments. Fractions containing
DNA
fragments with an average size of 12 kb, or larger, as determined by agarose gel electrophoresis, were ligated into the plasmid BluScript, Stratagene, La Jolla, California, and transformed into an E. coli DH5a or DHB10 strain.
Separately, purified aliquots of the protein were sent to the biotechnology hybridoma center at the University of Wisconsin, Madison for production of monoclonal antibodies to the proteins.
The material that was sent was the HPLC purified fraction containing native bands 1 and 2 which had been denatured at 65 0
C,
and 20 4g of which was injected into each of four mice. Stable monoclonal antibody-producing hybridoma cell lines were recovered after spleen cells from unimmunized mouse were fused with a stable myeloma cell line. Monoclonal antibodies were recovered from the hybridomas.
:0 15 Separately, polyclonal antibodies were created by taking .0 native agarose gel purified band 1 (see Example 1) protein which was then used to immunize a New Zealand white rabbit. The protein was prepared by excising the band from the native agarose gels, briefly heating the gel pieces to 65 0 C to melt the agarose, and immediately emulsifying with adjuvant. Freund's complete adjuvant see*: was used for the primary immunizations and Freund's incomplete was used for 3 additional injections at monthly intervals. For each injection, approximately 0.2 ml of emulsified band 1, containing to 100 micrograms of protein, was delivered by multiple 25 subcontaneous injections into the back of the rabbit. Serum was obtained 10 days after the final injection and additional bleeds were performed at weekly intervals for 3 weeks. The serum complement was inactivated by heating to 56 0 C for 15 minutes and 0o then stored at -20 0
C.
30 The monoclonal and polyclonal antibodies were then used to screen the genomic library for the expression ofT'-a-igens which could be detected by the epitope. Positive clones were detected on nitrocellulose filter colony lifts. An immunoblot analysis of the positive clones was undertaken.
An analysis of the clones as defined by both immunoblot and Southern analysis resulted in the tentative identification of four genomic regions.
In the first region was a gene encoding the peptide designated here as TcbAii. Full DNA sequence of this gene (tcbA) was obtained. It is set forth as SEQ ID NO:ll. Confirmation that the sequence encodes the internal sequence of SEQ ID NO:1 is demonstrated by the presence of SEQ ID NO:1 at amino acid number 88 -78from the deduced amino acid sequence created by the open reading frame of SEQ ID NO:11. This can be confirmed by referring to SEQ ID NO:12, which is the deduced amino acid sequence created by SEQ ID NO:11.
The second region of toxin peptides contains the segments referred to above as TcaBi, TcaBii and TcaC. Following the screening of the library with the polyclonal antisera, this-second region of toxin genes was identified-by several clones which produced different size proteins, all of which cross-reacted with the polyclonal antibody on an immunoblot and were also found to share DNA homology on a Southern Blot. Sequence comparison revealed that they belonged to the gene complex designated TcaB and TcaC above.
Two other regions of antibody toxin clones were also isolated 15 in the polyclonal screen. These regions produced proteins that cross-react with a polyclonal antibody and also shared DNA homology with the regions as determined by Southern blotting. Thus, it appears that the Photorhabdus luminescens extracellular protein genes represent a family of genes which are evolutionarily related.
20 To further pursue the concept that there might be evolutionarily related variations in the toxin peptides contained within this organism, two approaches have been undertaken to examine other strains of Photorhabdus luminescens for the presence of related proteins. This was done both by PCR amplification of 25 genomic DNA and by immunoblot analysis using the polyclonal and monoclonal antibodies.
The results indicate that related proteins are produced by Photorhabdus. luminescens strains WX-2, WX-3, WX-4, WX-5, WX-6, WX- 7, WX-8, WX-11, WX-12, WX-15 and W-14.
Example 11. Part B Sequence and Analysis of ecc Toxin Clones Further DNA sequencing was performed on plasmids isolated from E. coli clones described in Example 11, Part A. The nucleotide sequence from the third region of E. coli clones was shown to be three closely linked open reading frames at this genomic locus.
This locus was designated tec with the three open reading frames designated tccA SEQ ID NO:56, tccB SEQ ID NO:58 and tccC SEQ ID NO:60. The close linkage between these open reading frames is revealed by examination of SEQ ID NO:56, in which 93 bp separate the stop codon of tccA from the start codon of tccb (bases 2992- 2994 of SEQ ID NO:56), and by examination of SEQ ID NO:58, in which -79- 131 bases separate the stop codon of tccB and the tccC (bases 4930- 4932 of SEQ ID NO:58). The physical map is presented in Fig. 6B.
The deduced amino acid sequence from the tccA open reading frame indicates that the gene encodes a protein of 105,459 Da.
This protein was designated TccA (SEQ ID NO:57). The first 12 amino acids of this protein match the N-terminal sequence obtained from a 108 kDa protein, SEQ ID NO:8, previously identified as part of the toxin complex.
The deduced amino acid sequence from the tccB open reading frame indicates that this gene encodes a protein of 175,716 Da.
This protein was designated TccB (SEQ ID NO:59). The first 11 amino acids of this protein match the N-terminal sequence obtained from a protein with estimated molecular weight of 185 kDa, SEQ ID :NO:7. Similarity analysis revealed that the TccB protein is related 15 to the proteins identified as TcbA SEQ ID NO:12; 37% similarity and 28% identity, TcdA SEQ ID NO:47; 35% similarity and 28%identity, and TcaB SEQ ID NO:26; 32% similarity and 26% identity (using the GAP algorithm Wisconsin Package Version 9.0, Genetics Computer Group (GCG) Madison Wisconsin).
20 The deduced amino acid sequence of tccC indicated that this .o open reading frame encodes a protein of 111,694 Da and the protein product was designated TccC (SEQ ID NO:61) Example 12 Characterization of Photorhabdus Strains In order to establish that the collection described herein was comprised of Photorhabdus strains, the strains herein were assessed in terms of recognized microbiological traits that are characteristic of Photorhabdus and which differentiate it from other Enterobacteriaceae and Xenorhabdus spp. (Farmer, J. J. 1984.
Bergey's Manual of Systemic Bacteriology, Vol 1. pp. 510-511. (ed.
Kreig N. R. and Holt, J. Williams Wilkins, Baltimore; Akhurst and Boemare, 1988, Boemare et al., 1993). These characteristic traits are as follows: Gram's stain negative rods, organism size of 0.5-2 pm in width and 2-10 pm in length, red/yellow colony pigmentation, presence of crystalline-inclusion bodies, presence of catalase, inability to reduce nitrate, presence of bioluminescence, ability to take up dye from growth media, positive for protease production, growth-temperature range below 370C, survival under anaerobic conditions and positively motile.
(Table 20). Reference Escherichia coli, Xenorhabdus and Photorhabdus strains were included in all tests for comparison.
The overall results are consistent with all strains being part of the family Enterobacteriaceae and the genus Photorhabdus.
A luminometer was used to establish the bioluminescence of each strain and provide a quantitative and relative measurement of light production. For measurement of relative light emitting units, the broths from each strain (cells and media) were measured at three time intervals after inoculation in liquid culture 12, and 24 hr) and compared to background luminosity (uninoculated media and water). Prior to measuring light emission from the various broths, cell density was established by measuring light absorbance (560 nM) in a Gilford Systems (Oberlin, OH) spectrophotometer using a sipper cell. Appropriate dilutions were 15 then made (to normalize optical density to 1.0 unit) before .measuring luminosity. Aliquots of the diluted broths were then placed into cuvettes (300 Al each) and read in a Bio-Orbit 1251 Luminometer (Bio-Orbit Oy, Twiku, Finland). The integration period oo for each sample was 45 seconds. The samples were continuously 20 mixed (spun in baffled cuvettes) while being read to provide oxygen availability. A positive test was determined as being 2 background luminescence (about 5-10 units). In addition, colony luminosity was detected with photographic film overlays and visually, after adaptation in a darkroom. The Gram's staining 25 characteristics of each strain were established with a commercial Gram's stain kit (BBL, Cockeysville, MD) used in conjunction with Gram's stain control slides (Fisher Scientific, Pittsburgh, PA).
Microscopic evaluation was then performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil immersion objective lens (with ocular and 2X body magnification). Microscopic examination of individual strains for organism size, cellular description and inclusion bodies (the latter after logarithmic growth) was performed using wet mount slides (10X ocular, 2X body and objective magnification) with oil immersion and phase contrast microscopy with a micrometer (Akhurst, R.J. and Boemare, N.E. 1990.
Entomopathogenic Nematodes in Biolonical Control (ed. Gaugler, R.
and Kaya, pp. 75-90. CRC Press, Boca Raton, USA.; Baghdiguian Boyer-Giglio Thaler, Bonnot Boemare N. 1993.
Biol. Cell 79, 177-185.). Colony pigmentation was observed after -81inoculation on Bacto nutrient agar, .(Difco Laboratories, Detroit, MI) prepared as per label instructions. Incubation occurred at 28 0 C and descriptions were produced after 5-7 days. To test for the presence of the enzyme catalase, a colony of the test organism was removed on a small plug from a nutrient agar plate and placed into the bottom of a glass test tube. One ml of a household hydrogen peroxide solution was gently added down the side of the tube. A positive reaction was recorded when bubbles of gas (presumptive oxygen) appeared immediately or within 5 seconds.
Controls of uninoculated nutrient agar and hydrogen peroxide solution were also examined. To test for nitrate reduction, each culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI). After 24 hours incubation at 280C, nitrite production was tested by the addition of two drops of 15 sulfanilic acid reagent and two drops of alpha-naphthylamine reagent (see Difco Manual, 10th edition, Difco Laboratories, SDetroit, MI, 1984). The generation of a distinct pink or red color indicates the formation of nitrite from nitrate. The ability of .o.o each strain to uptake dye from growth media was tested with Bacto 20 MacConkey agar containing the dye neutral red; Bacto Tergitol-7 agar containing the dye bromothymol blue and Bacto EMB Agar :containing the dye eosin-Y (agars from Difco Laboratories, Detroit, MI, all prepared according to label instructions). After inoculation on these media, dye uptake was recorded after 25 incubation at 28 0 C for 5 days. Growth on these latter media is characteristic for members of the family Enterobacteriaceae.
Motility of each strain was tested using a solution of Bacto Motility Test Medium (Difco Laboratories, Detroit, MI) prepared as per label instructions. A butt-stab inoculation was performed with each strain and motility was judged macroscopically by a diffuse zone of growth spreading from the line of inoculum. In many cases, motility was also observed microscopically from liquid culture under wet mount slides. Biochemical nutrient evaluation for each strain was performed using BBL Enterotube II (Benton, Dickinson, Germany). Product instructions were followed with the exception that incubation was carried out at 28 0 C for 5 days. Results were consistent with previously cited reports for Photorhabdus. The production of protease was tested by observing hydrolysis of gelatin using Bacto gelatin (Difco Laboratories, Detroit, MI) -82plates made as per label instructions. Cultures were inoculated and the plates were incubated at 280C for 5 days. To assess growth at different temperatures, agar plates proteose peptone #3 with two percent Bacto-Agar (Difco, Detroit, MI) in deionized water] were streaked from a common source of inoculum. Plates were sealed with Nesco® film and incubated at 20, 28 and 37 0 C for up to three weeks. Plates showing no growth at 37 0 C showed no cell viability after transfer to a 28°C incubator for one week. Oxygen requirements for Phocorhabdus strains were tested in the following manner. A butt-stab inoculation into fluid thioglycolate broth medium (Difco, Detroit, MI) was made. The tubes were incubated at room temperature for one week and cultures were then examined for type and extent of growth. The indicator resazurin demonstrates the level of medium oxidation or the aerobiosis zone (Difco Manual, 15 10th edition, Difco Laboratories, Detroit, MI). Growth zone results obtained for the Photorhabdus strains tested were consistent with those of a facultative anaerobic microorganism.
a a -83- Taxonomic Trait fPophbu tan Assessed* Strain A 8 E
Q
WX T +t 4. ~~~WX14 .h 9_F 9 7.rs 0 Z_ 4-+ .0 09 A G m s stin rysa iTeln--u io C=Bioluminescence, D=Cell form, E=Motility, F=Nitrate reduction, G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, .J=Pigmentation, K=Growth on EMB agar, L=Growth on MacConkey agar, M=Growth on Tergitol-7 agar, N=Facultative anaerobe, 0=Growth at 10 20 0 C, P=Growth at 28 0 C, Q=Growth at 37 0 C, t positive or negative for trait, rd=rod, S=sized within Genus descriptors, RO=red-orange, LR light red, R= red, 0= orange, Y= yellow, T= tan, LY= light yellow, YT= yellow tan, and LO= light orange.
Cellular fatty acid analysis is a recognized tool for bacterial characterization at the genus and species level (Tornabene, T. G. 1985. Lipid Analysis and the Relationship to Chemotaxonomy in Methods in M icrobiology, Vol. 18, 209-234.; Goodfellow, M. and O'Donnell, A. G. 1993. Roots of Bacterial Systematics in Handbook of New Bacterial Sfsteatics (ed.
Goodfellow, M. O'Donnell, A. pp. 3-54. London: Academic Press Ltd.) these references are incorporated herein by reference, and were used to confirm that our collection was related at the genus level. Cultures were shipped to an external, contract laboratory -84for fatty acid methyl ester analysis. (FAME) using a Microbial ID (MIDI, Newark, DE, USA) Microbial Identification System (MIS). The MIS system consists of a Hewlett Packard HP5890A gas chromatograph with a 25mm x 0.2mm 5% methylphenyl silicone fused silica capillary column. Hydrogen is used as the carrier gas and a flame-ionization detector functions in conjunction with an automatic sampler, integrator and computer. The computer compares the sample fatty acid methyl esters to a microbial fatty acid library and against a calibration mix of known fatty acids. As selected by the contract laboratory, strains were grown for 24 hours at 28 0 C on trypticase soy agar prior to analysis. Extraction of samples was performed by the contract lab as per standard FAME methodology. There was no Sdirect identification of the strains to any luminescent bacterial Sgroup other than Photorhabdus. When the cluster analysis was 15 performed, which compares the fatty acid profiles of a group of isolates, the strain fatty acid profiles were related at the genus o level.
The evolutionary diversity of the Photorhabdus strains in our collection was measured by analysis of PCR (Polymerase Chain 20 Reaction) mediated genomic fingerprinting using genomic DNA from each strain. This technique is based on families of repetitive DNA sequences present throughout the genome of diverse bacterial species (reviewed by Versalovic, Schneider, DE Bruijn, F. J. and Lupski, J. R. 1994. Methods Mol. Cell. Biol., 5, 25-40.) Three of these, repetitive extragenic palindromic sequence (REP) Senterobacterial repetitive intergenic consensus (ERIC) and the BOX element are thought to play an important role in the organization of the bacterial genome. Genomic organization is believed to be shaped by selection and the differential dispersion of these elements within the genome of closely related bacterial strains can be used to discriminate these strains Louws, F. J., Fulbright, D. Stephens, C. T. and DE Bruijn, F. J. 1994. Appl.
Environ. Micro. 60, 2286-2295). Rep-PCR utilizes oligonucleotide primers complementary to these repetitive sequences to amplify the variably sized DNA fragments lying between them. The resulting products are separated by electrophoresis to establish the DNA "fingerprint" for each strain.
To isolate genomic DNA from our strains, cell pellets were resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to a final volume of 10 ml and 12 ml of 5 M NaC1 was then added. This mixture was centrifuged 20 min. at 15,000 x g. The resulting pellet was resuspended in 5.7 ml of TE and 300 p1 of 10% SDS and 1l 20 mg/ml proteinase K (Gdico BRL Products, Grand Island, NY) were added. This mixture was incubated at 37 OC for 1 hr, approximately 10 mg of lysozyme was then added and the mixture was incubated for an additional 45 min. One milliliter of 5M NaC1 and 800 pl of CTAB/NaC1 solution (10% w/v CTAB, 0.7 M NaC1) were then added and the mixture was incubated 10 min. at 65°C, gently agitated, then incubated and agitated for an additional 20 min. to aid in clearing of the cellular material. An equal volume of chloroform/isoamyl alcohol solution (24:1, v/v) was added, mixed gently then centrifuged. Two extractions were then performed with an equal volume of phenol/chloroform/isoamyl alcohol (50:49:1) 15 Genomic DNA was precipitated with 0.6 volume of isopropanol.
Precipitated DNA was removed with a glass rod, washed twice with 70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 10 mM NaC1, 1 mM EDTA). The DNA was then quantitated by optical density at 260 nm. To perform rep-PCR analysis of 20 Photorhabdus genomic DNA the following primers were used, REP1R-I; 5'-IIIICGICGICATCIGGC-3' and REP2-I; 5'-ICGICTTATCIGGCCTAC-3'.
PCR
was performed using the following 25l reaction: 7.75 pl H20, 1 10X LA buffer (PanVera Corp., Madison, WI), 16 pl dNTP mix mM each), 1 l of each primer at 50 pM/pl, 1 pl DMSO, 1.5 Cl genomic DNA (concentrations ranged from 0.075-0.480 pg/pl) and 0.25 Al TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR amplification was performed in a Perkin Elmer DNA Thermal Cycler (Norwalk, CT) using the following conditions: 95 0 C/7 min. then cycles of; 94 0 C/1 min.,44 0 C/l min., 65 0 C/8 min., followed by 15 min.
at 65 0 C. After cycling, the 25 p1 reaction was added to 5 l of 6X gel loading buffer (0.25% bromophenol blue, 40% w/v sucrose in
H
2 A 15x20cm l%-agarose gel was then run in TBE buffer (0.09 M Tris-borate, 0.002 M EDTA) using 8 pl of each reaction. The gel was run for approximately 16 hours at 45v. Gels were then stained in 20 pg/ml ethidium bromide for 1 hour and destained in TBE buffer for approximately 3 hours. Polaroid® photographs of the gels were then taken under UV illumination.
The presence or absence of bands at specific sizes for each strain was scored from the photographs and entered as a similarity matrix in the numerical taxonomy software program, NTSYS-pc (Exeter Software, Setauket, NY). Controls of E. coli strain HB101 and Xanthomonas oryzae pv. oryzae assayed at the same time produced PCR "fingerprints" corresponding to published reports (Versalovic,
J.,
Koeuth, T. and Lupski, J. R. 1991. Nucleic Acids Res. 19, 6823- 6831; Vera Cruz, C. Halda-Alija, Louws, Skinner, D. Z., George, M. Nelson, R. DE Bruijn, F. Rice, C. and Leach, J. E. 1995. Int. Rice Res. Notes, 20, 23-24.; Vera Cruz, C. M., Ardales, E. Skinner, D. Talag, Nelson, R. Louws, F. Leung, Mew, T. W. and Leach, J. E. 1996. Phytopathology (in press, respectively). The data from Photorhabdus strains were then analyzed with a series of programs within NTSYS-pc; SIMQUAL (Similarity for Qualitative data) to generate a matrix of S* similarity coefficients (using the Jaccard coefficient) and SAHN 15 (Sequential, Agglomerative, Heirarchical and Nested) clustering [using the UPGMA (Unweighted Pair-Group Method with Arithmetic Averages) method] which groups related strains and can be expressed as a phenogram (Fig. The COPH (cophenetic values) and MXCOMP (matrix comparison) programs were used to generate a cophenetic 20 value matrix and compare the correlation between this and the original matrix upon which the clustering was based. A resulting normalized Mantel statistic was generated which is a measure of the goodness of fit for a cluster analysis (r=0.8-0.9 represents a very good fit). In our case r 0.919. Therefore, our collection is comprised of a diverse group of easily distinguishable strains representative of the Photorhabdus genus.
Example 13- Insecticidal Utility of Toxin(s) Produced by Various Photorhabdus Strains Initial "seed" cultures of the various Photorhabdus strains were produced by inoculating 175 ml of 2% Proteose Peptone #3 (PP3) (Difco Laboratories, Detroit, MI) liquid media with a primary variant subclone in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput. Inoculum for each seed culture was derived from oil-overlay agar slant cultures or plate cultures. After inoculation, these flasks were incubated for 16 hrs at 28 0 C on a rotary shaker at 150 rpm. These seed cultures were then used as -87uniform inoculum sources for a given fermentation of each strain.
Additionally, overlaying the post-log seed culture with sterile mineral oil, adding a sterile magnetic stir bar for future resuspension and storing the culture in the dark, at room temperature provided long-term preservation of inoculum in a toxincompetent state. The production broths were inoculated by adding 1% of the actively growing seed culture to fresh 2% PP3 media 1.75 ml per 175 ml fresh media). Production of broths occurred in either 500 ml tribaffled flasks (see above), or 2800 ml baffled, convex bottom flasks (500 ml volume) covered by a silicon foam closure. Production flasks were incubated for 24-48 hrs under the above mentioned conditions. Following incubation, the broths were dispensed into sterile 1 L polyethylene bottles, spun at 2600 x g for 1 hr at 10 0 C and decanted from the cell and debris pellet.
15 The liquid broth was then vacuum filtered through Whatman GF/D (2.7 SM retention) and GF/B (1.0 AM retention) glass filters to remove debris. Further broth clarification was achieved with a tangential flow microfiltration device (Pall Filtron, Northborough, MA) using a 0.5 AiM open-channel filter. When necessary, additional 20 clarification could be obtained by chilling the broth (to 4°C) and centrifuging for several hours at 2600 x g. Following these procedures, the broth was filter sterilized using a 0.2 AM I nitrocellulose membrane filter. Sterile broths were then used directly for biological assay, biochemical analysis or concentrated (up to 15-fold) using a 10,000 MW cut-off, M12 ultra-filtration device (Amicon, Beverly MA) or centrifugal concentrators (Millipore, Bedford, MA and Pall Filtron, Northborough, MA) with a 10,000 MW pore size. In the case of centrifugal concentrators, the broth was spun at 2000 x g for approximately 2 hr. The 10,000 MW permeate was added to the corresponding retentate to achieve the desired concentration of components greater than 10,000 MW. Heat inactivation of processed broth samples was acheived by heating the samples at 100°C in a sand-filled heat block for 10 minutes.
The broth(s) and toxin complex(es) from different Photorhabdus strains are useful for reducing populations of insects and were used in a method of inhibiting an insect population which comprises applying to a locus of the insect an effective insect inactivating amount of the active described. A demonstration of the breadth of insecticidal activity observed from broths of a selected group of Photorhabdus strains fermented as described above is shown in Table It is possible that additional insecticidal activities could be detected with these strains through increased concentration of the broth or by employing different fermentation methods.
Consistent with the activity being associated with a protein, the insecticidal activity of all strains tested was heat labile (see above) Culture broth(s) from diverse Photorhabdus strains show differential insecticidal activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects.
More specifically, the activity is seen against corn rootworm larvae and boll weevil larvae which are members of the insect order Coleoptera. Other members of the Coleoptera include wireworms, pollen beetles, flea beetles, seed beetles and Colorado potato beetle. Activity is also observed against aster leafhopper and corn plant hopper, which are members of the order Homoptera. Other members of the Homoptera include planthoppers, pear psylla, apple sucker, scale insects, whiteflies, spittle bugs as well as numerous host specific aphid species. The broths and purified toxin 20 complex(es) are also active against tobacco budworm, tobacco hornworm and European corn borer which are members of the order Lepidoptera. Other typical members of this order are beet armyworm, cabbage looper, black cutworm, corn earworm, codling moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm and fall armyworm. Activity is also seen against fruitfly and mosquito larvae which are members of the order Diptera. Other members of the order Diptera are, pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly and house fly and various mosquito species. Activity with broth(s) and toxin complex(es) is also seen against two-spotted spider mite which is a member of the order Acarina which includes strawberry spider mites, broad mites, citrus red mite, European red mite, pear rust mite and tomato russet mite.
Activity against corn rootworm larvae was tested as follows.
Photorhabdus culture broth(s) (0-15 fold concentrated, filter sterilized), 2% Proteose Peptone purified toxin complex(es), mM sodium phosphate buffer pH 7.0 were applied directly to the surface (about 1.5 cm 2 of artificial diet (Rose, R. I. and McCabe, -89- J. M. (1973). J. Econ. Entomol. 66, .(398-400) in 40 ip aliquots.
Toxin complex was diluted in 10 mM sodium phosphate buffer, pH The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single; neonate Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) hatched from surface sterilized eggs. The plates were sealed, placed in a humidified growth chamber and maintained at 27 0 C for the appropriate period (3-5 days). Mortality and larval weight determinations were then scored. Generally, 16 insects per treatment were used in all studies. Control mortality was generally less than Activity against boll weevil (Anthomonas grandis) was tested 0 as follows. Concentrated (1-10 fold) Photorhabdus broths, control medium Proteose Peptone purified toxin complex(es) [0.23 15 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied in l aliquots to the surface of 0.35 g of artificial diet (Stoneville Yellow lepidopteran diet) and allowed to dry. A single, 12-24 hr boll weevil larva was placed on the diet, and the wells were sealed and held at 25 0 C, 50% RH for 5 days. Mortality and larval weights o* 20 were then assessed. Control mortality ranged between 0-13%.
Activity against mosquito larvae was tested as follows. The Sassay was conducted in a 96-well microtiter plate. Each well contained 200 pl of aqueous solution (10-fold concentrated Photorhabdus culture broth(s), control medium Proteose Peptone 10 mM sodium phosphate buffer, toxin complex(es) 0.23 mg/ml or H20) and approximately 20, 1-day old larvae (Aedes aegypti).
There were 6 wells per treatment. The results were read at 3-4 days after infestation. Control mortality was between 0-20%.
Activity against fruitflies was tested as follows. Purchased Drosophila melanogaster medium was prepared using 50% dry medium and a 50% liquid of either water, control medium Proteose Peptone 10-fold concentrated Photorhabdus culture broth(s), purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer pH 7.0. This was accomplished by placing 4.0 ml of dry medium in each of 3 rearing vials per treatment and adding 4.0 ml of the appropriate liquid. Ten late inscar Drosophila melanogaster maggots were then added to each 25 ml vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 15 days of exposure.
Adult emergence as compared to water and control medium (0-16% reduction).
Activity against aster leafhopper adults (Macrosteles severini) and corn planthopper nymphs (Peregrinus maidis) was tested with an ingestion assay designed to allow ingestion of the active without other external contact. The reservoir for the active/"food" solution is made by making 2 holes in the center of the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm M® square is placed across the top of the dish and secured with an "0" ring. A 1 oz. plastic cup is then infested with approximately 7 hoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using 10-fold concentrated Photorhabdus culture broth(s), the broth and control medium Proteose Peptone #3) o* .15 were dialyzed against 10 mM sodium phosphate buffer, pH 7.0 and sucrose (to was added to the resulting solution to reduce control mortality. Purified toxin complex(es) [0.23 mg/ml] or mM sodium phosphate buffer, pH 7.0 was also tested. Mortality is reported at day 3. The assay was held in an incubator at 28 0 C, 20 RH with a 16/8 photoperiod. The assays were graded for mortality at 72 hours. Control mortality was less than 6%.
Activity against lepidopteran larvae was tested as follows.
Concentrated (10-fold) Photorhabdus culture broth(s), control medium Proteose Peptone purified toxin complex(es) [0.23 25 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied directly to the surface (about 1.5 cm 2 of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 #1 aliquots. The diet plates were allowed to air-dry in a sterile flow-hood and each well was infested with a single, neonate larva. European corn borer (Ostrinia nubilalis) and tobacco hornworm (Manduca sexta) eggs were obtained from commercial sources and hatched in-house, whereas tobacco budworm (Heliothis virescens) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27 0 C for the appropriate period. Mortality and weight determinations were scored at day 5. Generally, 16 insects per treatment were used in all studies. Control mortality generally ranged from about 4 to about 12.5% for control medium and was less than 10% for phosphate buffer.
Activity against two-spotted spider mite (Tetranychus urticae) was determined as follows. Young squash plants were trimmed to a single cotyledon and sprayed to run-off with 10-fold concentrated broth(s), control medium Proteose Peptone purified toxin complex(es), 10 mM sodium phosphate buffer, pH 7.0. After drying, the plants were infested with a mixed population of spider mites and held at lab temperature and humidity for 72 hr. Live mites were then counted to determine levels of control.
e *S Tabj Observed Inse::cticia pcrmo rtsfo Different photorhabdus Strains PllocorhabdUS Strain sensi-tlve* Insect specTe-s WX-l 4, 5, G, 7, 8 WX-2 2, 4 WX-3 1, 4 WX-4 1, 4 WX-5 4 WX-6 4 WX-7 3, 4, 5, 6, 7, 8 1J.. 4 *WX-9 1, 2, 4 4 :..WX-ll 1, 2, 4 WX-14 1, 2, 4 1, 2, 4 W30 3, 4, 5, 8 NC-1 1, 2, 3, 4, 5, 6, 7, 8, 9 WIR 2, 3, 5, 6, 7, 8 HP88 1, 3, 4, 5, 7, 8 Hb 3, 4, 5, 7, 8 Hm 1, 2, 3, 4, 5, 7, 8 -H9 1, 2, 3, 4, 5, 6, 7, 8 *W-14 1, 2, 3, 4, 5, 6, 7, 8, ATCC 43948 4 ATCC 43949 4 ATCC 43950 4 ATCC 43951 4 ATCC 43952 4 2516 mortality and/or growth inhibition vs. control Tobacco budworm, 2; European corn borer, 3; Tobacco hornworm, 4; Southern corn rootworm, Boll weevil, 6; Mosquito, 7; Fruit Fly, 8; Aster Leafhopper, 9; Corn planthopper, Two-spotted spider mite.
-93- Example. 14 Non W-14 Photorhabdus Strains: Purification. Characterization and Activity Spectrum Purification The protocol, as follows, is similar to that developed for the purification of W-14 and was established based on purifying those fractions having the most activity against Southern corn root worm (SCR), as determined in bioassays (see Example 13). Typically, 4- 20 L of broth that had been filtered, as described in Example 13, were received and concentrated using an Amicon spiral ultra S. filtration cartridge Type S1Y100 attached to an Amicon M-12 filtration device. The retentate contained native proteins consisting of molecular sizes greater than 100 kDa, whereas the 15 flow through material contained native proteins less than 100 kDa o* in size. The majority of the activity against SCR was contained in the 100 kDa retentate. The retentate was then continually diafiltered with 10 mM sodium phosphate (pH 7.0) until the filtrate reached an A 280 0.100. Unless otherwise stated, all 20 procedures from this point were performed in buffer as defined by mM sodium phosphate (pH The retentate was then '.concentrated to a final volume of approximately 0.20 L and filtered using a 0.45 mm Nalgene' Filterware sterile filtration unit. The filtered material was loaded at 7.5 ml/min onto a Pharmacia HR16/10 25 column which had been packed with PerSeptive Biosystem Poros® 50 HQ strong anion exchange matrix equilibrated in buffer using a PerSeptive Biosystem Sprint® HPLC system. After loading, the column was washed with buffer until an A280 0.100 was achieved.
Proteins were then eluted from the column at 2.5 ml/min using buffer with 0.4 M NaC1 for 20 min for a total volume of 50 ml. The column was then washed using buffer with 1.0 M NaCl at the same flow rate for an additional 20 min (final volume 50 ml) Proteins eluted with 0.4 M and 1.0 M NaC1 were placed in separate dialysis bags (Spectra/Por® Membrane MWCO: 2,000) and allowed to dialyze overnight at 40 C in 12 L buffer. The majority of the activity against SCR was contained in the 0.4 M fraction. The 0.4 M fraction was further purified by application of 20 ml to a Pharmacia XK 26/100 column that had been prepacked with Sepharose CL4B (Pharmacia) using a flow rate of 0.75 ml/min. Fractions were -94pooled based on A 280 peak profile and concentrated to a final volume of 0.75 ml using a Millipore Ultrafree®-15 centrifugal filter device Biomax-50K NMWL membrane. Protein concentrations were determined using a Biorad Protein Assay Kit with bovine gamma globulin as a standard.
Characterization The native molecular weight of the SCR toxin complex was determined using a Pharmacia HR 16/50 that had been prepacked with Sepharose CL4B in buffer. The column was then calibrated using proteins of known molecular size thereby allowing for calculation of the toxin approximate native molecular size. As shown in Table 21, the molecular size of the toxin complex ranged from 777 kDa with strain
H
b to 1,900 kDa with strain WX-14. The yield of toxin complex also varied, from strain WX-12 producing 0.8 mg/L to strain Hb, which produced 7.0 mg/L.
*e Proteins found in the toxin complex were examined for individual polypeptide size using SDS-PAGE analysis. Typically, mg protein of the toxin complex from each strain was loaded onto a 20 2-15% polyacrylamide gel (Integrated Separation Systems) and electrophoresed at 20 mA in Biorad SDS-PAGE buffer. After completion of electrophoresis, the gels were stained overnight in Biorad Coomassie blue R-250 in methanol: acetic acid: water; 40:10:40 Subsequently, gels were destained in 25 methanol:acetic acid: water; 40:10:40 The gels were then rinsed with water for 15 min and scanned using a Molecular Dynamics Personal Laser Densitometer®. Lanes were quantitated and molecular sizes were calculated as compared to Biorad high molecular weight standards, which ranged from 200-45 kDa.
Sizes of the individual polypeptides comprising the SCR toxin complex from each strain are listed in Table 22. The sizes of the individual polypeptides ranged from 230 kDa with strain WX-1 to a size of 16 kDa, as seen with strain WX-7. Every strain, with the exception of strain Hb, had polypeptides comprising the toxin complex that were in the 160-230 kDa range, the 100-160 kDa range, and the 50-80 kDa range. These data indicate that the toxin complex may vary in peptide composition and components from strain to strain, however, in all cases the toxin attributes appears to consist of a large, oligomeric protein complex.
Table 21 Characterization of a Toxin Complex from Non W-14 Photorhabdus Strains strain Approx. Yleid Native Active Molecular Wt.a Fraction H9 (mg/L)b k19 972,000 Hb 777,000 Hm 1,400,000 1.1 HP88 813,000 NC1 1,092,000 3.3 WIR 979,000 WX-1 973,000 0.8 .WX-2 951,000 2.2 WX-7 1,000,000 WX-12 898,000 0.4 WX-14 1,900,000 1.9 W-14 860,000 a Native molecular weignt daeerminea using a Pharmacia HR 16/50 column packed with Sepharose CL4B b Amount of toxin complex recovered from culture broth.
Activity Spectrum ~As shown in Table 23, the toxin complexes purified from 10 strains Hm and H9 were tested for activity against a variety of insects, with the toxin complex from strain W-14 for comparison.
The assays were performed as described in Example 13. The toxin complex from all three strains exhibited activity against tobacco bud worm, European corn borer, Southern corn root worm, and aster 15 leafhopper. Furthermore, the toxin complex from strains Hm and W- 14 also exhibited activity against two-spotted spider mite. In addition, the toxin complex from W-14 exhibited activity against mosquito larvae. These data indicate that the toxin complex, while having similarities in activities between certain orders of insects, can also exhibit differential activities against other orders of insects.
-96-
S
*5 S S S
S
S
S
S..
S.
5* 5 S S S S S S *S* S S *S S S S S S S.
S. 5 6 S S S *5 5.5 55 Table 22 The Approximate Sizes (in kDa) of Peptides in a Purified Toxin Complex From Non W-14 Photorhabdus H19 Hb Hm HP 88 NC-1 150 140 139 130 120 100 98 88 81 69 57 54 49 44 39 37 170 140 100 81 72 68 49 46 30 22 20 19 170 160 140 130 129 110 100 86 81 77 73 60 58 45 39 3S 180 170 140 110 44 16
WIR
170 160 120 110 89 79 74 62 51 40 39 37 33 30 28 27 25 23 WX-1 WX-2 230 190 170 160 110 98 76 58 53 41 35 31 28 24 22 WX-7 200 180 110 87 75 43 33 28 26 23 22 21 19 18 16 WX- 12 180 160 140 139 130 110 92 87 80 73 59 56 51 37 33 32 26 WX-14 W-14 190 180 170 160 150 130 120 110 93 77 69 63 51 46 39 29 Table 23 Observed Insecticidal Spectrum of a Purified Toxin Complex from Photorhabdus Strains Photornandus Strain Sensicive* insect species Hm Toxin Complex 2, 3, 5, 6, 7, 8 H9 Toxin Complex 1, 2, 3, 6, 7, 8 W-14 Toxin Complex i, 2, 3, 4, 5, 6, 7, 8 25 mortality or growth inftii-tlon 25% mortality or growth inhibition 1, Tobacco bud worm; 2, European corn borer; 3, Southern corn root worm; 4, Mosquito; 5, Two-spotted spider mite; 6, Aster Leafhopper; 7, Fruit Fly; 8, Boll Weevil oExample 00 20 Sub-Fractionation of Photorhabdus Protein Toxin Complex :eo The Photorhabdus protein toxin complex was isolated as 0** described in Example 14. Next, about 10 mg toxin was applied to a MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a flow 25 rate of Iml/min. The column was washed with 20 mM Tris-HCl, pH until the optical density at 280 nm returned to baseline absorbance. The proteins bound to the column were eluted with a linear gradient of 0 to 1.0 M NaC1 in 20 mM Tris-HCl, pH 7.0 at 1 "ml/min for 30 min. One ml fractions were collected and subjected 30 to Southern corn rootworm (SCR) bioassay (see Example 13). Peaks of activity were determined by a series of dilutions of each fraction in SCR bioassays. Two activity peaks against SCR were observed and were named A (eluted at about 0.2-0.3 M NaC1) and B (eluted at 0.3-0.4 M NaC) Activity peaks A and B were pooled separately and both peaks were further purified using a 3-step procedure described below.
Solid (NH 4 2
SO
4 was added to the above protein fraction to a final concentration of 1.7 M. Proteins were then applied to a phenyl-Superose 5/5 column equilibrated with 1.7 M (NH 4 2
SO
4 in mM potassium phosphate buffer, pH 7 at 1 ml/min. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH 4 2 S0 4 0% ethylene glycol, 50 mM potassium phosphate, pH 7.0 to ethylene glycol, 25 mM potassium phosphate, pH 7.0 (no (NH 4 2
SO
4 at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. Activities in each fraction against SCR were determined by bioassay.
-98- The fractions with the highest.activity were pooled and applied to a MonoQ 5/5 column which was equilibrated with 20 mM Tris-HC1, pH 7.0 at 1 ml/min. The proteins bound to the column were eluted at 1 ml/min by a linear gradient of 0 to 1M NaC1 in mM Tris-HCl, pH For the final step of purification, the most active fractions above (determined by SCR bioassay) were pooled and subjected to a second phenyl-Superose 5/5/ column. Solid (NH 4 2
SO
4 was added to a final concentration of 1.7 M. The solution was then loaded onto the column equilibrated with 1.7 M (NH 4 2
SO
4 in 50 mM potassium phosphate buffer, pH 7 at Iml/min. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH 4 2
SO
4 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH :at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM 15 sodium phosphate buffer, pH 7.0. Activities in each fraction against SCR were determined by bioassay.
The final purified protein by the above 3-step procedure from peak A was named toxin A and the final purified protein from peak B was named toxin B.
Characterization and Amino Acid Secuencing of Toxin A and Toxin B In SDS-PAGE, both toxin A and toxin B contained two major 90% of total Commassie stained protein) peptides: 192 kDa (named Al and B l respectively) and 58 kDa (named A2 and B2, 25 respectively). Both toxin A and toxin B revealed only one major band-in native PAGE, indicating Al and A2 were subunits of one protein complex, and BI and B2 were subunits of one protein complex. Further, the native molecular weight of both toxin A and toxin B were determined to be 860 kDa by gel filtration chromatography. The relative molar concentrations of Al to A2 was judged to be a 1 to 1 equivalence as determined by densiometric analysis of SDS-PAGE gels. Similarly, Bl and B2 peptides were present at the same molar concentration.
Toxin A and toxin B were electrophoresed in 10% SDS-PAGE and transblotted to PVDF membranes. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. The N-terminal amino sequence of B1 was determined to be identical to SEQ ID NO:1, the TcbAii region of the tcbA gene (SEQ ID NO:12, position 87 to 99). A unique N-terminal sequence was obtained for peptide B2 (SEQ ID The N-terminal amino acid sequence of peptide B2 was identical to the TcbAiii region of the derived amino acid sequence for the tcbA gene (SEQ ID NO:12, position 1935 to 1945) Therefore, the B toxin contained predominantly two peptides, TcbAii and TcbAiii, that were observed to be derived from the same gene product, TcbA.
The N-terminal sequence of A2 (SEQ ID NO:41) was unique in comparison to the TcbAiii peptide and other peptides. The A2 peptide was denoted TcdAiii (see Example 17). SEQ ID NO:6 was determined to be a mixture of amino acid sequences SEQ ID NO:40 and 41.
Peptides Al and A2 were further subjected to internal amino acid sequencing. For internal amino acid sequencing, 10 Ag of toxin A was electrophoresized in 10% SDS-PAGE and transblotted to PVDF membrane. After the blot was stained with amido black, peptides Al and A2, denoted TcdAii and TcdAiii, respectively, were 15 excised from the blot and sent to Harvard MicroChem and Cambridge e*e: *ProChem. Peptides were subjected to trypsin digestion followed by HPLC chromatography to separate individual peptides. N-terminal amino acid analysis was performed on selected tryptic peptide fragments. Two internal amino acid sequences of peptide Al 20 (TcdAii-PK71, SEQ ID NO:38 and TcdAii-PK44, SEQ ID NO:39) were found to have significant homologies with deduced amino acid sequences of the TcbAii region of the tcbA gene (SEQ ID NO:12) Similarly, the N-terminal sequence (SEQ ID NO:41) and two internal sequences of peptides A2 (TcdAiii-PK57, SEQ ID NO:42 and TcdAiii- 25 PK20, SEQ ID NO.43) also showed significant homology with deduced amino acid sequences of TcbAiii region of the tcbA gene (SEQ ID NO:12) In summary of above results, the toxin complex has at least two active protein toxin complexes against SCR; toxin A and toxin B. Toxin A and toxin B are similar in their native and subunits molecular weight, however, their peptide compositions are different. Toxin A contained peptides TcdAii and TcdAiii as the major peptides and the toxin B contains TcbAii and TcbAiii as the major peptides.
Purification and Characterization of Toxin C. Tea Peptides The Photorhabdus protein toxin complex was isolated as described above. Next, about 50 mg toxin was applied to a MonoQ 10/10 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a flow rate of 2 ml/min. The column was washed with 20 mM Tris-HCl, -100until the optical density at 280 nm .returned to baseline level.
The proteins bound to the column were eluted with a linear gradient of 0 to 1M NaCl in 20 mM Tris-HCl, pH 7.0 at 2 ml/min for 60 min.
2 ml fractions were collected and subjected to Western analysis using pAb TcaBii-syn antibody (see Example 21) as the primary antibody. Fractions reacted with pAb TcaBii-syn antibody were combined and solid (NH4) 2 S0 4 was added to a final concentration of 1.7 M. Proteins were then applied to a phenyl-Superose 10/10 column equilibrated with 1.7 M (NH 4 2
SO
4 in 50 mM potassium phosphate buffer, pH 7 at iml/min. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH 4 2 SO4, 50 mM Spotassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH at 1 ml/min for 120 min. 2ml Fractions were collected, dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0, and 15 analyzed by Western blots using pAb TcaBii-syn antibody as the primary antibody.
Fractions cross-reacted with the antibody were pooled and applied to a MonoQ 5/5 column which was equilibrated with 20 mM Tris-HCl, pH 7.0 at Iml/min. The proteins bound to the column were 20 eluted at Iml/min by a linear gradient of 0 to 1M NaCI in 20 mM Tris-HCl, pH 7.0 for 30 min.
Fractions above reacted with pAb TcaBii-syn antibody were pooled and subjected to a phenyl-Superose 5/5/ column. Solid
(NH
4 2 S0 4 added to a final concentration of 1.7 M. The solution S25 was then applied onto the column equilibrated with 1.7 M (NH 4 2
SO
4 in 50 mM potassium phosphate buffer, pH 7 at Iml/min. Proteins bound to the column were then eluted with a linear gradient of 1.7 M (NH 4 2 S0 4 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 at 0.5 ml/min for 60 min. Fractions were dialyzed overnight against 10 mM sodium phosphate buffer, pH For the final purification step, fractions reacted with pAb TcaBii-syn antibody above determined by Western analysis were combined and applied to a Mono Q 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at Iml/min. The proteins bound to the cclumn were eluted at Iml/min by a linear gradient of 0 to 1M NaC1 in 20 mM Tris-HC1, pH 7.0 for 30 min.
The final purified protein fraction contained 6 major peptides examined by SDS-PAGE: 165 kDa, 90 kDa, 64 kDa, 62 kDa, 58 kDa, and 22 kDa. The LD50 of the insecticidal activities of this purified -101fraction were determined to be 100 ng and 500 ng against SCR and ECB, respectively.
The above peptides were blotted to PVDF membranes and blots were sent for amino acids analysis and 5 amino acid long N-terminal sequencing at Harvard MicroChem and Cambridge ProChem, respectively. The N-terminal amino acid sequence of the 165 kDa peptide was determined to be identical to peptide TcaC (SEQ ID 2, position 1 to The N-terminal amino acid sequence of the 90 kDa peptide was determined to be TcaAii region of the derived amino acid sequence for the tcaA gene (SEQ ID NO 33, position 254 to 258). The N-terminal amino acid sequence of 64 kDa peptide was determined to be identical to peptide TcaBi (SEQ ID 3, position 1 to The N-terminal amino acid sequence of the 62 kDa peptide was determined to be TcaAii region of the derived amino acid 15 sequence for the tcaA gene (SEQ ID NO 33, position 489 to 493) The N-terminal amino acid sequence of 58 kDa peptide was determined to be identical to peptide TcaBii (SEQ ID 5, position 1 to The N-terminal amino acid sequence of the 22 kDa peptide (SEQ ID NO 62) was determined to be TcaAi region, denoted TcaAiv, of the derived 0 20 amino acid sequence for the tcaA gene (SEQ ID NO 34, position 98 to 102). It is noted that all tcaA, tcaB, and tcaC genes reside in the same tca operon (Fig. 6A) Five pg of purified Tea fraction, purified toxin A, and purified toxin B were analyzed by Western blot using the following antibodies individually as primary antibody: pAb TcaBii-syn Santibody, mAb CF52 antibody, pAb TcdAii-syn antibody, and pAb Tcdiii-syn antibody (Example 21). With pAb TcaBii-syn antibody only the purified Tca peptides fraction reacted, but not toxin A or toxin B. With mAb CF52 antibody, only toxin B reacted but not Tca peptides fraction or toxin A. With either pAb TcdAii-syn antibody or pAb Tcdiii-syn antibody only toxin A reacted, but not Tca peptides fraction or toxin B. This indicated that the insecticidal activity observed in the purified Tca peptides fraction is independent of toxin A and toxin B. The purified Tea peptide fraction is a third unique protein toxin, denoted toxin C.
-102- Example 16 Cleavage and Activation of TcbA Peptide In the toxin B complex, peptide TcbAii and TcbAiii originate from the single gene product TcbA (Example 15). The processing of TcbA peptide to TcbAii and TcbAiii is presumably by the action of Photorhabdus protease(s), and most likely, the metalloproteases described in Example 10. In some cases, it was noted that when Photorhabdus W-14 broth was processed, TcbA peptide was present in toxin B complex as a major component, in addition to peptides TcbAii and TcbAiii. Identical procedures, described for the S, purification of toxin B complex (Example 15), were used to enrich peptide TcbA from toxin complex fraction of W-14 broth. The final S" purified material was analyzed in a 4-20% gradient SDS-PAGE and o* 15 major peptides were quantified by densitometry. It was determined that TcbA, TcbAii and TcbAiii comprised 58%, 36%, and 6%, respectively, of total protein. The identities of these peptides were confirmed by their respective molecular sizes in SDS-PAGE and Western blot analysis using monospecific antibodies. The native 20 molecular weight of this fraction was determined to be 860 kDa.
The cleavage of TcbA was evaluated by treating the above purified material with purified 38 kDa and 58 kDa W-14 Photorhabdus metalloproteases (Example 10), and trypsin as a control enzyme (Sigma, MO). The standard reaction consisted 17.5 Mg the above 25 purified fraction, 1.5 unit protease, and 0.1 M Tris buffer, pH in a total volume of 100 il. For the control reaction, protease was omitted. The reaction mixtures were incubated at 37 0 C for min. At the end of the reaction, 20 pl was taken and boiled with SDS-PAGE sample buffer immediately for electrophoresis analysis in a 4-20% gradient SDS-PAGE. It was determined from SDS-PAGE that in both 38 kDa and 58 kDa protease treatments, the amount of peptides TcbAii and TcbAiii increased about 3-fold while the amount of TcbA peptide decreased proportionally (Table 24). The relative reduction and augmentation of selected peptides was confirmed by Western blot analyses. Furthermore, gel filtration of the cleaved material revealed that the native molecular size of the complex remained the same. Upon trypsin treatment, peptides TcbA and TcbAii were nonspecifically digested into small peptides. This indicated that 38 kDa and 58 kDa Photorhabdus proteases can -103specifically process peptide TcbA into peptides TcbAii and TcbAiii.
Protease treated and untreated control of the remaining 80 l reaction mixture were serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and analyzed by SCR bioassay. By comparing activity in several dilution, it was determined that the 38 kDa protease treatment increased SCR insecticidal activity approximately 3 to 4 fold. The growth inhibition of remaining insects in the protease treatment was also more severe than control (Table 24).
0 Table 24 Conversion and Activation of Peptide TcbA into Peptides TcbAi and TcbAjiiiby Protease Treatment Control 38 Kia procease treatment r r r TcDA or total protein) 5b 18 TcbAii(% of total protein) 36 64 TcbAiii(% of total protein) 6 18 Lob0 (g protein) 2.1 U.b2 SCR Weight (mg/insect)* 0.2 0.1 an indication or growth inniition 5y measuring the average 20 weight of live insect after 5 days on diet in the assay.
Activation and Procession of Toxin B by SCR Gut Proteases In yet a second demonstration of proteolytic activation, it was examined whether W-14 toxins are processed by insects. Toxin B purified from Photorhabdus W-14 broth (see Example 15) was comprised of predominantly intact TcbA peptides as judged by SDS- PAGE and Western blot analysis using monoclonal antibody. The of this fraction against SCR was determined to be around 700 ng.
SCR larva were grown on coleopteran diet until they reached the fourth instar stage (about 100-125 mg total weight each insect). SCR gut content was collected as follows: the guts were removed using dissecting scissors and forceps. After removing the excess fatty material that coats the gut lining, about 40 guts were homogenized in a microcentrifuge tube containing 100 Al sterile water. The tube was then centrifuged at 14,000 rpm for 10 minutes and the pellet discarded. The supernatant was stored a--a -70 0
C
freezer until use.
The processing of toxin B by insect gut was evaluated by treating the above purified toxin B with the SCR gut content collected. The reaction consisted 40 Ag toxin B (1 mg/ml), 50 Al -104- SCR gut content, and 0.1M Tris buffer, pH 8.0 in a total volume of 100 il. For the control reaction, SCR gut content was omitted.
The reaction mixtures were incubated at 37 0 C for overnight. At the end of reaction, 10 pl was withdraw and boiled with equal volume 2x SDS-PAGE sample buffer for SDS-PAGE analysis. The remaining 90 1l reaction mixture was serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and analyzed by SCR bioassay. SDS-PAGE analysis indicated in SCR gut content treatment, peptide TcbA was digested completely into smaller peptides. Analysis of the undenatured toxin fraction showed that the native size, about 860 kDa, remained the same even though-larger peptides were fragmented. In SCR bioassays, it was found that the LD50 of SCR gut treated toxin B to be about 70 ng; representing a 10-fold increase. In a separate experiment, protease K treatment completely eliminated toxin 15 activity.
Example 17 Screening of the Library for a Gene Encoding the TcdAij Peptide 20 The cloning and characterization of a gene encoding the TcdAii peptide, described as SEQ ID NO:17 (internal peptide TcdAii-PTlll N-terminal sequence) and SEQ ID NO:18 (internal peptide TcdAii-PT79 N-terminal sequence) was completed. Two pools of degenerate oligonucleotides, designed to encode the amino acid sequences of SEQ ID NO:17 (Table 25) and SEQ ID NO:18 (Table 26), and the reverse complements of those sequences, were synthesized as described in Example 8. The DNA sequence of the oligonucleotides is given below: -105- Table flpopnprap 01 ioontirl ntide fnr RP0 Tn NO!017 P2-PTl111 1 2 3 4 5 6 7 8 Amino Acid 1Ala Phe Asn Ile Asp Asp Val Ser Codons 5' GCN TT AA AT GA GA GTN 3' P2. 3.6. CB 5 CC(A/C/C/T) TT AAT ATT GAT jGAT GT 3' P2.3.5 5' GC(A/C/C/T) TT(T/C) AA AT GA GA GT 3' P2.3.5R 5s'_AC TC TC AT TT AA [(A/C/C/T)CC 3' P2.3.5RI 5' ACI TCI TCI JATI TTI 1'AAI [C_3' P2.3R.CB 5s'_CAC (A/G)CT (AIC)AC JATC ATC AAT ATT AAA 3' Table 26 fc~r S~P0 Tfl ?'30:1R Decenerate oliaonucleotide for SEO TD NO:18 P2-PT79 1 -2 -3 4 5 6 -7 a 9 10 11 12 13 Amino Phe Ile Val -Tyr Thr Ser Leu Cly Val *Asn Pro Asn Asn Acid Codons* 51 TTY ATH GTN TAY ACN 6 6 GM GTM AAY CCN AAY AAY 3' P2.79.2 5' TTY ATY GTK TAT ACY TCI YTR GGY GTK AAT CCR AAT AAT 3'1 P2.79.3 5' TTT ATT GTK -~TAT ACY AGY YTR COY GTK AAT C ATAAT 3'- P2.79.R. 1 5s' ATT ATT YGJ ATT [MAC[ RCC YAR_ RCT ROT TATA JMAC AAT AAA 3' P2.79R.cB ATT ATT _YCG ATT_ MAC[ ACC CAC_ RCT _GGT TATA fMAC fAAT fAAA 3'- *According to IUPAC-IUB codes for nucleotides, Y C or T, H C or T, N A, C, G or T, K G or T, R A or G, and M A or C Polymerase Chain Reactions (PCR) were performed essentially as described in Example 8, using as forward primers P2.3.6.CB or P2.3.5, and as reverse primers P2.79.R.1 or P2.79R.CB, in all forward/reverse combinations, using Photorhabdus W-14 genomic DNA as template. In another set of reactions, primers P2.79.2 or P2.79.3 were used as forward primers, and P2.3.5R, P2.3.5RI, and P2.3R.CB were used as reverse primers in all forward/reverse combinations. Only in the reactions containing P2.3.6.CB as the forward primers combined with P2.79.R.1 or P2.79R.CB as the reverse primers was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 2500 base pairs. The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdAii-PT111 lies amino-proximal to the .peptide fragment TcdAii-PT79.
15 The 2500 bp PCR products were ligated to the plasmid vector pCR'II (Invitrogen, San Diego, CA) according to the supplier's instructions, and the DNA sequences across the ends of the insert fragments of two isolates (HS24 and HS27) were determined using the supplier's recommended primers and the sequencing methods described 20 previously. The sequence of both isolates was the same. New primers were synthesized based on the determined sequence, and used to prime additional sequencing reactions to obtain a total of 2557 :bases of the insert [SEQ ID NO:36] Translation of the partial peptide encoded by SEQ ID No: 36 yields the 845 amino acid sequence S. 25 disclosed as SEQ ID NO:37. Protein homology analysis of this portion of the TcdAii peptide fragment reveals substantial amino acid homology similarity,and 53% identity using the Wisconsin Package Version 8.0, Genetics Computer Group (GCG), Madison, WI) to residues 542 to 1390 of protein TcbA [SEQ ID NO:12] similarity, and 54% identity using the Wisconsin Package Version Genetics Computer Group (GCG), Madison, WI to residues 567 to 1389)). It is therefore apparent that the gene represented in part by SEQ ID NO:36 produces a protein of similar, but not identical, amino acid sequence as the TcbA protein, and which likely has similar, but not identical biological activity as the TcbA protein.
In yet another instance, a gene encoding the peptides TcdAii- PK44 and the TcdAiii 58 kDa N-terminal peptide, described as SEQ ID NO:39 (internal peptide TcdAii-PK44 sequence), and SEQ ID NO:41(TcdAiii 58 kDa N-terminal peptide sequence) was isolated.
-107- Two pools of degenerate oligonucleot.ides, designed to encode the amino acid sequences described as SEQ ID NO:39 (Table 28) and SEQ ID NO:41 (Table 27), and the reverse complements of those sequences, were synthesized as described in Example 8, and their DNA sequences.
e e- -108- 4e Table 27 Deaienerate Oligonucleotide for SEO ID 140:41 Cbin 1 2 3 4 516 7 8 911011112 13114 krdnro Lei Arg Ser Ala Asn Th Le Th Asp L-eu- Ih Le Pro Gb- Acid I II A2.l 5' YWR CIY PGY CM PAT ACY YIR ACY GAT _YIR T IT~ YI a CA3 A2. 2 02 wF Ia AM AM GAY JI M YIR c 15- -IG I YMI- YR~ AAA IYAR JIC ar FT YAR i= EP12 RCF 3__ CO AAI__ RIC JIGT CpG Iar PJT IFr23 Table 28 Decienerate Oligonucleotide for SEO ID NO:39 Amino Acid 1 J(9) (10) (11) -(12)1(13) (14) (15)1 (16) 1_ 6 9 Amino Acid Gly Pro Val Glu Ilie Asn Thr Alaj Ile A1.44.1 5' GGY CCR GTK GAA I ATT AAT ACC GCI AT 3' A1.44.1R ATI GCG GTA TT-T C C G C3' A1.44.2 5' GGIJCCI GTI GAR ATYJAAY ACT GCI_ AT 3' A1.44.2R 5' ATIIGCI IGTR ITTR IATY TCI ACT GGI fCC_3- Polymerase Chain Reactions (PCR) were performed essentially as described in Example 8, using as forward primers A1.44.1 or Al.44.2, and reverse primers A2.3R or A2.4R, in all forward/reverse combinations, using Photorhabdus W-14 genomic DNA as template. In another set of reactions, primers A2.1 or A2.2 were used as forward primers, and A1.44.1R, and A1.44.2R were used as reverse primers in all forward/reverse combinations. Only in the reactions containing Al.44.1 or A1.44.2 as the forward primers combined with A2.3R as the reverse primer was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 1400 base pairs. The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdAii-PK44 lies amino-proximal to the 58 kDapeptide fragment of TcdAiii.
The 1400 bp PCR products were ligated to the plasmid vector 15 pCR'"II according to the supplier's instructions. The DNA sequences across the ends of the insert fragments of four isolates were determined using primers similar in sequence to the supplier's recommended primers and using sequencing methods described previously. The nucleic acid sequence of all isolates differed as 20 expected in the regions corresponding to the degenerate primer sequences, but the amino acid sequences deduced from these data were the same as the actual amino acid sequences for the peptides determined previously, (SEQ ID NOS:41 and 39) Screening of the W-14 genomic cosmid library as described in 25 Example 8 with a radiolabeled probe comprised of the DNA prepared above (SEQ ID NO:36) identified five hybridizing cosmid isolates, namely 17D9, 20B10, 21D2, 27B10, and 26D1. These cosmids were distinct from those previously identified with probes corresponding to the genes described as SEQ ID NO:ll or SEQ ID Restriction enzyme analysis and DNA blot hybridizations identified three EcoR I fragments, of approximate sizes 3.7, 3.7, and 1.1 kbp, that span the region comprising the DNA of SEQ ID NO:36. Screening of the W-14 genomic cosmid library using as probe the radiolabeled 1.4 kbp DNA fragment prepared in this example identified the same five cosmids (17D9, 20B10, 21D2, 27B10, and 26D1). DNA blot hybridization to EcoR I-digested cosmid DNAs also showed hybridization to the same subset of EcoR I fragments as seen with the 2.5 kbp TcdAii gene probe, indicating that both fragments are encoded on the genomic DNA.
-110- DNA sequence determination of the cloned EcoR I fragments revealed an uninterrupted reading frame of 7551 base pairs (SEQ ID NO:46), encoding a 282.9 kDa protein of 2516 amino acids (SEQ ID NO:47). Analysis of the amino acid sequence of this protein revealed all expected internal fragments of peptides TcdAii(SEQ ID NOS:17, 18, 37, 38 and 39) and the TcdAiii peptide N-terminus (SEQ ID NO:41) and all TcdAiii internal peptides (SEQ ID NOS:42 and 43) The peptides isolated and identified as TcdAii and TcdAiii are each products of the open reading frame, denoted tcdA, disclosed as SEQ ID NO:46. Further, SEQ ID NO:47 shows, starting at position 89, the sequence disclosed as SEQ ID NO:13, which is the N-terminal sequence of a peptide of size approximately 201 kDa, indicating that the initial protein produced from SEQ ID NO: 46 is processed in a manner similar to that previously disclosed for SEQ ID NO:12.
15 In addition, the protein is further cleaved to generate a product of size 209.2 kDa, encoded by SEQ ID NO:48 and disclosed as SEQ ID NO:49 (TcdAii peptide), and a product of size 63.6 kDa, encoded by SEQ ID NO:50 and disclosed as SEQ ID NO:51 (TcdAiii peptide). Thus, it is thought that the insecticidal activity identified as toxin A 20 (Example 15) derived from the products of SEQ ID NO:46, as exemplified by the full-length protein of 282.9 kDa disclosed as SEQ ID NO:47, is processed to produce the peptides disclosed as SEQ ID NOS:49 and 51. It is thought that the insecticidal activity identified as toxin B (Example 15) derives from the products of SEQ ID NO:11, as exemplified by the 280.6 kDa protein disclosed as SEQ SID NO:12. This protein is proteolytically processed to yield the 207.6 kDa peptide disclosed as SEQ ID NO:53, which is encoded by SEQ ID NO:52, and the 62.9 kDa peptide having N-terminal sequence disclosed as SEQ ID NO:40, and further disclosed as SEQ ID which is encoded by SEQ ID NO:54.
Amino acid sequence comparisons between the proteins disclosed as SEQ ID NO:12 and SEQ ID NO:47 reveal that they have 69% similarity and 54% identity using the Wisconsin Package Version Genetics Computer Group (GCG), Madison, WI or 60% similarity and 54% identity using version 9.0 of the program. This high degree of evolutionary relationship is not uniform throughout the entire amino acid sequence of these peptides, but is higher towards the carboxy-terminal end of the proteins, since the peptides disclosed as SEQ ID NO:51 (derived from SEQ ID NO:47) and SEQ ID -111- (derived from SEQ ID NO:12) have 76% similarity and 64% identity using the Wisconsin Package Version 8.0, Genetics Computer Group (GCG), Madison, WI or 71% similarity and 64% identity using version 9.0 of the program.
Example 18 Control of European Cornborer-Induced Leaf Damage on Maize Plants by Spray Application of Photorhabdus (Strain W-14) Broth The ability of Photorhabdus toxin(s) to reduce plant damage caused by insect larvae was demonstrated by measuring leaf damage caused by European corn borer (Ostrinia nubilalis) infested onto maize plants treated with Photorhabdus broth. Fermentation broth 15 from Photorhabdus strain W-14 was produced and concentrated approximately 10-fold using ultrafiltration (10,000 MW pore-size) as described in Example 13. The resulting concentrated broth was then filter sterilized using 0.2 micron nitrocellulose membrane filters. A similarly prepared sample of uninoculated 2% proteose 20 peptone #3 was used for control purposes. Maize plants (an inbred line) were grown from seed to vegetative stage 7 or 8 in pots containing a soilless mixture in a greenhouse (27 0 C day; 22 0
C
night, about 50%RH, 14 hr day-length, watered/fertilized as needed). The test plants were arranged in a randomized complete block design (3 reps/treatment, 6 plants/treatment) in a greenhouse with temperature about 22 0 C day; 18 0 C night, no artificial light and with partial shading, about 50%RH and watered/fertilized as needed. Treatments (uninoculated media and concentrated Photorhabdus broth) were applied with a syringe sprayer, 2.0 mls applied from directly (about 6 inches) over the whorl and additional mls applied in a circular motion from approximately one foot above the whorl. In addition, one group of plants received no treatment. After the treatments had dried (approximately minutes), twelve neonate European corn borer larvae (eggs obtained from commercial sources and hatched in-house) were applied directly to the whorl. After one week, the plants were scored for damage to the leaves using a modified Guthrie Scale (Koziel, M. Beland, G. Bowman, Carozzi, N. Crenshaw, Crossland, L., Dawson, Desai, Hill, Kadwell, Launis, Lewis, -112- Maddox, McPherson, Meghj.i, M. Merlin, Rhodes, Warren, G. Wright, M. and Evola, S. V. 1993).
Bio/Technology, 11, 194-195.) and the scores were compared statistically [T-test (LSD) p<0.05 and Tukey's Studentized Range (HSD) Test The results are shown in Table 29. For reference, a score of 1 represents no damage, a score of 2 represents fine "window pane" damage on the unfurled leaf with no pinhole penetration and a score of 5 represents leaf penetration with elongated lesions and/or mid rib feeding evident on more than three leaves (lesions 1 inch). These data indicate that broth or other protein containing fractions may confer protection against specific insect pests when delivered in a sprayable formulation or when the gene or derivative thereof, encoding the protein or part thereof, is delivered via a transgenic plant or microbe.
Table 29 Effect of Photorhabdus Culture Broth on European Corn Borer-Induced Leaf Damage on Maize 20 Treatment Average Guthrie Score No Treatment 5.02 a Uninoculated medium 5.15 a Photorhabdus Broth 2.24 b Means with different letters are statistically different (p<0.05 or p<0.1).
Example 19 Genetic Engineering of Genes for Expression in E. coli Summary of Constructions A series of plasmids were constructed to express the tcbA gene of Photorhabdus W-14 in Escherichia coli. A list of the plasmids is shown in Table 30. A brief description of each construction follows as well as a summary of the E. coli expression data obtained.
-113- Express ion-21 Plamids for the-tcbA Gene PiasmilCd 0en Vector/elec lon IomIpDartmC ent~ pDA1B-)2 5 CODA PBC/C u Irace ular pDA~~b tbA c~i6/~Ainp cu ovirus, piJAaBu2 t7C.DA PET.12nIKan- Perip DTasm pV8228E~lApZ115-tCCA intrace u-a-r AbreviAt ions: Kar:aamcn 7 CXhL1t(;L1'nL=cloramp-eni.coi, 7mp---amoici i-jn Construct-ion of pDAB2025 In Example 9, a large EcoR I fragment which hybridizes to the TcbAii probe is described. This fragment was subcloned into pBC :(Stratagene, La Jolla CA) to create pDAB2025. Sequence analysis indicates that the fragment is 8816 base pairs. T he fragment encodes the tcbA gene with the initiating ATG at position 571 and .*.the terminating TAA at position 8086. The fragment therefore carries 570 base pairs of Photorhabdus DNA upstream of the ATG and 730 base pairs downstream of the TAA.
Construction of Plasmid D-AB2k2 The tcbA gene was PCR amplified from plasmid pDAB2025 using :the following primers; 5' primer (SlAc5l) 5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3' and 3' primer (SlAc3l) 5' TTT AAA GCG GCC-GCT TAA CGG ATG GTA TAA CGA ATA TG PCR was performed using a TaKaRa LA PCR kit from PanVera (Madison, WI) in the following reaction: 57.5 microliters water, 10 microliters lOX LA buffer, 16 microliters dNTPs (2.5 mM each stock solution), 20 microliters each primer at 10 pmoles/ microliters, 300 ng of the plasmid pDAB2O25 containing the W-14 tcbA gene and one microliter of TaKaRa LA Taq polymerase. The cycling conditions were 98 0 C/20 sec, 68 0 C/5 min, 72 0 C/10 min for 30 cycles. A PCR product of the expected about 7526 bp was isolated in a 0.8% agarose gel in TBE (100 mM Tris, mM boric acid, 1 mM EDTA) buffer and purified using a Qiaex II kit from Qiagen (Chatsworth, CA) The purified tcbA gene was digested with Nco I and Not I and ligated into the baculovirus transfer vector pAcGP67B (PharMingen (San Diego, CA)) and transformed into E. coi. The resdlting recombinant is called pDAfl2026. The tcbA gene was then cut from pDAB2o26 and transferred to pET27b to -114create plasmid pDAB2027. A missense mutation in the tcbA gene was repaired in pDAB2027.
The repaired tcbA gene contains two changes from the sequence shown in Sequence ID NO:11; an A>G at 212 changing an asparagine 71 S to serine 71 and a G>A at 229 changing an alanine 77 to threonine 77. These changes are both upstream of the proposed TcbAii Nterminus.
Construction of pDAB2028 The tcbA coding region of pDAB2027 was transferred to vector This was accomplished using shotgun ligations, the DNAs were cut with restriction enzymes Nco I and Xho I. The resulting recombinant is called pDAB2028.
15 Expression of TcbA in E. coli from Plasmid pDAB2028 Expression of tcbA in E. coli was obtained by modification of the methods previously described by Studier et al. (Studier, F.W., Rosenberg, Dunn, and Dubendorff, (1990) Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol., 20 185: 60-89.). Competent E. coli cells strain BL21(DE3) were transformed with plasmid pDAB2028 and plated on LB agar containing S 1 00 jg/mL ampicillin and 40 mM glucose. The transformed cells were S plated to a density of several hundred isolated colonies/plate.
Following overnight incubation at 37 0 C the cells were scraped from the plates and suspended in LB broth containing 100 4g/mL ampicillin. Typical culture volumes were from 200-500 mL. At time ooo0 zero, culture densities (OD600) were from 0.05-0.15 depending on the experiment. Cultures were shaken at one of three temperatures (22 0 C, 30 0 C or 37 0 C) until a density of 0.15-0.5 was obtained at which time they were induced with 1 mM isopropylthio-P-galactoside (IPTG). Cultures were incubated at the designated temperature for hours and then were transferred to 4 0 C until processing (12-72 hours).
Purification and Characterization of TcbA Expressed in E.coli from Plasmid DDAB2028 E. coli cultures expressing TcbA peptides were processed as follows. Cells were harvested by centrifugation at 17,000 x G and the media was decanted and saved in a separate container.
-115- The media was concentrated about 8x using the M12 (Amicon, Beverly MA) filtration system and a 100 kD molecular mass cut-off filter. The concentrated media was loaded onto an anion exchange column and the bound proteins were eluted with 1.0 M NaC1. The M NaCl elution peak was found to cause mortality against Southern corn rootworm (SCR) larvae Table 30). The 1.0 M NaCl fraction was dialyzed against 10 mM sodium phosphate buffer pH concentrated, and subjected to gel filtration on Sepharose CL-4B (Pharmacia, Piscataway, NJ). The region of the CL-4B elution profile corresponding to calculated molecular weight (about 900 kDa) as the native W-14 toxin complex was collected, concentrated and bioassayed against larvae. The collected 900 kDa fraction was found to have insecticidal activity (see Table 31 below), with symptomology similar to that caused by native W-14 toxin complex.
15 This fraction was subjected to Proteinase K and heat treatment, the activity in both cases was either eliminated or reduced, providing evidence that the activity is proteinaceous in nature. In addition, the active fraction tested immunologically positive for the TcbA and TcbAiii peptides in immunoblot analysis when tested 20 with an anti-TcbAiii monoclonal antibody (Table 31) Table 31 .Results of noblot and SCR Bioa *Results of Immunoblot and SCR Bioassavs Fract l o n 6CR Activity immunoolot Native Size S% uGrowth peptides [CL-4B Mortalit Inhibit. Detected Estimate Y d Size] TcDA Media 1.0 M TcA ion 2xchange TCDA Media CL-4B about TcbAiii 900 kDa 'CbA Media CL-4B N1 Proteinase K TcbA Media CL-4B NT heat treatment TCbA Ceil Sup CL-4B NT about 900 kD PK Proteinase K treatment 2 nours-; eat reatmene 1 UU Uor 1 minutes; ND None Detected; NT Not Tested. Scoring system for mortality and growth inhibition as compared to control samples; The cell pellet was resuspended in 10 mM sodium phosphate buffer, pH=7.0, and lysed by passage through a Bio-Neb" cell nebulizer (Glas-Col Inc., Terra Haute, IN). The pellets were -116treated with DNase to remove DNA and centrifuged at 17,000 x g to separate the cell pellet from the cell supernatant. The supernatant fraction was decanted and filtered through a 0.2 micron filter to remove large particles and subjected to anion exchange chromatography. Bound proteins were eluted with 1.0 M NaC1, dialyzed and concentrated using Biomax'" (Millipore Corp, Bedford, MA) concentrators with a molecular mass cut-off of 50,000 Daltons.
The concentrated fraction was subjected to gel filtration chromatography using Sepharose CL-4B beaded matrix. Bioassay data for material prepared in this way is shown in Table 30 and is denoted as "TcbA Cell Sup".
In yet another method to handle large amounts of material, the cell pellets were re-suspended in 10 mM sodium phosphate buffer, pH 7.0 and thoroughly homogenized by using a Kontes Glass Company 15 (Vineland, NJ) 40 ml tissue grinder. The cellular debris was pelleted by centrifugation at 25,000 x g and the cell supernatant was decanted, passed through a 0.2 micron filter and subjected to anion exchange chromatography using a Pharmacia 10/10 column packed with Poros HQ 50 beads. The bound proteins were eluted by performing a NaCI gradient of 0.0 to 1.0 M. Fractions containing the TcbA protein were combined and concentrated using a 50 kDa concentrator and subjected to gel filtration chromatography using Pharmacia CL-4B beaded matrix. The fractions containing TcbA oligomer, molecular mass of approximately 900 kDa, were collected and subjected to anion exchange chromatography using a Pharmacia SMono Q 10/10 column equilibrated with 20 mM Tris buffer pH 7.3.
A gradient of 0.0 to 1.0 M NaC1 was used to elute recombinant TcbA protein. Recombinant TcbA eluted from the column at a salt concentration of approximately 0.3-0.4 M NaCI, the same molarity at which native TcbA oligomer is eluted from the Mono Q 10/10 column.
The recombinant TcbA fraction was found to cause SCR mortality in bioassay experiments similar to those in Table 31.
A second set of expression constructions were prepared and tested for expression of the TcbA protein toxin.
Construction of pDAB2030: An Expression Plasmid for the tcbA Coding Region The plasmid pDAB2028 (see herein) contains the tcbA coding region in the commercial vector pET15 (Novagen, Madison, WI), -117encodes an ampicillin selection marker. The plasmid pDAB2030 was created to express the tcbA coding region from a plasmid which encodes a kanamycin selection marker. This was done by cutting pET27 (Novagen, Madison, WI) a kanamycin selection plasmid, and pDAB2028 with Xba I and Xho I. This releases the entire multiple cloning site, including the tcbA coding region from plasmid pDAB2028. The two cut plasmids, were mixed and ligated.
Recombinant plasmids were selected on kanamycin and those containing the pDAB2028 fragment were identified by restriction analysis. The new recombinant plasmid is called pDAB2030.
Construction of Plasmid DDAB2031: Correction of Mutations in tcbA The two mutations in the N-terminus of the tcbA coding region as described in Example 19 (Sequence ID NO:11; A>G at 212 changing 15 an asparagine 71 to serine 71; G>A at 229 changing an alanine 77 to threonine 77) were corrected as follows: A PCR product was generated using the primers TH50 ACC GTC TTC TTT ACG ATC AGT G 3')and SlAc51(5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3') and pDAB2025 as template to generate a 1778 bp product. This PCR product was cloned into plasmid pCR2.1 (Invitrogen, San Diego, CA) and a clone was isolated and sequenced. The clone was digested o* with Nco I and Pin AI and a 1670 bp fragment was purified from a 1% agarose gel. A plasmid containing the mutated tcbA coding region (pDAB2030) was digested with Nco I and Not I and purified away from 25 the 1670 bp fragment in a 0.8% agarose with Qiaex II (Qiagen, Chatsworth, CA). The corrected Nco I/Pin AI fragment was then ligated into pDAB2030. The ligated DNA was transformed into E. coli. A clone was isolated, sequenced and found to be correct.
This plasmid, containing the corrected tcbA coding region, is called pDAB2031.
Construction of pDAB2033 and DAB2034: Expression Plasmids for tchA The expression plasmids pDAB2025 and pDAB2027-2031 all rely on the Bacteriophage T7 expression system. An additional vector system was used for bacterial expression of the tcbA gene and i'ts derivatives. The expression vector Trc99a (Pharmacia Biotech, Piscataway, NJ) contains a strong trc promoter upstream of a multiple cloning site with a 5' Nco I site which is compatible with the tcbA coding region from pDAB2030 and 2031. However, the plasmid does not have a compatible 3' site. Therefore, the Hind III site of Trc99a was cut and made blunt by treatment with T4 DNA -118polymerase (Boehringer Mannheim, Indianapolis, IN). The vector plasmid was then cut by Nco I followed by treatment with alkaline phosphatase. The plasmids pDAB2030 and pDAB2031 were each cut with Xho I (cuts at the 3' end of the tcbA coding region) followed by treatment with T4 DNA polymerase to blunt the ends. The plasmids were then cut with Nco I, the DNAs were extracted with phenol, ethanol precipitated and resuspended in buffer. The Trc99a and pDAB2030 and pDAB2031 plasmids were mixed separately, ligated and transformed into DH5a cells and plated on LB media containing ampicillin and 50 mM glucose. Recombinant plasmids were identified by restriction digestion. The new plasmids are called pDAB2033 (contains the tcbA coding sequence with the two mutations in tcbAi) and pDAB2034 (contains the corrected version of tcbA from pDAB2031).
Construction of Plasmid oDAB2032: An Expression Plasmid for tcbAiiii i A plasmid encoding the TcbAiiAii i portion of TcbA was created in a similar way as plasmid pDAB2031. A PCR product was generated 20 using TH42 TAG GTC TCC ATG GCT TTT ATA CAA GGT TAT AGT GAT CTG and TH50 ACC GTC TTC TTT ACG ATC AGT G primers and plasmid pDAB2025 as template. This yielded a product of 1521 bp having an initiation codon at the beginning of the coding sequence of tcbAii. This PCR product was isolated in a 1% agarose gel and o 25 purified. The purified product was cloned into pCR2.1 as above and a correct clone was identified by DNA sequence analysis. This clone was digested with Nco I and Pin AI, a 1414 bp fragment was isolated in a 1% agarose gel and ligated into the Nco I and Pin AI sites of plasmid pDAB2030 and transformed into DH5a E. coli. This new plasmid, designed to express TcbAiiAiii in E. coli, is called pDAB2032.
Expression of tcbA and tcbAiiAiii from Plasmids DDAB2030. pDAB2031 and pDDAB2032 Expression of tcbA in E. coli from plasmids pDAB2030, pDAB2031 and pDAB2032 was as described herein, except expression of tcbAiiAii i was done in E. coli strain HMS174(DE3) (Novagen, Madison,
WI).
-119- Expression of tcbA from Plasmid pDAB2033 The plasmid pDAB2033 was transformed into BL21 cells (Novagen, Madison, WI) and plated on LB containing 100 micrograms/mL ampicillin and 50 mM glucose. The plates were spread such that several hundred well separated colonies were present on each plate following incubation at either 30 0 C or 37 0 C overnight. The colonies were scraped from the plates and suspended in LB containing 100 micrograms/mL ampicillin, but no glucose. Typical culture volume was 250 mL in a single 1 L baffle bottom flask. The cultures were induced when the culture reached a density of 0.3-0.6 OD600 nm. Most often this density was achieved immediately after suspension of the cells from the plates and did not require a growth period in liquid media. Two induction methods were used.
Method 1: cells were induced with 1 mM IPTG at 37 0 C. The cultures :15 were shaken at 200 rpm on a platform shaker for 5 hours and harvested. Method 2: The cultures were induced with 25 micromolar IPTG at 30 0 C and shaken at 200 rpm for 15 hours at either 20 0 C or 30 0 C. The cultures were stored at 4°C until used for purification.
Purification of TcbA from E. coli Purification, bioassay and immunoblot analysis of TcbA and TcbAiiAiii was as described herein. Results of several representative E. coli expression experiments are shown in Table 32. All materials shown in Table 32 were purified from the media 25 fraction of the cultures. The predicted native molecular weight is approximately 900 kD as described herein. The purity of the samples, the amount of TcbA relative to contaminating proteins, varied with each preparation.
-120- Table 32 Bioassay Activity and Immunoblot Analysis of TcbA and Derivatives Produced in E. coli and Purified from the Culture Media Plasmia Coding E. Southern corn Peptides Micrograms Region coli Rootworm Bioassay Detected Protein Strain Activity by Applied to Immunoblot Diet t Growtn Inhibit. Mortal.
PUAB2UJU cbA B2- TcB-A 1- (DE3) TcbAii i DAB20.31 tCCA BLT 1 TcDbA 1-10 (DE3) TcbAiii pDAB2033T cbA 4BL TcbA 1-2 TcbAii i pDAM2U72 CcAiiAi i HI/4 T'cAiiiii 13-21 (DE3) TcbA *I TcbAii i Scoring system tor mortall y and growth inhlbiion on Southern Corn 5 Rootworm as compared to control samples; Example 10 Characterization of Toxin Peptides with Matrix-Assisted Laser Desorption Ionization Time-of-Flight Mass Spectroscopy Toxins isolated from W-14 broth were purified as described in Example 15. In some cases, the TcaB protein toxin was pretreated 15 with proteases (Example 16) that had been isolated from W-14 broth as previously described (Example 15) Protein molecular mass was *determined using matrix-assisted laser desorption ionization timeof-flight mass spectroscopy, hereinafter MALDI-TOF, on a VOYAGER :o:T BIOSPECTROMETRY workstation with DELAYED EXTRACTION technology o 20 (Perseptive Biosystems, Framingham, MA) Typically, the protein of interest (100-500 pmoles in 5 pl) was mixed with 1 L1 of acetonitrile and dialyzed for 0.5 to 1 h on a Millipore VS filter having a pore size of 0.025 AM (Millipore Corp. Bedford, MA).
Dialysis was performed by floating the filter on water(shinny side up) followed by adding protein-acetonitrile mixture as a droplet to the surface of the filter. After dialysis, the dialyzed protein removed using a pipette and was then mixed with a matrix consisting of sinapinic acid and trifluoroacetic acid according to manufacturers instructions. The protein and matrix were allowed to co-crystallize on a about 3 cm 2 gold-plated sample plate (PerSeptive Corp.). Excitation of the crystals and subsequent mass analysis was performed using the following conditions: laser setting of 3050; pressure of 4.55e-07; low mass gate of 1500.0; negative ions off; accelerating voltage of 25,000; grid voltage of -121- 90.0%; guide wire voltage of 0.010%; linear mode; and a pulse delay time of 350 ns.
Protein mass analysis data are shown in Table 33. The data obtained from MALDI-TOF was compared to that hypothesized from gene sequence information and as previously determined by SDS-PAGE.
Table 33 Molecular Analysis of Peptides by MALDI-TOF. SDS-PAGE and Predicted Determination Based on Gene Sequence Peptide Predicted (Gene) SDS PAGE MALDI-TOF TcbA 280,634 Da 240,000 Da 281,040 Da TcbAi/ii 217,710 Da not resolved 216,812 Da TcbAii 207,698 Da 201,000 Da 206,473 Da TcbAiii 62,943 Da 58,000 Da 63,520 Da TcdAii 209,218 Da 188,000 Da 208,186 Da TcdAiii 63,520 Da 56,000 Da 63,544 Da 20 TcbAii Protease Generated 201,000 Da 216,614 Da 215,123 Da^ 210,391 Da 208,680 Da^ TcbAmii Protease Generated 56,000 Da 64,111 Da ^Data normalized TcbA, multiple fragments observed at TcbAi/ii a Example 21 Production of Peptide Specific Polyclonal Antibodies Nine peptide components of the W-14 toxin complex, namely, TcaA, TcaAiii, TcaBi, TcaBii, TcaC, TcbAii, TcbAiii, TcdAii, and 35 TcdAiii were selected as targets against which antibodies were produced. Comprehensive DNA and deduced amino acid sequence data for these peptides indicated that the sequence homology between some of these peptides was substantial. If a whole peptide was used as the immunogen to induce antibody production, the resulting antibodies might bind to multiple peptides in the toxin preparation. To avoid this problem antibodies were generated that would bind specifically to a unique region of each peptide of interest. The unique region (subpeptide) of each target peptide was selected based on the analyses described below.
Each entire peptide sequence was analyzed using MacVector" Protein Analysis Tool (IBI Sequence Analysis Software, International Biotechnologies, Inc., P. O. Box 9558, New Haven, CT 06535) to determine its antigenicity index. This program was designed to locate possible externally-located amino acid -122sequences, regions that might be antigenic sites. This method combined information from hydrophilicity, surface probability, and backbone flexibility predictions along with the secondary structure predictions in order to produce a composite prediction of the surface contour of a protein. The scores for each of the analyses were normalized to a value between -1.0 and (MacVectorm Manual). The antigenicity index value was obtained for the entire sequence of the target peptide. From each peptide, an area covering 19 or more amino acids that showed a high antigenicity index from the original sequence was re-analyzed to determine the antigenicity index of the subpeptide without the flanking residues. This re-analysis was necessary because the antigenicity index of a peptide could be influenced by the flanking amino acid residues. If the isolated subpeptide sequence did not 15 maintain a high antigenicity index, a new region was chosen and the analysis was repeated.
SEach selected subpeptide sequence was aligned and compared to all seven target peptide sequences using MacVector" alignment program. If a selected subpeptide sequence showed identity (greater than 20%) to another target peptide, a new 19 or more amino acid region was isolated and re-analyzed. Unique subpeptide sequences covering 19 or more amino acid showing high antigenicity index were selected from all target peptides.
The sequences of seven subpeptides were sent to Genemed 25 Biotechnology Inc. The last amino acid residue on each subpeptide was-deleted because it showed no apparent effect on the antigenicity index. A cysteine residue was added to the N-terminal '.of each subpeptide sequence, except TcaBi-syn which contains an internal cysteine residue. The present of a cysteine residue facilitates conjugation of a carrier protein (KLH). The final peptide products corresponding to the appropriate toxin peptides and SEQ ID NO.s are shown in Table 34.
-123- Table 34 Amino Acid Seauences for Synthetic Peptides S SEO ID No. Pepide Amino Acid Sequence 63 TcaAii-syn NH2-( C LRG N S PTN P D KDG I FAQVA 64 TcaAiii-syn NH2-(C) Y TPD QT P S F Y ETA F R S AD G TcaBi-syn NH2-H G Q S YN D N NY CN FT L S I NT 66 TcaBiii-syn NH2-( C) V D P K T L Q R Q Q AG G D G T G S S 67 TcaC-syn NH2- (C )YKAPQRQEDGDSNAVTYDK 68 TcbAii-syn NH2- C) YN E N P SS E D KKW Y F SS KD D 69 TcbAiii-syn NH 2 )F DSY SQLYEENINAG EQ RA TcdAii-syn NH2- C) NPNN SSN KLM FY PV Y Q Y S GN T 71 TcdAiii-syn NH2- C )VS QG S GSAG S G N N N LA FGAG Each conjugated synthetic peptide was injected into two rabbits according to Genemed accelerated program. The pre- and post-immune sera were available for testing after one month.
The preliminary test of both pre- and post-immune sera from 20 each rabbit was performed by Genemed Biotechnologies Inc. Genemed reported that by using both ELISA and Western blot techniques, they detected the reaction of post-immune sera to the respective synthetic peptides. Subsequently, the sera were tested with the whole target peptides, by Western blot analysis. Two batches of 25 partially purified Photorhabdus strain W-14 toxin complex-was used as the antigen. The two samples had shown activity against the Southern corn rootworm. Their peptide patterns on an SDS-PAGE gel were slightly different.
Pre-cast SDS-polyacrylamide gels with 4-20% gradient (Integrated Separation Systems, Natick, MA 01760) were used.
Between 1 to 8 Ag of protein was applied to each gel well.
Electrophoresis was performed and the protein was electroblotted onto Hybond-ECL m nitrocellulose membrane (Amersham International) The membrane was blocked with 10% milk in TBST (25 mM Tris HC1 pH 7.4, 136 mM NaC1, 2.7 mM KC1, 0.1% Tween 20) for one hour at room temperature. Each rabbit serum was diluted in 10% milk/TBST to 1:500. Other dilutions between 1:50 to 1:1000 were also used. The serum was added to the membrane and placed on a platform rocker for at least one hour. The membrane was washed thoroughly with the blocking solution or TBST. A 1:2000 dilution of secondary antibodies (goat anti-mouse IgG conjugated to horse radish peroxidase; BioRad Laboratories) in 10% milk/TBST was applied to the membrane placed on a platform rocker for one hour. The membrane was subsequently washed with excess amount of TBST. The -124detection of the protein was performed by using an ECL (Enhanced Chemiluminescence) detection kit (Amersham International) Western blot analyses were performed to identify binding specificity of each anti-syntheticpeptide antibodies. All synthetic polyclonal antibodies showed specificity toward to processed and, when applicable, unprocessed target peptides from protein fractions derived from Photorhabdus culture broth. Various antibodies were shown to recognize either unprocessed or processed recombinant proteins derived from heterologous expression systems such as bacteria or insect cells, using baculovirus expression constructs. In one case, the anti-TcbAiii-syn antibody showed some cross-reactivity to anti-TcdAiii peptide. In a second case, the anti-TcaC-syn antibody, recognized an unidentified 190 kDa peptide in W-14 toxin complex fractions.
Example 22 .Characterization of Photorhabdus Strains In order to establish that the collection described herein was comprised of Photorhabdus strains, the strains herein were assessed in terms of recognized microbiological traits that are characteristic of the bacterial genus Photorhabdus and which differentiate it from other Enterobacteriaceae and Xenorhabdus spp.
.(Farmer, J. J. 1984. Bergey's Manual of Systemic Bacteriology, Vol 25 1. pp. 510-511. (ed. Kreig N. R. and Holt, J. Williams Wilkins, Baltimore.; Akhurst and Boemare, 1988, J. Gen. Microbiol.
134, 1835-1845; Forst and Nealson, 1996. Microbiol. Rev. 60, 21- 43). These characteristic traits are as follows: Gram stain negative rods, organism size of 0.3-2 pm in width and 2-10 pm in length [with occasional filaments (15-50 gm) and spheroplasts], yellow to orange/red colony pigmentation on nutrient agar, presence of crystalline inclusion bodies, presence of catalase, inability to reduce nitrate, presence of bioluminescence, ability to take up dye from growth media, positive for protease production, growth at temperatures below 37 0 C, survival under anaerobic conditions and positively motile. (Table 33). Test methods were checked using reference Escherichia coli, Xenorhabdus and Photorhabdus strains.
The overall results are consistent with all strains being part of the family Enterobacteriaceae and the genus Photorhabdus. Note that DEP1, DEP2, and DEP3 refer to Photorhabdus strains obtained -125from the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA (#29304, 29999 and 51583, respectively).
A luminometer was used to establish the bioluminescence associated with these Photorhabdus strains. To measure the presence or absence of relative light emitting units, the broths from each strain (cells and media) were measured at three time intervals after inoculation in liquid culture (24, 48, 72 hr) and compared to background luminosity (uninoculated media). Several Xenorhabdus strains were tested as negative controls for luminosity. Prior to measuring light emission from the various broths, cell density was established by measuring light absorbance i ,(560 nM) in a Gilford Systems (Oberlin, OH) spectrophotometer using a sipper cell. The resulting light emitting units could then be normalized to density of cells. Aliquots of the broths were placed 15 into 96-well microtiter plates (100 il each) and read in a Packard Lumicount" luminometer (Packard Instrument Co., Meriden, CT). The measurement period for each sample was 0.1 to 1.0 second. The samples were agitated in the luminometer for 10 sec prior to taking readings. A positive test was determined as being about 20 background luminescence (about 1-15 relative light units). In addition, degree of colony luminosity was confirmed with photographic film overlays and by eye, after visual adaptation in a darkroom. The Gram's staining characteristics of each strain were established with a commercial Gram's stain kit (BBL, Cockeysville, 25 MD) used in conjunction with Gram's stain control slides (Fisher Scientific, Pittsburgh, PA). Microscopic evaluation was then performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil immersion objective lens (with 10X ocular and 2X body magnification). Microscopic examination of individual strains for organism size, cellular description and inclusion bodies (the latter two observations after logarithmic growth) was performed using wet mount slides (10X ocular, 2X body and 40X objective magnification) and phase contrast microscopy with a micrometer (Akhurst, R. J. and Boemare, N. E. 1990. Entomopathogenic Nematodes in Biological Control (ed. Gaugler, R. and Kaya, pp. 75-90.
CRC Press, Boca Raton, USA.; Baghdiguian Boyer-Giglio M. H., Thaler, J. Bonnot Boemare N. 1993. Biol. Cell 79, 177- 185.). Colony pigmentation was observed after inoculation on Bacto nutrient agar, (Difco Laboratories, Detroit, MI) prepared as per -126label instructions. Incubation occurred at 28 0 C and descriptions were produced after 5 days. To test for the presence of the enzyme catalase, a colony of the test organism was removed on a small plug from a nutrient agar plate and placed into the bottom of a glass test tube. One ml of a household hydrogen peroxide solution was gently added down the side of the tube. A positive reaction was recorded when bubbles of gas (presumptive oxygen) appeared immediately or within 5 seconds. Controls of uninoculated nutrient agar and hydrogen peroxide solution were also examined. To test for nitrate reduction, each culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI) After 24 hours incubation with gentle agitation at 28 0 C, nitrite production .was tested by the addition of two drops of sulfanilic acid reagent and two drops of alpha-naphthylamine reagent (see Difco Manual, 10th edition, Difco Laboratories, Detroit, MI, 1984). The generation of a distinct pink or red color indicates the formation of nitrite from nitrate whereas the lack of color formation indicates that the strain is nitrate reduction negative. In the latter case, finely powdered zinc was added to further confirm the 20 presence of unreduced nitrate; established by the formation of nitrite and the resultant red color. The ability of each strain to uptake dye from growth media was tested with Bacto MacConkey agar containing the dye neutral red; Bacto Tergitol-7 agar containing the dye bromothymol blue and Bacto EMB Agar containing the dye eosin-Y (formulated agars from Difco Laboratories, Detroit, MI, all prepared according to label instructions). After inoculation on these media, dye uptake was recorded after incubation at 28 0 C for days. Growth on these latter media is characteristic for members of the family Enterobacteriaceae. Motility of each strain was tested using a solution of Bacto Motility Test Medium (Difco Laboratories, Detroit, MI) prepared as per label instructions. A butt-stab inoculation was performed with each strain and motility was judged macroscopically by a diffuse zone of growth spreading from the line of inoculum. The production of protease was tested by observing hydrolysis of gelatin using Bacto gelatin (Difco Laboratories, Detroit, MI) made as per label instructions.
Cultures were inoculated and the tubes or plates were incubated at 28 0 C for 5 days. Gelatin hydrolysis was then checked at room temperature, i.e. less than 22 0 C. To assess growth at different -127temperatures, agar plates proteose peptone #3 with two percent Bacto-Agar (Difco, Detroit, MI) in deionized water] were streaked from a common source of inoculum. Plates were incubated at 20, 28 and 37 0 C for up to three weeks. The incubator temperature levels were checked with an electronic thermocouple and meter to insure valid temperature settings. Oxygen requirements for Photorhabdus strains were tested in the following manner. A butt-stab inoculation into fluid thioglycolate broth medium (Difco, Detroit, MI) was made. The tubes were incubated at room temperature for one week and cultures were then examined for type and extent of growth The indicator resazurin demonstrates the presence of medium oxygenation or the aerobiosis zone (Difco Manual, 10th edition, Difco Laboratories, Detroit, MI). Growth zone results obtained for the Photorhabdus strains tested were consistent with those of a 15 facultative anaerobic microorganism. In the case of unclear results, the final agar concentration of fluid thioglycolate broth medium was raised to 0.75% and the growth characteristics rechecked.
••oQ -128- Table Taxonomic Traits of Photorhabdus Strains strain A* U U J K M N 0 Q U zealandica V. hepiaius r HB-Arg HB Oswego ra HB Lewiston T ra S K-122 y HMU1C) r cL Inalcus ra w UL- r-as 7 YT 1h- ra egiais ra Meg K A.cow e A. Cows r= T r T T MP3 T 7 r MP ra b1+ M F ul CLI r +Li+b W ++LI rc l1 r i i92 T+ f+ P +T 7+ T 5 A=Gram's stain, B=Crystaline-inclusion bodies, C=Bioluminescence, D=Cell form, E=Motility, F=Nitrate reduction, G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, J=Pigmentation on Nutrient Agar (some color shifts after Day K=Growth on EMB agar, L=Growth on MacConkey agar, M=Growth on Tergitol-7 agar, N =Facultative anaerobe, O=Growth at 20 0
C,
P=Growth at 28 0 C, Q=Growth at 37 0
C.
t: +=positive for trait, =negative for trait; rd=rod, S=sized within Genus descriptors.
g: W white, CR cream, Y =yellow, YT=yellow tan, T=tan PO=pale orange, O=orange, PR=pale red, R=red.
The evolutionary diversity of the Photorhabdus strains in our collection was measured by analysis of PCR (Polymerase Chain Reaction) mediated genomic fingerprinting using genomic DNA from each strain. This technique is based on families of repetitive DNA sequences present throughout the genome of diverse bacterial species (reviewed by Versalovic, Schneider, DE Bruijn, F. J. and Lupski, J. R. 1994. Methods Mol. Cell. Biol., 5, 25-40) Three of these, repetitive extragenic palindromic sequence (REP), enterobacterial repetitive intergenic consensus (ERIC) and the BOX -129element are thought to play an important role in the organization of the bacterial genome. Genomic organization is believed to be shaped by selection and the differential dispersion of these elements within the genome of closely related bacterial strains can be used to discriminate these strains Louws, F. J., Fulbright, D. Stephens, C. T. and DE Bruijn, F. J. 1994. Appl.
Environ. Micro. 60, 2286-2295). Rep-PCR utilizes oligonucleotide primers complementary to these repetitive sequences to amplify the variably sized DNA fragments lying between them. The resulting products are separated by electrophoresis to establish the DNA "fingerprint" for each strain.
To isolate genomic DNA from our strains, cell pellets were resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to a final volume of 10 ml and 12 ml of 5 M NaC1 was then added. This 15 mixture was centrifuged 20 min. at 15,000 x g. The resulting pellet was resuspended in 5.7 ml of TE and 300 Al of 10% SDS and pl 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, NY) were added. This mixture was incubated at 37 0 C for 1 hr, approximately 10 mg of lysozyme was then added and the mixture was 20 incubated for an additional 45 min. One milliliter of 5M NaCl and 800 il of CTAB/NaC1 solution (10% w/v CTAB, 0.7 M NaC1) were then added and the mixture was incubated 10 min. at 65 0 C, gently agitated, then incubated and agitated for an additional 20 min. to aid in clearing of the cellular material. An equal volume of 25 chloroform/isoamyl alcohol solution (24:1, v/v) was added, mixed gently then centrifuged. Two extractions were then performed with an equal volume of phenol/chloroform/isoamyl alcohol (50:49:1) Genomic DNA was precipitated with 0.6 volume of isopropanol.
Precipitated DNA was removed with a glass rod, washed twice with 70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 10 mM NaC1, 1 mM EDTA). The DNA was then quantitated by optical density at 260 nm. To perform rep-PCR analysis of Photorhabdus genomic DNA the following primers were used, REP1R-I; 5'-IIIICGICGICATCIGGC-3' and REP2-I; 5'-ICGICTTATCIGGCCTAC-3'.
PCR
was performed using the following 251 reaction: 7.75 g1 H 2 0, l 10X LA buffer (PanVera Corp., Madison, WI), 16 il dNTP mix mM each), 1 il of each primer at 50 pM/41, 1 l DMSO, 1.5 1 genomic DNA (concentrations ranged from 0.075-0.480 pg/ll) and 0.25 1l TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR -130amplification was performed in a Perkin Elmer DNA Thermal Cycler (Norwalk, CT) using the following conditions: 95 0 C/7 min. then cycles of; 94 0 C/1 min.,44 0 C/l min., 65 0 C/8 min., followed by 15 min.
at 65 0 C. After cycling, the 25 l reaction was added to 5 g1 of 6X gel loading buffer (0.25% bromophenol blue, 40% w/v sucrose in
H
2 A 15x20cm l%-agarose gel was then run in TBE buffer (0.09 M Tris-borate, 0.002 M EDTA) using 8 l of each reaction. The gel was run for approximately 16 hours at 45v. Gels were then stained in 20 Ag/ml ethidium bromide for 1 hour and destained in TBE buffer for approximately 3 hours. Polaroid® photographs of the gels were then taken under UV illumination.
The presence or absence of bands at specific sizes for each strain was scored from the photographs and entered as a similarity matrix in the numerical taxonomy software program, NTSYS-pc (Exeter Software, Setauket, NY). Controls of E. coli strain HB101 and Xanthomonas oryzae pv. oryzae assayed under the same conditions produced PCR fingerprints corresponding to published reports S(Versalovic, Koeuth, T. and Lupski, J. R. 1991. Nucleic Acids Res. 19, 6823-6831; Vera Cruz, C. Halda-Alija, Louws, F., 20 Skinner, D. George, M. Nelson, R. DE Bruijn, F. J., Rice, C. and Leach, J. E. 1995. Int. Rice Res. Notes, 20, 23-24.; Vera Cruz, C. Ardales, E. Skinner, D. Talag, J., Nelson, R. Louws, F. Leung, Mew, T. W. and Leach, J. E.
1996. Phytopathology 86, 1352-1359). The data from Photorhabdus strains were then analyzed with a series of programs within NTSYSpc; SIMQUAL (Similarity for Qualitative data) to generate a matrix oo.0- of similarity coefficients (using the Jaccard coefficient) and SAHN (Sequential, Agglomerative, Heirarchical and Nested) clustering [using the UPGMA (Unweighted Pair-Group Method with Arithmetic Averages) method] which groups related strains and can be expressed as a phenogram (Fig. The COPH (cophenetic values) and MXCOMP (matrix comparison) programs were used to generate a cophenetic value matrix and compare the correlation between this and the original matrix upon which the clustering was based. A resulting normalized Mantel statistic was generated which is a measure of the goodness of fit for a cluster analysis (r=0.8-0.9 represents a very good fit). In our case r=0.924. Therefore, the collection is comprised of a diverse group of easily distinguishable strains representative of the Photorhabdus genus.
-131- Example 23 Insecticidal Utility of Toxin(s) Produced by Various Photorhabdus Strains Initial "storage" cultures of the various Photorhabdus strains were produced by inoculating 175 ml of 2% Proteose Peptone #3 (PP3) (Difco Laboratories, Detroit, MI) liquid medium with a primary variant colony in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput closure. After inoculation, the flask was incubated for between 24-72 hrs at 28 0 C on a rotary shaker at 150 rpm, until stationary phase was reached. The culture was transferred to a sterile bottle containing a sterile magnetic stir bar and the culture was overlayered with sterile mineral oil, to 15 limit exposure to air. The storage culture was kept in the dark, at room temperature. These cultures were then used as inoculum sources for the fermentation of each strain.
"Seed" flasks or cultures were produced by either inoculating 2 mls of an oil overlayered storage culture or by transferring a 20 primary variant colony into 175 ml sterile medium in a 500 ml **e ~tribaffled flask covered with a Kaput closure. (The use of other inoculum sources is also possible.) Typically, following 16 hours incubation at 28 0 C on a rotary shaker at 150 rpm, the seed culture was transferred into production flasks. Production flasks were usually inoculated by adding about 1% of the actively growing seed culture to sterile 2% PP3 medium 2.0 ml per 175 ml sterile medium). Production of broths occurred in 500 ml tribaffled flasks covered with a Kaput. Production flasks were agitated at 28 0 C on a rotary shaker at 150 rpm. Production fermentations were terminated after 24-72 hrs although successful fermentation is not confined to this time duration. Following appropriate incubation, the broths were dispensed into sterile 1.0 L polyethylene bottles, spun at 2600xg for 1 hr at 100C and decanted from the cell and debris pellet. Further broth clarification was achieved with a tangential flow microfiltration device (Pall Filtron, Northborough, MA) using a 0.5 pM open-channel poly-ether sulfone (PES) membrane filter.
The resulting broths were then concentrated (up to 10-fold) using a 10,000 or 100,000 MW cut-off membrane, M12 ultra-filtration device (Amicon, Beverly MA) or centrifugal concentrators (Millipore, Bedford, MA and Pall Filtron, Northborough, MA) with a 10,000 or -132- 100,000 MW pore size. In the case of centrifugal concentrators, the broth was spun at 2000xg for approximately 2 hr. The membrane permeate was added to the corresponding retentate to achieve the desired concentration of components greater than the pore size used. Following these procedures, the broth was used for biochemical analysis or filter sterilized using a 0.2 pM cellulose nitrate membrane filter for biological assessment. Heat inactivation of processed broth samples was achieved by heating the samples at 1000C in a sand-filled heat block for 10 minutes.
The broth(s) and toxin complex(es) from different Photorhabdus strains are useful for reducing populations of insects and were used in a method of inhibiting an insect population which comprises applying to a locus of the insect an effective insect inactivating o amount of the active described. A demonstration of the breadth of insecticidal activity observed from broths of a selected group of SPhotorhabdus strains fermented as described above is shown in Table S36. It is possible that improved or additional insecticidal activities could be detected with these strains through increased concentration of the broth or by employing different fermentation 20 methods. Consistent with the activity being associated with a protein, the insecticidal activity of all strains tested was heat labile.
Culture broth(s) from diverse Photorhabdus strains show differential insecticidal activity (mortality and/or growth inhibition) against a number of insects. More specifically, the activity is seen against corn rootworm which is a member of the insect order Coleoptera. Other members of the Coleoptera include boll weevils, wireworms, pollen beetles, flea beetles, seed beetles and Colorado potato beetle. The broths and purified toxin complex(es) are also active against tobacco budworm, tobacco hornworm and European corn borer which are members of the order Lepidoptera. Other typical members of this order are beet armyworm, cabbage looper, black cutworm, corn earworm, codling moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm and fall armyworm. Activity is also observed against German cockroach which is a member of the order Dictyoptera (or Blattodea). Other members of this order are oriental cockroach and American cockroach.
-133- Activity against corn rootworm larvae was tested as follows.
Photorhabdus culture broth(s) (10 fold concentrated, filter sterilized), 2% Proteose Peptone #3 (10 fold concentrated), purified toxin complex(es), 10 mM sodium phosphate buffer, pH were applied directly to the surface (about 1.5 cm 2 of artificial diet (Rose, R. I. and McCabe, J. M. 1973. J. Econ. Entomol. 66, 398-400) in 40 pl aliquots. Toxin complex was diluted in 10 mM sodium phosphate buffer, pH 7.0. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) hatched from surface sterilized eggs. The plates were sealed, placed in a humidified growth chamber and maintained at 27 0 C for the appropriate period (3-5 days). Mortality and .larval weight determinations were then scored. Generally, 16 insects per treatment were used in all studies. Control mortality was generally less than SActivity against lepidopteran larvae was tested as follows.
Concentrated (10-fold) Photorhabdus culture broth(s), control medium Proteose Peptone purified toxin complex(es), 10 mM 20 sodium phosphate buffer, pH 7.0 were applied directly to the surface (about 1.5 cm 2 of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 il aliquots. The diet plates were allowed to air-dry in a sterile flow-hood and each well was infested with a single, neonate larva. European corn borer (Ostrinia nubilalis) and tobacco hornworm (Manduca sexta) eggs were obtained from commercial sources and hatched in-house, whereas tobacco budworm (Heliothis virescens) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27 0 C for the appropriate period. Mortality and weight determinations were scored at day 5. Generally, 16 insects per treatment were used in all studies. Control mortality generally ranged from about 0 to about 12.5% for control medium and was less than 10% for phosphate buffer.
Activity against cockroach was tested as follows. Concentrated Photorhabdus culture broth(s) and control medium (2% Proteose Peptone were applied directly to the surface (about cm 2 of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 ji aliq-. s. The diet plates were allowed to -134air-dry in a sterile flow-hood and each well was infested with a single, CO 2 anesthetized first instar German cockroach (Blatella germanica) Following infestation, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27 0 C for the appropriate period. Mortality and weight determinations were scored at day 5. Control mortality less than e -135- Tabe 36 Observed Insecticidal Spectrum of Broths from Different PhotorhabdUs Strains Sensitive* Insect !:peci.es
*SS..S
S
S
S p 5.
S
*5 P. zealandica P. hepialus HB-Arg HB Oswego HB Lewiston K-122
HMGD
Indicus
GD
Megidi s HF-85 A. Cows MP1 MP2 MP3 MP4 MP5 GL98 GL101 GL138 GL 155 GL217 GL257 DEP 1 DEP 2 DEP3 2, 4 1, 2, 4 1, 2, 4 1, 2, 4 1, 2, 4 1, 4 1, 4 1, 2, 4 2, 4 1, 2, 4 1, 2, 4 1, 2, 4 1, 4 1, 2, 4 1, 2, 4 4 1, 4 4 1, 4 1, 4, 1, 2, 4 1, 4 1, 2, 4 1, 4 1, 4 1, 2, 3, 4 4 25% mortality and/or growth inhibition vs. control Tobacco budworm, 2; European corn borer, 3; Tobacco hornworm, 4; Southern corn rootworm, German cockroach.
-136- Example 24 Southern Analysis of Non-W-14 Photorhabdus Strains Using W-14 Gene Probes Photorhabdus strais were grown on 2% proteose peptone #3 agar (Difco Laboratories, Detroit, MI) and insecticidal toxin competence was maintained by repeated bioassay after passage. A 50 ml shake culture was produced in 175 ml baffled flasks in 2% proteose peptone #3 medium, grown at 280 and 150 rpm for approximately 24 hours. Fifteen ml of this culture were centrifuged (700 x g, min) and frozen in its medium at -200 until it was thawed (slowly in ice water) for DNA isolation. The thawed W-14 culture was centrifuged (900 x g, 15 min 40), and the floating orange 1 mucopolysaccharide material was removed. The remaining cell 15 material was centrifuged (25,000 x g, 40) to pellet the bacterial cells, and the medium was removed and discarded.
Total DNA was isolated by an adaptation of the CTAB method described in section 2.4.1 of Ausubel et al. (1994). The modifications included a high salt shock, and all volumes were increased ten-fold over the "miniprep" recommended volumes. All centrifugations were at 4 0 C unless otherwise specified. The pelleted bacterial cells were resuspended in TE buffer (10 mM Tris- HC1, 1 mM EDTA, pH 8) to a final volume of 10 ml, then 12 ml 5 M NaC1 were added; this mixture was centrifuged 20 min at 15,000 x g.
25 The pellet was resuspended in 5.7 ml TE, and 300 l of 10% SDS and 60 il of 20 mg/ml proteinase K (in sterile distilled water, Gibco BRL Products, Grand Island, NY) were added to the suspension. The mixture was incubated at 37 0 C for 1 hr; then approximately 10 mg lysozyme (Worthington Biochemical Corp., Freehold, NJ) were added.
30 Afta-.an-additional 45 min incubation, 1 ml of 5 M NaCI and 800 pl of CTAB/NaC1 solution (10% w/v CTAB, 0.7 M NaCl)-were added. This preparation was incubated 10 min at 65 0 C, then gently agitated and further incubated and agitated for approximately 20 min to assist clearing of the cellular material. An equal volume of chloroform/isoamyl alcohol solution (24:1, v:v) was added, mixed very gently, and the phases separated by centrifugation at 12,000 x g for 15 min. The upper (aqueous) phase was gently removed with a wide-bore pipette and extracted twice as above with an equal volume of PCI (phenol/choloroform/ isoamyl alcohol; 50:49:1, v:v:v; equilibrated with 1M Tris-HCl, pH 8.0; Intermountain Scientific Corporation, Kaysville, UT). The DNA precipitated with 0.6 volume of isopropanol was gently removed on a glass rod, washed twice with ethanol, dried, and dissolved in 2 ml STE (10 mM Tris-HCl, -137mM NaCI, 1 mM EDTA, pH This preparation contained 2.5 mg/ml DNA, as determined by optical density at 260nm.
Identification of Bal II/Hind IITT Fragments Hybridizing to tc-gene Specific Probes Approximately 10 pg of genomic DNA was digested to completion with about 30 units each of Bgl II and Hind III (NEB) for 180 min, frozen overnight, then heated at 65 0 C for five min, and electrophoresed in a 0.8% agarose gel (Seakems LE, lX TEA, volts, 90 min). The DNA was stained with ethidium bromide Ag/ml) as described earlier, and photographed under ultraviolet light. The DNA fragments in the agarose gel were subjected to depurination (5 min in 0.2 M HC1), denaturation (15 min in 0.5 M NaOH, 1.5 M NaC1), and neutralization (15 min in 0.5 M Tris HC1 pH 15 8.0, 1.5 M NaC1), with 3 rinses of distilled water-between each step. The DNA was transferred by Southern blotting from the gel onto a NYTRAN nylon membrane (Amersham, Arlington Heights, IL) using a high salt (20X SSC) protocol, as described in section 2.9 of Ausubel et al. (CPMB, op. cit.). The transferred DNA was then UV-crosslinked to the nylon membrane using a Stratagene UV Stratalinker set on auto crosslink. The membranes were stored dry at 25 0 C until use.
Hybidization was performed using the ECLTMdirect (Amersham, Arlington Heights, IL) labeling and detection system following S" 25 protocols provided by the manufacturer. In brief, probes were prepared by covalently linking the denatured DNA to the enzyme horseradish peroxidase. Once labeled the probe was used under hybridization conditions which maintain the enzymatic activity.
S Unhybridized probe was removed by two gentle washes 20 minutes each at 42 0 C in 0.5xSSC, 0.4% SDS, and 6M Urea. This was followed by two washes 5 minutes each at room temperature in 2xSSC. As directed by the manufacturer, ECLT reagents were used to detect the hybridizing DNA bands. There are several factors which influence the ability to detect gene relatedness between various Photorhabdus strains and strain W-14. First, high stringency conditions have not been employed in these hybridizations. It is known in the art that varying the stringency of hybridization and wash conditions will influence the pattern and intensity of hybridizing bands. Second, Southern blots' blot to blot variation will influence the mobility of hybridizing bands and molecular weight estamates. Therefore, W- 14 was included as a standard on all Southern blots.
-138- Gene specific probes derived from the W-14 toxin genes were used in these hybridizations. The following lists the specific coordinates within each gene sequence to which the probe corresponds. A probe specific for tcaBi/Bi 1 1174 to 3642 of Sequence ID #25, a probe specific for tcaC: 3637 to 6005 of Sequence ID #25, a probe specific for tcbA: 2097 to 4964 of Sequence ID #11, and a probe specific for tcdA: 1660 to 4191 of sequence ID #46. The following tables summarize Southern Blot analyses of Photorhabdus strains. In the event that hybridization of probes occurred, the hybridized fragment(s) were noted as either identical or different from the pattern observed for the W-14 strain.
p e *os -139- Table 37 Southern Analysis of Photorhabdus Strains fragment pattern; D Different frqgme!nt pattern.
-140- Southern Analysis of-Phio-torhabdus Strains Srains CcU I b tca C tcaB 1111 K -122 DT2
NU
D
india.cus 3 .0 D km -8 L) D~ MP 3 MY 1
+D
A. U- w L)W D Ar g D D kHNUV D L) ±IB Lewis ton---7 W
D
HB Uswego W D W-14 ND Not cdeterinii;T no detec abie ntybrI dization product; I =Identical fragment pattern; D Different fragment pattern.
=Hybridization fragment pattern not determined.
-141- Table 39 Southern Analysis of Photorhabdus Strains |trains EcCLA tCbA caC tcaBi/Bii L9B
D
GLIO ULij- L155 GL217 MW4 MP4 P nepialus P zealandla DE P3em W-±4 2 NI-
L
I- I d. -eULcLblue nybriaization product; I Identical fragment pattern; D Different fragment pattern.
Hybridization fragment pattern not determined.
From these analyses it is apparent that homologs of W-14 genes are dispersed throughout these diverse Photorhabdus strains, as evidenced by differences in gene fragment sizes between W-14 and the other strains.
Examole N-Terminal Amino Acid Sequences of Toxin Complex Peptides from Different Photorhabdus Strains The relationship of peptides isolated from different Photorhabdus strains, as described in Example 14, were subjected to -142- N-terminal amino acid sequencing. The N-terminal amino acid sequences of toxin peptides in several strains were compared to W- 14 toxin peptides. In Table 40, a comparison of toxin peptides compared to date showed that identical or homologous (at least similarity to W14 gene/peptides) toxin peptides were present in all of the strains. For example, the N-terminal amino acid sequence of TcaC, SEQ ID NO: 2, was found to be identical to that for 160 kDa peptide in HP88 but also homologs were present in strains WIR, H9, Hb, WX-1, and Hm. Some W-14 peptides or homologs have not been observed in other strains; however, not all peptides have been sequenced for toxin complexes from other strains due to N-terminal blockage or low abundance. In addition, many other N-terminal amino acid sequences (SEQ ID NOS: 82 to 88) have been obtained for toxin complex peptides from other strains that have no similarity to peptides from W-14 and in some case were identical to each S" other. For example, an identical amino acid sequence, SEQ ID NO: 82, was obtained for 64 kDa peptide present in both HP88 and Hb strains and a homologous sequence for a 70 kDa peptide in NC-1 strain (SEQ ID NO: 83).
-143- Table Amino Terminal Sequence A Comparison of Homoloav Between Proteins Isolated From Non-W-14 Strains 0 0 p •g **00 0 oooo W-ii W-1 4 Wq SQ 10 Strain iaentical Homoiogy Peptide Gene SEQ ID NO: TcaAii tcaA TcaAiii tcaA 4 TcaBi tcaB 3 76 -H9 74 kDa 76 Hm 71 kDa TcaBii tcaB 5 H9 61 kDa Hm 61 kDa TcaC tcaA 2 72 Hb 160 kDa HP88 160 kDa 73 WIR 170 kDa 74 H9 180 kDa 75 Hm 170 kDa WX-1 170 kDa TcbAii tcbA 1 TcbAiii tcbA TccA tccA 8 77 Hb 61 kDa TccB tccB 7 WX-1 170 kDa WX-2 180 kDa WX-14 180 kDa WIR 170 kDa 78 H9 170 kDa NC-1 140 kDa 79 Hm 190 kDa TcdAii tcdA TcdAiii tcdA 41 Hb 57 kDa 81 H9 69 kDa 9 Hb 86 kDa HP88 86 kDa Homo±ogy rerers to amino acil sequences that were at least qU% similarity to W14 gene peptides. Similar residues were identified as being a member in one of the following five groups: A, G, S, N, E, B, D, K, I, V, and Y, W).
Example 26 Immunolocical Analysis of Photorhabdus Strains Culture broths of Photorhabdus strains were concentrated 10 to times using Centriprep-10 ultrafiltration device (Amicon, Inc.
Beverly, MA 01915). The concentration of the protein ranges from 0.3 to 3.0 mg per ml. Ten to 20 pg of total protein was loaded in each well of a precast 4-20% polyacrylamide gel (Integrated Separation Systems, Natick, MA 01760). Gel electrophoresis was performed for 1.25 hours using a constant current set at 25 ma per gel. The gel was electro-blotted on to Hybond-ECLTM nitrocellulose membrane (Amersham Corporation, Arlington Hts, Il 60005) using a semi-dry electro-blotter (Pharmacia Biotech Inc., Piscataway NJ -144- 08854). A constant current was applied at 0.75 ma per cm for hours. The membrane was blocked with 10% milk in TBST (25 mM Tris HC1 pH 7.4, 136 mM NaC1, 2.7 mM KC1, 0.1% Tween 20) for one hour at room temperature. Each primary antibody was diluted in milk/TBST to 1:500. Other dilution between 1:50 to 1:1000 was also used. The membrane was incubated in primary antibody for at least one hour. Then it was washed thoroughly with the blocking solution or TBST. A 1:2000 dilution of secondary antibodies (goat antimouse IgG or goat anti rabbit TgG conjugated to horseradish peroxidase; BioRad Laboratories, Hercules, CA 94547) in milk/TBST was applied to the membrane which was placed on a platform rocker for one hour. The membrane was subsequently washed with excess amount of TBST. The detection of the protein was performed by-using an ECL (Enhanced Chemiluminescence) detection 15 kit (Amersham International).
A panel of peptide specific-antibodies generated against W-14 .ooo: •peptides were used to characterize the protein composition of broths from nine non-W-14 Photorhabdus strains using Western blot analysis. In addition, one monoclonal antibody (MAb-C5F2) which recognizes TcbAiii protein in W-14-derived toxin complex was used.
The results (Table 39) showed cross recognition of the antibodies S* to some of the proteins in these broths. In some cases, the proteins that were recognized by the antibodies were the same size as the W-14 target peptides. In other cases, the proteins that 25 were recognized by the antibodies were smaller than the W-14 target Soooo: peptides. This data indicate that some of the non-W-14 Photorhabdus strains may produce similar proteins to the W-14 strain. The difference could be due to deletion or protein processing or degradation process. Some of the strains did not 30 contain protein(s) that could be recognized by some antibodies, however, it is possible that the concentration is significantly lower than those observed for W-14 peptides. When compared for various toxin peptide homologs these results showed peptide diversity among the Photorhabdus strains.
-145- Table 41 Cross Recognition by Monoclonal Antibodies or Polvclonal Antibodi a Generated Acainst W-14 Peptides to Protein(s) in Broths of Selected Non-W-14 Photorhabdus P0oco- MA A PAb PAb PAb- PAb PAb PAb PArhabdus C5F2 TcdA TcdA TcaC TcaB TcbA TcaB TcaA TcaA Strain ii- iii- -syn ii- iii- i- ii- iiisyn syn syn syn syn syn syn MP MP2 MP- NT A. Cows N w-osw NT NT H-Arg NT Hb-leu NT inalcus NT HF85 W-14+ Posiive reaction; Neaatve reac tfi- M: -nt-rpIri-e 0 Additional non-W-14 Photorhabdus strains were characterized by Western blot analysis using the culture broth and/or partial 10 purified protein fractions as antigen. The panel of antibodies include MAb-C5F2, MAb-DE1 (recognizing TcdAii), PAb-DE2 (recognizing TcaB), PAb-TcbAii-syn, PAb- TcaC-syn, PAb TcaBii-syn, PAb-TcbAiiisyn, PAb-TcaBi-syn. These antibodies showed cross-reactivity with proteins in the broth and in the partial purified fractions of non- W-14 strains.
The data indicate that antibodies could be used to identify proteins in the broth as well as in the partially purified protein fractions.
-146- Table 42 Cross Recognition by Monoclonal Antibodies or Plvclonal AntibocHs Generated Against W-14 Peptides to Protein(s) in Broths and/or 0 4ia P ife Dro i FF ti A D rS- SA 16 E .ractL Uons of selected NonY-W14i Photo ahbdus Pco- Monociona Voiycionai Antinodles rhabdus Antibodies Strain Mab Mab- PAX)- PAX) PAb PAb kWAb- P-Ab CSF2 DE1 DE2 TcbAii TcaC- TcaBii TcbAii 1 TcaBi -syn syn -syn -syn -syn W- W-2
T+
T
N T ~N r WX-- NT NT NT NT NNT WA-b N7 Nf N= N= N7- NT NT WA- W- NT N NT WX9+ NT ITT N T NT N Hb NT NT NT 'NT Hm NT Nr-) N' T r -717 NTr- NTr N'- N T NT NT T HFM NT M, N' NT NT Negative reaction; Positive reaction; N: Not testea Example 27 L f L zesiaon of the tcd n.oUdl] IsgiO1 Engineering of the tcdA Gene for Bacterial ExTression The 5' and 3' ends of the tcdA coding region (SEQ ID NO:46) were modified to add useful cloning sites for inserting the segment into heterologous expression vectors. The ends were modified using unique primers in Polymerase Chain Reactions (PCR), performed essentially as described in Example 8. Primer sets, as described below, were used in conjunction with cosmid 21D2.4 as template, to created products with the appropriately modified ends.
The first primer set was used to modify the 5' end of the gene, to insert a unique Nco I site at the initiator codon using the forward primer AOF1 GAT CGA TCG ATC CAT GGC CAA CGA GTC TGT AAA AGA GAT ACC TGA TG TAT TAA AAA GCC AGT GTG and to add unique Bgl II, Sal I and Not I sites to facilitate insertion of the remainder of the gene using the reverse primer AOR1 GAT CGA TCG TAC GCG -147- GCC GCT CGA TCG ATC GTC GAC CCA TTG ATT TGA GAT CTG GGC GGC GGG TAT CCA GAT AAT AAA CGG AGT CAC Another PCR reaction was designed to modify the 3' end of the gene by adding an additional stop codon and convenient restriction sites for cloning. The forward primer AOF2 ACT GGC TGC GTG GTC GAC TGG CGG CGA TTT ACT was used to amplify across a unique Sal I site in the gene, later used to clone the modified 3' end. The reverse primer AOR2 CGA TGC ATG CTG CGG CCG CAG GCC TTC CTC GAG TCA TTA TTT AAT GGT GTA GCG AAT ATG CAA AAT was used to insert a second stop codon (TGA) and cloning sites Xho I, Stu I and Not I.
Bacterial expression vector pET27b (Novagen, Madison, WI), was modified to delete the Bgl II site at position 446, according to standard molecular biology techniques.
The 497-bp PCR product from the first amplification reaction (AOF1+AOR1), to modify the 5' end of the gene, was ligated to the modified pET27b vector according to the supplier's instructions.
The DNA sequences of the amplified portion of three isolates were determined using the supplier's recommended primers and the sequencing methods described previously. The sequence of all 20 isolates was the same.
One isolate was then used as a cloning vector to insert the middle portion of the tcdA gene on a 6341 bp Bgl II to Sal I fragment. The resulting clone was called MC4 and contained all but the 3' most portion of the tcdA coding sequence. Finally, to 25 complete the full-length coding region, the 832 bp PCR product from the second PCR amplification (AOF2+AOR2), to modify the 3' end of the gene, was ligated to isolate MC4 on a Sal I to Not I fragment, according to standard molecular biology techniques. The tcdA coding e region was sequenced and found to be complete, the resulting plasmid is called pDAB2035.
Construction of Plasmids DDAB2036. pDAB2037 and pDAB2038 for Bacterial Expression of tcdA The tcdA coding region was cut from plasmid pDAB2035 with restriction enzymes Nco I and Xho I and gel purified. The fragment was ligated into the Nco I and Xho I sites of the expression vector pET1S to create plasmid pDAB2036. Additionally, pDAB2035 was cut with Nco I and Not I to release the tcdA coding region which was ligated into the Nco I and Not I sites of the expression vector pET28b to create plasmid pDAB2037. Finally, plasmid pDAB2035 was cut with Nco I and Stu I to release the tcdA coding region. This fragment was ligated into the expression vector Trc99a which was cut with Hind III followed by treatment with T4 DNA polymerase to blunt -148- 149 the ends. The vector was then cut with Nco I and ligated with the Nco I/Stu I cut tcdA fragment. The resulting plasmid is calledpDAB2038.
Expression of tcdA from Plasmid pDAB2038 Plasmid pDAB2038 was transformed into BL21 cells and expressed as described above for plasmid pDAB2033 in Example 19.
Purification of tcdA from E. coli The expression culture was centrifuged at 10,300 g for 30 min and the supernatant was collected. It was diluted with two volumes of H 2 0 and applied at a flow rate of ml/min to a poros 50 HQ (Perspective Systems, MA) column (1.6 cm x 10 cm) which was pre-equilibrated with 10 mM sodium phosphate buffer, pH 7.0 (Buffer The column was washed with Buffer A until the optical density at 280 nm returned to baseline level. The proteins bound to the column were then eluted with 1M NaCI in Buffer A.
The fraction was loaded in 20 ml aliquots onto a gel filtration column, Sepharose CL- 4B (2.6 x 100 cm), which was equilibrated with Buffer A. The protein was eluted in Buffer 15 A at a flow rate of 0.75 mL/min. Fractions with a retention time between 260 minutes and S460 minutes were pooled and applied at 1 mL/min to a Mono Q 5/5 column which was equilibrated with 20 mM Tris-HC1, pH 7.0 (Buffer The column was washed with Buffer B until the optical density at 280 nm returned to baseline level. The proteins bound to the column were eluted with a linear gradient of 0 to 1 M NaCl in Buffer B at ImL/min for 20 min. One milliliter fractions were collected, serial diluted, and subjected to SCR bioassay.
Fractions eluted out between 0.1 and 0.3 M NaCl were found to have the highest insecticidal activity. Western analysis of the active fractions using pAb TcdAii-syn anitbody and pAb Tcdiii-syn antibody indicated the presence of peptides TcdAii and TcdAiii.
Micro-Organism Deposit Details Table 43 illustrates those strains deposited at the American Type Culture Collection, 12301 Park Lawn Drive, Rockville, Maryland 20852, United States of America, together with the date of deposit and the relevant accession number.
Table 43 Microorganism Accession No. Deposit Date: Photorhabdus luminescens B2 55870 5 November 1996 P. luminescens WIR 55871 5 November 1996 [R:\WEEK 6\LibM]00067 P. luminescens Wr0 P. 'urninescens HP88 P. luminescens 4394 P. luinfescens 4394 P lurninescens 4395 P. luninescens 43958 P lurninescens 43952 A'enorhabdus 'luminescens W- 14 1 49a 55872 -5 N-ove-mber -19-9 6 5 873 -5 No0v ehnber-1-996- 55874 -5 N ov em-be r 19 96 55875 -5 N-ove mbe r 19 9 -6 55880 D November 1996 55881 5 No-v'em'berI 19 9 6 5588 5 November 1996 55397 5 March 199-3 a Table 44 illustrates strains deposited in the Agricultural Research Culture Collection (NRRL) International Depository Authority, 1815 North University Street, Peoria, Illinois 61604, United States of America. together with the date of deposit and the relevant 5 accession number.
Ph icorabdusap.iW m Photorhabdus sp. WXSI Photorhabdus sp. -WX6 Pholorhabdus sp. WX7 Phoforhabdus sp. WX8 Phoforhabdus sp. WX9 Photorhabdus sp. WX61 Photorhabdus sp. WX71 Phoborhabdus sp. WX81 Table 44 Accession No. Deposit Date: NRRL-B-21710 29 April -1997 NRRB-21711 29 April 1997 NRRL B-21712 29 April 1997 TRRL 'B2171 329 Ap)ril -1997 NRRL B-21714 29 April 1997 NKRRL B-2171 5 29 April 1997 N9RRL B-217206 29 April 1997 NRRL B-21721 729 April 1997 [R:\WEEK I 6\LibMOOO67 1 49b Photorhabdus sp. WX 14 NRRL B-21722 29 April 1997 Photorhabdus sp. IXI15 NRRL B-2 1723 29 April 1997 Photorhabdus sp. H9 NRRL B-2 1727 29 April 1997 Photorhabdus sp. Hb NRRL B-2 1726 29April 1997 Photorhabdus sp. Hm NRRL B-21725 29 April 1997 Photorhabdus sp. HP88 NRRL B-2 1724 29 April 1997 Photorhabdus sp. NC- I NRRL B-2 1728 :29 April 1997 Photorhabdus sp. W30 NRRL B-2 1729 21)9 April 1997 Photorhabdus sp. WIR NRRL B-2 1730 29 April 1997 Photorhabdus sp. B2 NRRL B-2 1731 29 April 1997 Photorhabdus -sp.DEP I NRRL B-21707 29 April 1997 Photorhabdus sp. DEP2 NRRL B3-2 1708 29 April 1997 Photorhabdus sp. DEP3 NRRL B-2 1709 29 April 1997 Photorhabdus sp. P zealandica NRRL B-2 1683 29 April 1997 Photorhabdus sp. P hepialus NRRL B3-21684 29 April 1997 Photorhabdus sp. HB-Arg NRRL B-2 1685 29 April 1997 Photorhabdus sp. HB Oswego NRRL B-2 1686 29 April 1997 Photorhabdus sp. HB Lewiston NRRL 13-21687 29 April 1997 Photorhabdus sp. K- 122 NRRL B-21688 29 April 1997 Photorhabdus sp. HMGD NRRL B-2 1689 29 April 1997 Photorhabdus sp. Indicus NRRL B-2 1690 29 April 1997 Photorhabdus sp. GD NRRL B-2 1691 29 April 1997 PThotorhabdus -sp.PWH-5 NRRL B-2 1692 29 April 1997 Photorhabdus sp. Megidis NRRL B3-2 1693 29 April 1997 Photorhabdus sp HF-85 NRRL B-2 1694 29 April 1997 Photorhabdus sp A. Cows NRRL 13-2 1695 29 April 1997 Photorhabdus sp MP1I NRRL 13-2 1696 29 April 1997 Photorhabdus MP2 NRRL 13-2 1697 29 April 1997 Photrhabus p MP NRL B-169829 pril199 Photorhabdus sp MP3 NRRL 13-2 1698 29 April 1997 Photorhabdus sp MP4 NRRL B-2 1699 29 April 1997 -Photorhabdus -spGMP5 NRRL 13-2 1700 29 April 1997 Photorhabdus sp GL980 NRRL 13-21701 29 April 1997 [R:\WEEK 16\LibM]00067 _1 49c rIuinwus sp ul'i115 Photorhabdus sp GU 55 Photorhabdus sp GL21-7 Photorhabdus sp GL25-7 INKIL B3-21703
I
I
NRRL B-21704 NRRRL B-217-05 NRRL B-217-6 29 April 1997 29 April 1997- 29 Apr1 1997 29 April -1997 a a.
[R:\WEEK I 6\LibM]00067 Page(s) 2 IQ6Tkre claims pages they appear after the sequence listing SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: Ensign, Jerald C Bowen, David J Petell, James Fatig, Raymond Schoonover, Sue ffrench-Constant, Richard Orr, Gregory L Merlo, Donald J Roberts, Jean L Rocheleau, Thomas A (ii) TITLE OF INVENTION: Insecticidal Protein Toxins from Photorhabdus 20 (iii) NUMBER OF SEQUENCES: 88 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: DowElanco STREET: 9330 Zionsville Road 25 CITY: Indianapolis STATE: IN COUNTRY: US ZIP: 46268 30 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: FILING DATE:
CLASSIFICATION:
S (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/063,615 FILING DATE: 18-MAY-1993 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/395,497 FILING DATE: 28-FEB-1995 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 60/007,255 FILING DATE: 06-NOV-1995 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/608,423 FILING DATE: 28-FEB-1996 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/705,484 FILING DATE: 28-AUG-1996 (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/743,699 FILING DATE: 06-NOV-1996 -150- (viii) ATTORNEY/AGENT INFORMATION: NAME: Borucki, Andrea T.
REGISTRATION NUMBER: 33651 REFERENCE/DOCKET NUMBER: 50301E (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: 317-337-4846 TELEFAX: 317-337-4847 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1 (TcbAii N-terminus): Phe Ile Gin Gly Tyr Ser Asp Leu Phe Gly Asn 1 5 INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: 30 LENGTH: 12 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein 35 FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2 (TcaC N-terminus): Met Gin Asp Ser Pro Glu Val Ser Ile Thr Thr Trp 1 5 INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3 (TcaB i N-terminus): Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala 1 5 10 Leu Val Ala -151- INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4 (TcaAii i N-terminus): Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn 1 5 INFORMATION FOR SEQ ID S(i) SEQUENCE CHARACTERISTICS: 20 LENGTH: 9 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein 25 FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5 (TcaBi N-terminus): Ala Gly Asp Thr Ala Asn Ile Gly Asp 30 1 INFORMATION FOR SEQ ID NO:6: 35 SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 40 (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Leu Gly Gly Ala Ala Thr Leu Leu Asp Leu Leu Leu-ro Gln Ile 1 5 10 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7 (TccB N-terminus): Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 1 5 -152- INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8 (TccA N-terminus): Met Asn Leu Ala Ser Pro Leu Ile Ser 1 INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: 20 LENGTH: 16 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein 25 FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: Met Ile Asn Leu Asp Ile Asn Glu Gin Asn Lys Ile Met Val Val Ser 30 1 5 10 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Ala Lys Asp Val Lys Phe Gly Ser Asp Ala Arg Val Lys Met Leu 1 5 10 Arg Gly Val Asn INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 7515 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..7515 -153- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll (tcbA gene): ATG CAA AAC Met Gin Asn TCA TTA TCA AGC ACT ATC CAT ACT ATT TCT CAG AAA 1
CAA
Gin
CGG
Arg
TAT
Tyr
CGT
Arg
GGT
Gly 25
GGT
Gly 30
TTC
Phe 35 TTG Leu
TTA
40 Leu 145
ACG
Thr
ACA
Thr
TTA
Leu
ATC
Ile
ATT
Ile 225
CAT
His
AAA
Lys
ACT
Thr
AAA
Lys
ATT
Ile
TTT
Phe
CGG
Arg
CGT
Arg
CCG
Pro 115
GAC
Asp
AGC
Ser
GCT
Ala
AAA
Lys
GGA
Gly 195
CAT
His
GCT
Ala
TCG
Ser
GAA
Glu Ser Leu
S
TGT CCG Cys Pro ACT CGG Thr Arg GCA CAA Ala Gin GCC TAT Ala Tyr CAA ATG Gin Met 85 GCT CAT Ala Asp 100 GCG GCT Ala Ala AGC AGC Ser Ser TTA ATG Leu Met CTC TCT Leu Ser 165 TCA CAA Ser Gin 180 GAG ACA Glu Thr GAA CGT Glu Arg CCT AAG Ala Lys CCA GAA Pro Glu 245 CCC GCG Ala Ala 260 Ser Ser Thr Ile GCG GAA ATT CCT Ala Glu Ile Ala 25 GGA ATG CTT AAT Gly Met Val Asn 40 GCG GAA CAG GAT Ala Glu Gin Asp 55 CCT AAT CCC CTC Ala Asn Pro Leu 70 TTG GGT TTT ATA Leu Cly Phe Ile AAC TAT CCC CCC Asn Tyr Ala Ala 105 TAT TTC ACC GAA Tyr Leu Thr-Glu 120 TCA ATT TAT TAC Ser Ile Tyr Tyr 135 CTC ACC CAC AAA Leu Ser Gin Lys 150 AAT CAA TTC TGC Asn Glu Leu Cys CAT CAA CTG ATC Asp Glu Val Met 185 CCT TAT CAT CAC Pro Tyr His His 200 CAT CCA GGA TTT Asp Pro Cly Phe 215 CTC CAT CCT GTG Leu Asp Pro Vai 230 CTG TAT AAC TTC Leu Tyr Asn Leu CTT CAT ACC CTT Leu Asp Thr Leu 265 Asp 10
TTC
Leu
TGG
Trp
AGA
Arg
CTC
Leu
CAA
Gin 90
CCC
Pro
TTC
Leu
CTA
Leu
AAT
Asn
CTT
Leu 170
CAT
Asp
CCT
Ala
CGT
Arg
ACT
Thr
CTC
Leu 250
TAT
Tyr Thr Ile Cys TAT CCC TTT Tyr Pro Phe GGG GAA CCA Cly Glu Ala AAC CTA CTT Asn Leu Leu 60 AAA AAC OCT Lys Asn Ala 75 GGT TAT AGT Cly Tyr Ser GGC TCG GTT Gly Ser Val TAC CCT GAA Tyr Arg Glu 125 CAT AAA CGT Asp Lys Arg 140 ATG GAT GAG Met Asp Clu 155 CCC GGG ATC Ala Cly Ile ATC TTG TCA Met Leu Ser TAT CAA ACT Tyr Glu Thr 205 CAT TTC TCA His Leu Ser 220 TTC TTG GGT Leu Leu Cly 235 ATT GAG GAG Ile Giu Glu AAA ACA AAC Lys Thr Asn Lys
ACT
Thr
CGG
Arg
CAA
Glu
CGG
Arg
CTG
Leu
TCG
Ser
AAA
Lys
CCC
Pro
ATT
Ile
ACA
Thr 175
TAT
Tyr
CGT
Arg
GCA
Ala
AGC
Ser
CCC
Pro 255
GGC
Gly
CTC
Leu
TTC
Phe
ATT
Ile
AAA
Lys
TTC
Leu
TTT
Phe
ATG
Met
AAC
Asn
GAT
Asp
TCA
Ser 160
AAA
Lys
CGT
Arg
CAA
Glu
CCC
Pro
TCC
Ser 240
GAA
Clu
CAT
Asp 48 96 144 192 240 288 336 384 432 480 528 576 624 672 720 768 816 864 ATT ACT ACT CCT CAC TTA ATG TCC CCA ACT TAT CTG GCC CGG TAT TAT -154- Ile Thr Thr Ala Gin Leu Met Ser 275 280 Pro Ser.Tyr Leu Ala Arg Tyr Tyr 285 o oo***
GGC
Gly
GTT
Val 305
GGT
Gly
ACC
Thr
TAT
Tyr
TAT
25 Tyr
AAT
Asn 30 385
ACA
Thr 35
AGA
Arg 40 GAC Asp
CGG
Arg
GTT
Val 465
AAG
Lys
GAG
Glu
GGC
Gly
AAT
Asn
AAT
Asn 545
GAT
Asp
GAT
Asp 310
GTT
Val
TAT
Tyr
AAT
Asn
GAT
Asp
ATG
Met 390
GAC
Asp
AGT
Ser
AAA
Lys
ACC
Thr
AGC
Ser 470
AAA
Lys
TTG
Leu
CAG
Gin
GAA
Glu
CTT
Leu 550
TAC
Tyr
GTT
Val
ACC
Thr
CTG
Leu 345
AAT
Asn
GCT
Ala
AAT
Asn
AAT
Asn
TTT
Phe 425
CTG
Leu
TCT
Ser
TCC
Ser
ATT
Ile
ATT
Ile 505
CAA
Gin
GAG
Glu
GAC
Asp TTA TCA Leu Ser GAT GGT Asp Gly GAT AAT Asp Asn 335 GGC GAC Gly Asp 350 GAT GAT Asp Asp ATT GCC Ile Ala TCA CAG Ser Gin GGG TTA Gly Leu 415 TTT AAA Phe Lys 430 AAG GCT Lys Ala GAG CGT Glu Arg GTA TTA Val Leu ATC AGT Ile Ser 495 CAA GCT Gin Ala 510 CCG CCG Pro Pro CAT CTT His Leu GAT CAA Asp Gin CAT 912 His GTG 960 Val 320 TAT 1008 Tyr AAT 1056 Asn TTT 1104 Phe CAT 1152 His GCG 1200 Ala 400 CAA 1248 Gin ATT 1296 Ile ATT 1344 Ile ATT 1392 Ile AAC 1440 Asn 480 GAA 1488 Glu GTT 1536 Val CTC 1584 Leu CCT 1632 Pro CGC 1680 Arg 560 -155- AAG GCG GTT TTA AAA CGC GCG TTT Lys Ala Val Leu Lys Arg Ala Phe 565 CAG GTT AAC Gin ValiAsn 570 GCC AGT GAG Ala Ser Giu TTG TAT 1728 Leu Tyr 575 CAG ATG TTA Gin Met Leu AAC TTA GAG Asn Leu Giu 595 ATT CAT AAC Ile His Asn is 610 ATC ACT GAT CGT Ile Thr Asp Arg GAA GAC GOT GTT Giu Asp Gly Val ATC AAA AAT 1776 Ile Lys Asn 590 CTG GCC CAG 1824 Leu Ala Gin AAT TTG TCT GAT Asn Leu Ser Asp TAT TTG GTT AGT Tyr Leu Vai Ser CTG ACT ATT Leu Thr Ile GAA TTG AAC-ATT Giu Leu Asn Ile TTG GTG ATT TOT 1872 Leu Val Ile Cys GGC TAT GGC GAC ACC AAC ATT TAT CAG ATT ACC GAC GAT AAT TTA GCC 1920 Gly Tyr Gly Asp Thr Asn Ile Tyr Gin Ile Thr Asp Asp Asn Leu Ala 625 630 635 640 AAA ATA GTG GAA Lys Ile Val Giu TTG TTG TGG ATC Leu Leu Trp Ile CAA TGG TTG Gin Trp Leu AAA TGG ACA Lys Trp Thr ACC GAC CTG TTT Thr Asp Leu Phe ATG ACC ACG GCC Met Th-r Thr Ala AAG ACC CAA 1968 Lys Thr Gin 655 ACT TAC AGC 2016 Thr Tyr Ser 670 TTG TCT TCA 2064 Leu Ser Ser ACC ACT TTA ACG CCA GAA ATT AGC Thr Thr Leu Thr Pro Giu Ile Ser 675 680 AAT CTG ACG GCT Asn Leu Thr Ala ACT TTG Thr Leu 690 CAT GGC AAA GAG His Gly Lys Giu CTG ATT GGG GAA Leu Ile Giy Giu CTG AAA AGA GCA 2112 Leu Lys Arg Ala GCG CCT TGC TTC Ala Pro Cys Phe TCG GCT TTG CAT Ser Ala Leu His ACT TCT CAA GAA Thr Ser Gin Giu GTT 2160 Val 720 GCG TAT GAC CTG Ala Tyr Asp Leu TTO TGG ATA GAC Leu Trp Ile Asp ATT CAA CCO Ile Gin Pro ACT GTT OAT Thr Val Asp AAG OTG ATT Lys Val Ile 755 CGT COT ATT Arg Arg Ile 770 TT'r TOG GAA OAA Phe Trp Giu Olu CAA ACA ACA CCA Gin Thr Thr Pro GCA CAA ATA 2208 Ala Gin Ile 735 ACC AGC TTG 2256 Thr Ser Leu 750 CTG ATC TAT 2304 Leu Ile Tyr ACC TTT GCT CAG Thr Phe Ala Gin CTO GCA CAA TTG Leu Ala Gin Leu 000 TTA AGT Gly Leu Ser ACG GAA CTO TCA Thr Giu Leu Ser ATC OTO ACT CAA 2352 Ile Val Thr Gin TCT CTG CTA GTG Ser Leu- Leu Val OCA GGC AAA AOC ATA CTG GAT CAC GGT CTG TTA 2400 Ala Gly Lys Ser Ile Leu Asp His Oly Leu Leu 790 795 800 ACC CTG ATG 0CC Thr Leu Met Ala GAA GOT TTT CAT ACC TGG GTT AAT GGC Giu Gly Phe His Thr Trp Vai Asn Oiy 810 TTG GGG 2448 Leu Gly 815 CAA CAT GCC TCC Gin His Ala Ser 820 GTT ACC OAT GTA Vai -hr Asp Val 835 TTG ATA TTG GCG Leu Ile Leu Ala TTG AAA GAC GGA Leu Lys Asp Gly GCC TTG ACA 2496 Ala Leu Thr 830 CTC CTA CAA 2544 Leu Leu Gin OCA CAA GCT Ala Gin Ala AAT AAG, GAG OAA Asn Lys Giu Giu 6- ATG GCA GCT AAT CAG GTG GAG AAG GAT CTA ACA AAA CTG ACC AGT TGG 2592 Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys ITeu Thr Ser Trp 850 855 860 ACA CAG ATT GAC GCT ATT CTG CAA TGG TTA CAG ATG TCT TCG GCC TTG 2640 Thr Gin Ile Asp Ala Ile Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 865 870 875 880 GCG GTT TCT CCA CTG GAT CTG GCA GGG ATG ATG GCC CTG AAA TAT GGG 2688 Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 885 890 895 ATA GAT CAT AAC TAT GCT GCC TGG CAA GCT GCG GCG GCT GCG CTG ATG 2736 Ile Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 900 905 910 GCT GAT CAT GCT AAT CAG GCA CAG AAA AAA CTG GAT GAG ACG TTC AGT 2784 Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 915 920 925 AAG GCA TTA TGT AAC TAT TAT ATT AAT GCT GTT GTC GAT AGT GCT GCT 2832 Lys Ala Leu Cys Asn Tyr Tyr Ile Asn Ala Val Val Asp Ser Ala Ala 930 935 940 GGA GTA CGT GAT CGT AAC GGT TTA TAT ACC TAT TTG CTG ATT GAT AAT 2880 Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu Ile Asp Asn 945 950 955 960 30 CAG GTT TCT GCC GAT GTG ATC ACT TCA CGT ATT GCA GAA GCT ATC GCC 2928 Gin Val Ser Ala Asp Val Ile Thr Ser Arg Ile Ala Glu Ala Ile Ala 965 970 975 GGT ATT CAA CTG TAC GTT AAC CGG GCT TTA AAC CGA GAT GAA GGT CAG 2976 Gly Ile Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 980 985 990 CTT GCA TCG GAC GTT AGT ACC CGT CAG TTC TTC ACT GAC TGG GAA CGT 3024 Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 995 1000 1005 *TAC AAT AAA CGT TAC AGT ACT TGG GCT GGT GTC TCT GAA CTG OTC TAT 3072 Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 1010 1015 1020 45 TAT CCA GAA AAC TAT GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC AAA 3120 Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg Ile Gly Gin Thr Lys 1025 1030 1035 1040 ATG ATG GAT GCG CTG TTG CAA TCC ATC AAC CAG AGC CAG CTA AAT GCG 3168 Met Met Asp Ala Leu Leu Gin Ser Ile Asn Gin Ser Gin Leu Asn Ala 1045 1050 1055 GAT ACG GTG GAA GAT GCT TTC AAA ACT TAT TTG ACC AGC TTT GAG CAG 3216 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 1060 1065 1070 GTA GCA AAT CTG AAA GTA ATT AGT GCT TAC CAC GAT AAT GTG AAT GTG 3264 Val Ala Asn Leu Lys Val Ile Ser Ala Tyr His Asp Asn Val Asn Val 1075 1080 1085 GAT CAA GGA TTA ACT TAT TTT ATC GGT ATC GAC CAA GCA GCT CCG GGT 3312 Asp Gin Gly Leu Thr Tyr Phe Ile Gly Ile Asp Gin Ala Ala Pro Gly 1090 1095 1100 ACG TAT TAC TGG CGT AGT GTT GAT CAC AGC AAA TGT GAA AAT GGC AAG 3360 Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120 TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC 3408 Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys Ile Thr Cys Ala Val -157- 1125 AAT CCT TGG AAA AAT ATC Asn Pro Trp Lys Asn Ile 1140 1130 1135 ATC CGT CCG GTT GTT TAT ATG TCC CGC TTA 3456 Ile Arg Pro Val Val Tyr Met Ser Arg Leu 1145 1150 TAT CTG CTA TGG Tyr Leu Leu Trp 1155 CTG GAG CAG Leu Glu Gin CAA TCA Gin Ser 1160 AAG AAA AGT Lys Lys Ser GAT GAT GGT AAA 3504 Asp Asp Gly Lys 1165 ACC ACG ATT TAT CAA TAT Thr Thr Ile Tyr Gin Tyr 1170 AAC TTA Asn Leu 1175 AAA CTG GCT Lys Leu Ala CAT ATT His Ile 1180 CGT TAC GAC 3552 Arg Tyr Asp GGT AGT Gly Ser 1185 TGG AAT ACA Trp Asn Thr CCA TTT Pro Phe 1190 ACT TTT GAT Thr Phe Asp GTG ACA Val Thr 1195 GAA AAG GTA Glu Lys Val AAA 3600 Lys 1200 AAT TAC ACG TCG Asn Tyr Thr Ser AGT ACT Ser Thr 1205 GAT GCT GCT GAA TCT TTA GGG Asp Ala Ala Glu Ser Leu Gly 1210 S. c S. S
S
S.
S..
S
TTG TAT TGT 3648 Leu Tyr Cys 1215 TAT TCG ATG 3696 Tyr Ser Met 1230 ACT GGT TAT Thr Gly Tyr CAA GGG Gin Gly 1220 GAA GAC ACT Glu Asp Thr CTA TTA Leu Leu 1225 GTT ATG TTC Val Met Phe CAG AGT AGT TAT Gin Ser Ser Tyr 1235 AGC TCC TAT ACC GAT AAT AAT GCG Ser Ser Tyr Thr Asp Asn Asn Ala 1240 CCG GTC ACT GGG 3744 Pro Val Thr Gly 1245 CTA TAT ATT Leu Tyr Ile 1250 TTC GCT GAT Phe Ala Asp ATG TCA Met Ser 1255 TCA GAC AAT ATG ACG AAT GCA CAA 3792 Ser Asp Asn Met Thr Asn Ala Gin 1260 .55555
S.
.5 35 GCA ACT AAC TAT TGG Ala Thr Asn Tyr Trp 1265 AAT AAC Asn Asn 1270 AGT TAT CCG Ser Tyr Pro CAA TTT Gin Phe 1275 GAT ACT GTG Asp Thr Val ATG 3840 Met 1280 GCA GAT CCG GAT Ala Asp Pro Asp AGC GAC AAT AAA AAA GTC ATA ACC AGA AGA GTT AAT 3888 Ser Asp Asn Lys Lys Val Ile Thr Arg Arg Val Asn 1285 1290 1295 AAC CGT TAT Asn Arg Tyr GCG GAG Ala Glu 1300 GAT TAT GAA ATT CCT TCC TCT GTG Asp Tyr Glu Ile Pro Ser Ser Val 1305 ACA AGT AAC 3936 Thr Ser Asn 1310 AGT AAT TAT TCT Ser Asn Tyr Ser 1315 AGT GTT CCT AAT Ser Val Pro Asn 1330 TGG GGT GAT CAC AGT TTA ACC ATG CTT TAT GGT GGT 3984 Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 1320 1325 ATT ACT TTT GAA Ile Thr Phe Glu 1335 TCG GCG GCA Ser Ala Ala GAA GAT TTA AGG CTA 4032 Glu Asp Leu Arg Leu 1340 TCT ACC Ser Thr 1345 AAT ATG GCA TTG AGT ATT ATT CAT Asn Met Ala Leu Ser Ile Ile His 1350 AAT GGA Asn Gly 1355 TAT GCG GGA ACC 4080 Tyr Ala Gly Thr 1360 CGC CGT ATA CAA Arg Arg Ile Gin TGT AAT Cys Asn 1365 CTT ATG AAA Leu Met Lys CAA TAC Gin Tyr 1370 GCT TCA TTA Ala Ser Leu GGT GAT 4128 Gly Asp 1375 AAA TTT ATA Lys Phe Ile ATT TAT GAT Ile Tyr Asp 1380 TCA TCA TTT GAT GAT Ser Ser Phe Asp Asp 1385 GCA AAC CGT TTT AAT 4176 Ala Asn Arg Phe Asn 1390 CTG GTG CCA TTG TTT AAA TTC GGA AAA GAC GAG AAC Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn TCA GAT GAT AGT 4224 Ser Asp Asp Ser 1405 1395 1400 ATT TGT ATA TAT AAT GAA AAC CCT TCC TCT GAA GAT AAG AAG TGG TAT 4272 -158- Ile Cys Ile Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 1410 1415 1420 TTT TCT TCG AAA GAT GAC AAT AAA ACA GCG GAT TAT AAT GGT GGA ACT 4320 Phe'Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 1425 1430 1435 1440 CAA TGT ATA GAT GCT GGA ACC AGT AAC AAA GAT TTT TAT TAT AAT CTC 4368 Gin Cys Ile Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455 CAG GAG ATT GAA GTA ATT AGT GTT ACT GGT GGG TAT TGG TCG AGT TAT 4416 Gin Glu Ile Glu Val Ile Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 1460 1465 1470 AAA ATA TCC AAC CCG ATT AAT ATC AAT ACG GGC ATT GAT AGT GCT AAA 4464 Lys Ile Ser Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp Ser Ala Lys 1475 1480 1485 GTA AAA GTC ACC GTA AAA GCG GGT GGT GAC GAT CAA ATC TTT ACT GCT 4512 Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin Ile Phe Thr Ala .1490 1495 1500 GAT AAT AGT ACC TAT GTT CCT CAG CAA CCG GCA CCC AGT TTT GAG GAG 4560 25 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 1505 1510 1515 1520 ATG ATT TAT CAG TTC AAT AAC CTG ACA ATA GAT TGT AAG AAT TTA AAT 4608 Met Ile Tyr Gin Phe Asn Asn Leu Thr Ile Asp Cys Lys Asn Leu Asn 1525 1530 1535 TTC ATC GAC AAT CAG GCA CAT ATT GAG ATT GAT TTC ACC GCT ACG GCA 4656 Phe Ile Asp Asn Gin Ala His Ile Glu Ile Asp Phe Thr Ala Thr Ala 1540 1545 1550 CAA GAT GGC CGA TTC TTG GGT GCA GAA ACT TTT ATT ATC CCG GTA ACT 4704 Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe Ile Ile Pro Val Thr 1555 1560 1565 *40 AAA AAA GTT CTC GGT ACT GAG AAC GTG ATT GCG TTA TAT AGC GAA AAT 4752 Lys Lys Val Leu Gly Thr Glu Asn Val Ile Ala Leu Tyr Ser Glu Asn 1570 1575 1580 AAC GGT GTT CAA TAT ATG CAA ATT GGC GCA TAT CGT ACC CGT TTG AAT 4800 45 Asn Gly Val Gin Tyr Met Gin Ile Gly Ala Tyr Arg Thr Arg Leu Asn 1585 1590 1595 1600 ACG TTA TTC GCT CAA CAG TTG GTT AGC CGT GCT AAT CGT GGC ATT GAT 4848 Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly Ile Asp 1605 1610 1615 GCA GTG CTC AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAA TTA GGA 4896 Ala Val Leu Ser Met Glu Thr Gin Asn Ile Gin Glu Pro Gin Leu Gly 1620 1625 1630 GCG GGC ACA TAT GTG CAG CTT GTG TTG GAT AAA TAT GAT GAG TCT ATT 4944 Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser Ile 1635 1640 1645 CAT GGC ACT AAT AAA AGC TTT GCT ATT GAA TAT GTT GAT ATA TTT AAA 4992 His Gly Thr Asn Lys Ser Phe Ala Ile Glu Tyr Val Asp Ile Phe Lys 1650 1655 1660 GAG AAC GAT AGT TTT GTG ATT TAT CAA GGA GAA CTT AGC GAA ACA ACT 5040 Glu Asn Asp Ser Phe Val Ile Tyr Gin Gly Glu Leu Ser Glu Thr Ser 1665 1670 1675 1680 CAA ACT GTT GTG AAA GTT TTC TTA TCC TAT TTT ATA GAG GCG ACT GGA 5088 Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe Ile Glu Ala Thr Gly 1685 1690 1695 -159- AAT AAG AAC CAC TTA TGG GTA CGT GCT AAA TAC CAA AAG GAA ACG ACT 5136 Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 1700 1705 1710 GAT AAG ATC TTG TTC GAC CGT ACT GAT GAG AAA GAT CCG CAC GGT TGG 5184 Asp Lys Ile Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 1715 1720 1725 TTT CTC AGC GAC GAT CAC AAG ACC TTT AGT GGT CTC TCT TCC GCA CAG 5232 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 1730 1735 1740 GCA TTA AAG AAC GAC AGT GAA CCG ATG GAT TTC TCT GGC GCC AAT GCT 5280 Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 1745 1750 1755 1760 CTC TAT TTC TGG GAA CTG TTC TAT TAC ACG CCG ATG ATG ATG GCT CAT 5328 Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 1765 1770 1775 CGT TTG TTG CAG GAA CAG AAT TTT GAT GCG GCG AAC CAT TGG TTC CGT 5376 Arg Leu Leu G1m Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 1780 1785 1790 TAT GTC TGG AGT CCA TCC GGT TAT ATC GTT GAT GGT AAA ATT GCT ATC 5424 Tyr Val Trp Ser Pro Ser Gly Tyr Ile Val Asp Gly Lys Ile Ala Ile S1795 1800 1805 TAC CAC TGG AAC GTG CGA CCG CTG GAA GAA GAC ACC AGT TGG AAT GCA 5472 Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 1810 1815 1820 CAA CAA CTG GAC TCC ACC GAT CCA GAT GCT GTA GCC CAA GAT GAT CCG 5520 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 35 1825 1830 1835 1840 ATG CAC TAC AAG GTG GCT ACC TTT ATG GCG ACG TTG GAT CTG CTA ATG 5568 Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met GCC CGT GGT GAT GCT GCT TAC CGC CAG TTA GAG CGT GAT ACG TTG GCT 5616 Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 1860 1865 1870 GAA GCT AAA ATG TGG TAT ACA CAG GCG CTT AAT CTG TTG GGT GAT GAG 5664 Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 1875 1880 1885 CCA CAA GTG ATG CTG AGT ACG ACT TGG GCT AAT CCA ACA TTG GGT AAT 5712 Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 1890 1895 1900 GCT GCT TCA AAA ACC ACA CAG CAG GTT CGT CAG CAA GTG CTT ACC CAG 5760 Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 1905 1910 1915 1920 TTG CGT CTC AAT AGC AGG GTA AAA ACC CCG TTG CTA GGA ACA GCC AAT 5808 Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 60 1925 1930 1935 TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT AGC AAG CTC AAA GGC 5856 Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 1940 1945 1950 TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT TTA CGT CAT AAT CTG 5904 Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 1955 1960 1965 TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG TAT GCT AAA CCG GCT 5952 Ser Ile Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980 -160- GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA GCT TCT CAA GGG GGA 6000 Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 1985 1990 1995 2000 GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC CGC TTC CCT CAA ATG 6048 Ala Asp Leu Pro Lys Ala Pro Leu Thr Ile His Arg Phe Pro Gin Met 2005 2010 2015 CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT ATA CAG TTC GGT AGT 6096 Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu Ile Gin Phe Gly Ser 2020 2025 2030 TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG GAA GCT ATG AGT CAA 6144 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 2035 2040 2045 CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG ACC AGT ATT CGT ATG 6192 Leu Leu Gin Thr Gin Ala Ser Giu Leu Ile Leu Thr Ser Ile Arg Met 2050 2055 2060 CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA AAA ACC GCC TTG CAA 6240 Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gn 2065 2070 2075 2080 GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC AGC TAT AGC CAA CTG 6288 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 2085 2090 2095 *00* ev 0 30 TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA GCG CTG GCG TTA CGC 6336 Tyr Glu Glu Asn Ile Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 2100 2105 2110 TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG ATT TCC CGT ATG GCA 6384 Ser Glu Ser Ala Ile Glu Ser Gin Gly Ala Gin Ile Ser Arg Met Ala 2115 2120 2125 GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC GGC CTG GCT GAT GGC 6432 Gly Ala Gly Val Asp Met Ala Pro Asn Ile Phe Gly Leu Ala Asp Gly 40 2130 2135 2140 GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC GCT GAC GGT ATT GAG 6480 Gly-Met His Tyr Gly Ala Ile Ala Tyr Ala Ile Ala Asp Gly Ile Glu 2145 2150 2155 2160 TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG AAA GTT GCT CAG TCG 6528 Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 2165 2170 2175 GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA ATT CAG CGT GAC AAC 6576 Glu Ile Tyr Arg Arg Arg Arg Gin Glu Trp Lys Ile Gin Arg Asp Asn 2180 2185 2190 GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA CTG GAA TCA CTG TCT 6624 Ala Gin Ala Glu Ile Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 2195 2200 2205 ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG TAC CTG AAA ACC CAG 6672 Ile Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 2210 2215 2220 CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA AGA AGC AAA TTC AGT 6720 Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240 AAT CAA GCG TTA TAT ACT TGG TTA CGA GGG CGT TTG TCA GGT ATT TAT 6768 Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly Ile Tyr 2245 2250 2255 TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC CTG ATG GCA GAG CAA 6816 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin -161- 2260 2265 2270 TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT AGC TTT GTC AAA CCG 6864 Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser Ile Ser Phe Val Lys Pro 2275 2280 2285 GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG TGT GGA GAA GCT TTG 6912 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300 ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT CTG AAA TGG GAA TCT 6960 Ile Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 2305 2310 2315 2320 CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG GCA GTG GTT TAT GAT 7008 Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 2325 2330 2335 TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG GAA CAA ATA CCT GCA 7056 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin Ile Pro Ala 2340 2345 2350 TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT AAA GAA AAT GGG TTA 7104 S* Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu S25 2355 2360 2365 So TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC AAA TTG TCC GAC TTG 7152 Ser Leu Ala Asn Ala Ile Leu Ser Ala Ser Val Lys Leu Ser Asp Leu *2370 2375 2380 30 AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT GGT AGC AAC AAG GTT 7200 Lys Leu Gly Thr Asp Tyr Pro Asp Ser Ile Val Gly Ser Asn Lys Val "see: 2385 2390 2395 2400 *00 35 CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT GCA TTG GTT GGG CCT-7248 0994 Arg Arg Ile Lys Gin Ile Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415 TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT GGC AGT ACT CAA TTG 7296 Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 2420 2425 2430 CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT GGT ACC AAT GAT AGT 7344 Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 45 2435 2440 2445 *see GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA TAC CTG CCA TTT GAA 7392 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 2450 2455 2460 GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT CTT CAA TTT CCG AAT 7440 Gly Ile Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2480 GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT ATG AGC GAT ATT ATT 7488 Ala Thr Asp Lys Gin Lys Ala Ile Leu Gin Thr Met Ser Asp Ile Ile 2485 2490 2495 TTG CAT ATT CGT TAT ACC ATC CGT TAA 7515 Leu His Ile Arg Tyr Thr Ile Arg 2500 2505 -162- INFORMATION FOR SEQ ID NO:12: Wi SEQUENCE CHARACTERISTICS: LENGTH: 2504 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:i2 (TcbA protein): Met Gin Asn Ser Leu Ser Ser Thr Ile Asp Thr Ile Cys Gin Lys Leu 1 5 10 Gin Leu Thr Cys Pro Ala Giu Ile Ala Leu Tyr Pro Phe Asp Thr Phe 20 25 Arg Giu Lys Thr Arg Gly Met Val Asn Trp Gly Giu Ala Lys Arg Ile 40 Tyr Giu Ile Ala Gin Ala Giu Gin Asp Arg Asn Leu Leu His Giu Lys 55 Arg Ile Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 70 75 Gly Thr Arg Gin Met Leu Giy Phe Ile Gin Gly Tyr Ser Asp Leu Phe 90 Giy Asn Arg Aia Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 100 105 110 Phe Ser Pro Ala Aia Tyr Leu Thr Giu Leu Tyr Arg Giu Ala Lys Asn 115 120 125 Leu His Asp Ser Ser Ser Ile Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 130 135 140 Leu Aia Ser Leu Met Leu Ser Gin Lys Asn Met Asp Giu Glu Ile Ser 0 145 150 155 160 Thr Leu Ala Leu Ser Asn Giu Leu Cys Leu Ala Gly Ile Glu Thr Lys .165 170 175 Thr Giy Lys Ser Gin Asp Giu Vai Met Asp Met Leu Ser Thr Tyr Arg 180 185 190 Leu Ser Gly Giu Thr Pro Tyr His His Ala Tyr Giu Thr Val Arg Giu 5 ~200 205 Ile Val His Giu Arg Asp Pro Gly Phe Arg His Leu Ser Gi-n Ala1- Pro 210 215 220 Ile Val Ala Ala Lys Leu Asp Pro Vai Thr Leu Leu Giy Ile Ser Ser 225 230 235 240 His Ile Ser *Pro Giu Leu Tyr Asn Leu Leu Ile Giu Giu Ile Pro Giu 245 250 255 Lys Asp Giu Ala Aia Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 260 265 270 Ile Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 275 280 285 Giy Val Ser Pro Giu Asp Ile Ala Tyr Val Thr Thr Ser Leu Ser His 290 295 300 Val Gly Tyr Ser Ser Asp Ile Leu Val Ile Pro Leu Val Asp Gly Val 305 310 315 320 -163- Gly Lys Met Giu Val Val Arg Vai Thr 325 Thr Pro Ser Asp Thr Ser Tyr Leu Tyr Leu 370 Asn Pro 385 Thr Ile Arg Trp Asp Gin 25 Arg Leu 450 Vai Asp 465 Lys Vai Giu Thr Gly Asn Asn Giy 530 Asn -Pro 545 45 Lys Ala Gin Met Asn Leu Ile His 610 Gly Tyr 625 Lys Ile Lys Trp Thr Thr Thr Leu 690 Gin Ile 355 Gin Tyr Lys His Tyr 435 Leu Ser Tyr Aila Gin 515 Ile Asp Vali Leu Giu 595 Asn Gly Val1 Thr Leu 675 His Asn Tyr Tyr Asn Lys Asp Asp Met 390 Ser Asp 405 Gly Ser Pro Lys Ala Thr Asn Ser 470 Val Lys 485 Ile Leu Ser Gin Tyr Giu Asn Leu 550 Lys Arg 565 Ile Thr Leu Scr Thr Ile Thr Asri 630 Thr Leu 645 Thr Asp Pro Giu Lys Giu Giu Ser 360 5cr Ile Asp Asn Phe 440 Leu Lys Tyr Asn Giu 520 Scr Pro Phe Arg Leu 600 Giu Tyr Trp Phe Ser 680 Leu Leu 345 Asn Al a Asn As n Phe 425 Leu 5cr Ser Ile Ile 505 Gin Giu Asp Gin Lys 585 Tyr Leu Gin Ile Leu 665 Asn Ile Pro Gin Phe Gly Trp Thr 380 Lys Tyr 395 Leu Ser Ala Ala Lys Met Ala Thr 460 Thr Val 475 Arg Tyr Ile Ser Phe Asn Asn 5cr 540 Thr Giy 555 Asn Aia Asp Giy Vai 5cr Ile Leu 620 Thr Asp 635 Gin Trp, Thr Thr Thr Ala Giu Asp 700 Asn Tyr 335 Asp Asn Asp Phe Ala His Gin Ala 400 Leu Gin 415 Lys Ile Ala Ile Arg Ile Leu Asn 480 5cr Giu 495 Aia Vai Pro Leu Leu Pro Gin Arg 560 Leu Tyr 575 Lys Asn Ala Gin Ile Cys Leu Ala 640 Thr Gin 655 Tyr Ser Ser Ser Arg Ala -164- Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 705 710 715 720 Ala Tyr Asp Thr Val Asp Lys Val Ile 755 Arg Arg Ile 770 Ser Ser Leu 785 Thr Leu Met Gin His Ala Val Thr Asp 835 Met Ala Ala 850 Thr Gin Ile 865 Ala Val Ser Ile Asp His Ala Asp His 915 Lys Ala Leu 930 Gly Val Arg 945 Gin Val Ser Gly Ile Gin Leu Ala Ser 995 Tyr Asn Lys 1010 Tyr Pro Glu 1025 Met Met Asp Asp Thr Val Leu Trp Ile Asp Gin Ile Gin Pro Ala Gin Ile 730 735 Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 745 750 Ala Gin Val Leu Ala Gin Leu Ser Leu Ile Tyr 760 765 Ser Glu Thr Glu Leu Ser Leu Ile Val Thr Gin 775 780 Ala Gly Lys Ser Ile Leu Asp His Gly Leu Leu 790 795 800 Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 810 815 Ile Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 825 830 Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 840 845 Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 855 860 Ile Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 870 875 880 Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 890 895 Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 905 910 Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 920 925 Tyr Tyr Ile Asn Ala Val Val Asp Ser Ala Ala 935 940 Asn Gly Leu Tyr Thr Tyr Leu Leu Ile Asp Asn 950 955 960 Val Ile Thr Ser Arg Ile Ala Glu Ala Ile Ala 970 975 Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 985 990 Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 1000 1005 Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 1015 1020 Val Asp Pro Thr Gin Arg Ile Gly Gln Thr Lys 1030 1035 1040 Leu Gin Ser Ile Asn Gin Ser Gin Leu Asn Ala 1050 1055 Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 1065 1070 1045 Glu Asp 1060 Val Ala Asn Leu Lys Val lie Ser Ala Tyr His Asp Asn Val Asn Val -165- 1075 1080 1085 Asp Gin Gly Leu Thr Tyr Phe Ile Gly Ile Asp Gin Ala Ala Pro Gly 1090 1095 1100 Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120 Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys Ile Thr Cys Ala Val 1125 1130 1135 Asn Pro Trp Lys Asn Ile Ile Arg Pro Val Val Tyr MeW'Ser Arg Leu 1140 1145 1150 Tyr Leu Leu Trp Leu.Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 1155 1160 1165 Thr Thr Ile Tyr Gin Tyr Asn Leu Lys Leu Ala His Ile Arg Tyr Asp 1170 1175 1180 Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 1185 1190 1195 1200 Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 25 1205 1210 1215 *Thr Gly Tyr Gln Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 1220 1225 1230 Gln Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 1235 1240 1245 Leu Tyr Ile Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin S1250 1255 1260 35 Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 1265 1270 1275 1280 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val Ile Thr Arg Arg Val Asn 1285 1290 1295 Asn Arg Tyr Ala Glu Asp Tyr Glu Ile Pro Ser Ser Val Thr Ser Asn 1300 1305 1310 Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 1315 1320 1325 Ser Val Pro Asn Ile Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 1330 1335 1340 Ser Thr Asn Met Ala Leu Ser Ile Ile His Asn Gly Tyr Ala Gly Thr 1345 1350 1355 1360 Arg Arg Ile Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 1365 1370 1375 Lys Phe Ile Ile Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 1380 1385 1390 Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 1395 1400 1405 Ile Cys Ile Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 1410 1415 1420 Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 1425 1430 1435 1440 Gin Cys Ile Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455 -166- Gin Glu Ile Glu Val Ile Ser Val Thr Gly.Gly Tyr Trp Ser Ser Tyr 1460 1465 1470 Lys Ile Ser Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp Ser Ala Lys 1475 1480 1485 Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin Ile Phe Thr Ala 1490 1495 1500 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Giu 1505 1510 i5iS 1520 Met Ile Tyr Gin Phe Asn Asn Leu Thr Ile Asp Cys Lys Asn Leu Asn 1525 1530 1535 Phe Ile Asp Asn Gin Ala His Ile Giu Ile Asp Phe T1hr Ala Thr Ala 1540 1545 1550 Gin Asp Gly Arg Phe Leu Gly Ala Giu Thr Phe Ile Ile Pro Val Thr 1555 1560 1565 Lys Lys Val Leu Gly Thr Giu Asn Val Ile Ala Leu Tyr Ser Giu Asn 1570 1575 1580 Asn Giy Val Gin Tyr Met Gin Ile Giy Ala Tyr Arg Thr Arg Leu Asn 1585 1590 1595 1600 *.Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly Ile Asp 1605 1610 1615 30 *Ala Val Leu Ser Met Giu Thr Gin Asn Ile Gin Giu Pro Gin Leu Gly 1620 1625 1630 Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser Ile 1635 1640 1645 *His Gly Thr Asn Lys Ser Phe Ala Ile Giu Tyr Val Asp Ile Phe Lys 1650 1655 1660 Giu Asn Asp Ser Phe Vai Ile Tyr Gin Gly Glu Leu Ser Giu Thr Ser *1665 1670 1675 1680 Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe Ile Giu Ala Thr Gly 1685 1690 1695 .Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Giu Thr Thr 1700 1705 1710 Asp Lys Ile Leu Phe Asp Arg Thr Asp Giu Lys Asp Pro His Gly Trp 1715 1720 1725 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 1730 1735 1740 Ala Leu Lys Asn Asp Ser Giu Pro Met Asp Phe Ser Gly Ala Asn Ala 1745 1750 1755 1760 Leu Tyr Phe Trp Giu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 1765 1770 1775 Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 1780 1785 1790 Tyr Val Trp, Ser Pro Ser Gly Tyr Ile Val Asp Gly Lys Ile Ala Ile 1795 1800 1805 Tyr His Trp Asn Val Arg Pro Leu Giu Giu Asp Thr Ser Trp Asn Ala 1810 1815 1820 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 1825 1830 1835 1840 -167- Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 1845 1850 1855 Ala Arg Gly Asp Ala Ala Tyr Arg Gln Leu Glu Arg Asp Thr Leu Ala 1860 1865 1870 Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 1875 1880 1885 Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 1890 1895 1900 Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gln Gln Val Leu Thr Gin 1905 1910 1915 1920 Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 1925 1930 1935 Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 1940 1945 1950 S: Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 5 1955 1960 1965 25 ,ooo Ser Ile Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980 Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly S* 30 1985 1990 1995 2000 Ala Asp Leu Pro Lys Ala Pro Leu Thr Ile His Arg Phe Pro Gin Met 2005 2010 2015 35 Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu Ile Gin Phe Gly Ser 2020 2025 2030 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 2035 2040 2045 40 Leu Leu Gin Thr Gin Ala Ser Glu Leu Ile Leu Thr Ser Ile Arg Met 2050 2055 2060 Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 45 2065 2070 2075 2080 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 2085 2090 2095 Tyr Glu Glu Asn Ile Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 2100 2105 2110 Ser Glu Ser Ala Ile Glu Ser Gin Gly Ala Gin Ile Ser Arg Met Ala 2115 2120 2125 Gly Ala Gly Val Asp Met Ala Pro Asn Ile Phe Gly Leu Ala Asp Gly 2130 2135 2140 Gly Met His Tyr Gly Ala Ile Ala Tyr Ala Ile Ala Asp Gly Ile Glu 2145 2150 2155 2160 Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 2165 2170 2175 Glu Ile Tyr Arg Arg Arg Arg Gin Glu Trp Lys Ile Gin Arg Asp Asn 2180 2185 2190 Ala Gin Ala Glu Ile Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 2195 2200 2205 Ile Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin -168- 2210 2215 2220 Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240 Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly Ile Tyr 2245 2250 2255 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 2260 2265 2270 Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser Ile Ser Phe Val Lys Pro 2275 2280 2285 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300 Ile Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 2305 2310 2315 2320 Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp S2325 2330 2335 2 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin Ile Pro Ala 2340 2345 2350 Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 2355 2360 2365 30 Ser Leu Ala Asn Ala Ile Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 2370 2375 2380 Lys Leu Gly Thr Asp Tyr Pro Asp Ser Ile Val Gly Ser Asn Lys Val 5 2385 2390 2395 2400 Arg Arg Ile Lys Gin Ile Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415 Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 40 2420 2425 2430 Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 2435 2440 2445 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 2450 2455 2460 .Gly Ile Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2480 Ala Thr Asp Lys Gin Lys Ala Ile Leu Gin Thr Met Ser Asp Ile Ile 2485 2490 2495 Leu His Ile Arg Tyr Thr Ile Arg 2500 2505 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13 (TcdAii N-terminus) Leu Ile Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala -169- 1 5 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14 (TcdB N-terminus): Met Gin Asn Ser Gin Thr Phe Ser Val Gly Glu Leu 1 5 INFORMATION FOR SEQ ID 2(i) SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids S* TYPE: amino acid So(C) STRANDEDNESS: single 25 TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15 (TcaAii N-terminus): Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr 1 5 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 5 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16 (TcbA N-terminus): Met Gin Asn Ser Leu 1 INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 10 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17 (TcdAii-PT111 internal peptide): Ala Phe Asn Ile Asp Asp Val Ser Leu Phe 1 5 -170- INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18 (TcdAii- PT79 internal peptide): Phe Ile Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1 5 10 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: 20 LENGTH: 21 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19 (TcaB i PT158 internal peptide): Ile Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala Ile Gly Ser 30 1 5 10 Leu Gin Leu Phe Ile eo INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20 (TcaB4- PT 108 internal peptide): Met Tyr Tyr Ile Gin Ala Gin Gin Leu Leu Gly Pro 1 5 INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 26 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21 (TcbAii- PT103 internal peptide): -171- Gly Ile Asp Ala Val Leu Ser Met Glu Thr Gin Asn Ile Gin Glu Pro 1 5 10 Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 20 INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22 (TcbAii- PT56 internal peptide): Ile Ser Asn Pro Ile Asn Ile Asn Thr Gly Ile Asp Ser Ala Lys S1 5 10 25 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 13 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23 (TcbA- PT81 (a) Sinternal peptide): Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 1 5 INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 22 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24 (TcbAii- PT81 (b) internal peptide): Val Leu Gly Thr Glu Asn Val Ile Ala Leu Tyr Ser Glu Asn Asn Gly 1 5 10 Val Gin Tyr Met Gin Ile -172- INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 6054 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1. .43 OTHER INFORMATION: /product= "end of TCaAiii" (ix) FEATURE: NAME/KEY: RB3S LOCATION: 51 58 (ix) FEATURE: CA) NAME/KEY: CDS LOCATION: 65 3634 OTHER INFORMATION: /product= "TcaB~i" (xi) SEQUENCE DESCRIPTION: SEQ ID A GTA GCC CAA AAC TTA AGT GCC GCA ATC AGC AAT CGT CAG TAACCGGATA Val Ala Gin Asn Leu Ser Ala Ala Ile Ser Asn Arg Gin o..
t S AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA 100 Met Ser Giu Ser Leu Phe Thr Gin Thr Leu Lys Giu *5 S C S
S.
GCG CGC CQT Ala Arg Arg is GAT GCA TTG GTT Asp Ala Leu Val CAT TAT ATT GCT His Tyr Ile Ala CAG GTG CCC Gin Val Pro 35 GCA GAT Ala Asp 30
CTG_,TTG
40 Leu L-eu TTA AAA GAG AGT Leu Lys Giu Ser CAG ACC GCG GAT Gin Thr Ala Asp CTG TAC GAA TAT Leu Tyr Giu Tyr 148 196 244 CTG GAT ACC Leu Asp Thr ATT AGC GAT CTG GTT ACT ACT TCA CCG Ile Ser Asp Leu Val Thr Thr Ser Pro *5
S
S..
TCC GAA GCG ATT Ser Glu Ala Ile AGT CTG CAA TTG Ser Leu Gin Leu ATT CAT CGT Ile His Arg GGC TAT GAC Gly Tyr Asp GAA CAG TTT Giu Gin Phe ACG CTG GCA GAC Thr Leu Ala Asp GCA AAA CCC TAT Ala Lys Pro Tyr GCG ATA GAG Ala Ile Giu TTT GCC GAT Phe Ala Asp TAT AGC ACT Tyr Ser Thr TTA TAT AAC TGG Leu Tyr Asn Trp AGT rrr AAC CAC Ser Phe Asn His TGG GCT GGC AAG GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT Trp Ala Gly Lys Giu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr Ile Asp 110 115 120
CCA
Pro 125 ACA TTG CGA TTG Thr Leu Arg Leu AAG ACC GAG ATA TTT ACC GCA TTI' GAA Lys Thr Giu Ile Phe Thr Ala Phe Giu 135 GGT ATT TCT CAA Gly Ile Ser Gin AAA TTA AAA AGT Lys Leu Lys Ser TTA GTC GAA TCT Leu Val Giu Ser AAA TTA Lys Leu is s CGT GAT TAT CTA ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT -173- S. 0 t o I to..: 9 0 0* 5 Arg Asp ACT GCC Thr Ala ACA CAG Thr Gin 190 ACT GAT Thr Asp 205 ATT AAT Ile Asn TGG GAA Trp Giu GAT AAA 25 Asp Lys TAT AGC Tyr Ser 30 270 TAC AAT Tyr Asn 285 TCA CAA Ser Gin GTA CTT Val Leu ACG TTA Thr Leu GGA AGT Gly Ser 350 ATG TGT Met Cys 365 CTC TCT Leu Ser GAT GGA Asp Gly CTC CCT Leu Pro TCA CTA Ser Leu 430 Leu Ile Ser 160 CAA GGC AAA Gin Giy Lys GCA CCC TAT Ala Pro Tyr GGT AAG TTG Giy Lys Leu 210 GGG ATT AGT Giy Ile Ser 225 AAC AAG CTG Asn Lys Leu 240 GAT T= GTT Asp Phe Vai GCA TCA AAG Ala Ser Lys GTT GGA GCA Val Gly Ala 290 GGT TCT GAT Gly Ser Asp 305 TTT GAG AAT Phe Gin Asn 320 TAT GAG TCT Tyr Asp Ser AAT TTA TCG Asn Leu Ser GGA CAA AGT Gly Gin Ser 370 AAT ACA ATA Asn Thr Ile 385 CAA TTT ACA Gin Phe Thr 400 TAT GTA GAT Tyr Val Asp AAT TAT GAC Asn Tyr Asp Asp Thr 165 AAT AAA Asn Lys 180 T TT TAT Phe Tyr CGA GAT Pro Asp GCA TAT Ala Tyr ATC CGT Ile Arg 245 AAA AAC Lys Asn 260 AAA ATC Lys Ile GGA TGA Gly Ser GAG ATG Gin Met GGC GGA Gly Gly 325 AAG GTG Asn Val 340 AAG GAT Lys Asp AAT GAT Asn Asp TTC ACC Phe Thr CCT TGT Pro Ser 405 AAC GCG Asn Aia 420 GAG GGG Gin Gly Ala Thr ATC 'TG Ile Phe GGA AAA Arg Lys 200 TGG TGA Trp Ser 215 GGG CAT Gly His TTT ACT Phe Thr TGG GTG Trp Val GAA CTT Giu Leu 280 AGG CCG Ser Pro 295 ATT TCT Ile Ser ACT CCC Thr Pro AAG AAG Lys Asn GCC ACA Aia Thr 360 AAG TAG Asn Tyr 375 TAG GGG Tyr Gly TCT GCC Ser Ala TTA GAT Leu Asp TTT GGG Phe Giy 440 Leu Asp 170 TTT ATT Phe Ile 185 TTA ACT Leu Thr GAG TGG Giu Trp, GTC GAG Vai Giu ATC TCG Ile Ser 250 ATG AGT Met Ser 265 TCT TTT Ser Phe ACT GAA Thr Giu GAT GAT Asp Asp AGT ACT Ser Thr 330 CTA TGT Leu Ser 345 ACT AAA Thr Lys TGC AAT Gys Asn ACA TTG Thr Phe ATT GAT Ile Asp 410 ATT AGG Ile Ser 425 GGA TGT Gly Ser Ile CGT 628 Arg GTC 676 Val GGA 724 Aila 220 TTC 772 Phe GAA 820 Giu GAT 868 Asp GAC 916 Asp GCT 964 Ala 300 ACT 1012 Thr GTG 1060 Val ACA 1108 Thr CGC 1156 Arg ACA 1204 Thr 380 TGA 1252 Ser GAG 1300 His GAT 1348 Asp GGG 1396 Pro -174- :*pow* S. *00 .0 .0 so.
0 6* 0 GTT OAT AAT TTC Val Asp Asn Phe 445 TTC CAT ATT CCG Phe His Ile Pro TAC GAA GAC GCG Tyr Glu Asp Ala 480 TAT CGC GAT GCT Tyr Arg Asp Ala 495 TAT TGO AAT GTG Tyr Trp Asn Val 510 CAG CCC GCC ACC Gin Pro Ala Thr 525 CAT TAC AAG CTG His Tyr Lys Leu CGA GGC GAC AGC Arg Gly Asp Ser 560 0CC AAA ATG TAC Ala Lys Met Tyr 35 575 GAT ATC CAT ACC Asp Ile His Thr 590 GCT GGC GCT ATT Ala Gly Ala Ile 605 ACG TTC OCT 0CC Thr Phe Ala Ala GGT GA TTC-TTO Gly Asp Phe Leu 640 AAA CTT GAG TTA Lys Leu Glu Leu 655 GGT CAA CCG'CTA Gly Gin Pro Leu 670 ACC CTG CAA COC Thr Leu Gin Arg 685 OCT GOT GOT CAA Ala Gly Gly Gin OAA COC GCC CGC Glu Arg Ala Arg 720 AOT GOT CCC TAT GGT ATT TAT CTA TG Ser Gly Pro Tyr Gly Ile Tyr Leu Trp 450 455 TTC CTT GTT ACG GTC COT ATO CAA ACC Phe Leu Val Thr Val Arg Met Gin Thr 465 470 GAC ACT TGO TAC AAA TAT ATT TTC COC Asp Thr Trp Tyr Lys Tyr Ile Phe Arg 485 AAT GGC CAG CTC ATT ATG OAT GOC AGT Asn Gly Gin Leu Ile Met Asp Gly Ser 500 505 ATO CCA TTO CAA CTG OAT ACC OCA TG Met Pro Leu Gin Leu Asp Thr Ala Trp 515 520 ACT OAT CCA OAT OTO ATC OCT ATO OCO Thr Asp Pro Asp Val Ile Ala Met Ala 530 535 GCG ATA TTC CTG CAT ACC C'rT OAT CTA Ala Ile Phe Leu His Thr Leu Asp Leu 545 550 OCT TAC COT CAA CTT GAA COC OAT ACT Ala Tyr Arg Gin Leu Glu Arg Asp Thr 565 TAC ATT CAG OCA CAA CAG CTA CTG GOA Tyr Ile Gin Ala Gin Gin Leu Leu Oiy 580 585 ACC AAT ACT TOG CCA AAT CCC ACC TTG Thr Asn Thr Trp Pro Asn Pro Thr Leu 595 600 0CC ACA CCG ACA TTC CTC AGT TCA CCO Ala Thr Pro Thr Phe Leu Ser Ser Pro 610 615 TGG CTA AGC OCA OOC OAT ACC OCA MAT Trp Leu Ser Ala Oly Asp Thr Ala Asn 625 630 CCA CCO TAC MAC OAT OTA CTA CTC GOT Pro Pro Tyr Asn Asp Val Leu Leu Oly '645 COC CTA TAC MAC CTO COC CAC MAT CTG Arg Leu Tyr Asn Leu Arg His Asn Leu 660 665 MAT CTG CCA CTG TAT 0CC ACO CCO OTA Asn Leu Pro Leu Tyr Ala Thr Pro Val 675 680 CAG CMA 0CC OGA 000 GAC GOT ACA GOC Gin Gin Ala Gly Oly Asp Gly Thr Oly 690 695 GOC ACT OTT CAG GOC TOO CGC TAT CCG Gly Ser Val Gin Gly Trp Arg Tyr Pro 705 710 TCT 0CC OTO AGT TTO TTO ACT CAG TTC Ser Ala Val Ser Leu Leu Thr Gin Phe 725 GMA ATC TTC 1444 Giu Ile Phe 460 GMA CMA COT 1492 Giu Oin Arg 475 AGO 0CC OOT 1540 Ser Ala Gly 490 AAA CCA COT 1588 Lys Pro Arg OAT ACC ACA 1636 Asp Thr Thr GAC CCG ATO 1684 Asp Pro Met 540 TTG ATT 0CC 1732 Leu Ile Ala 555 CTA GTC GMA 1780 Leu Val Glu 570 CCG CGC CCT 1828 Pro Arg Pro AOT MAA GMA 1876 Ser Lys Glu GAG OTG ATG 1924 Giu Vai Met 620 ATT GOC GAC 1972 Ile Gly Asp 635 TAC TGO GAT 2020 T-T~rp-Asp 650 AOT CTO GAT 2068 Ser Leu Asp GAC CCG MAA 2116 Asp Pro Lys AGT AGT CCG 2164 Ser Ser Pro 700 TTA TTO OTA 2212 Leu Leu Val 715 GOC MAC AOC 2260 Oiy Asn Ser 730 -175- TTA CAA ACA ACG TTA GAA CAT CAG GAT AAT GAA AAA ATO ACG ATA CTG 2308 Leu Gin Thr Thr Leu Giu His Gin Asp Asn Giu Lys Met Thr Ile Leu 735 740 745 TTG CAG ACT CAA CAG GAA GCC ATC CTG AAA CAT CAG CAC OAT ATA CAA 2356 Leu Gin Thr Gin Gin Giu Ala Ile Leu Lys His Gin His Asp Ile Gin 750 755 760 CAA AAT AAT CTA AAA OGA TTA CAA CAC AGC CTO ACC OCA TTA CAG OCT 2404 Gin Asn Asn Leu Lys Gly Leu Gin His Ser Leu Thr Aia Leu Gin Ala 765 770 775 780 AGC COT OAT GOC GAC ACA TTO CGG CAA AAA CAT TAC AOC GAC CTG ATT 2452 Ser Arg Asp Gly Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu Ile 785 790 795 AAC GOT GOT CTA TCT GCG GCA OAA ATC 0CC GOT CTO ACA CTA COC AOC 2500 Asn Oly Oly Leu Ser Ala Aia Glu Ile Aia Gly Leu Thr Leu Arg Ser 800 805 810 ACC 0CC ATO ATT ACC AAT GOC OTT OCA ACG GGA TTO CTG ATT 0CC GOC 2548 Thr Ala Met Ile Thr Asn Oly Val Ala Thr Oly Leu Leu Ile Ala Oly 815 820 825 .i 25 OGA ATC 0CC AAC GCG GTA CCT AAC OTC TTC GGG CTG OCT AAC GOT OGA 2596 Oly Ile Aia Asn Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly 830 835 840 TCO GAA TOG OGA GCG CCA TTA ATT GOC TCC GGG CAA OCA ACC CAA OTT 2644 Ser Glu Trp Gly Ala Pro Leu Ile Gly Ser Gly Gin Ala Thr Gin Vai 845 850 855 860 GGC 0CC GOC ATC CAG OAT CAG AOC GCG GGC ATT TCA OAA GTG ACA OCA 2692 Oly Ala Oly Ile Gin Asp Gin Ser Ala Oly Ile Ser Giu Vai Thr Ala 865 870 875 GGC TAT CAG COT COT CAG GAA OAA TGG GCA TTO CAA CGG GAT ATT OCT 2740 Oly Tyr Gin Arg Arg Gin Glu Giu Trp Ala Leu Oln Arg Asp Ile Ala 40 880 885 890 Goooo: GAT AAC OAA ATA ACC CAA CTG GAT 0CC CAO ATA CAA AOC CTO CAA GAG 2788 Asp-Asn Glu Ile Thr Gin Leu Asp Ala Gin Ile Gin Ser Leu Gin Glu 895 900 905 45 o CAA ATC ACO ATG GCA CAA AAA CAO ATC ACO CTC TCT OAA ACC OAA CAA 2836 Gin Ile Thr Met Ala Gin Lys Gin Ile Thr Leu Ser Olu Thr Glu Oin 910 915 920 GCG AAT 0CC CAA GCG ATT TAT OAC CTO CAA ACC ACT COT TTT ACC GGG 2884 Ala Asn Ala Oln Ala Ile Tyr Asp Leu ln Thr Thr Arg Phe Thr Oly 925 930 935 940 CAG OCA CTO TAT AAC TGG ATG GCC GOT COT CTC TCC GCG CTC TAT TAC 2932 Oin Ala Leu Tyr Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr 945 950 955 CAA ATO TAT OAT TCC ACT CTO CCA ATC TOT CTC CAG CCA AAA 0CC OCA 2980 Oln Met Tyr Asp Ser Thr Leu Pro Ile Cys Leu Oin Pro Lys Ala Ala 960 965 970 TTA OTA CAG GAA TTA GGC GAG AAA GAG AOC OAC AOT CTT TTC CAG GTT 3028 Leu Val Oln Giu Leu Oly Glu Lys Giu Ser Asp Ser Leu Phe Oln Vai 975 980 985 CCG GTG TGG AAT OAT CTO TGG CAA GGG CTG TTA OCA OGA OAA GGT TTA 3076 Pro Val Trp Asn Asp Leu Trp Oln Oly Leu Leu Ala Oly Olu Gly Leu 990 995 1000 AOT TCA GAG CTA CAG AAA CTG OAT 0CC ATC TGG CTT OCA COT GOT GGT 3124 Ser Ser Olu Leu Gin Lys Leu Asp Ala Ile Trp Leu Ala Arg Oly Oly -176- 1005 1010 .1015 1020 ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3172 Ile Gly Leu Glu Ala Ile Arg Thr Val Ser Leu Asp Thr Leu Phe Gly 1025 1030 1035 ACA GOG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC 000 GAA ACG 3220 Thr Gly Thr Leu Ser GlU Asn Ile Asn Lys Val Leu Asn Gly Glu Thr 1040 1045 1050 GTA TCT CCA TCC GOT GOC GTC ACT CTG GCG CTG ACA 000 GAT ATC TTC 3268 Vai Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Giy Asp Ile Phe 1055 1060 1065 r7Ah kt~ ~jJ GATLU AGT CA1 CTA GGT TTG OAT AAC TCT TAC AAC 3316 Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn 1070 1075 1080 TTG GGT AAC GAG AAG AAA CGT COT ATT AAA CGT ATC GCC GTC ACC CTG 3364 Leu Oly Asn Giu Lys Lys Arg Arg Ile Lys Arg Ile Ala Val Thr Leu 1085 1090 1095 1100 CCA ACA CTT CTG 000 CCA TAT CAA GAT CTT OAA GCC ACA CTG OTA ATG 3412 Pro Thr Leu Leu Oly Pro Tyr Gin Asp Leu Giu Ala Thr Leu Val Met 1105 1110 1115 GOT GCG GAA ATC 0CC GCC TTA TCA CAC GGT GTG AAT GAC OGA GGC COG 3460 Oly Ala Giu Ile Ala Ala Leu Ser His Gly Vai Asn Asp Gly Gly Arg 1120 1125 1130 TTT OTT ACC GAC TTT AAC GAC AGC CGT TTT CTG CCT TTT GAA GOT CGA 3508 Phe Vai Thr Asp Phe Asn Asp Ser Ai-g Phe Leu Pro Phe Giu Gly Arg 1135 1140 1145 OAT GCA ACA ACC GOC ACA CTG GAG CTC AAT ATT ?I'C CAT OCG GOT AAA 3556 Asp Ala Thr Thr Gly Thr Leu Olu Leu Asn Ile Phe His Ala Gly Lys 1150 1155 1160 a 35 GAG OGA ACO CAA CAC GAG TTG OTC OCO AAT CTO AOT GAC ATC ATT GTO 3604 40 Giu Gly Thr Gin His Giu Leu Val Ala Asn Leu Ser Asp Ile Ile Val 1165 1170 1175 1180 CAT CTO AAT TAC ATC ATT CGA GAC OCG TAA ATTTCTTTTC TTTGTCGATT 3654 His Leu Asn Tyr Ile Ile Arg Asp Ala 1185 1190
ACAGGTCCCT
TCGATTACAA
CTGAATGCTG
GGCAGAGGGA
TTCOOCATCG
CCACAATACG
CTGAATGACC
TTOCCAATTT
ATCGAATACT
CCGOACGGGC
AATGACCAAC
GTCAGCTATC
CATCCCAATG
ATCAOOOOCC
CGCTGTCACT
CCOGCCCTGA
CGGCTCCTGG
GCTGGCAATO
GTAATGACGA
AAGGGCAACC
CCTATACCGT
GOCAACCTOC
ATCTACACAT
AAATCGCCCA
AATATCGAGC
TGTTATTAAG
TCCCAAAGGT
TGOAATGGCC
ATTATCGCTG
CGGTGTTATO
CACOTTCCTA
TGATATCCGT
GACCCOCTAT
CTCCGGTCA
OAGTACTTTA
GGCGOTGCTA
TCCCTATCTC
ATTTACAGCA
TCCATTAGCC
TCCCCACAAG
CAAGACGTTA
CAAGCCCGCC
GAAGGACGCG
TGCAGGATTC
TCAATGGCAT
TGCCATTACC
ACAGTGCAGG
GACGCACCCA
GCOAGGTCAT
AAACGCTGCA
AGATCCTGGA
CTTTCTGGCT
CTTGTCTGGC
TGACGCCAGC
ACGACAATGA
ACCAGAAOTA 3714 OOOAGAAGCA 3774 CCTrTCGACC 3834 TAATOGGCCT 3894 ACATGOCATT 3954 GAATATCGCC 4014 AOGCGTTACC 4074 ?1'TCAGTAAA 4134 OATATCGACA 4194 AAATCCGCAA 4254 CGGTOAACAT 4314 AAAAACCOCT 4374 CTTAGGGAA
ACCGCGCAGG
GTOOTTGCTO
OAAGAAACTG
CGAAGATGAJA
GCCCATTGTG
TTACCOCACA OCGCTATCTG
GTACAOGTOA
-177- ACTACAGGCA ACATCAAACC 4434 ACAAGCCAGC CTGTTCGTAC TCTGGTCTT GACCACGGTG GGTACAGCGC AATGGTCTGT GTGCGTACTC GCCG ATG GGAGAAGCCA GTACCAATGA AAAAACGCCA GCGTCACCAC AGGCCAGTCA CCCAGCCACC CCGACATGGC AACGCTTTGA GTTGATCTGC GGGGAGAAGG TATAAAGCTC CGCAACGTCA GCCCCACTGC CTACCCTACC GACGGCCAAC TGGATTGGGT 25 CCCGATGGAA AGTGGACGCA CCAAGCATCC AGTTCGCTGA 30 CCGAAAAGCG TGCGTCTATA CCCCAATCCA CAGGTATCAC TTCAGTGATA TGCTCGGTTC 35 ACCTGTTGGC CGAATCTAGG AGCCAGCCCG AAAATAGCTT 40 GGCACCACCG ACCTTATCTA GGTAATCAGT TTGATGCCCC ACTTGCCAAC TTCAAGTCGC 45 GTGCCACATA TCGCGCCACA TTGAATGTAA TGAACAATAA CAATTCTGGT TGGATGAAAA CTGCCGTTTC CAATGCATTT CGGCTCACCA GTGAAGTCAA
TGGATAACGC
AGCGCGTACC
ACGCCCGGAT
TCAACAAGTG
CGCCCCGGAA
GTTGATTAc
ACTAGAACTA
CGCACTAGAT
GTTGCCAGGT
GGAAGACGGA
CAATTTGCAG
TGTTACCGCC
CTTTACGCCA
CCTTACCGGG
TGCCAACCAG
CCTGCCTGTC
CGGTCAACAA
GCATGGCCGT
CAATCCCGAA
TGCGCAATCC
GTTGACATTA
CGATATTCAG
TCACTGGCGT
CCGGGGCGCA
ATTACAGCTC
GCTATGGTAT
CTACAGCCAC
ACCTCcCGCcA TCACTrrCATA
ATCTTC-TCTC
CTGATGTTTC
CTGGTTGGAC
ATCCGTCAAT
GCCTGGCAAC
AATTTTAACT
ATGCTGTATC
GACAGCAATG
GATAATGCCT
TCCGGTATTC
ATCAATGCCT
GCAGGCTTAT
CGAAACGGCT
ACAGGGACCG
CATCTGGTGG
TTCGGTCAAC
CGGCTGTTC
GGCTCTTGC
GCGTTGCCAG
GGATTAGGGA
TGTGACCTGT
CATCACACGC
ACCAAAGCAG
ACCGAAAT'rC
GGCGTCTGGG
CCGGAAGAGT
CCGTGCCAAC
GCTATGAATA
ACCGCACCGC
GCTAATACT
TAAGCCATGA
GG'FTGATCT
CGCAGCAACG
AAGATCGAGG
CCGTCACTA
CATTGATGGA
GCGGATACCA
TGCCCGTGGA
CTGATTTAGT
GGCGTAAAGG
ATGCCCGCAA
AAATCAAGGG
CACTAACTCT
TGGCGGATAT
TCATTATCT
GGCTGTTTCA
ATGGGATGCA
TGGT=rGAA
GCTCATGGCC
GGAATATGAC
ATCGGACGGG
GGAGAAAATC
TTATCAACTG
CGCTTGGTGG
CGACAAAATC
TATCAACGGA
TAGTCAGCAA
ATATTTTCAT
GTTGATCGGG
AGAAGATGTC
ACTGGTGGCT
TAATCGCGTC
GTCAGGATTT
CGACGGCTCC
CAACCAAAGT
4494 4554 4614 4674 4634 4794 4854 4865 4914 4974 5034 5094 5214 5274 5334 5394 5454 5514 5574 5634 5694 5754 5814 5874 5934 5994.
6054 AAGGCGTACA ATIGACAAC TAGCCAGCTT GATTCTGACT CACTGACCAA ACCCTGGTTG TACATTATCG TAGTTCCGCG GCAAATCTCC GGCTTGTTAT AGGATGAAAT CAGCGGCAAC ATGGTAAAGA GCGGGAATTC INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 1189 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26 (TcaB protein) met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 1 5 10 -178- Ala Leu Val Ala His Tyr Ile Ala Thr Gin Val Pro 25 Ala Asp Leu Lys 30
A.
Glu Thr Gly Thr Tyr Glu Leu 25 Gly 145 Ile 30 Gly Pro Lys Ile 225 Lys Phe Ser Gly Ser.
305 Gin Asp Leu Gin Thr 385 Ser Lys Ser Leu Asn Arg Asn 130 Lys Ser Lys Tyr Leu 210 Ser Leu Val Lys Ala 290 Asp Asn Ser Ser Ser 370 Ile Gin 35 Ile Ser Leu Gin Ala Asp Trp Asp 100 Leu Lys 115 Lys Thr Leu Lys Tyr Asp Asp Asn 180 Ala Phe 195 Lys Pro Glu Ala His Ile Tyr Lys 260 Lys Lys 275 Thr Gly Ala Gin Ala -Gly Gly Asn 340 Ser Lys 355 Tyr Asn Thr Asp Leu Ser Ser Phe Glu Ser Thr 165 Lys Tyr Asp Tyr Arg 245 Asn Ile Ser Met Gly 325 Val Asp Asp Ala Leu Phe 70 Ala Phe Tyr Ile Glu 150 Leu Thr Trp Gin Ser 230 Trp Ile Leu 4 Ser Asn 310 Ala Ile Tyr Asn Asp Val 55 Ile Lys Asn Ala Phe 135 Leu Ala Ile Arg Trp 215 Gly Phe rrp 31u Ser 295 lie rhr Lys Ala ksn 375 Asp 40 Thr His Pro His Gly 120 Thr Val Thr Phe Lys 200 Ser His Thr Val Leu 280 Pro Ser Pro Asn Thr 360 Tyr Leu Thr Arg Tyr Arg 105 Asp Ala Glu Leu Phe 185 Leu Glu Vai Ile Met 265 Ser rhr Asp Ser Leu 345 rhr :ys Tyr Glu Ser Pro Ala Ile 75 PheAl a 90 Tyr Ser Tyr Ile Phe Glu Ser Lys 155 Asp Tyr 170 Ile Gly Thr Leu Trp Arg Glu Pro 235 Ser Lys 250 Ser Ser Phe Thr Glu Val Asp Gly 315 Thr Gly 330 Ser Ser Lys Leu Asn Phe Tyr Leu Glu Asp Thr Asp Gin 140 Leu Ile Arg Val Ala 220 Phe Glu Asp Asp Ala 300 Thr Val rhr Arg rhr 380 Let 45 Ser Gly Glu Trp Pro 125 Gly Arg Thr Thr Thr 205 Ile Trp Asp Tyr Tyr 285 Ser Val Thr Gly Met 365 Leu I Leu Glu Tyr Gin Ala 110 Thr Ile Asp Ala Gin 190 Asp Asn Glu Lys Ser 270 Asn I Gin Leu I Leu C 3 Ser A 350 Cys H Ser I Lei Al Asr Phe Gly Leu Ser Tyr Cys 175 Asn ly Ala ksn Ile 255 Crp ~rg ry :le :ys la [is le a Asp i Ile Gly Leu Lys Arg Gln Leu 160 Glr.
Ala Gly Gly Asn 240 Asp Ala Val Gly Phe 320 Tyr Asn Gly Asn Ile Giu Phe Thr Ser Tyr Gly 390 Thr Phe Ser 395 Ser Asp Gly Lys -179- 0 0 o r Phe Thr Pro Val Asp Leu Tyr Asp Val 435 Ser Gly Pro 450 Phe Leu Val 465 Asp Thr Trp Asn Gly Gin Met Pro Leu 515 Thr Asp Pro 530 Ala Ile Phe 545 Ala Tyr Arg Tyr Ile Gin Thr Asn Thr 595 Ala Thr Pro 610 Trp, Leu Ser 625 Pro Pro Tyr Arg Leu Tyr Asn Leu Pro 675 Gin Gin Ala 690 Gly Ser Val 705 Ser Ala Val Leu Glu His Gin Glu Ala 755 His Leu Pro Asn 415 Leu Asn Ile Asp Asp 495 Asn Ala Lys Asp Met 575 His Ala Ala Phe Glu 655 Pro Gin Gly Ala Thr 735 Thr Asn L.;,Giy Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly -180- 770 775 780 Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu Ile Asn Gly Gly Leu 785 790 795 800 Ser Ala Ala Giu Ile Adla Giy Leu Thr Leu Arg Ser Thr Ala Met Ile 805 810 815 Thr Asn Gly Vai Ala Thr Gly Leu Leu Ile Ala Gly Gly Ile Ala Asn 820 825 830 Ala Val Pro Asn Vai Phe Giy Leu Ala Asn Giy Giy Ser Glu Trp Gly 835 840 845 Ala Pro Leu Ile Giy Ser Gly Gin Aia Thr Gin Val Gly Ala Giy' Ile 850 855 860 Gin Asp Gin Ser Ala Giy Ile Ser Giu Val Thr Ala Giy Tyr Gin Arg 865 870 875 880 Arg Gin Giu Giu Trp Ala Leu Gin Arg Asp Ile Ala Asp Asn Glu Ile 885 890 895 Thr Gin Leu Asp Ala Gin Ile Gin Ser Leu Gin Giu Gin Ile Thr Met 900 905 910 Ala Gin Lys Gin Ile Thr Leu Ser Giu Thr Giu Gin Aia Asn Aia Gin 915 920 925 Ala Ile Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr 930 935 940 Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp 945 950 955 960 Ser Thr Leu Pro Ile Cys Leu Gin Pro Lys Ala Ala Leu Vai Gin Giu 965 970 97S Leu Giy Giu Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn *:40 980 985 990 Asp Leu Trp Gin Giy Leu Leu Ala Giy Giu Gly Leu Ser Ser Giu Leu *995 1000 1005 Gin Lys Leu Asp Ala Ile Trp Leu Ala Arg Giy Gly Ile Gly Leu Giu .1010 1015 1020 Ala Ile Arg Thr Vai Ser Leu Asp Thr Leu Phe Gly Thr Giy Thr Leu 1025 1030 1035 1040 50 Ser Giu Asn Ile Asn Lys Vai Leu Asn Gly Giu Thr Val Ser Pro Ser 1045 1050 1055 Gly Gly Val Thr Leu Ala Leu Thr Gly Asp Ile Phe Gln Ala Thr Leu 1060 1065 1070 Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Giu 1075 1080 1085 Lys Lys Arg Arg Ile Lys Arg Ile Ala Val Thr Leu Pro Thr Leu Leu 1090 1095 1100 Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met Giy Ala Giu Ile 1105 1110 1115 1120 Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp *1125 1130 1135 Phe Asn Asp Ser Arg Phe Leu Pro Phe Giu Gly Arg Asp Ala Thr Thr 1140 1145 1150 -181- Gly Thr Leu Glu Leu Asn Ile Phe His Ala Gly Lys Giu Giy Thr Gin 1155 1160 1165 His Giu Leu Val Ala Asn Leu Ser Asp Ile Ile Val His Leu Asn Tyr 1170 1175 1180 Ile Ile Arg Asp Ala 1185 1190 0 INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 1881 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (ix) FEATURE: 0 NAME/KEY: CDS LOCATION: 1. .1881 OTHER INFORMATION: tcaB 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27 (CcaB 1 coding region) .040 ATG TCT GAA TCT TTA Met Ser Giu Ser Leu 1 5 GCA TTG GTT GCT CAT Ala Leu Val Ala His GAG AGT ATC CAG ACC Giu Ser Ile Gin Thr ACC AAA ATT AGC GAT Thr Lys Ile Ser Asp 50 GGC AGT CTG CAA TTG Giy Ser Leu Gin Leu ACG CTG GCA GAC TCA Thr Leu Ala Asp Ser TAT AAC TGG GAT AGT Tyr Asn Trp Asp Ser 100 GAA CGG TTG AAA TTC Giu Arg Leu Lys Phe 115 TTG AAT AAG ACC GAG Leu Asn Lys Thr Giu 130 GGG AAA TTA AAA AGT Gly Lys Leu Lys Ser 145 TTT ACA CAA ACG TTG Phe Thr Gin Thr Leu AAA GAA GCG CGC CGT GAT Arg
GAT
Asp
TTG
Leu
GAA
Giu
TAT
Tyr
CAG
Gin
OCT
Ala 110
ACA
Thr
ATT
Ile
GAT
Asp
GCC
Asp
AAA
Lys
GAT
Asp
ATT
Ile
GGC
Gly
TTA
Leu
AAG
Lys
CGA
Arg
CAA
Gin
CTA
Leu 160
CAA
48 96 144 192 240 288 336- 384 432 480 528 ATT AGT TAT GAC ACT T1'A GCC ACC Ci-r Ile Ser Tyr Asp Leu Ala Thr Leu Asp Tyr 170 Ile Thr Ala Cys Gin 175 -182- GGC AAA GAT AAT AAA ACC ATC Gly Lys Asp Asn Lys Thr Ile 180 TTC 'rrT ATrGC Phe Phe Ile Gly CGT ACA CAG AAT GCA Arg Thr Gin Asn Ala 190 ccc Pro
AAG
Lys
ATT
Ile 225
AAG
Lys
TTT
Phe 25
TCA
Ser 30 GGA Gly
TCT
35 Ser.
305
CAG
Gin 40
GAC
Asp 45
TTA
Leu
CAA~
Gin ACA2 Thr: 385
TTTJ
Phe GTA C Val I
TATC
Tyr
TATI
Tyr Leu 210
AGT
Ser
CTG
Leu
GTT
Vai
AAG
Lys
GCA
Ala 290
GAT
Asp
A.AT
Asn
TCT
Ser rCG Ser k.GT Ser 370 kTA Ile kCA
;AT
~sp
YAC
*GCA
*Ala 195
AAA
Lys
GAG
Giu
CAC
His
TAT
Tyr
AAA
Lys 275
ACA
Thr
GCT
Ala
GCC
Ala
GGC
Gly
TCA
Ser 355
TAC
Tyr2
GAA
Giu
CCA
Pro
CTC
Leu GTT C Val C 435
TTT
Phe
CCA
Pro
GCA
Ala
ATC
Ile
AAA
Lys 260
AAA
Lys
GGA
Giy
CAG
Gin
GGC
Giy
A.AC
Asn 340
AAG
Lys kAT ksn
L'TC
Phe
:CT
?ro
UAC
~sn L20
~AG
;ln
TAT
Tyr
GAT
*Asp
*TAT
Tyr
CGT
Arg 245
AAC
Asn
ATC
Ile
TCA
Ser
ATG
Met
GGA
Gly 325
GTG
Val
GAT
Asp
GAT
Asp2
ACC
Thr
TCT(
SerC 405 GCG C Ala I
GGGC
Gly C
TGG
Trp
CAA
Gin
TCA
Ser 230
TGG
Trp
ATC
Ile
TTG
Leu
TCA
Ser
AAT
Asn 310
GCT
Ala
ATT
Ile
TAT
Tyr k.AT ksn
['CC
3er 390 3GT fly
:TA
~eu
AG
ln
CGA
4Arg
TGG
*Trp 215
*GGG
Gly
TTT
Phe
TGG
Trp
GAA
Glu
AGC
Ser 295
ATT
Ile
ACT
Thr
AAG
Lys
GCC
Ala
AAC
Asn 375
TAC
Tyr
TCT
Ser
TTAC
Leu TTT C Phe G 4
AAA
Lys 200
TCA
Ser
CAT
His
ACT
Thr
GTG
Val
CTT
Leu 280
CCG
Pro
TCT
Ser
CCC
Pro
AAC
Asn k.CA T'hr 360
['AC
I'yr 3GC fly
'AT
~sp
GC
ly 40 ImT Lei
GAC
Gil
GT(
Val
ATC
Ile
ATG
Met 265
TCT
Ser
ACT
Thr
GAT
Asp
AGT
Ser
CTA
Leu 345
ACT
Thr
TGC
Cys
ACA
Thr
ATT
Ile
ATT
Ile 425
GGA
Gly
ACT
u Thr 3 TGG i Trp
GAG
*Glu
TCG
Ser 250
IAGT
*Ser
TTT
*Phe
*GAA
Glu
GAT
Asp
ACT
Thr 330
TCT
Ser
AAA
Lys
AAT
Asn
TTC
Phe
GAT
Asp 410
AGC
Ser
TCT
Ser
CG;
Arg
CCT
Pro 235
AAA
Lys
AGC
Ser
ACT
Thr
GTA
Val
GGG
Gly 315
GGA
Gly
AGT
S er
TTA
Leu Phe
['CA
Ser 395
['TA
eu
:TC
jeu
AT
*sn GCA ATT AAT GCC GGG Ala Ile 220 TTC TGG Phe Trp GAA GAT Glu Asp GAT TAT Asp Tyr GAC TAC Asp Tyr 285 GCT TCA Ala Ser 300 ACT GTA Thr Val GTG ACG Val Thr ACA GGA Thr Gly CGC ATG Arg Met 365 ACA CTC Thr Leu 380 TCA GAT Ser Asp CAC CTC His Leu GAT TCA Asp Ser CCG. GTTC Pro Val 445 TTC TTC C Phe Phe I Asr G1J Giu
AAA
Lys
AGC
Ser 270
AAT
Asn
CAA
Gin
CTT
Leu
TA
Leu
AGT
Ser 350
TGT
rye
TCT
Ser
GA
Giy Ala
*ATA
Ile 255
TGG
Trp
AGA
Arg
TAT
Tyr
ATT
Ile
TGT
Cys 335
GCA
Ala
CAT
His
ATT
Ile
AAA
Lys Gly AAC 720 Asn 240 GAT 768 Asp GCA 816 Al a GTT 864 Val1 GGT 912 Gly TTT 960 Phe 3 2-G TAT 1008 Tyr AAT 1056 Asn GGA 1104 Gly AAT 1152 Asn CAA 1200 Gin 400 TAT 1248 TTA GTC ACT GAT GGC GGT Leu Val Thr Asp Gly Gly 205 Pro Asn Tyr 415 AAT 1296 Asn TTC 1344 Phe CCG 1392 Pro ACT GCT CCC TAT GGT ATT TAT CTA TG Ser Cly Pro Tyr Cly Ile Tyr Leu Trp GAA IleC -183- 'Tc Phe 465
CAC
Asp
AAT
Asn
ATG
Met
ACT
Thr
C
Ala 25 545
GCT
Ala
TAC
Tyr
ACC
Thr
GCC
40 Ala
TGG
Trp 45 625 450
CTT
Leu
ACT
Thr
GGC
Gly
CCA
Pro
GAT
Asp 530
ATA
Ile
TAC
Tyr
ATT
Ile
AAT
Asn
ACA
Thr 610
CTA
Leu GTT ACG Val Thr TGG TAC Trp Tyr CAG CTC Gin Leu 500 TTG CAA Leu Gin 515 CCA GAT Pro Asp 'rro CTG Phe Leu CGT CAA Arg Gin CAC GCA Gin Aia 580 ACT TGG Thr Trp 595 CCG ACA Pro Thr
AGO
Ser
GTC
Vai
AAA
Lys 485
ATT
Ile
CTG
Leu
GTG
Vai
CAT
Hi s
CTT
Leu 565
CAA
Gin
OCA
Pro
CT
Arg 470
TAT
Tyr
ATG
Met
GAT
Asp
ATC
Ile
ACC
Thr 550
GAA
Giu
CAC
Gin
AAT
Asn 455 460 ATG CAA ACC GAA CAA CT Met Gin Thr Giu Gin Arg 475 ATT TTO CCC AGO GCC GOT Ile Phe Arg Ser Ala Gly 490 CAT GCC AGT AAA OCA CGT Asp Gly Ser Lys Pro Arg 505 ACC GCA TGG GAT ACC ACA Thr Aia Trp Asp Thr Thr 520 GOT ATG CG GAO CCG ATG Ala Met Ala Asp Pro Met 535 540 OTT GAT OTA TTC ATT GCO Leu Asp Leu Leu Ile Aia 555 OGC GAT ACT CTA GTO GAA Arg Asp Thr Leu Vai Giu 570 OTA OTG GGA CCG CGC COT Leu Leu Giy Pro Arg Pro 585 COO ACC TTC AGT AAA GAA Pro Thr Leu Ser Lys Giu 600 TAO CAA GAO CG 1440 Tyr Giu Asp Ala 480 TAT CCC CAT GOT 1488 Tyr Arg Asp Ala 495 TAT TGG AAT GTG 1536 Tyr Trp Asn Val 510 CAG COO CCC ACC 1584 Gin Pro Ala Thr 525 CAT TAO AAG OTC 1632 His Tyr Lys Leu OGA CCC GAO AGO 1680 Arg Gly Asp Ser 560 CO AAA ATG TAO 1728 Ala Lys Met Tyr 575 CAT ATO CAT ACC 1776 Asp Ile His Thr 590 CT CCC GOT ATT 1824 Ala Cly Ala Ile 605 TTO OTO ACT TOA CC Phe Leu Ser Ser Pro GAG GTC ATC ACG Giu Val Met Thr 620 TTO GOT GCC 1872 Phe Ala Ala 1881 INFORMATION FOR SEQ ID NO:28: Wi SEQUENCE CHARACTERISTICS: LENGTH: 627 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECUL~E TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28 (TcaBi protein) Met Ser Giu Ala Leu Val Clu Ser Ile 35 Thr Lys Ile Ser Leu Phe Thr Gin Thr Leu Lys 10 Ala His Tyr Ile Ala Thr Gin Val 25 Gin Thr Ala Asp Asp Leu Tyr Giu 40 Ser Asp Leu Val Thr Thr Ser Pro 55 Ciu Ala Arg Arg Asp is Pro Ala Asp Leu Lys Tyr LeU Leu Leu Asp Leu Ser Giu Ala Ile -184- Gly Ser Leu Gin Leu Phe Ile His Arg AlaIle Glu Gly Tyr Asp Gly 65 70 75 Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Giu Gin Phe Leu 585 90 Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 100 105 110 Giu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr Ile Asp Pro Thr Leu Arg 115 120 125 Leu Asn Lys Thr Glu Ile Phe Thr Ala Phe Giu Gin Gly Ile Ser Gin 135 140 Giy Lys Leu Lys Ser Giu Leu Val Giu Ser Lys Leu Arg Asp Tyr Leu 145 150 155 160 Ile Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr Ile Thr Ala Cys Gin 165 170 175 Gly Lys Asp Asn Lys Thr Ile Phe Phe Ile Gly Arg Thr Gin Asn Ala -180 185 190 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Vai Thr Asp Gly Giy *195 200 205 9Lys Leu Lys Pro Asp Gin Trp, Ser Giu Trp Arg Ala Ile Asn Ala Gly 210 215 220 I 0 le Ser Giu Ala Tyr Ser Giy His Val Giu Pro Phe Trp Giu Asa Asn 225 230 235 240 *Lys Leu His Ile Arg Trp Phe Thr Ile Ser Lys Glu Asp Lys Ile Asp 245 250 255 9Phe Val Tyr Lys Asn Ile Trp Vai Met Ser Ser Asp Tyr Ser Trp Ala 260 265 270 .:40 Ser Lys Lys Lys Ile Leu Giu Leu Ser Phe Thr Asp Tyr Asn Arg Vai .275 280 285 *Gly-hla Thr Gly Ser Ser Ser Pro Thr Giu Val Ala Ser Gin Tyr Gly 290 295 300 45 *Ser Asp Ala Gin Met Asa Ile Ser Asp Asp Gly Thr Val Leu Ile Phe 305 310 315 320 Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 325 330 335 Asp Ser Gly Asn Val Ile Lys Asn Leu Ser Ser Thr Gly Ser Ala Asa 340 345 350 Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 355 360 365 Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser Ile Asn 370 375 380 Th Ile Giu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 385 390 395 400 Phe Thr Pro Pro Ser Giy Ser Ala Ile Asp Leu His Leu Pro Asn Tyr 405 410 415 Val Asp Leu Asn Ala Leu Leu Asp Ile Ser Leu Asp Ser Leu Leu Asn 420 425 430 Tyr Asp Val Gin Gly Gin Phe Gly Giy Ser Asn Pro Val Asp Asn Phe 435 440 445 -185- Ser Phe- 465 Asp Asn Met Thr Ala 545 Al a 25 Tyr Thr 30 Ala Trp 625 Ile Asp Asp 495 Asn Ala Lys Asp Met 575 His Al a Ala *0 o .0 0 .0 0.00 0 0: INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 1689 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: l. .1689 OTHER INFORMATION: tcaBi 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29 (tcaB 1 1 coding GCA GGC GAT ACC GCA AAT ATT OGC GAC GGT GAT TTC TTG ccA CCG TAC Ala Oly Asp Thr Ala Asn Ile Gly Asp Gly Asp Phe Leu Pro Pro Tyr 1 5 10 AAC OAT GTA CTA CTC GGT TAC TOG OAT AAA CTT GAG TTA COC CTA TAC Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu Arg Leu Tyr 20 25 AAC CTO COC CAC AAT CTG AGT CTG OAT GOT CAA CCO CTA AAT CTG CCA Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu Asn Leu Pro 40 CTO TAT 0CC ACO CCG OTA GAC cco AAA ACC CTO CAA cGc CAG CAA CC Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala -186regaion) 48 96 144 192 000 GAC GOT ACA GGC AGT AGT CCG OCT Gly Asp Gly Thr Gly Ser Ser Pro Ala a a.
a.
a CAG GGC TOG CGC TAT Gin Gly Trp Arg Tyr AGT TTG TTO ACT CAG Ser Leu Leu- Thr Gin 100 CAG OAT AAT OAA AAA Gin Asp Asn Giu Lys 115 ATC CTG AAA CAT CAG Ile Leu Lys His Gin 130 CAA CAC AOC CTG ACC Gin His Ser Leu Thr 145 25 COO CAA AAA CAT TAC Arg Gin Lys His Tyr 165 30 GAA ATC 0CC GOT CTG Oiu Ile Ala Gly Leu 180 OTT-GCA ACO OGA TTG 35 Val Aia Thr Oiy Leu 195 AAC GTC TTC 000 CTG Asn Val Phe Oly Leu 40 210 ATT GOC TCC 000 CAA Ile Oly Ser Oly Gin 225 45 AGC OCO GOC ATT TCA Ser Aia Giy Ile Ser 245 GAA TOO GCA TTO CAA Giu Trp Aia Leu Gin 260 OAT 0CC CAG ATA CAA Asp Ala Gin Ile Gin 275 CAG ATC ACG CTC TCT Gin Ile Thr Leu Ser 290 GAC CTO CAA ACC ACT Asp Leu Gin Thr Thr 305 0CC OOT COT CTC TCC Ala Oly Arg Leu Ser 325 CCA ATC TOT CTC CAG Pro Ile Cys Leu Gin
CCO
Pro
TTC
Phe
ATG
Met
CAC
His
OCA
Ala 150
AOC
Ser
ACA
Thr
CTO
Leu
OCT
Ala
OCA
Ala 230
GAA
Giu
CG
Arg
AGC
Ser
GAA
Oiu
COT
Arg 310
OCO
Al a TTA *rrO Leu Leu GOC AAC Oly Asn ACO ATA Thr Ile 120 OAT ATA Asp Ile 135 TTA CAG Leu Gin GAC CTO Asp Leu CTA COC Leu Arg ATT 0CC Ile Ala 200 AAC GOT Asn Gly 215 ACC CAA Thr Gin OTO ACA Val Thr OAT ATT Asp Ile CTO CAA Leu Gin 280 ACC GAA Thr Oiu 295 TTT ACC Phe Thr CTC TAT Leu Tyr OTA OAA Val Olu 90 AGC TTA Ser Leu 105 CTO TTG Leu Leu CAA CAA Gin Gin OCT AGO Ala Ser ATT AAC Ile Asn 170 AGC ACC Ser Thr 185 0CC OGA Gly Gly OGA TCO Oly Ser OTT GOC Val Gly OCA. 000 Ala Gly 250 OCT OAT Ala Asp 265 GAO CAA Giu Gin CAA OCO Gin Ala 000 CAG Gly Oln TAC CAA Tyr Gin 330
GT
Gly
COC
Arg
CAA
Gin
CAG
Gin
AAT
Asn
COT
Arg 155
GOT
Oly 0CC Ala
ATC
Ile
OAA
Oiu 0CC Ala 235
TAT
Tyr
AAC
Asn
ATC
Ile
AAT
Asn
OCA
Ala 315
ATG
Met 0CC COC TCT 0CC Ala Arg Ser Ala ACA ACG TTA OAA Thr Thr Leu Giu 110 ACT CAA CAG OAA Thr Gin Gin Giu 125 AAT CTA AAA GGA Asn Leu Lys Oly 140 OAT OOC GAC ACA Asp Oly Asp Thr OOT CTA TCT OCO Gly Leu Ser Ala 175 ATO ATT ACC AAT Met Ile Thr Asn 190 0CC AAC OCO OTA Ala Asn Ala Val 205 TOG GOA OCO CCA Trp Oly Ala Pro 220 GOC ATC CAG OAT Gly Ile Gin Asp CAG COT COT CAG Gin Arg Arg Gin 255 OAA ATA A~jC CAA Giu Ile Thr Ginr 270 ACO ATO OCA CAA Thr Met Ala Gin 285 0CC CAA OCO ATT Ala Gin Ala Ile 300 CTG TAT AAC TGO Leu Tyr Asn Trp TAT OAT TCC ACT Tyr Asp Ser Thr 335 OTO 288 Val CAT 336 His 0CC 384 Ala TTA 432 Leu TTG 480 Leu 160 GC' 528 Al a GOC 576 Gly CCT 624 Pro TTA 672 Leu CAO 720 Gin 240 GAA1 768 Giu CTG 816 Leu A.AA 864 Lys TAT 912 Tyr ATG 960 M1et 320 CTG 1008 Leu GOT CAA GOC AGT OTT Gly Gin Oly Ser Vai CCA AAA 0CC OCA TTA OTA CAG GAA TTA GOC GAG 1056 Pro Lys Ala Ala Leu Val Gin 0Th Leu Oly Glu -187- 340 345 AAA GAG AGC GAC AGT CTT TTC CAG GTT CCG p p..
p p. p p p p Lys Giu Ser 355 CAA GGG CTG Gin Giy Leu 370 GAT GCC ATC Asp Ala Ile 385 ACC GTG TCG Thr Val Ser ATC AAT AAA Ile Asn Lys ACT CTG GCG Thr Leu Ala 435 CAG CTA GGT Gin Leu Gly 450 30 CGT ATT AAA Arg Ile Lys 465 CAA GAT CTT Gin Asp Leu TCA CAC GGT Ser His Gly AGCCGT TTT Ser Arg Phe 45 515 GAG CTC AAT Glu Leu Asn 530 GTC GCG AAT Val Aia Asn 545 GAC GCG TAA Asp Ala Asp Ser TTA GCA Leu Ala TGG CTT Trp Leu CTG OAT Leu Asp 405 GTG CTT Vai Leu 420 CTG ACA Leu Thr TTG GAT Leu Asp CGT ATC Arg Ile GAA GCC Glu Ala 485 GTG AAT Vai Asn 500 CTG CCT Leu Pro ATT TTC Ile Phe Leu
GGA
Gly
GCA
Ala 390
ACC
Thr
AAC
Asn
GGG
Gly
AAC
Asn
GCC
Ala 470
ACA
Thr
GAC
Asp
TTT
Phe
CAT
His Phe
GAA
Glu 375
CGT
Arg
CTG
Leu
GGG
Gly
GAT
Asp
TCT
Ser 455
GTC
Val
CTG
Leu
GGA
Gly
GAA
Glu
GCG
Ala 535 Gin 360
GGT
Gly
GGT
Gly
TTT
Phe
GAA
Glu
ATC
Ile 440
TAC
Tyr
ACC
Thr
GTA
Val
GGC
Gly
GGT
Oly 520
GGT
Gly Vai
TTA
Leu
GGT
Gly
GGC
Gly
ACG
Thr 425
TTC
Phe
AAC
Asn
CTG
Leu
ATG
Met
CGG
Arg 505
CGA
Arg
AAA
Lys
GTG
Vai
TCA
Ser
GGG
Gly 395
GGG
Oly
TCT
Ser
GCA
Ala
GGT
Gly
ACA
Thr 475
GCO
Ala
GTT
Val
GCA
Ala
GGA
Gly
CTG
Leu 555 Trp Asn 365 GAG CTA Glu Leu 380 CTA OAA Leu Glu ACG TTA Thr Leu CCA TCC Pro Ser ACA CTG Thr Leu 445 AAC GAG Asn Glu 460 CTT CTG Leu Leu GAA ATC Glu Ile ACC GAC Thr Asp ACA ACC Thr Thr 525 ACG CAA Thr Gin 540 Asp
CAG
Gin
GCC
Ala
AGT
Ser
GGT
Gly 430
GAT
Asp
AAG
Lys
GGG
Gly
GCC
Ala
TTT
Phe 510
GGC
Gly
CAC
His Leu
AAA
Lys
ATC
Ile
GAA
Glu 415
GGC
Gly
TTG
Leu
AAA
Lys
CCA
Pro
GCC
Ala 495
AAC
Asn
ACA
Thr
GAG
Glu Trp CTG 1152 Leu CGC 1200 Arg 400 AAT 1248 Asn GTC 1296 Val AGT 1344 Ser CGT 1392 Arg TAT 1440 Tyr 480 TTA 1488 Leu GAC 1536 Asp CTG 1584 Leu TTG 1632 Leu CGA 1680 Arq.
560.
TGO AAT GAT CTG TGG 1104 CTG AGT GAC ATC Leu Ser Asp Ile 550 ATT GTG CAT Ile Vai His AAT TAC ATC ATT Asn Tyr Ile le 1689 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 562 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein -188- Ala 1 Asn Asn (xi) SEQUENCE Gly Asp Thr Ala
S
Asp Vai Leu Leu Leu Arg His Asn 35 DESCRIPTION: SEQ ID NO:30 Asn Ile Gly Asp Giy Asp Phe 10 Gly Tyr Trp Asp Lys Leu Glu 25 Leu Ser Leu Asp Gly Gin Pro 40 (TcaBi 1 protein) Leu Pro Pro Tyr Leu Arg Leu Tyr 30 Leu Asn Leu Pro Leu Tyr Ala Thr Pro ValL .9 9* 9** *9 9* 9 9**
S
9*9* 9* 9* 006
S
S 9Z Gly Gin Ser Gin 25 Ile 30 Gin 145 Arg Glu Val Asn 45 le 225 Ser Glu Asp Gin Asp 305 Ala Pro Lys( Gl) Leu Asp Leu 130 His Gin Ile Ala Val 210 Gly Ala Trp Ala Ile 290 Leu Gly Ile .lu Trp Leu Asn 115 Lys Ser Lys Ala Thr 195 Phe Ser Gly Ala Gin 275 Thr Gin Arg ys I Ser I 355 Arg Thr 100 Glu His Leu His Gly 180 Gly Gly Gly Ile Leu 260 Ile Leu rhr Leu eu 340 ksp Tyr 85 Gin Lys Gin Thr Tyr 165 Leu Leu Leu Gin Ser 245 Gin Gin Ser rhr Ser 325 Gln Ser 70 Pro Phe Met His Ala 150 Ser Thr Leu Ala Ala 230 Glu Arg Ser i Glu Arg I 310 Ala I Pro I Leu P Gly Asp Gly Thr Gly As 5 Se Lei G1 Th Asr 135 Leu Asp Leu Ile Asn 215 Thr Val Asp Leu hr 295 ?he ~eu ~ys 'he
P
5 Pro Lys Thr Leu Gin Arg Gin Gin Ala r Ser Pro Leu Val r Asn Ser 105 Ile Leu 120 Ile Gin i Gin Ala Leu Ile Arg Ser 185 Ala Gly 200 Gly Gly Gin Val Thr Ala 4 Ile Ala 265 Gin Glu 280 Glu Gin I Thr Gly C Tyr Tyr C 3 Ala Ala L 345 Gin Val P 360 Gly Ser Leu Leu Gin Thr Glr Gin Ser Asn 170 Thr Gly Ser Gly 3 iy 250 Asp 31n la ln ;In 30 leu Asn Arg 155 Gly Ala Ile Glu Ala 235 Tyr Asn Ile, Asn Ala 315 Asn 140 Asp Gly Met Ala Trp 220 Gly Gin Glu Thr Ala 300 Leu 125 Leu Gly Leu Ile Asn 205 Gly Ile Arg Ile Met 285 Gil Tyr 110 1 Gin Lys Asp Ser Thr 190 Ala Ala Gin Arg Thr 270 Ala C Ala Asn I Gi Gl Thr Ala 175 Asn Val Pro Asp Mln 255 31n ;in :le Irp 1 Ala Leu Leu 160 Ala Gly Pro Leu Gin 240 Glu Leu Lys Tyr Met 320 Met Tyr Asp Ser Thr Leu 335 Val Gin Giu Leu Gly Glu 350 'ro Vai Trp Asn Asp Leu Trp 365 -189i Gin Gly 370 Asp Ala 385 Thr Val Ile Asn Thr Leu Gin Leu 450 Arg Ile 465 Gin Asp Ser His Ser Arg 30 Giu Leu 530 35 Val Ala 545 Asp Ala Leu Ser Gly Ile Gly Thr 410 Thr Vai 425 Phe Gin Asn Leu Leu Pro Met Gly 490 Arg Phe 505 Arg Asp Lys Giu Ser Gly 395 Giy Ser Ala Gly Thr 475 Al a Val1 Al a Gly Leu 555 Leu Arg 400 Asn Val Ser Arg Tyr 480 Leu Asp Leu Leu Arg 560 0@ 0e S 0@9 S 9* 4 @4 9 5S@9
S
4.
4. 4 4 See'.
6 #996 9. 0 .4 4.
0 e.g S..
0 49 S 4 09.6
S.
6009 Asrr Leu Ser Asp 550 Ile Ile Val His Asn Tyr Ile Ile INFORMATION FOR SEQ ID NO:31: Wi SEQUENCE CHARACTERISTICS: 45 LENGTH: 4458 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1. .4458 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31 ATG CAG GAT TCA CCA GAA GTA TCG ATT ACA ACG C'TG Met Gin Asp Ser Pro Glu Val Ser Ile Thr Thr Leu 1 5 10 GGT GGC GGT GCT ATC AAT GGC ATG GGA GAA GCA CTG Gly Gly Gly Ala Ile Asn Gly Met Gly Glu Ala Leu 25 OCT GAT GGA ATG GCC TCC CTA TCT CTG CCA TTA ccc Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro 40 AGA GGG ACG GCT CCT GGA TTA TCG CTG ATT TAC AGC Arg Gly Thr Ala Pro Gly Leu Ser Leu Ile Tyr Ser -190- (tcac gene): TCA CTT CCC AAA Ser Leu Pro Lys AAT GOT GCC GGC Asn Ala Ala Gly CTT TCG ACC GGC Leu Ser Thr Gly AAC AGT GCA GGT Asn Ser Ala Gly
.D
AAT
Asn 65
CGA
Arg
CTA
Leu
CAA
Gin
CCA
Pro
TTC
Phe 145
GCT
Ala 30
AAA
Lys Gcc Ala
AGC
Ser
AAA,
Lys 45 225
AAC
Asn
GCA
Ala
GGT(
Gly
ACA
Thr2
GGT
Gly I 305
CAC
His I so
GGG
Gly
CGC
Arg
TCC
Ser
CCT
Pro
ATT
Ile 130
AGT
Ser
TTC
Phe
ACC
Thr
CAG
Gin
TAT
Tyr 210
ACC
rhr
AC
Tyr
CCT
Pro
GAG
lu 3CG kla 290
ETT
Phe
:GC
krg CCT TTC Pro Phe ACC CAA Thr Gin CCA CAA Pro Gin 100 GAT ATC Asp Ile 115 TCC TAT Ser Tyr AAA ATC Lys Ile TGG CTG Trp Leu GCG CAG Ala Gin 180 TGG TTG Trp Leu 195 CAA TAT Gin Tyr GCT CAT Ala His GGC AAC Gly Asn CCC GCA Pro Ala 260 CGC GAT Arg Asp 275 CAA TGG Gin Trp GAA GTG Glu Vai ACC GCG Thr Ala GGC ATC Gly Ile 70 CAT GGC His Gly GGC GAG Gly Glu CGT CAA Arg Gin ACC GTG Thr Vai GAA TAC Glu Tyr 150 ATA TCG Ile Ser 165 GCT TGT Ala Cys CTG GAA Leu Glu CGA GCC Arg Ala CCC AAT Pro Asn 230 ATC AAA Ile Lys 245 CCG GAA Pro Glu ACC TCA Thr Ser TCT GTA Ser Vai CGT ACT Arg Thr 310 CTC ATG Leu Met 325 55
GGC
Gly
ATT
Ile
GTC
Vai
GAC
Asp
ACC
Thr 135
TGG
Trp
ACA
Thr
CTG
Leu
GAA
Glu
GAA
Glu 215 GTT 2 Val
CCA
Pro
GAG
Glu
CTTC
Leu I CGC C Arg I 295 CGC C Arg GCC C Ula C
CCA
Pro
ATG
Met
GTT
Vai 120
CGC
Arg
CAA
Gin
CCG
Pro
GCA
Ala
ACT
Thr 200
GAT
Asp kCC Thr
:AA(
Unn rGG Vrp1
:AT
Iis !80
CG
?ro 2
GC
~rg I ;GA C ny c
CAA
Gin
AAT
Asn 105
AAA
Lys
TAT
Tyr
CCT
Pro
GAC
Asp
AAT
Asn 185
GTG
Val
GAA
Giu
GCA
Ala
GCC
Ala
CTG
Leu 265 kCC 'hr 3AT ksp rTA ,eu
;AA
lu TGG CA; Trp Gir TGC GGj Cys GI) TAC GGI Tyr Gi) 90 ATC GCC ile Ala ACG CT Thr Leu CAA GCC Gin Ala GCC TCC Ala Ser 155 GGG CAT Gly His 170 CCG CAA Pro Gin ACG CCA Thr Pro GCC CAT Ala His CAG CGC Gin Arg 235 AGC CTG Ser Leu 250 TTT CAT Phe His GTG CCA Val Pro ATC TTC Ile Phe TGT CAA Cys Gin 315 GCC AGT Ala Ser 330 7 AAT GAC r Asn Asp CTG AAT Leu Asn CAA GGC Gin Gly 125 CGC CAG Arg Gin 140 GGT CAA Gly Gin CTA CAC Leu His AAT GAC *Asn Asp GCC GGT Ala Gly 205 TGT GAC Cys Asp 220 TAT CTG Tyr Leu TTC GTA Phe Vai CTG GTC Leu Val ACA TGG 4 Thr Trp 285 TCT CGC Ser Arg 300 CAA GTG Gin Val I ACC AAT Thr Asn 2
GA
As
GA
Asi 11(
GT.
Va
AT(
Ile
GA)
Glu
ATC
Ile
CAA
Gin 190
GAA
Glu
GAC
Asp
GTA
Vai
CTG
Leu
TTT
Phe 270
GAT
Asp
TAT
Tyr
CTG
Leu
'AC
ksp C ACC p Th2
CA)
p Glr
ACC
i Thr
CTG
Leu
GGA
Gly
TTA
Leu 175
CAA
Gin
CAT
His
AAT
Asn
CAG
Gin
GAT
Asp 255
GAC
Asp
GCA
Ala
GAA
Glu
ATG
Met
GCC
Ala 335 3 TTC 288 Phe GGG -33-6 Gly TTG 384 Leu GAT 432 Asp CGC 480 Arg 160 GGG 528 Gly ATC 576 Ile GTC 624 Val GAA 672 Glu GTG 720 Val 240 AAC 768 Asn CAC 816 His GGT 864 Gly TAT 912 Tyr TTT 960 Phe 320 CCG 1008 Pro GTT ATG TCC ATT AGC Vai Met Ser Ile Ser GAA CTG GTT GGA CGC TTA ATA CTG GAA TAT GAC AAA AAC GCC AGC GTC 1056 -191- Glu Leu Val Gly Arg Leu Ile Leu Giu Tyr Asp Lys Asn Ala Ser Val 340 345 350 ACC ACG TTG ATT ACC ATC CGT CAA TTA AGC CAT GAA TCG GAC GGO AGO 1104 Thr Thr Leu Ile Thr Ile Arg Gin Leu Ser His Giu Ser Asp Gly Arg 355 360 365 CCA GTC ACC CAG CCA CCA CTA GAA CTA GCC TGG CAA CGG TTT GAT CTG 1152 Pro Val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 370 375 380 GAG AAA ATC CCG ACA TGO CAA CGC TTT GAC GCA CTA GAT AAT TTT AAC 1200 Glu Lys Ile Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 385 390 395 400 TCG CAG CAA COT TAT CAA CTO GTT OAT CTO CGG GGA GAA GGG TTO CCA 1248 Ser Gin Gin Arg Tyr Gin Leu Vai Asp Leu Arg Giy Glu Oly Leu Pro 405 410 415 GOT ATO CTO TAT CAA OAT COA GOC OCT TGG TGG TAT AAA GCT CCG CAA 1296 Oly Met Leu Tyr Gin Asp Arg Oly Ala Trp Trp Tyr Lys Ala Pro Gin 420 425 430 CGT CAG GAA GAC OGA GAC AOC AAT 0CC GTC ACT TAC GAC AAA ATC 0CC 1344 Arg Gin Olu Asp Oly Asp Ser Asn Ala Val Thr Tyr Asp Lys Ile Ala 435 440 445 CCA CTO CCT ACC CTA CCC AAT TTO CAG OAT AAT 0CC TCA TTG ATG GAT 1392 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Met Asp 450 455 460 ATC AAC OGA GAC GOC CAA CTG GAT TOG OTT GTT ACC GCC TCC GOT ATT 1440 Ile Asn Oly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly Ile 465 470 475 480 CGC OGA TAC CAT AOT CAG CAA CCC OAT OGA AAO TGG ACG CAC TTT ACG 1488 Arg Gly Tyr His Ser Gin Gin Pro Asp Oly Lys Trp Thr His Phe Thr 485 490 495 CCA ATC AAT 0CC TTG CCC GTG GAA TAT TTT CAT CCA AOC ATC CAG TTC 1536 Pro Ile Asn Ala Leu Pro Val Giu Tyr Phe His Pro Ser Ile Gin Phe 500 505 510 OCT GAC CTT ACC GGG GCA GOC TTA TCT OAT TTA OTG TTO ATC GGG CCG 1584 Ala Asp Leu Thr Gly Ala Oly Leu Ser Asp Leu Val Leu Ile Oly Pro 515 520 525 AAA AGC GTG COT CTA TAT 0CC AAC CAG COA AAC GOC TGG COT AAA OGA 1632 Lys Ser Val Arg Leu Tyr Ala As Gin Arg Asn Gly Trp Arg Lys Oly 530 535 540 OAA OAT GTC CCC CAA TCC ACA GOT ATC ACC CTO CCT OTC ACA GGG ACC 1680 Olu Asp Val Pro Gin Ser Thr Gly Ile Thr Leu Pro Val Thr Oly Thr 545 550 555 560 OAT 0CC COC AAA CTG GTG GCT TTC AOT OAT ATO CTC GOT TCC GOT CAA 1728 Asp Ala Arg Lys Leu Vai Ala Phe Ser Asp Met Leu Oly Ser Oly Gin 565 570 575 CAA CAT CTG GTG GAA ATC AAO GOT AAT COC GTC ACC TOT TOG CCO AAT 1776 Gin His Leu Val Giu Ile Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 580 585 590 CTA GGG CAT GOC COT TTC GOT CAA CCA CTA ACT CTG TCA OGA TTT AGC 1824 Leu Oly His Gly Arg Phe Oly Gin Pro Leu Thr Leu Ser Gly Phe Ser 595 600 605 CAG CCC OAA AAT AGC TTC AAT CCC OAA CGG CTG TIT CTG GCG OAT ATC 1872 Gin Pro Glu Asn Ser Phe Asn Pro Giu Arg Leu Phe Leu Ala Asp Ile 610 615 620 -192- GAC GGC TCC GGC ACC ACC GAC CTT ATC TAT.GCG CAA TCC GGC TCT TTG 1920 Asp Gly Ser Gly Thr Thr Asp Leu Ile Tyr Ala Gin Ser Gly Ser Leu 625 630 635 640
C
C. CTC AT1' TAT Leu Ile Tyr TTA GCG TTG Leu Ala Leu GTC GCC GAT Val Ala Asp 675 CCA CAT ATC Pro His Ile 690 CCC TGG TTG Pro Trp iLeu 705 25 CTA CAT TAT Leu His Tyr CTC ACC AAA 30 Leu Thr Lys CAT TTG CTA His Leu Leu 755 CTC ACC AGT Leu Thr Ser 770 40 CGG GAA TTC Arg Glu Phe 785 =T TCT CAC Phe Ser His AGC TGG TTT Ser Trp Phe GAA TAT TGG Glu Tyr Trp, 835 TAT ACC GTC Tyr Thr Vai 850 AAT GAG ACA Asn Giu Thr 865 CTA CGC ACT Leu Arg Thr CCT TAT ACC Pro Tyr Thr
AAT
Asn
T
Phe 665
ATA
Ile
CGT
Arg
AAT
Asn
TTC
Phe
GCT
Aila 745
CAG
Gin
CAC
His
ATC
Ile
CAG
Gin
GAA
Giu 825
GCT
Ala
CAG
Gin
ACG
Thr
GAC
Asp
TAT
Tyr 905 CAG TTT Gin Phe 650 GAC AAC Asp Asn GCC AGC Ala Ser TGT GAC Cys Asp AAC CGG Asn Arg 715 TGG TTG Trp Leu 730 TGT TAT Cys Tyr GAT GAA Asp Giu GGC GTC Giy Vai AAA CAG Lys Gin 795 GCG GCA Ala Ala 810 GTA GAC Val Asp TAT AGC Tyr Ser ACA GAC Thr Asp CGA GCG Arg Ala 875 GGA ACA Gly Thr 890 CAG GTA Gin Val GAT GCC CCG TTG Asp Ala Pro Leu 655 ACT TGC CAA CTI' Thr Cys Gin Leu 670 TTG ATT CTG ACT Leu Ile Leu Thr 685 CTG TCA CTG ACC Leu Ser Leu Thr 700 GGC GCA CAT CAC Gly Ala His His GAT GAA AAA TTA Asp Giu Lys Leu 735 CTG CCG TTT CCA Leu Pro Phe Pro 750 ATC AGC GGC AAC Ile Ser Gly Asn 765 TGG GAT GGT AAA Trp Asp Gly Lys 780 ACA GAT ACC ACA Thr Asp Thr Thr CCG TCG CTG AGT Pro Ser Leu Ser 815 AGC CAA TTA GCT Ser Gin Leu Ala 830 GGA TTT GAA ACC Gly Phe Giu Thr 845 CAA GCA TTT ACC Gin Ala Phe Thr 860 CTT AAA GGC CAA Leu Lys Gly Gin GAT MAG CMA ACA Asp Lys Gin Thr 895 CGC TCT ATT CCC ACA 1968 Thr CMA 2016 Gin GTO 2064 Val AAA 2112 Lys ACG 2160 Thr 720 CAG 2208 Gin ATG 2256 Met CGG 2304 Arg GAG 2352 Giu ACG 2400 Thr 800 ATT 2448 Ile ACG 2496 Thr CGT 254-4- Arg CCC 2592 Pro CTG 2640 Leu 880 GTG 2688 Val1 GTA 2736 AGT GMA TCG CGC Ser Glu Ser Arg Arg Ser Ile Pro Vai 910 -193- AAT AAA GAA ACT GAA TTA TCT GCC TGG GTG ACT GCT ATT GAA AAT CGC 2784 Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala Ile Glu Asn Arg 915 920 925 AGC TAC CAC TAT GAA CGT ATC ATC ACT GAC CCA CAG TTC AGC CAG AGT 2832 Ser Tyr His Tyr Glu Arg Ile Ile Thr Asp Pro Gin Phe Ser Gin Ser 930 935 940 ATC AAG TTG CAA CAC GAT ATC TTT GGT CAA TCA CTG CAA AGT GTC GAT 2880 Ile Lys Leu Gin His Asp Ile Phe Gly Gin Ser Leu Gin Ser Val Asp 945 950 955 960 ATT GCC TGG CCG CGC CGC GAA AAA CCA GCA GTG AAT CCC TAC CCG CCT 2928 Ile Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro 965 970 975 ACC CTG CCG GAA ACG CTA TTT GAC AGC AGC TAT GAT GAT CAA CAA CAA 2976 Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 980 985 990 CTA TTA CGT CTG GTG AGA CAA AAA AAT AGC TGG CAT CAC CTG ACT GAT 3024 Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 995 1000 1005 25 GGG GAA AAC TGG CGA TTA GGT TTA CCG AAT GCA CAA CGC CGT GAT GTT 3072 Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val ***1010 1015 1020 30 TAT ACT TAT GAC CGG AGC AAA ATT CCA ACC GAA GGG ATT TCC CTT GAA 3120 Tyr Thr Tyr Asp Arg Ser Lys Ile Pro Thr Glu Gly Ile Ser Leu Glu 1025 1030 1035 1040 ATC TTG CTG AAA GAT GAT GGC CTG CTA GCA GAT GAA AAA GCG GCC GTT 3168 Ile Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 1045 1050 1055 TAT CTG GGA CAA CAA CAG ACG TTT TAC ACC GCC GGT CAA GCG GAA GTC 3216 Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 1060 1065 1070 ACT CTA GAA AAA CCC ACG TTA CAA GCA CTG GTC GCG TTC CAA GAA ACC 3264 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 451075 1080 1085 GCC ATG ATG GAC GAT ACC TCA TTA CAG GCG TAT GAA GGC GTG ATT GAA 3312 Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val Ile Glu 1090 1095 1100 GAG CAA GAG TTG AAT ACC GCG CTG ACA CAG GCC GGT TAT CAG CAA GTC 3360 Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 1105 1110 1115 1120 GCG CGG TTG TTT AAT ACC AGA TCA GAA AGC CCG GTA TGG GCG GCA CGG 3408 Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 1125 1130 1135 CAA GGT TAT-ACC GAT TAC GGT GAC GCC GCA CAG TTC TGG CGG CCT CAG 3456 Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1140 1145 1150 GCT CAG CGT AAC TCG TTG CTG ACA GGG AAA ACC ACA CTG ACC TGG GAT 3504 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1165 ACC CAT CAT TGT GTA ATA ATA CAG ACT CAA GAT GCC GCT GGA TTA ACG 3552 Thr His His Cys Val Ile Ile Gin Thr Gin Asp Ala Ala Gly Leu Thr 1170 1175 1180 ACG CAA GCC CAT TAC GAT TAT CGT TTC CTT ACA CCG GTA CAA CTG ACA 3600 Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr -194- 1185 1190 AAT GAT AAT CAA CAT An A R- G1 4 1195 1200 GAT ATT sn Te ATT GTG ACT CTG GAC GCG CTA GGT CGC 3648 s Ie sn n n s Ile val Thr Leu Asp Ala Leu Gly Arg 1205 1210 1215 GTA ACC ACC AGC CGG TTC TGG GGC ACA GAG GCA GGA CAA GCC GCA GGC 3696 Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 1220 1225 1230 TAT TCC AAC CAG CCC TTC ACA CCA CCG GAC TCC GTA GAT AAA GCG CTG 3744 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 1235 1240 1245 GCA TTA ACC GGC GCA CTC CCT GTT GCC CAA TGT TTA GTC TAT GCC GTT 3792 Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 1250 1255 1260 GAT AGC TGG ATG CCG TCG TTA TCT TTG TCT CAG CTT TCT CAG..TCA CAA 3840 Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 1265 1270 1275 1280 GAA GAG GCA GAA GCG CTA TGG GCG CAA CTG CGT GCC GCT CAT ATG ATT 3888 Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met Ile 25 1285 1290 1295 ACC GAA GAT GGG AAA GTG TGT GCG TTA AGC GGG AAA CGA GGA ACA AGC 3936 Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310 CAT CAG AAC CTG ACG ATT CAA CTT ATT TCG CTA TTG GCA AGT ATT CCC 3984 His Gin Asn Leu Thr Ile Gin Leu Ile Ser Leu Leu Ala Ser Ile Pro 1315 1320 1325 35 CGT TTA CCG CCA CAT GTA CTG GGG ATC ACC ACT GAT CGC TAT GAT AGC 4032 Arg Leu Pro Pro His Val Leu Gly Ile Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340 GAT CCG CAA CAG CAG CAC CAA CAG ACG GTG AGC TTT AGT GAC GGT TTT 4080 40 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 1345 1350 1355 1360 GGC CGG TTA CTC CAG AGT TCA GCT CGT CAT GAG TCA GGT GAT GCC TGG 4128 Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 1365 1370 1375 CAA CGT AAA GAG GAT GGC GGG CTG GTC GTG GAT GCA AAT GGC GTT CTG 4176 Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 1380 1385 1390 GTC AGT GCC CCT ACA GAC ACC CGA TGG GCC GTT TCC GGT CGC ACA GAA 4224 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 1395 1400 1405 TAT GAC GAC AAA GGC CAA CCT GTG CGT ACT TAT CAA CCC TAT TTT CTA 4272 Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 1410 1415 1420 AAT GAC TGG CGT TAC GTT AGT GAT GAC AGC GCA CGA GAT GAC CTG TTT 4320 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 1425 1430 1435 1440 GCC GAT ACC CAC CTT TAT GAT CCA TTG GGA CGG GAA TAC AAA GTC ATC 4368 Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val Ile 1445 1450 1455 ACT GCT AAG AAA TAT TTG CGA GAA AAG CTG TAC ACC CCG TGG TTT ATT 4416 Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe Ile 1460 1465 1470 GTC AGT GAG GAT GAA AAC GAT ACA GCA TCA AGA ACC CCA TAG 4458 -195- Val Ser Glu Asp Giu Asn Asp Thr Ala Ser.Arg Tbr Pro 1475 1480 1485 INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 1485 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32 (TcaC protein) Met Gin Asp Ser Pro Glu Val Ser Ile Thr Thr Leu Ser Leu 00.0 Gly Pro Arg Asn 65 Arg Leu Gin Pro Phe 145 Ala Lys Ala Ser Lys 225 Asn Ala Gly Leu Pro Ser Val Asn Leu Gin Arg 140 Gly Leu Asn Ala cys 220 Tyr Phe Leu Thr Ala Ser Ser Ser Asp Atp 110 Val Ile Giu Ile Gin 190 Giu Asp Val Leu Phe 270 Asp Lys Gly Gly Gly Ser Phe Gly Leu Asp Arg 160 Gly Ile Val Glu Val 240 Asn His Gly Thr Ala Gin Trp Ser Val Arg Pro Asp Ile Phe Ser Arg Tyr Glu- Tyr -196- 9 .9 9* a 9 *9 290 Gly Phe 305 His Arg Giu Leu Thr Thr Pro Val 370 Glu Lys 385 Ser Gin 25 Giy Met Arg Gin 30 Pro Leu 450 Ile Asn 465 35 Arg Gly Pro Ile 40 Ala Asp Lys Ser 530 Giu Asp 545 Asp Ala Gin His Leu Giy Gin Pro 610 Asp Giy 625 Leu Ile Leu Ala Arg Ala Ile Arg Leu 375 Gin Leu Arg Ser Asn 455 Leu Gin Val Gly Al a 535 Thr Al a Lys Gly Asn 615 Asp Ser Val 300 Gin Thr Lys Giu Gin 380 Leu Gly Tyr Tyr Aia 460 Thr Trp Pro Val Giy 540 Pro Leu Thr Leu Phe 620 Gin Asp Val Asn Asn Ser 365 Arg Asp Giu Lys Asp 445 Ser Aia Thr Ser Leu 525 Trp Val Gly Cys Ser 605 Leu Ser Ala Leu Asp Al a 350 Asp Phe Asn Giy Aila 430- Lys Leu Ser His Ile 510 Ile Arg Thr Ser Trp 590 Giy Ala Giy Pro Met Ala 335 Ser Giy Asp Phe Leu 415 Pro Ile Met Giy Phe 495 Gin Giy Lys Gly Giy 575 Pro Phe Asp Ser Phe 320 Pro Vai Arg Leu Asn 400 Pro Gin Al a Asp Ile 480 Thr Phe Pro Gly Thr 560 Gin Asn Ser Ile Leu 640 655 Asp Asn Thr Cys Gin Leu Gin 670 -197- '.2 Val Ala Asp 675 Pro- His Ile 690 Pro Trp Leu 705 Leu His Tyr Leu Tbr Lys His Leu Leu 755 Leu Thr Ser 770 Arg Glu Phe 785 25 Phe Ser His Ser Trp Phe 30 Giu Tyr Trp 835 Tyr Thr Vai 35 850 Asn Giu Thr 865 Leu Arg Thr Pro Tyr Thr Asn Lys Giu 915 Ser Tyr His 930 Ile Lys Leu 945 Ile Ala Trp Thr Leu Pro Leu Leu Arg 995 Gly Giu Asn 1010 Tyr Thr Tyr 1025 Ile Gin Gly Leu His His Trp Arg 695 Val Met Asn Asn 710 Ser Ala Gin Phe Lys Ser Pro Ala 745 Thr Giu Ile Gin 760 Asn Tyr Ser His 775 Phe Gly Cys Ile 790 Ala Pro Giu Gin Gly Met Asp Giu 825 Asp Thr Gin Ala 840 His Thr Asn Gin 855 Asn Trp Leu Thr 870 Tyr Gly Leu Asp Giu Ser Arg Tyr 905 Leu Ser Ala Trp 920 Arg Ile Ile Thr 935 Asp Ile Phe Gly 950 Arg Giu Lys Pro Leu Phe Asp Ser 985 Arg Gin Lys Asn 1000 Leu Gly Leu Pro 1015 Ser Lys Ile Pro 1030 Asp Arg 715 Leu Tyr Giu Val Gin 795 Ala Asp Ser Asp Ala 875 Thr Val Thr Pro Ser 955 Val Tyr Trp Ala Ser Leu Gin Leu 830 Phe Giu 845 Ala Phe Lys Gly Lys Gin Ser Ile 910 Ile Giu 925 Phe Ser Gin Ser Pro Tyr Asp Gin 990 His Leu 1005 Arg Arg Gly Ile Ala Ser Leu Ile Leu Thr Val 680 685 Thr Lys His Thr 720 Leu Gin 735 Pro Met Asn Arg Lys Giu Thr Thr 800 Ser Ile 815 Ala Thr Thr Arg Thr Pro Gin Leu 880 Thr Val 895 Pro Val Asn Arg Gin Ser Val Asp 960 Pro Pro 975 Gin Gin Thr Asp Asp Vai 1020 Giu Gly Ile Ser Leu Glu 1035 1040 Ile Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Giu Lys Ala 1045 1050 -198- Ala Val 1055 Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr *Ala Gly Gin Ala Giu Val 1060 1065 1070 Thr Leu Giu Lys Pro Thr Leu Gin Ala Leu Vai Ala Phe Gin Giu Thr 1075 1080 1085 Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Giu Gly Val Ile Giu 1090 1095 1100 Giu Gin Giu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 1105 1110 1115 1120 Ala Arg Leu Phe Asn Thr Arg Ser Giu Ser Pro Val Trp Ala Ala Arg 1125 1130 1135 Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1140 1145 1150 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1165 ~Thr His His Cys Val Ile Ile Gin Thr Gin Asp Ala Ala Gly Leu Thr 1170 1175 1180 :Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 1185 1190 1195 1200 Asp Ile Asn Asp Asn Gin His Ile Val Thr Leu Asp Ala Leu Gly Arg 1205 1210 1215 *Val Thr Thr Ser Arg Phe Trp Gly Thr Giu Ala Gly Gin Ala Ala Gly 1220 1225 1230 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu .1235 1240 1245 Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 1250 1255 1260 40 ***Asp Ser Trp met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 1265 1270 1275 1280 *Glu Giu Ala Giu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met Ile 1285 1290 1295 .Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310 His Gin Asn Leu Thr Ile Gin Leu Ile Ser Leu Leu Ala Ser Ile Pro 1315 1320 1325 Arg Leu Pro Pro His Val Leu Giy Ile Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 1345 1350 1355 1360 Gly Arg Leu Leu Gin Ser Ser Ala Arg His Giu Ser Gly Asp Ala Trp 1365 1370 1375 Gin Arg Lys Giu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 1380 1385 1390 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 1395 1400 1405 Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 1410 1415 1420 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe -199- 1425 1430 .1435 1440 Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val Ile 51445 1450 1455 Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe Ile 1460 1465 1470 Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro 0 1475 1480 1485 INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 3288 base pairs TYPE: nucleic acid STRANDEDNESS: doujble TOPOLOGY: linear 0 (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33 (tcaA gene): ATG GTG 25 Met Val ACT GTT ATG CAA AAT AAA ATA TCA TTT TTA TCA Thr Val Met Gin Asn Lys Ile Ser Phe Le Cm.- GOT ACA TCC Gly Thr Ser
I
GAA CAG ccc Glu Gln Pro 30 TCA ATC AGC Ser Ile Ser 35 35 AAA GAG GCT Lys Oiu Ala so 40 CTO AAA TCC Leu_-Lvs Ser AAA 000 CTG 45 Lys Oly Leu OAT OCT TTO Asp Ala Leu ATO AAC CGT Met Asn Arg 115 TTT TCA CCO Phe Ser Pro 130 CTG CAT AAA Leu His Lys 145 CTO AAO OAT Leu Lys Asp TCC CTT OAT Ser Leu Asp
GT
Gly
OTT
Val 40
COT
Arg
TG
Trp
CTA
Leu
GOC
Gly
OCT
Ala 120
TCC
Ser
TTO
Leu
GAA
Olu
OTG
Val 10
CAA
Gin
TCC
Ser
GCO
Ala rro Leu
TCC
Ser 90
OAT
Asp
OCT
Ala
CTC
Leu
ATT
Ile
ACO
Thr 170
CAA
Gin
AAC
Asn
OTT
Val
COO
Arg
COT
Arg 75
AAC
Asn 000 Gly 0CC Ala
TAC
Tyr
OAT
Asp 155
ATG
Met
AAA
Lys
TTT
Phe
ACC
Thr 45
COT
Arg
GAG
Glu
TCT
Ser
TTC
Phe
ATT
Ile 125 o'rr Val
OAT
Asp 30
CTO
Leu
GCG
Ala
CCO
Pro
OTO
Val1
AGC
Ser 110
CAA
Gin
OCT
Ala
ATC
Ile
CCC
Pro
OAA
Glu
OTT
Val1
CTT
Leu
OAT
Asp
TCC
Ser
AAA
Lys
OCT
Al a
OTC
Val 175
OAT
Asp
OCA
Ala
OTT
Val
AAT
Asn
ATT
Ile
CAA
Gin
TTA
Leu
CTA
Leu
OAT
Asp
OAT
Asp 160
ACT
Thr
ATT
Ile 48 96 144 192 240 288 336 384 432 480 528 576 Asn Arg Arg AAT AAA GAG Asn Lys Giu GGC GOT A Gly Gly Lys -2 00a. a a a a eat.
ACT
Thr
CAT
His
AAC
Asn 225
GAA
Giu
GT
Gly
GTG
Val1 25
GG
Gly 30
GCC
Al1a 305 35 AAA Lys
GOT
Gly
CCG
Pro
ACT
Thr
GAG
Giu 385
OTA
Val
GO
01W
GC
Gly
ATT
S le
GAG
Giu
CTG
Leu 210
GT
Gly
CAA
Gin
AAT
Asn
GA
Gly
AAC
Asn 290
GCC
Ala
CAG
Gin
PAT
Asn
OAT
Asp
CAG
Gin 370
AAT
Asn
GAT
Asp
A.CA.
Thr
A.CT
rhr
CCA
Pro 450
CTG
Leu 195
TCG
Ser
GTO
Val1
ACC
Thr
CA
Gin
TTC
Phe 275
GGC
Oly 0CC Ala
TOT
Cys
OTT
Val
AA
Ltys 355
CCT
Pro
AA
Lys
AGT
Ser
AAA
LJys
CCG
Pro 435 rCG Ser 180
TCC
Ser
CA
Gin
TOO
Trp
AGT
Ser
OAT
Asp 260
AGC
Ser
ATT
Ile
GCA
Ala
TAC
Tyr
CTO
Leu 340
GAC
Asp
TTO
Leu
OAT
Asp
TCC
Ser
GAC
Asp 420
ACA
Thr
CCG
Pro GOC OCA Oly Ala ATC OAT Ile Asp PAT ACT Asn Thr 230 PAT ACO Asn Thr 245 ACA TTT Thr Phe 000 CA Gly Gin GTC GOC Val Oly ATA GTC Ile Vai 310 TAC CTC Tyr Leu 325 ACC GOC Thr Oly GOT ATT Gly Ile CCA AGC Pro Ser CAG TAC Oln Tyr 390 OGA CAG Oly Gin 405 PAG 000 Lys Gly PAC CCT Asn Pro CCA 0CC Pro Ala1
TTC
Phe
TCC
Ser 215
TTG
Leu
PAT
Asn
TTT
Phe
CCT
Pro
OCA
Al a 295
OCA
Al a
GTC
Vai
TOT
Cys
TTT
Phe
TTC
Phe 375
TAT
T'yr
TCA
Ser
CTG
.eu
'AT
ksp
:GC
krg 155
TTC
Phe 200
OCT
Ala
ACA
Thr
ACA
Thr
TCC
Ser
ATG
Met 280
CA
Gin
CCO
Pro
OCT
Ala
TTC
Phe
GCT
Ala 360
CAT
His
CTO
Leu
PAT
Asn
TTA
Leu I OAT C Asp 440
OA
Giu
CCA
Pro
TTA
Leu
OAT
Asp
CGC
Arg
GGA
Gly 265
OTT
Vai
TTG
Leu TrG Leu
CCC
Pro
TTA
Leu 345
CAG
Glm
CTG
Lieu kAA 4's roo rrp rTA .eu 125
;TG
tal
LCA
~hr
ATC
*Met
*TCG
Ser
ACC
Thr
AAA
Lys 250
PAC
Asn
TAC
Tyr
ATT
Ile
AA
Lys
OAT
Asp 330
AGA
Arg
OTA
Val
CCO
Pro
ACA
Thr
AA
Lys 410
ACC
Thr
ATT
Ile
CTO
Leu
ACO
Thr
OCA
Ala
ACO
*Thr 235
*CTO
Leu
ACT
Thr
CTG
Leu
OCA
Al a
CTC
Leu 315
GOT
Gly
GOC
Oly 0CC Ala
OTC
Val
GAO
Olu 395
PAC
Asn
TTT
Phe
CCT
Pro
TCA
Ser 1
TTA
Leu
CA
Gin 220
OCA
Al a
TTC
Phe
TTT
Phe
TCA
Ser
GT
Gly 300
ACT
Thr
ACA
Thr
AAC
Asn
AAC
Asn
AICA
rhr 380
CAG
GCG
Ala
TOC
:ys
C
Pro :TG I ~eu 9
CCT
Prc 205 0CC Ala Gin
OCT
Ala
TAT
Tyr
CAG
Gin 285
PAT
Asn
TG
Trp
ACO
Thr
AOC
Ser
AAA
Lys 365
CTO
Lieu
GGT
Gly
:TO
Leu kGC Ser
XCT
145
~CG
rhr 190 *TAT OAC Tyr Asp AGA ACO Arg Thr CO OTT Ala Val 0CC CA Ala Oin 255 TTC AAA Phe Lys 270 TAC ACC Tyr Thr CCA GAC Pro Asp TCA ATO Ser Met ATO OGA Met Gly 335 CCA ACT Pro Thr 350 TCA GOC Ser Gly GAA CAC Giu His TAT ATC Tyr Ile OTT ATC Val Ile 415 OAT AOC Asp Ser 430 ATC PAT C Ile Asn I CCG GTC Pro Val S
OAT
Asp
CTG
Leu
TCA
Ser 240
OAT
Asp
OCO
Ala
AGC
Ser
CA
Gin
OCA
Al a 320
GAC
Asp
AAC
Asn
AGT
Ser
P.OC
Ser
ACG
rhr 400 k.AT ksn
CCA
~er
;AT
Lsp
LOT
er 624 672 720 768 816 864 912 960 1008 1056 1104 1152 1200 1248 1296 1344 1392 TAT CPA TTO ATO ACC PAT CCO OCA CCO ACA OPA OAT.OAT ATT ACC PAC 1440 -201-
S
S S Tyr Gin Leu Met 465 CAT TAT GGT TTT His Tyr Gly Phe AGC GAG TTG ACC Ser Giu Leu Thr 500 ACC CGG TTA AGC Thr Arg Leu Ser 515 TAC AGT CAA AGC Tyr Ser Gin Ser 530 TTT GGG GAA ACC Phe Gly Glu Thr 545 CTG AAC AGC ACA 25 Leu Asn Ser Thr CAG ACT GAT GGC Gin Thr Asp Gly 580 'ITA GCC GOT CGC Leu Ala Gly Arg 595 CTA TCA TTT GAA Leu Ser Phe Glu 610 40 GTG CCG GAC CAC Val Pro Asp His 625 GCA CTG GCA GAG Ala Leu Ala Giu AAT ACC T= GCG Asn -Thftr-4e-A~a 660 CAG ACA CCC AGT Gin Thr Pro Ser 675 CAT GTC ATT
GCG
His Val Ile Ala 690 GAT GAG TTA GCC Asp Giu Leu Ala 705 GAA CTG CTC CGT Glu Leu Leu Arg Thr Asn 470 AAC GGC Asn Gly 485 AGC AAA Ser Lys TTC AAT Phe Asn AGC ATT Ser Ile ACC CCA Thr Pro 550 CTG GCA Leu Ala 565 AAG, AGC Lys Ser GCT GAA Ala Giu GAA TTG Glu Leu CAC GAC His Asp 630 TAT GTC Tyr Val 645 ACC TTC Thr Phe TTC TAT Phe Tyr CTA GGT Leu Gly GCC ATA Ala Ile 710 ATT GGT Ile Gly 725 Pro Ala OCT AGC Ala Ser CTG AAT Leu Asn CAG TTA Gin Leu 520 GAT GCG Asp Ala 535 ACC CGC Thr Arg GAC GCG Asp Ala CTA AAT Leu Asn AAG CTG Lys Leu 600 GAC TGG Asp Trp 615 AAA ATT Lys Ile AGC CTA Ser Leu ATT AGT Ile Ser GAA ACC Giu Thr 680 ACA GAG Thr Glu 695 TGC TGC Cys Cys CGC TAT Arg Tyr Thr Glu 475 CGG GCT Arg Al a 490- ATC GAT Ile Asp GAT TTG Asp Leu GCA GCC Ala Ala AAT GTC Asn Val 555 GAT GOT Asp Gly 570 ACT GAC Thr Asp CGT TTA Arg Leu ATT 0CC Ile Ala CTG OAT Leu Asp 635 CAG CGC Gin Arg 650 GTA AAT Val Asn TTC CGC Phe Arg AAA TAT Lys Tyr GCA TTG Ala Leu 715 TTC GGT Phe Gly 730 TTro TAT Leu Tyr Asp Ile Thr Asn 480 CCA TTG TCA ACC Pro Leu Ser Thr 495 TTC TOT GAG AAG Phe Cys Glu Lys 510 OCT CAG CAA TCT Ala Gin Gin Ser 525 CGC TAT GTT CGT Arg Tyr Val Arg GOT GCC GCT TAT Gly Ala Ala Tyr 560 TAT CTG TOG ATT Tyr Leu Trp Ile 575 ACG GTA GTC GCC Thr Val Val Ala 590 TCC CAG ACC GO Ser Gin Thr Gly 605 GCC AGT COT AGT Ala Ser Arg Ser CCG GTC CTT OAA Pro Val Leu Glu 640 000 CTT GAT 0CC Gly Leu Asp Ala 655 TAT ACG CCA OAT Tyr Thr Pro Asp 670 0CC GAC GGT AAT Ala Asp Gly Asn 685 GAA AAT GAG CAG Giu Asn Giu Gin GTC ACC AGT OAT Val Thr Ser Asp 720 OCA GGC AGT TTT Ala Gly Ser Phe 735 TTC GGC 0CC ATT Phe Gly Ala Ile 750 1488 1536 1584 1632 1680 1728 1776 1824 1872 1920 1968 2016 2064 2112 2160 2208 2256 ACC TTG OAT GAA TAT ACC GCC AGT Thr Leu Asp Giu Tyr Thr Ala Ser 740 -202- CCC CGT TTG Pro Arg Leu 755 TTT GGG CTG ACA TTT Phe Gly Leu Thr Phe 760 GCC CAA *GCC GAA ATT Ala Gin Ala Giu Ile 765 TTA TGG CGT Leu Trp Arg p at a a
CTG
Leu
AAA
Lys 785
GAT
Asp
GTA
Vai
TTG
Leu 25 ACA Thr
GT
30 Giy 865
GAG
Glu 35
TG
Trp 40
TCC
Ser
CAG
Gin
CAA
Gin 945
OTT
Val
CAG
Gin
TTC
Phe CTo Leu ATO GAA GOC Met Giu Gly 770 TCC CTG CAA Ser Leu Gin TOG ATO TCO Trp Met Ser AGT ACO CAA Ser Thr Gin 820 OAA AAC OTT Giu Asn Vai 835 ATG OAT TCG Met Asp Ser 850 TTC OGC ATT Phe Gly Ile MAA ATC ACA Lys Ile Thr CAT OAT ATT His Asp Ile 900 TTA CAA ACC Leu Gin Thr 915 CTA OTO TTA Leu Vai Leu 930 TTA CTO ACA Leu Leu Thr CCT GTA CCC Pro Val Pro TOG GAA ACT Trp Giu-Thr 980 GAT CAA TTA Asp Gin Leu 995 ATC 0CC ACA Ile Aia Thr 1010 OAT ATC TTA TTO CAA CAG TTA GOT CAG GCA Asp Ile Leu Leu Gin Gin Leu Giy Gin Ala 775 780 OCT ATT TTA COC COT ACC GAG CAG OTG CTO Ala Ile Leu Arg Arg Thr Oiu Gin Vai Leu 795 800 AAT CTA AGT CTG-XCT TAT CTO CAA 000 ATO Asn Leu Ser Leu Thr Tyr Leu Gin Gly met 810 815 OGT ACC 0CC ACC OCT GAG ATO TTC AAT TTC Gly Thr Aia Thr Ala Giu Met Phe Asn Phe 825 830 AGC OTO AAT AOT CAA OCT 0CC ACT AAA GAA Ser Vai Asn Ser Gin Ala Ala Thr Lys Giu 840 845 CAG CAG AAA OTO CTG COG GCO CTA AGC 0CC Gin Gin Lys Vai Leu Arg Ala Leu Ser Ala 855 860 AAT OTO ATG GOT ATC GTC ACC TTC TGG CTG Asn Val Met Gly Ile Val Thr Phe Trp Leu 875 880 AOT GAT AAT CCT TTT ACA TTO GCA AAC TAC Ser Asp Asn Pro Phe Thr Leu-Aia Asn Tyr 890 895 CTG TTT AGC CAT GAC AAT 0CC ACG TTA GAO Leu Phe Ser His Asp Asn Aia Thr Leu Giu 905 910 TCT CTO GTA ATT OCT ACT CAG CAA CTT AGC Ser Leu Vai Ile Ala Thr Oin Gin Leu Ser 920 925 AAA TOO CTO AOC CTO ACC GAG CAG GAT CTG Lys Trp Leu Ser Leu Thr Oiu Gin Asp Leu 935 940 CCC GAA COT TTA ATC AAC GOC ATC ACO AAT Pro Giu Arg Leu Ile Asn Oiy Ile Thr Asn 955 960 GAG CTA TTA CTC ACO CTA TCA COT TTT AAG Oiu Leu Leu Leu Thr Leu Ser Arg Phe Lys 970 975 ACC OTT TCC COT OAT OAA OCG ATO COC TOT Thr Val Ser Arg Asp Giu Ala met Arg Cys 985 990 AAT OAT ATO ACO ACT GAA MAT OCA GOT TCA Asn Asp Met Thr Thr Oiu Asn Ala Oly Ser 1000 1005 GAO ATO OAT AAA GOT ACO OGA OCO CMA OTT Giu Met Asp Lys Oly Thr Gly Ala Gin Vai 10i5 1020 2304 2352 2400 2448 2496 2544 2592 2640 2688 2736 2784 2832 2880 2928 2976 3024 3072 3120 MAT ACC TTO CTA TTA Asn Thr Leu Leu Leu 1025
GT
Oiy 1030 Oiu Asn Asn -2 TOG CCG MAA AOT TTT ACC TCT Trp Pro Lys Ser Phe Thr Ser 1035 104( 03- 0 CTC TGG CAA C= CTG ACC TGG TTA CGC GTC GGG CAA AGA CTG AAT GTC 3168 Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 1045 1050 1055 GGT AGT ACC ACT CTG GGC AAT CTG TTG TCC ATG ATG CAA GCA GAC CCT 3216 Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 1060 1065 1070 GCT GCC GAG AGT AGC GCT TTA TTG GCA TCA GTA GCC CAA AAC TTA AGT 3264 Ala Ala Giu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 1075 1080 1085 GCC GCA ATC AGC AAT CGT CAG TAA 3288 Ala Ala Ile Ser Asn Arg Gin se 1090 1095 INFORMATION FOR SEQ ID NO:34: SEQUENCE CHARACTERISTICS: LENGTH: 1095 amino acids TYPE: amino acids TOPOLOGY: linear 25 (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34 (TcaA protein): Features From To Description 254 267 SEQ ID NO:iS 30 254 492 TcaAii peptide 00.
0 0 0.~ 0006 0o 00 06* on 00 060 Met 1 Glu Ser 40 Lys Leu Lys Asp Met Phe Leu 145 Leu Ser Ser Phe Leu Ser Gly Thr Ser 10 Gin Asn Val Phe Asp Ile Ala Ser Val Pro Thr Leu Pro Val Ala Arg Gin Arg Ala Giu Asn Leu Arg Gin Giu Pro Val Ile 75 Ser Asn Val Ser Val Leu Gin Asp Gly Asp Phe Ser Asp Leu 110 Ala Ala Ser Ile Gin Ser Leu 125 Leu Tyr Arg Val Ala Lys Asp 140 Ile Asp Asn Arg Arg Ala ASP 155 160 Thr Met Asn Lys Giu Val Thr 170 175 Gin Lys Gly Gly Lys Asp Ile 190 Thr Giu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp -204- 00.0 0 000e His Asn 225 Glu Gly Val Gly Ala 305 Lys Gly Pro Thr Giu 385 Val Gly Gly Ile Tyr 465 His Ser Thr Tyr Phe 545 Leu 195 Leu Ser 210 Gly Val Gin Thr Asn Gin Gly Phe 275 Asn Gly 290 Ala Ala Gin Cys Asn Val Asp Lys 355 Gin Pro 370 Asn Lys Asp Ser Thr Lys Thr Pro 435 Pro Ser 450 Gin Leu Tyr Gly Giu Leu Arg Leu 515 Ser Gln 530 Gly Giu Asn Ser Ala Thr Thr Ser Met 280 Gin Pro Al a Phe Ala 360 His Leu Asn Leu Asp 440 Giu Ala Ser Asn Leu 520 Ala Arg Ala Leu ser Asp Thr Arg Lys 250 Giy Asn 265 Val Tyr Leu Ile Leu Lys Pro Asp 330 Leu Arg 345 Gin Val Leu Pro Lys Thr Trp, Lys 410 Leu Thr 425 Val Ile Thr Leu Pro Thr Leu Arg 490 Ser Ile 505 Met Asp Lys Ala Val Asn Ala Asp 570 -205- 205 Aia Arg Thr Gin Ala Vai Ala Aia Gin 255 Tyr Phe Lys 270 Gin Tyr Thr 285 Asn Pro Asp Trp Ser Met Thr Met Gly 335 Ser Pro Thr 350 Lys Ser Gly 365 Leu Glu His Gly Tyr Ile Leu Val Ile 415 Ser Asp Ser 430 Ala Ile Asn 445 Thr Pro Val Asp Ile Thr Pro Leu Ser W4 495 Phe Cys Giu 510 Ala Gin Gin 525 Arg Tyr Val Gly Ala Ala Tyr Leu Trp 575 Leu Ser 240 Asp Aia Ser Gin Al a 320 Asp Asn Ser Ser Thr 400 Asn Ser Asp Ser Asn 480 Thr Lys Ser Arg Tyr 560 Ile Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr. Asp Asp Thr Val Val Ala 580 585 590 Leu Ala Leu Ser 610 Val Pro 625 Ala Leu Asn Thr Gin Thr His Val 690 25 Asp Giu 705 Giu Leu Thr Leu Pro Arg Leu Met 770 40 Lys Ser 785 Asp Trp Val Ser Leu Giu Thr Met 850 Gly Phe 865 Giu Lys Trp His Ser Leu Gin Leu 930 Gly Arg 595 Phe Giu Asp His Ala Giu Phe Ala 660 Pro Ser 675 Ile Ala Leu Ala Leu Arg Asp Giu 740 Leu Phe 755 GJlu Giy Leu Gin Met Ser Thr Gin 820 Asn Val 835 Asp Ser Gly Ile Ile-Thr Asp Ile 900 Gin Thr 915 Val Leu Arg Leu Ile Ala Leu Asp 635 Gin-Arg 650 Val Asn Phe Arg Lys Tyr Ala Leu 715 Phe Gly 730 Leu Tyr Gin Ala Leu Gin Arg Arg 795 Leu Thr 810 Thr Ala 5cr Gin Val Leu Gly Ile 875 Pro Phe 890 His Asp Ile Ala Ser Leu Thr Gly Arg Ser Leu Giu 1640 Asp Ala 655 Pro Asp Gly Asn Giu Gin Ser Asp 720 Ser Phe 735 Ala Ile Trp Arg Gin Ala Val Leu 800 Gly Met 815 Asn Phe Lys Giu Ser Ala Trp Leu 880 Asn Tyr 895 Leu Giu Leu Ser Asp Leu Gin Leu Leu Thr Thr 945 Pro Giu Arg Leu Ile Asn Gly Ile Thr Asn -2 06- Val Pro Val Pro Asn Pro Giu Leu Leu Leu *Thr Leu .Sqe. Arg Phe Lys 965 970 975 Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp GlU Ala Met Arg Cys 980 985 990 Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Giu Asn Ala Gly Ser 995 1000 1005 Leu Ile Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr Gly Ala Gin Vai 1010 1015 1020 Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 1025 1030 1035 1040 Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 1045 1050 1055 Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 1060 1065 1070 Ala Ala Giu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 1075 1080 1085 25 Ala Ala Ile Ser Asn Arg Gin oes 1090 1095 30 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 603 amino acids TYPE: amino acid 35 TOPOLOGY: linear 40 Pro 1 Phe 45 Ala Arg Gly Tyr Thr Ser Ala Pro 145 (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: Leu Ser Thr Ser Glu Leu Thr Ser 5 Cys Giu Lys Thr Arg Leu Ser Phe 25 Gin Gin Ser Tyr Ser Gin Ser Ser 35 40 Tyr Val Arg Phe Gly Glu Thr Thr 50 55 Ala Ala Tyr Leu Asn Ser Thr Leu 70 Leu Trp Ile Gin Thr Asp Gly Lys Val Val Ala Leu Ala Gly Arg Ala 100 105 Gin Thr Gly Leu Ser Phe Giu Giu 115 120 Ser Arg Ser Val Pro Asp His His 130 135 Val Leu Glu Ala Leu Ala Glu Tyr 150
SEQ
Lys 10 Asn Ile Pro Ala Ser 90 Glu Lieu Asp VTal ID NO:35 CTca.Aiii protein) Leu Asn Ser Ile Asp Thr Gin Leu met Asp Leu Thr Asp Ala Lys Ala Ala Ser Thr Arg Val Asn Val Tyr Asp Ala Ala Asp Gly Gin 75 Leu Asn Phe Thr Asp Asp Lys Leu Val Arg Leu Ser 110 Asp Trp Leu Ile Ala Asn 125 Lys Ile Val Leu Asp Lys 140 Ser Leu Lys Gin Arg Tyr 155 160 -2 07- Gly Leu Asp Ala Asn Thr Phe Ala Thr Phe.Ile 165 170 Ser Ala Val Asn Pro 175 .o Tyr Thr Ala Asp Glu Asn 210 Val Thr 225 Ala Gly Phe Gly Ile Leu Xxx Gly 290 Glu Gin 305 Leu Gln Met Phe Xxx Thr Ala Leu 370 Thr Phe 385 Leu Ala Ala Thr Gln Gln Giu Gln 450 Gly Ile 465 Ser Arg Ala Met Asn Ala Gly Ala 530 Pro Gly 195 Giu Ser A-rg Ala Trp 275 Gin Val1 Gly Asn Lys 355 Ser Trp, Asn Leu Leu 435 Asp Thr Phe Arg Giy 515 Gin His Asp Giu Thr 245 Pro Leu Lys Asp Val 325 Leu Thr Gly Giu Trp 405 Ser Gin Gin Val Gin 485 Phe Leu Pro Ile Leu 215 Leu Asp Leu Giu Leu 295 Met Thr Asn Asp Gly 375 Ile Asp Gin Val Leu 455 Val Giu Gin Ala Phe 185 Leu Ala I le Tyr Gly 265 Gly Pro Pro Trp Cys 345 Ala Lys Ile Gin Asp 425 Ile Thr Asn Gin Asn 505 Leu Giu Thr Cys Arg 235 Al a Thr Asp Ala Asn 315 Gly Ser Gin Asn Arg 395 Leu Ser Lys Pro Giu 475 Thr Asn Glu Arg Ser Tyr Ala Leu Gly Gly Asn 240 Tyr Arg 255 Ala Giu Gin Gin Arg Thr Thr Tyr 320 Ala Glu 335 Gin Ala Leu Arg Ile Val Phe Thr 400 Asp Asn 415 Ala Thr Leu Thr Ile Asn Thr Leu 480 Asp Giu 495 Thr Giu Gly Thr Gin Val Asn Thr Leu Leu Leu Gly GlU Asn Asn Trp Pro Lys 535 540 -2 08- Ser Phe Thr Ser Leu Trp Gin Leu ILeu Thr-Trp Leu Arg 545 550 555 Val Gly Gin 560 Arg'Leu Asn Val Gly Ser Thr Thr Leu Gly Asn 565 570 Leu Leu Ser Met Met Gin Ala Asp Pro Ala Ala Giu Ser Ser Ala Leu Leu Ala Ser Val Ala 10580 585 590 Gin Asn Leu Ser Ala Ala Ile Ser Asn Arg Gin 595 600 INFORMATION FOR SEQ ID NO:36: Wi SEQUENCE CHARACTERISTICS: LENGTH: 2557 base pairs TYPE: nucleic acid TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36 (tcdA internal fragment):
GAATTCGGCT
ACCATGATAA
TTGGAAAATT
TTGCCGTAGG
TGATCAGAAA
AGCTATTTAT
TGCTGGATAC
ATGTCATGGC
CGGTACTCCT
TCTGGGACTG
AACATATCGT
GCATCAACGA
CTGGAGCAGC
GGGTGAACGC
40 TAACGGCAGA
GTATTCAAGC
GTTGGACATC
GCCCCACAGG
CCGACCTATG
ACAGGCTAAT
ACTATATCCG
AATACTTACT
CCATTGCCAG
ATTCGGGGGT
GCACTTGGGC
TGCGTATCGG
TAAACGCCGA
CTAATCTTAA
ATTTTATCGG
TGCGTI'TAAT
TAAAGATGGA
ACTGGCAGAT
TGAAGGAAAA
ACTCAATACT
CATGACCTCC
CGTCTACCAC
GCCCTATATT
TTGGGCAGAT
GTTGAATACT
TCAGTATTGT
AAACGCCTTC
GCCCGCGCAT
ACTAGGCGAA
ACAACTGGCT
ACAAAATCAT
TATCAATACT
GCGTTTCCGC
CCCAGTGGGA
ACATTACAAC
TCAAGTCGCC
GATTGATAAT
TATTCAACTG
TATCAGCCGC
GGGTG=TCT
ACAAACCAAA
TACCGTCGAA
AGTTATTAGC
ACTCAGTGAA
ATTGATGATG
AAAATTAAAA
ATTCATCAAT
ACTAATTTAT
ATTACCAGCT
ACCAGCTATA
GGTTTACAAG
GCGGCCACCT
AAGTTACAGC
AAGTATACGC
CAGGCTCTGG
TCTCGCTCTT
ATAACCTAAA
TAACCATTGA
CCGCTATCAG
GGCTACATAC
ACAAAACGCT
G'rrTTGATAA
TGCAATTATC
CCGGCGACGG
CGGGTTCATC
CACAATTGGA
CGTCTATTTG
TGACAAAACC
GATGCCCTTT
CACTGATTAT
AAAGCGTCCT CGGTGCTAGC GATGCCATGA ATCTTGATGC
CCGCCTGCTT
GAATCTTTCc
TGAACTGGAT
TGATAAGCAA
ACAGAAGTGG
AACGCCTGAA
AGACAAAGCA
ATCGGAAAAT
CGCAATGACA
GGAAGCCGTA
AATGGTTTAC
AGAGATGTTT
GCTGACACGT
GGCATTTGAA
TAATTTGCTG
TCCAGAAAAT
CGCACAACAA
TTCAATCAAT
CCGCCGGGTT
CAGTGCCGCA
AAGCCGTGAT
AACCACCCG4G
AAATGTGGAA
CAAATACAAT
AAACTATATT
ATCCGTCAGC
GACATCGTTT
TAACGATCAA
TTGGCGCAGT
TGAATGGCAT
AAAATTACCG
AATTTATATA
TTATTACTGA
TTGGCTACCC
AGTGTATTCC
ATTAAGAATT
GATTTGCTAC
GTCGCCCACT
GCAGAGGGAN
GAAACGCAGG
CATTCCACCG
GGCGCTGCAA
TTTGCGGATT
GCTAACTCGT
TTGCAAGCCA
GCGTTCTCCT
TTGAAATGTC
GAAAGAGACA
GAATTCAACA
TTAAGCACCT
GACTTGTATC
ATCGCCGAAG
GAAAATGCCA
AAACGCTACA
GATCCGACCA
CAAAGCCAAT
G3AACAAGTGG
GGGCTGACCT
GTCGATCACA
AAAATTGATT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800
CAACATCTTC
ATCCTGCAAT
TTTGGTCGG
AAACGCGGCA
GCT'ITTCTGG
AAGGCAGCGG
CAGGTTTCTG
TACGTCAACC
CAATTCTTTA
CAATTAGTTT
ATGATGGACG
GATGCCTTTA
GCATATCACG
ACTGATGCCG
CCCCAGTAAC
GGGTTAATGT
CTGGATTATA
GGCGTATTAA
ATGAATCTCG
CGGCTATTA
CGGCAATAAA
GGGCAT'rGGA
TCGACTGGGA
ACTACCCGGA
CATTACTGCA
TGTCTTATCT
ATAATATAA
GTGAATATTA
GTAAATTCAA CGACGGTAAA TTCGCGGCTA ATGCCTGGAG -209-
GTCCAATTAA
TGCTCTGGTT
AAACTGAAAC
GGAATACGCC
ATAGAGCGCC
TTTATAACCA
TCTTGCTGA
ATAGCTATCA
ATTATGAGAT
TCAGCATGGT
TAAAAATTTA
GCAATCAATG
CCAGCCTGGG
CCCTTATAAA
GGAACAAAAG
GGATTATCGT
AATCACIT
CGGACTCTAT
ACAAGACACA
TATGGCATCC
ACAATTTGAT
TCCTTCTTCG
ATATAACGGA
TATTTCACCA
CAATTTGATG
CGTTAATCCG
AGCACTATCC
GAGATCACCA
TATGAACTAA
GATGTCAATA
TGTGCCGGTT
CTAGATAGTT
AAAGATATGA
ACCAATAATG
GTAAGTAGCC
GATATTCCAA
AAATTAAGAA
AATAAATATG
AATAATAAGC
GTCCAGTGAT
AACAGACAGG
AATTGGCGCA
AAAAAATATC
ATCAAGGTGA
ATAAAAACGC
CCCCAGAACA
TCAGAAGAGT
GTAAAGACTA
CTATCAATTA
TTATTCATAA
GCAAACTAGG
CGAATTC
ATATAAATCC
AAATAGTAAA
TATCCGCTAT
CGAGCTAAAA
AGATACGTTG
TTCAATGCAA
GAGCAATGTr
GAATAACCGC
TGGTTGGGGA
CAAAGCCGCA
TGGATATGAA
TGATAAATTT
CGCCTGTATC
GATGGCTATC
GATGGCACT
CTGGAAAAA
CTGGTGATGT
GGACTATATA
TATCGGGATA
TATGCAGAGG
GATTATTAcc
TCAAGTGATT
GGACAGAAGC
ATTGTGTATA
1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2557 INFORMATION FOR SEQ ID NO:37: Wi SEQUENCE CHARACTERISTICS: LENGTH: 845 amino acids TYPE: amino acids TOPOLOGY: linear (ii) MOLECULE TYPE: protein (partial) (Xi) SEQUENCE DESCRIPTION: SEQ peptide).
ID NO:37 (TcdA internal Ala Phe Asn Ile Asp Val Ser Leu 0..
0 0.
Arg Leu Leu Lys Ile Thr 30 Asp His Asp Asn Lys Asp Gly Lys Ile 20 25 Lys Asn Asn Leu Lys Asn Leu Gin Leu Thr Ser Asn Leu 35 Tyr Ile Gly Lys Leu Ala Asp Ile Ile Asp 0Th Leu Aspo Leu Leu Leu Ile Ala Val Gly 55 60 Glu Gly Lys Thr 0.
Asn Leu 40 65 Ser Ala Ile Asp Lys Gin Leu Thr Leu Ile Arg Leu Asn Thr Ile Ser Trp Leu His Gin Lys Trp Ser Val Phe Gin Leu Phe Giu Ile Lys 115 Asp Lys Asp 130 Met Thr Ser Thr Tyr Asn Lys Thr Leu Thr Pro 110 Gin Giy Phe Asn Leu Leu Asp Val Tyr His Gly Lys Ala Asp Leu His Val Met Pro Tyr Ile Ala Ala Thr 145 Leu Gin Leu Ser Giu Asn Val His Ser Val Leu Trp Ala Asp Lys Leu Gin Pro Giy Asp Gly 165 170 Ala Met Thr Ala Giu Giy 175 Phe Trp, Asp Trp, Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala 180 185 190 Val Glu Thr Gin Giu His Ile Vai Gin Tyr Cys Gin Aia Leu Ala Gin -210a a a a. a.
a.
a Leu ILeu 225 Pro Trp Giu Asp His 305 Ile 25 Arg 30 Asn Ile Phe 385 Gin Gin 45 Arg Leu Phe 465 Gly Met Ser Tyr Tyr 545 Tyr His Ser Thr Gly Ile Asn Giu Asn Ala Phe Arg 215 220 Lys Pro Giu Met Phe Gly Ala Ala Thr Giy Ala Ala 230 235 240 Ala Leu Ser Leu Ile Met Leu Thr Arg Phe Ala Asp 245 250 255 Leu Gly Glu Lys Ala Ser Ser Vai Leu Ala Ala Phe 265 270 Leu Thr Ala Giu Gin Leu Ala Asp Ala Met Asn Leu 280 285 Leu Leu Gin Ala Ser Ile Gin Ala Gin Asn His Gin 295 300 Val Thr Pro Glu Asn Ala Vhe Ser Cys Trp Thr Ser 310 315 320 Leu Gin Trp, Vai Asn Val Ala Gin Gin Leu Lys Cys 325 330 335 Arg Phe Arg Phe Gly Arg Ala Gly Leu Tyr Ser Ile 345 350 Thr Asp Leu Cys Pro Vai Gly Lys Arg Gly Arg Arg 360 365 Val Glu Phe Asn Asn Arg Leu Ile His Tyr Asn Ala 375 380 Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr Ile Arg 390 395 400 Ala Ala Ala Ala Ile Lys Ser Arg Asp Asp Leu Tyr 405 410 415 Ile Asp Asn Gin Vai Ser Ala Ala Ile Lys Thr Thr 425 430 Ala Ile Ala Ser Ile Gin Leu Tyr Val Asn Arg Ala 440 445 Giu Giu Asn Ala Asn Ser Giy Val Ile Ser Arg Gin 455 460 Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp, Ala 470 475 480 Leu Val Tyr Tyr Pro Glu Asn Tyr Ile Asp Pro Thr 485 490 495 Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 505 510 Leu Asn Ala Asp Thr Val Giu Asp Ala Phe Met Ser 520 525 Phe Giu Gin Val Ala Asn Leu Lys Val Ile Ser Ala 535 540 Ile Asn Asn Asp Gin Gly Leu Thr Tyr Phe Ile Gly 550 555 560 Asp Ala Giy Giu Tyr Tyr Trp Arg Ser Val Asp His 565 570 575 Leu Ser Giu Thr -211- Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 580 585 590 *1 0**e His Lys Val Ile 610 Ile Thr 625 Asp Tyr Tr-p Asn Lys Leu Gly Giu 690 25 Asp Ser 705 Met Ala 30 Asn Ser Arg Tyr Asp Tyr 770 Ile Pro 785 Ile Ser Arg Asn Phe Ile Ile Asp 595 Tyr Lys Lys Gin Arg Tyr Thr Pro 660 Giu Lays 675 Asp Thr Tyr Lys Ser Lys Tyr Gin 740 Ala Giu 755 Giy Trp Thr Ile Pro Lys Gin Cys 820 Val Tyr 835 Pro Ile Arg Leu 615 Gly Asn 630 Leu Lys Thr Phe Arg Ala Leu Vai 695 Ala Ser 710 Met Thr Phe Asp Tyr Giu Asp Tyr 775 Tyr Lys 790 Arg Ile Leu Met Ser Leu As n 600 Tyr Ser Leu Asp Pro 680 Met Met Pro Thr Ile 760 Tyr Ala Ile Asn Gly 840 Lys Trp, Giy 635 Ile Lys Tyr Asn Leu 715 Se r Val Ser Met Ser 795 Gly Giy Ser Leu 620 Tyr Arg Lys cys Gin 700 Tyr Asn Arg Val Val 780 Asp Tyr Lys Thr 605 Giu Gin Tyr Ile Ala 685 Gin Ile Val Arg Ser 765 Tyr Leu Giu Leu Ile Gin Thr Asp Ser 670 Gly Asp Phe Tyr Vai 750 Ser Asn Lys Gly Gly 830 Arg Lys Glu Gly 655 Giu Tyr Thr Al a Arg 735 Asn Arg Gly Ile Gin 815 Asp Pro Giu Thr 640 Thr Leu Gin Leu Asp 720 Asp Asn Lys As-p Tyr 800 Lys Lys Asn Pro Asn Asn INFORMvATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38 (TcdAjj- pk7i internal peptide): Arg Tyr Tyr Asn Leu Ser Asp Glu Giu Leu Ser Gin Phe Ile Gly 1 5 10 -212- Lys INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39 (TcdAii- pK44 internal peptide): Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu Ile Asn Thr Ala 1 5 10 Ile Ser Pro Ala Lys INFORMATION FOR SEQ ID 9* SEQUENCE CHARACTERISTICS: S(A) LENGTH: 11 amino acids TYPE: amino acid STRANDEDNESS: single 30 TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal 9 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40 (TcbAii i N-terminus): Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gln 1 5 40 INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid 45 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41 (TcdAii i N-terminus): Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gln 1 5 213- INFORMATION FOR SEQ ID NO:42: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid STRANDEDNESS: single- TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42 (TcdA-pk57 internal peptide): Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 1 5 10 Ala Gly Leu Glu INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 11 amino acids TYPE: amino acid 25 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43 (TcdAiii-pK20 internal peptide): Ile Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys 1 5 INFORMATION FOR SEQ ID NO:44: SEQUENCE CHARACTERISTICS: 40 LENGTH: 16 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear ii MOLECULAR TYPE: protein 45 7v) FRAMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: Asp Asp Ser Gly Asp Asp Asp Lys Val Thr Asn Thr Asp Ile His Arg 1 5 10 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 13 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID -214- Asp Val Xaa Gly Ser Glu Lys Ala Asn Glu *Lys Leu Lys 1 5 INFORMATION FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 7551 base pairs TYPE: nucleic acid .0 STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic-) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46 (tcdA): ATG AAC GAG TCT GTA AAA Met Asn Glu Ser Val Lys GAG ATA Giu Ile CCT GAT GTA TTA AAA AGC CAG TGT Pro Asp Val Leu Lys Ser 1
GGT
Gly
CGC
25 Arg
TAT
Tyr
CGT
Arg 35
GCC
Al a 40 AGC Ser
TTC
45 Phe
TTA
Leu
CTC
Leu 145
ACA
Thr
TCT
Ser
CGT
Arg
GAA
Giu
AAT
Asn
CAA
Gin 35
GAT
Asp
CTC
Leu
CTC
Leu
AGA
Arg
CCC
Pro 115
GCA
Al a
TCA
Ser
TCT
Ser-
CTG
Leu
TCC
Ser 195
ATC
Ile ACA GAT Thr Asp GAG CAC Glu His CAG GCA Gin Ala 55 CCC AAT Ala Asn 70 AAT GCT Asn Ala CAA TAT Gin Tyr TAT TTG Tyr Leu TCC GTT Ser Val 135 CTC AGT Leu Ser 150 AAT GAG Asn Giu TAT ACT Tyr Thr ACG CCT Thr Pro CAA GAT Gin Asp 10 AGC CAC Ser His 25 TCC TGG Ser Trp AAG GAT Lys Asp CAA TTA Gin Leu CTG ATA Leu Ile GCG CCG Ala Pro 105 GAA CTT Giu Leu TAT CTG Tyr Leu CAA MAT Gin Asn TTA TTG Leu Leu 170 GTG ATG Val Met 185 CAT CAT His Asp GGA CTT Gly Leu
AC
Ser
TCC
Ser
A.T
Asn
CAA
Gin 75
GCC
Gly
GGT
Gi y
TAT
Tyr
GAT
Asp
ATG
Met 155
GMA
Gi u
GMA
Gi u
GCT
Al a
GAG
TCT
Ser
GMA
Giu
CGC
Arg
AAT
Asn
TAT
Tyr
ACC
Thr
CGT
Arg
ACC
Thr 140
CAT
Asp
AGC
Se r
ATG
Met
TAT
Tyr
CAA
TTT
Phe
ACA
Thr
CTC
Leu
GCG
Al a
MAC
Asn
GTT
Val
GMA
Giu 125
CC
Arg
ATA
Ile
ATT
Ile
CTC
Leu
GMA
Glu 205
CTC
AT
Asn
CAC
His
TAT
Tyr
GTG
Val1
MAT
As n
TCT
Ser 110
GCA
Ala
CGC
Arg
GMA
Clu
MA
Lys
TCC
Ser 190
AAT
PAsn
AAT
*Gin Cys GMA TTT *Giu Phe GAC TT *a Asp Leu CMA C Giu Ala CAT CTT His Leu CMA TTT Gin Phe TCC ATO Ser Met CCC MAT Arg Asn CCA CAT Pro Asp TTA TCC Leu Ser 160 ACT GMA Thr Giu 175 ACT TTC Thr Phe CTG CC? Val Arg GCA TCA1 48 96 144 192 240 288 336 384 432 480 528 576 624 672 Giu Gin Leu Asn Ala Ser -2 210 215
S..
5* S
S
i .5*U
CCG
Pro 225
GCT
Ala
GAA
Glu
CCG
Pro
AGC
Ser
CAA
Gin 25 305
AGT
Ser 30
AAT
Asn 35 TAT Tyr
TCC
40 Ser
CCT
Pro 385
GAT
Asp
GGT
Gly
CAA
Gin
GCG
Ala
AAT
Asn 465
ACT
Thr
GCA
Ala
TCA
Ser
GGT
Gly
GCC
Ala
GAT
Asp 290
CAG
Gin
GAT
Aso
GCT
Ala
CGG
Arg
ATC
Ile 370
CAA
Gin
ATC
Ile
TCT
Ser
TAC
Tyr
ACA
Thr 450
CTA
Leu
AAA
Lys
ATT
Ile
ATC
Ile
AAT
Asn
TCA
Ser 275
GAA
Glu
GAA
Glu
GGC
Cly
TAT
Tyr
TTA
Leu 355
AAG
Lys
GTC
Val
AGT
Ser
TGG
Trp
TCT
Ser 435
GAA
Clu
CAA
Gin
TAT
Tyr
GCC
Ala
TCG
Ser
GCT
Ala 260
TTG
Leu
GAA
Glu
TAT
Tyr
ACG
Thr
CAA
Gin 340
GAT
Asp
TTA
Leu
AAT
Asn
CAA
Gin
GCA
Ala 420
TTT
Phe
TTG
Leu
CTG
Leu
TAT
Tyr
GGG
Gly
CCT
Pro 245
GAG
Glu
GCT
Ala
CTT
Leu
AGT
Ser
CTT
Val 325
ATC
Met
TAT
Tyr
PAT
Asn
ATA
Ile
CCT
Pro 405
TAT
Tyr
CTG
Leu
TCA
Ser
GAT
Asp
ATG
Met 485
TTG
Leu 230
GAG
Glu
GAA
Glu
ATG
Met
AGT
Ser
PAT
Asn 310
PAG
Lys
GAT
Asp
AAA
Lys
GAT
Asp
GAA
Glu 390
TTT
Phe
GCC
Ala
CTA
Leu
.CC
Pro DiTC Ile 470
CAG
Gin ATG CAT Met His CTA TTT Leu Phe CTT TAT Leu Tyr CCC GAA Pro Glu 280 CAC TTT Gin Phe 295 PAC CAA Asn Gin GTA TAT Val Tyr CTG GAG Val Glu TTC AAA Phe Lys 360 AAA AGA Lys Arg 375 TAC TCC Tyr Ser GAA ATT Glu Ile CCC GCA Ala Ala AAA CTT Lys Leu 440 ACG ATT Thr Ile 455 AAC ACA Asn Thr CGT TAT Arg Tyr PAT ATI Asn Ile 250 AAG AAP Lys Lys 265 TAC CTT Tyr Leu ATT GGT Ile Cly CTT ATT Leu Ile CGG ATC Arg Ile 330 CTA TTT Leu Phe 345 PAT TTT Asn Phe GAA CTT Clu Leu GCA AAT Ala Asn GGC CTG Gly Leu 410 AAA TTT Lys Phe 425 AAC AAG Asn Lys CTG GAA Leu Glu GAC GTA Asp Val GCT ATT kla Ile 490 CTG ACG GAG Leu Thr Clu PAT TTT GGT Asn Phe Glv AAA CGT TAT Lys Arg Tyr 285 AAA GCC AGC Lys Ala Ser 300 ACT CCG GTA Thr Pro Val 315 ACC CGC GAA Thr Arg Clu CCC TTC CGT Pro Phe Gly TAT PAT GCC Tyr Asn Ala 365 GTT CGA ACT Val Arg Thr 380 ATC ACA TTA Ile Thr Leu 395 ACA CGA GTA Thr Arg Val ACC GTT GAA Thr Val Glu GCT ATT CGT Ala Ile Arg 445 GGC ATT GTG Gly Ile Val 460 TTA GGT AAA Leu Gly Lys 475 CAT CCT GAA His Ala Glu CAA GCC TCC CTA TTG Gin Ala Ser Leu Leu GGT ATT PAC Gly Ile Asn 240 GAG ATT ACC Glu Ile Thr 255 PAT ATC CAA Asn Ile Glu 270 TAT PAT TTA Tyr Asn Leu PAT TTT GGT Asn Phe Gly GTC PAC AGC Vai Asn Ser 320 TAT ACA ACC Tyr Thr Thr 335 GGT GAG PAT Cly Clu Asn 350 TCT TAT TTA Ser Tyr Leu GAA GGC CCT Glu Gly Ala PAT ACC GCT Asn Thr Ala 400 CTT CCT TCC Leu Pro Ser 415 GAG TAT PAC Giu Tyr Asn 430 CTA TCA CGT Leu Ser Arg CGC AGT GTT Arg Ser Val GTT TTT CTG Val Phe Leu 480 ACT GCC CTG rhr Ala Leu 495 768 816 864 912 960 1008 1056 1104 1152 1200 1248 1296 1344 1392 1440 1488 1536 ATA CTA TCC PAC GCG CCT ATT TCA CPA CGT -216- TCA TAT CAT PAT CPA CCT Ile Leu Cys Asn Ala Pro Ile 500 AGC CAA TTT GAT CGC CTG TTT Ser Gin Phe Asp Arg Leu Phe 515 TTT TCT ACC GGC GAT GAG GAG Phe Ser Thr Gly Asp Glu Glu 530 535 GAT TGG CGA AAA ACC ATA CTT Asp Trp Arg Lys Thr Ile Leu 545 550 TCG CTC TTC CGC CTG CTT AAA Ser Leu Phe Arg Leu Leu Lys 565 AAA ATT AAA AAT AAC CTA AAG Lys Ile Lys Asn Asn Leu Lys 580 TTA CTG GCA GAT ATT CAT CAA 25 Leu Leu Ala Asp Ile His Gin 595 CTG ATT GCC GTA GGT GAA GGA Leu Ile Ala Val Gly Glu Gly 610 615 AAG CAA TTG GCT ACC CTG ATC Lys Gin Leu Ala Thr Leu Ile 625 630 CTA CAT ACA CAG AAG TGG AGT Leu His Thr Gin Lys Trp Ser 645 ACC AGC TAT AAC AAA ACG CTA Thr Ser Tyr Asn Lys Thr Leu 660 ACC GTC TAC CAC GGT TTA CAA Thr Val Tyr His Gly Leu Gin 675 CTA CAT GTC ATG GCG CCC TAT Leu His Val Met Ala Pro Tyr 690 695 GAA AAT GTC GCC CAC TCG GTA Glu Asn Val Ala His Ser Val 705 710 GGC GAC GGC GCA ATG ACA GCA Gly Asp Gly Ala Met Thr Ala 725 AAG TAT ACG CCG GGT TCA TCG Lys Tyr Thr Pro Gly Ser Ser 740 GTT CAG TAT TGT CAG GCT CTG Val Gin Tyr Cys Gin Ala Leu 755 ACC GGC ATC AAC GAA AAC GCC Thr Gly Ile Asn Glu Asn Ala 770 775 Ser Gin Arg 505 AAT ACG CCA Asn Thr Pro 520 ATT GAT TTA Ile Asp Leu AAG CGT GCA Lys Arg Ala ATT ACC GAC Ile Thr Asp 570 AAT CTT TCC Asn Leu Ser 585 TTA ACC ATT Leu Thr Ile 600 AAA ACT AAT Lys Thr Asn AGA AAA CTC Arg Lys Leu GTA TTC CAG Val Phe Gin 650 ACG CCT GAA Thr Pro Glu 665 GGT TTT GAT Gly Phe Asp 680 ATT GCG GCC Ile Ala Ala CTC CTT TGG Leu Leu Trp GAA AAA TTC Glu Lys Phe 730 GAA GCC GTA Glu Ala Val 745 GCA CAA TTG Ala Gin Leu 760 TTC CGT CTA Phe Arg Leu 1584 1632 1680 1728 1776 1824 1872 1920 1968 2016 2064 2112 2160 2208 2256 2304 2352 -217- ATG TTT GGC GCT Met Phe Gly Ala 785 CTC ATT ATG CTG GCA ACT Ala Thr GGA GCA GCG Gly Ala Ala
CCC.CCG
CAT GAT GCC CTT Pro Ala His Asp Ala Leu *o
C
Leu Ile Met AAA GCG TCC Lys Ala Ser GAA CAA CTG Giu Gin Leu 835 GCC ACT ATT Ala Ser Ile 850 GAA AAT GCG Glu Asn Ala 865 25 GTT A-AT GTC Val Asn Val TTG G-C GGG 30 Leu Vai Gly GCC CAG TGC Ala Gin Trp 35 915 CAA CAG GCT Gin Gin Ala 930 CCA TTA AC Ala Leu Ser 945 ATT A-AA AC Ile Lys Ser GTT TCT GCG Val Ser Ala ATT CAA CTC Ile Gin Leu 995 PAT TCG GGG Asn Ser Gly 1010 PAT AAA CGC Asn Lys Arq 1025 CCG GAA AAC Pro G.lu Asn ATG GAC GCA Met Aso Ala Leu
TCG
Ser 820
GCT
Al a
CA
Gin
TTC
Phe
GCA
Ala
CTG
Leu 900
CPA
Glu
PAT
Asn
ACC
Thr
CCT
Arg A.la 980
TACC
Tyr GTT I VLalI rAC I Tyr S TAT Tyr I ACA CCT TTT CC CAT Thr Arg Phe Ala Asp 805 CTC CTA CC CCA TTT Val Leu Ala Ala Phe 825 CAT CCC ATG PAT CTT Asp Ala Met Asn Leu 840 GCA CPA PAT CAT CPA Ala Gin Asn His Gin 855 TCC TCT TCC ACA TCT Ser Cys Trp Thr Ser 870 CPA CPA TTG PAT GTC Gin Gin Leu Asn Val 885 CAT TAT ATT CPA TCA Asp Tyr Ile Gin Ser 905 AAC CC GCA CCC CTA A.sn Ala Ala Cly Val 920 ACA TTA CAC CCT TTT Thr Leu His Ala Phe 935 TAC TAT ATC CCT CPA ryr Tyr Ile Arg Gin 950 GAT CAC TTC TAT CPA ksp Asp Leu Tyr Gin 965 %.TA AAA ACC ACC CCC Ile Lys Thr Thr Arg 985 3TC PAC CCC CCA TTGC lai Asn Arq Ala Leu C 1000 ~TC AGC CCC CPA TTC T Ele Ser Arg Gin Phe P 1015 GC ACT TCG C CCT G ~er Thr Trp Ala Cly V~ 1030 ~TT CAT CCC ACC ATC C le Asp Pro Thr Met A .0451 PAC CCA CT) Asn Ala Lei CPA CCT PAC TCC TTA Giu Ala Asn Ser Leu 830 GATCGCT PAT TTC CTC Asp Ala Asn Leu Leu 845 CAT CTT CCC CCA CTA His Leu Pro Pro Val 860 ATC PAT ACT ATC CTC Ile Asn Thr Ile Leu 875 CCC CCA CAC CCC CTT Ala Pro Gin Gly Val 890 ATC AAA GAG ACA CCC Met Lys Clu Thr Pro 910 TTA ACC CCC CCC TTC Leu Thr Ala Cly Leu 925 CTG CAT CPA TCT CC Leu Asp Ciu Ser Arg 940 GTC CCC PAC GCA CC Val Ala Lys Ala Ala 955 TAC TTA CTC ATT CAT Tyr Leu Leu Ile Asp 1 9709 kTC CCC GPA CCC ATT C Ile Ala Glu Ala IleP 990 APA PAT CTC CPA CPA ;iu Asn Val Clu Giu A 1005 'TT ATC GAC TCC CAC ~he Ile Asp Trp Asp L 1020 ;TT TCT CPA TTA CTT T ~al Ser Gin Leu Val T 1035 CT ATC CCA CPA ACCA ~rg Ile Gly Gin Thr L 0501 Cly Ciu 815 ACC GCA Thr Ala TTG CPA Leu Gin ACT CCA Thr Pro CPA TCC Cmn Trp 880 TCC GCT Ser Ala 895 ACC TAT Thr Tyr BLAT TCA Asn Ser ACT CC Ser Ala 3 CC CCT kla Ala 960 AT CAC .sn Gin ~CC ACT ~la Ser LAT GCC ~sn Ala AA TAC .ys Tyr AC TAC yr Tyr 1040 PA ATC ys Met 2400 2448 2496 2544 2592 2640 2688 2736 2784 2832 2880 2928 2976 3024 3072 3120 3168 321f; ['TA CTG CPA TCC CTC .Leu 1060 Leu Gin Ser Val AGC CAA ACC CPA TTA PAC CCC CAT Ser GIn Ser Gin Leu Asn Ala Asp 1065 1070 -218- ACC GTC GAA GAT GCC TTT ATG TCT TAT CTG ACA TCG -TT. GAA CAA GTG 3264 Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 1075 1080 1085 GCT AAT CTT AAA GTT ATT AGC GCA TAT CAC GAT AAT ATT AAT AAC GAT 3312 Ala Asn Leu Lys Val Ile Ser Ala Tyr His Asp Asn Ile Asn Asn Asp 1090 1095 1100 CAA GGG CTG ACC TAT TTT ATC GGA CTC AGT GAA ACT GAT GCC GGT GAA 3360 Gin Gly Leu Thr Tyr Phe Ile Gly Leu Ser Glu Thr Asp Ala Gly Glu 1105 1110 1115 1120 TAT TAT TGG CGC AGT GTC GAT CAC AGT AAA TTC AAC GAC GGT AAA TTC 3408 Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 1125 1130 1135 GCG GCT AAT GCC TGG AGT GAA TGG CAT AAA ATT GAT TGT CCA ATT AAC 3456 Ala Ala Asn Ala Trp Ser Glu Trp His Lys lie Asp Cys Pro- Ile Asn 1140 1145 1150 CCT TAT AAA AGC-ACT ATC CGT CCA GTG ATA TAT AAA TCC CGC CTG TAT 3504 Pro Tyr Lys Ser Thr Ile Arg Pro Val Ile Tyr Lys Ser Arg Leu Tyr 1155 1160 1165 CTG CTC TGG TTG GAA CAA AAG GAG ATC ACC AAA CAG ACA GGA AAT AGT 3552 Leu Leu Trp Leu Glu Gin Lys Glu Ile Thr Lys Gin Thr Gly Asn Ser 1170 1175 1180 30 AAA GAT GGC TAT CAA ACT GAA ACG GAT TAT CGT TAT GAA CTA AAA TTG 3600 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 1185 1190 1195 1200 GCG CAT ATC CGC TAT GAT GGC ACT TGG AAT ACG CCA ATC ACC TTT GAT 3648 g 35 Ala His Ile Arg Tyr Asp Gly Thr Trp Asn Thr Pro Ile Thr Phe Asp 1205 1210 1215 GTC AAT AAA AAA ATA TCC GAG CTA AAA CTG GAA AAA AAT AGA GCG CCC 3696 Val Asn Lys Lys Ile Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230 GGA CTC TAT TGT GCC GGT TAT CAA GGT GAA GAT ACG TTG CTG GTG ATG 3744 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 1235 1240 1245 TTT TAT AAC CAA CAA GAC ACA CTA GAT AGT TAT AAA AAC GCT TCA ATG 3792 Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 1250 1255 1260 CAA GGA CTA TAT ATC TTT GCT GAT ATG GCA TCC AAA GAT ATG ACC CCA 3840 Gin Gly Leu Tyr Ile Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 1265 1270 1275 1280 GAA CAG AGC AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 3888 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295 AAT AAT GTC AGA AGA GTG AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3936 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu Ile 1300 1305 1310 CCT TCC TCG GTA AGT AGC CGT AAA GAC TAT GGT TGG GGA GAT TAT TAC 3984 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325 CTC AGC ATG GTA TAT AAC GGA GAT ATT CCA ACT ATC AAT TAC AAA GCC 4032 Leu Ser Met Val Tyr Asn Gly Asp Ile Pro Thr Ile Asn Tyr Lys Ala 1330 1335 1340 GCA TCA AGT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 4080 Ala Ser Ser Asp Leu Lys Ile Tyr Ile Ser Pro Lys Leu Arg Ile Ile -219- 1345 1350 1355 1360 CAT AAT GGA TAT GAA GGA CAG AAG CGC AAT CAA TGC AAT CTG ATG AAT 4128 His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 1365 1370 1375 AAA TAT GGC AAA CTA GGT GAT AAA TTT ATT GTT TAT ACT AGC TTG GGG 4176 Lys Tyr Gly Lys Leu Gly Asp Lys Phe Ile Val Tyr Thr Ser Leu Gly 1380 1385 1390 GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATG TTT TAC CCC GTC TAT 4224 Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 1395 1400 1405 CAA TAT AGC GGA AAC ACC AGT GGA CTC AAT CAA GGG AGA CTA CTA TTC 4272 Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420 CAC CGT GAC ACC ACT TAT CCA TCT AAA GTA GAA GCT TGG ATT CCT GGA 4320 His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp Ile Pro Gly 1425 1430 1435 1440 GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT GAT TAT 4368 Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala Ile Gly Asp Asp Tyr 25 1445 1450 1455 GCT ACA GAC TCT CTG AAT AAA CCG GAT GAT CTT AAG CAA TAT ATC TTT 4416 Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr Ile Phe 3 1460 1465 1470 ATG ACT GAC AGT AAA GGG ACT GCT ACT GAT GTC TCA GGC CCA GTA GAG 4464 Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 1475 1480 1485 S 35 ATT AAT ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA GTC AAA GCG 4512 l* le Asn Thr Ala Ile Ser Pro Ala Lys Val Gin Ile Ile Val Lys Ala 1490 1495 1500 GGT GGC AAG GAG CAA ACT TTT ACC GCA GAT AAA GAT GTC TCC ATT CAG 4560 Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser Ile Gin 1505 1510 1515 1520 CCA TCA CCT AGC TTT GAT GAA ATG AAT TAT CAA TTT AAT GCC CTT GAA 4608 Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 1525 1530 1535 ATA GAC GGT TT GGT CTG AAT TTT ATT AAC AAC TCA GCC AGT ATT GAT 4656 Ile Asp Gly Ser Gly Leu Asn Phe Ile Asn Asn Ser Ala Ser Ile Asp 1540 1545 1550 GTT ACT TTT ACC GCA TTT GCG GAG GAT GGC CGC AAA CTG GGT TAT GAA 4704 Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 1555 1560 1565 AGT TTC AGT ATT CCT GTT ACC CTC AAG GTA AGT ACC GAT AAT GCC CTG 4752 Ser Phe Ser Ile Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 1570 1575 1580 ACC CTG CAC CAT AAT GAA AAT GGT GCG CAA TAT ATG CAA TGG CAA TCC 4800 Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 1585 1590 1595 1600 TAT CGT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT GCA CGC 4848 Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 1605 1610 1615 GCC ACC ACC GGA ATC GAT ACA ATT CTG AGT ATG GAA ACT CAG AAT ATT 4896 Ala Thr Thr CGly Ile Asp Thr Ile Leu Ser Met Glu Thr Gin Asn Ile 1620 1625 1630 CAG GPA CCG CAG TTA GGC AAA GGT TTC TAT GCT ACG TTC GTG ATA CCT 4944 -220- Gln Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val Ile Prc 1635 1640 1645 CCC TAT AAC CTA TCA ACT CAT GGT GAT GAA CGT TGG TTT AAG CTT TAT 4992 Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 1650 1655 1660 ATC AAA CAT GTT GTT GAT AAT AAT TCA CAT ATT ATC TAT TCA GGC CAG 5040 Ile Lys His Val Val Asp Asn Asn Ser His Ile Ile Tyr Ser Gly Gin 1665 1670 1675 1680 CTA ACA GAT ACA AAT ATA AAC ATC ACA TTA TTT ATT CCT CTT GAT GAT 5088 Leu Thr Asp Thr Asn lie Asn Ile Thr Leu Phe Ile Pro Leu Asp Asp 15 1685 1690 1695 GTC CCA TTG AAT CAA GAT TAT CAC GCC AAG GTT TAT ATG ACC TTC AAG 5136 Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 1700 1705 1710 AAA TCA CCA TCA GAT GGT ACC TGG TGG GGC CCT CAC TTT GTT AGA GAT 5184 Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 1715 1720 1725 2 GAT AAA GGA ATA GTA ACA ATA AAC CCT AAA TCC ATT TTG ACC CAT TTT 5232 Asp Lys Gly Ile Val Thr lie Asn Pro Lys Ser Ile Leu Thr His Phe 1730 1735 1740 GAG AGC GTC AAT GTC CTG AAT AAT ATT AGT AGC GAA CCA ATG GAT TTC 5280 Glu Ser Val Asn Val Leu Asn Asn Ile Ser Ser Glu Pro Met Asp Phe 1745 1750 1755 1760 AGC GGC GCT AAC AGC CTC TAT TTC TGG GAA CTG TTC TAC TAT ACC CCG 5328 Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 1765 1770 1775 ATG CTG GTT GCT CAA CGT TTG CTG CAT GAA CAG AAC TTC GAT GAA GCC- 5376 Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 1780 1785 1790 AAC CGT TGG CTG AAA TAT GTC TGG AGT CCA TCC GGT TAT ATT GTC CAC 5424 Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr lie Val His 1795 1800 1805 GGC CAG ATT CAG AAC TAC CAG TGG AAC GTC CGC CCG TTA CTG GAA GAC 5472 Gly Gin lie Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Aso 1810 1815 1820 ACC AGT TGG AAC AGT GAT CCT TTG GAT TCC GTC GAT CCT GAC GCG GTA 5520 ***Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1825 1830 1835 1840 GCA CAG CAC GAT CCA ATG CAC TAC AAA GTT TCA ACT TTT ATG CGT ACC 5568 Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 1845 1850 1855 TTG GAT CTA TTG ATA GCA CGC GGC GAC CAT GCT TAT CGC CAA CTG GAA 5616 Leu Asp Leu Leu lie Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 1860 1865 1870 CGA GAT ACA CTC AAC GAA GCG AAG ATG TGG TAT ATG CAA GCG CTG CAT 5664 Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 1875 1880 1885 CTA TTA GGT GAC AAA CCT TAT CTA CCG CTG AGT ACG ACA TGG AGT GAT 5712 Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 1890 1895 1900 CCA CGA CTA GAC AGA GCC GCG GAT ATC ACT ACC CAA AAT GCT CAC GAC 5760 Pro Arg Leu Asp Arg Ala Ala Asp Ile Thr Thr Gin Asn Ala His Aso 1905 1910 1915 1920 -221- AGC GCA ATA GTC GCT CTG CGG CAG AAT ATA.CCT ACA CCG GCA CCT TTA 5808 Ser Ala Ile Val Ala Leu Arg Gin Asn Ile Pro Thr Pro Ala Pro Leu 1925 1930 1935 TCA TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC 5856 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin Ile 1940 1945 1950 AAT GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC 5904 Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965 AAT CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA 5952 Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980 ATC TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT 6000 Ile Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1985 1990 1995 2000 GCC ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG 6048 Ala Thr Ser Gln-Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 2005 2010 2015 :25 TGG CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG 6096 Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gir 2020 2025 2030 V CTC ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC 6144 Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn Ile Ile Glu Arg Gin Asp .2035 2040 2045 GCG GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA 6192 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu Ile 35 2050 2055 2060 TTG ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC 6240 Leu Thr Asn Leu Ser Ile Gin Asp Lys Thr Ile Glu Glu Leu Asp Ala 2065 2070 2075 2080 GAG AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA CAA TCG CGC TTT 6288 Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 2085 2090 2095 GAT AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC 6336 Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn Ile Asn Ala Gly Glu Asn 2100 2105 2110 CAA GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT 6384 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125 CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 6432 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn Ile 2130 2135 2140 TTC GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG 6480 Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala Ile Ala Glu Ala 2145 2150 2155 2160 ACA GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG 6528 Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 2165 2170 2175 GAT AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG 6576 Asp Lys Ile Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Tro 2180 2185 2190 GAG ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT 6624 Glu Ile Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin Ile Asp Ala 2195 2200 2205 -222- CAG CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA 6672 Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 2210 2215 2220 ACC AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC 6720 Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 CTG CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT 6768 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255 CGA CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT 6816 Arg Leu Ala Ala Ile Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270 TGC CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT 6864 Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 2275 2280 2285 GCC CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG 6912 Ala Arg Phe Ile Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 25 2290 2295 2300 25 CTT GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT 6960 Leu Ala Gly Giu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 2305 2310 2315 2320 30 CAT CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG 7008 His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 2325 2330 2335 CTG GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC 7056 Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 2340 2345 2350 CTG GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC AGT GCC 7104 Leu Ala Gin Glu Ile Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365 GGC AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA 7152 Gly-Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Giy Thr Asp Thr Lys 2370 2375 2380 ACC TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA 7200 Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys Ile Arg Glu 2385 2390 2395 2400 GAT TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC 7248 Asp Tyr Pro Ala Ser Leu Gly Lys Ile Arg Arg Ile Lys Gin Ile Ser 2405 2410 2415 GTG AGT TTG CCC GGG CTA GTG GGA CCG TAT GAG GAT GTA GAG GGA ATA 7296 Val Thr Len Pro Ala Leu Len Gly Pro Tyr Gin Asp Val Gin Ala Ile 2420 2425 2430 TTG TCT TAC GGC GAT AAA'GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG 7344 Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Gys Glu Ala Leu 2435 2440 2445 GCA GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC 7392 Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 2450 2455 2460 AAC GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC 7440 Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly Ile Ala Ile Asp Gin Gly 2465 2470 2475 2480 ACG CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA 7488 Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys -223- 2485 2490 2495 CAA GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC 7536 Gin Ala Thr Met Leu Lys Thr Leu Asn Asp Ile Ile Leu His Ile Arg 2500 2505 2510 TAC ACC ATT AAA TAA 7551 Tyr Thr Ile Lys 2516 INFORMATION FOR SEQ ID NO:47: SEQUENCE CHARACTERISTICS: LENGTH: 2516 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47 (TcdA):
B
Met 1 Gly Arg Tyr Arg 65 Ala Ser Phe Leu Leu 145 Features Peptide Peptide Fragment Fragment Fragment Fragment Fragment Fragment Fragment Fragment Peptide Fragment Fragment Glu Ser Val 5 Asn Cys Leu Gin Val Ser Asp Ala Gin Leu Lys Arg Leu Ala Pro Arg Ala Ser 100 Pro Ala Ala 115 Ala Ser Asp Ser Met Ala 89 284 554 1080 1385 1478 1620 1938 1938 2327 2398 Glu Ile Asp Ile His Leu 40 Ala Gin Asn Pro Ala Glu Tyr Val Leu Thr 120 Val Tyr 135 Ser Gin 100 299 563 1092 1400 1497 1642 1948 2516 2345 2408 Pro I Ser 1 25 Ser 1 Lys I Gin I Leu Ala E 105 Glu I Tyr I Gin P From To 1 2516 89 1937 Description TcdA proteins TcdAii peptide TcdAii N-terminus (SEQ (SEQ ID NO:38) (SEQ ID NO:17) (SEQ ID NO:23; 12/13) (SEQ ID (SEQ ID (SEQ ID (SEQ ID TcdAiii (SEQ ID (SEQ ID Val Leu Ser Ser Ser Glu Asn Arg i Gin Asn 75 SGly Tyr Gly Thr Tyr Arg Asp Thr 140 Met Asp 155 NO: 18) NO:39) NO:21; NO:41) peptide NO: 42) NO:43) Lys Ser Phe Asn Thr His Leu Tyr Ala Val Asn Asn Val Ser 110 Glu Ala 125 Arg Arg Ile Glu 19/23) ID NO:13) Cys Phe Leu Ala Leu Phe Met Asn Asp Ser 160 Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser Ile Lys Thr Glu -224- Ser Lys Leu Giu Asn Tyr Thr Lys Val 180 185 Arg Pro Ser Gly Ala Thr Pro Tyr His 195 200 Glu Val Ile Gin Leu Gin Asp Pro Gly 210 215 Pro Ala Ile Ala Gly Leu Met His Gin 225 230 Ala Ser Ile Ser Pro Giu Leu Phe Asn 245 Giu Gly Asn Ala Giu Giu Leu Tyr Lys 260 265 Pro Ala Ser Leu Ala Met Pro Giu Tyr 275 280 Ser Asp Giu Glu Leu Ser Gin Phe Ile 25 290 295 Gin Gin Giu Tyr Ser Asn Asn Gin Leu 305 310 30 Ser Asp Gly Thr Val Lys Val Tyr Arg 325 Asn Ala Tyr Gin Met Asp Val Giu Leu 340 345 Tyr Arg Leu Asp Tyr Lys Phe Lys Asn 355 360 Ser Ile Lys Leu Asn Asp Lys Arq Giu 370 375 Pro Gin Val Asn Ile Giu Tyr Ser Ala 385 390 Asp Ile Ser Gin Pro Phe Giu Ile Gly 405 Gly Ser Trp Ala Tyr Ala Ala Ala Lys 425 Gin Tyr Ser Phe Leu Leu Lys Leu Asn 435 440 Ala Thr Giu Leu Ser Pro Thr Ile Leu 450 455 Asn Leu Gin*-Leu Asp Ile Asn Thr Asp 465 470 Thr Lys Tyr Tyr Met Gin Arg Tyr Ala 485 Ile Leu Cys Asn Ala Pro Ile Ser Gin 65500 505 Ser Gin Phe Asp Arg Leu Phe Asn Thr 515 520 Phe Ser Thr Gly Asp Giu Giu Ile Asp 530 S135 175 Ser Thr Phe 190 Asn Val Arq Asn Ala Ser Gly Ile Asn 240 Giu Ile Thr 255 Asn Ile Giu 270 Tyr Asn Leu Asn Phe Gly Val Asn Ser 320 Tyr Thr Thr 335 Gly Giu Asn 350 Ser Tyr Leu Giu Gly Ala Asn Thr Ala 400 Leu Pro Ser 415 Glu Tyr Asn 430 E-ef -SvrArg Arg Ser Val Val Phe Leu 480 Thr Ala Leu 495 Asn Gin Pro 510 Gly Gin Tyr Ser Thr Gly -22 Tru Arg Lys Thr Leu Lys Arg Ala Phe Asn Ile Asp Asp Val 555 560
<S
Ser Leu Lys Ile Leu Leu Leu Ile 610 Lys G I 625 Leu His Thr Ser 25 Thr Val Leu H is 690 Giu Asn 705 Gly Asp 35 Lys Tyr 40 Val Gin Thr---G-ly 770 Met Phe 785 Leu lie Lys Ala Giu Gin Ala Ser 850 Giu Asn 865 Val Asn Leu Val Phe Lys Ala 595 Al a Leu Thr Tyr Tyr 675 Val1 ValI Gi y Thr Tyr 755 Ile Gi y Met Ser Leu 835 Ile Al a Val1 Gly Arg Asn 580 Asp ValI Al a Gin Asn 660 His Met Al a Al a Pro 740 Cys As n Ala Leu Ser 820 Ala Gin Phe Al a Leu 900 Leu 565 Asn Ile Gi y Thr Lys 645 Lys Gly Ala His Met 725 Gi y Gin Giu Ala Thr 805 Val Asp Al a Ser Gin 885 Asp Leu Leu His Giu Leu 630 T rp Thr Le u Pro Ser 710 Thr Ser Al a As n Thr 790 Arg Leu Al a Gin Cys 870 Gin Tyr Lys Lys Gin Gi y 615 Ile Ser Leu Gin T yr 695 ValI Al a Ser Leu Ala 775 Gly Phe Ala Met As n 855 Trp Le u Ile Ile Asn Leu 600 Lys Arg Val1 Thr Gi y 680 Ile Leu Giu Giu Al a 760 Phe Ala Ala Ala Asn 840 His Thr As n Gin Thr Leu 585 Th r Thr Lys Phe Pro 665 Phe Ala Leu Lys Ala 745 Gin Arg Ala Asp Phe 825 Leu Gin Ser Val Se r 905 Asp 570 Ser Ile Asn Le u Gin 650 Giu Asp Al a T rp Phe 730 Val1 Leu Leu Pro Trp 810 Giu Asp His Ile Al a 890 Met His Asn Asp Leu As n 635 Leu Ile Lys Thr Al a 715 Trp Glu Giu Phe Aia 795 Val Al a Ala Leu Asn 875 Pro Lys Asp Leu Giu Ser 620 Thr Phe Lys Asp Le u 700 Asp Asp Thr Met Val 780 His Asn Asn Asn Pro 860 Thr Gin Glu Asn Tyr Leu 605 Ala.
Ile Ile Asn Lys 685 Gin Lys Trp Gin Val 765 Thr Asp Ala Ser Leu 845 Pro Ile Gly Thr Ala Gin Trp 915 Giu Asn Ala Ala Val Leu Thr Ala Leu Asn Ser -226- Gln Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala 930 935 940 Ala Leu Ser Thr Tyr Tyr Ile Arg Gin Val Ala Lys Ala Ala Ala Ala 945 950 955 960 Ile Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu Ile Asp Asn Gin 965 970 975 Val Ser Ala Ala Ile Lys Thr Thr Arg Ile Ala Glu Ala Ile Ala Ser 980 985 990 Ile Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 995 1000 1005 Asn Ser Gly Val Ile Ser Arg Gin Phe Phe Ile Asp Trp Asp Lys Tyr 1010 1015 1020 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 1025 1030 1035 1040 Pro Glu Asn Tyr Ile Asp Pro Thr Met Arg Ile Gly Gin Thr Lys Met 1045 1050 1055 Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070 Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 30 1075 1080 1085 Ala Asn Leu Lys Val Ile Ser Ala Tyr His Asp Asn Ile Asn Asn Asp 1090 1095 1100 35 Gin Gly Leu Thr Tyr Phe Ile Gly Leu Ser Glu Thr Asp Ala Gly Glu 1105 1110 1115 1120 Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 1125 1130 1135 Ala Ala Asn Ala Trp Ser Glu Trp His Lys Ile Asp Cys Pro Ile Asn 1140 1145 1150 S* Pro Tyr Lys Ser Thr Ile Arg Pro Val Ile Tyr Lys Ser Arg Leu Tyr 1155 1160 1165 Leu Leu Trp Leu Glu Gin Lys Glu Ile Thr Lys Gin Thr Gly Asn Ser 1170 1175 1180 1 50 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 1185 1190 1195 1200 Ala His Ile Arg Tyr Asp Gly Thr Trp Asn Thr Pro Ile Thr Phe Asp 1205 1210 1215 Val Asn Lys Lys Ile Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 1235 1240 1245 Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 1250 1255 1260 Gin Gly Leu Tyr Ile Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 1265 1270 1275 1280 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu Ile -227- 1300 1305 11 1310 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325 Leu Ser Met Val Tyr Asn Gly Asp Ile Pro-Thr Ile Asn Tyr Lys Ala 1330 1335 1340 Ala Ser Ser Asp Leu Lys Ile Tyr Ile Ser Pro Lys Leu Arg Ile Ile 1345 1350 1355 1360 His Asn Gly Tyr Giu Gly Gin Lys Arq Asn Gin Cys Asn Leu Met Asn 1365 1370 1375 Lys Tyr Giy Lys Leu Gly Asp Lys Phe Ile Val Tyr Thr Ser Leu Gly 1380 1385 1390 Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 1395 1400 1405 Gin Tyr Ser Giy Asn Thr Ser Giy Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420 Arg Asp Thr Thr Tyr Pro Ser Lys Vai Giu Ala Trp Ile Pro Gly @0:25 .1425 1430 1435 1440 *Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala le Gly Asp Asp Tyr .1445 1450 145 30 Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr Ile Phe **1460 1465 1470 Met Thr Asp Ser Lys Giy Thr Ala Thr Asp Val Ser Gly Pro Vai Giu 1475 1480 1485 0le Asn Thr Ala Ile Ser Pro Ala Lys Val Gin Ile Ile Val Lys Ala 1490 1495 1500 Gly Gly Lys Giu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser Ile Gin 1505 1510 1515 1520 .Pro Ser Pro Ser Phe Asp Giu Met Asn Tyr Gin Phe Asn Ala Leu Giu S1525 1530 1535 Ile Asp Gly Ser Gly Leu Asn Phe Ile Asn Asn Ser Ala Ser Ile ASD *1540 1545 1550 Val Thr Phe Thr Ala Phe Ala Giu Asp Gly Arg Lys Leu Gly Tyr Glu Soso -55- 1560 1565 50 Ser Phe Ser Ile Pro Val Thr Leu Lys Val Ser Thr Asp Asn-A-r.- Leu 1570 1575 1580 Thr Leu His His Asn Giu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 1585 1590 1595 1600 Tyr Arg Thr-Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 1605 1610 1615 Ala Thr Thr Gly Ile Asp Thr Ile Leu Ser Met Glu Thr Gin Asn Ile 1620 1625 1630 Gin Giu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val Ile Pro 1635 1640 1645 Pro Tyr Asn Leu Ser Thr His Gly Asp Giu Arg Trp Phe Lys Leu Tyr 1650 1655 1660 Ile Lys His Val Val Asp Asn Asn Ser His Ile Ile Tyr Ser Gly Gin 1665 1670 1675 1680 -228- Leu Thr Asp Thr Asn Ile Asn Ile Thr Leu Phe Ile Pro Leu Asp Asp 1685 1690 1695 Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 1700 1705 1710 Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 1715 1720 1725 Asp Lys Gly Ile Val Thr Ile Asn Pro Lys Ser Ile Leu Thr His Phe 1730 1735 1740 Glu Ser Val Asn Val Leu Asn Asn Ile Ser Ser Glu Pro Met Asp Phe 1745 1750 1755 1760 Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 1765 1770 1775 Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 1780 1785 1790 Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr Ile Val His 1795 1800 1805 25 Gly Gin Ile Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 1810 1815 1820 Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1825 1830 1835 1840 Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 1845 1850 1855 Leu Asp Leu Leu Ile Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 1860 1865 1870 Arg Asp Thr Leu Asn Glu Ala Lvs Met Trp Tyr Met Gin Ala Leu His 1875 1880 1885 Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 1890 1895 1900 Pro Arg Leu Asp Arg Ala Ala Asp Ile Thr Thr Gin Asn Ala His Asp 1905 1910 1915 1920 Ser Ala Ile Val Ala Leu Arg Gin Asn Ile Pro Thr Pro Ala Pro Leu 1925 1930 1935 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin Ile 50 1940 1945 1950 Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965 Asn Leu Arg His Asn Leu Ser Ile Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980 Ile Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1985 1990 1995 2000 Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 2005 2010 2015 Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 2020 2025 2030 Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn Ile Ile Glu Arg Gln Asp 2035 2040 2045 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu Ile 2050 2055 2060 -229- Leu Thr Asn Leu Ser Ile Gin Asp Lys Thr Ile Glu Glu Leu Asp Ala 2065 2070 2075 2080 Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 2085 2090 2095 Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn Ile Asn Ala Gly Glu Asn 2100 2105 2110 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn Ile 2130 2135 2140 Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala Ile Ala Glu Ala 2145 2150 2155 2160 Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 2165 2170 2175 Asp Lys Ile Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 2180 2185 2190 25 Glu Ile Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin Ile Asp Ala 2195 2200 2205 Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 30 2210 2215 2220 Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255 Arg Leu Ala Ala Ile Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270 Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 2275 2280 2285 Ala Arg Phe Ile Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 2290 2295 2300 Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 2305 2310 2315 2320 50 His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 2325 2330 2335 Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 2340 2345 2350 Leu Ala Gin Glu Ile Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365 Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 2370 2375 2380 Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys Ile Arg Glu 2385 2390 2395 2400 Asp Tyr Pro Ala Ser Leu Gly Lys Ile Arg Arg Ile Lys Gin Ile Ser 2405 2410 2415 Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala Ile 2420 2425 2430 Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu -230- 2435 2440 2445 Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 2450 2455 2460 Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly Ile Ala Ile Asp Gin Gly 2465 2470 2475 2480 Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 2485 2490 2495 Gin Ala Thr Met Leu Lys Thr Leu Asn Asp Ile Ile Leu His Ile Arg 2500 2505 2510 Tyr Thr Ile Lys 2516 INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 5547 base pairs TYPE: nucleic acid STRANDEDNESS: double 25 TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48 (tcdAii coding region) CTG ATA Leu Ile 1 GCG CCG Ala Pro GAA CTT Glu Leu TAT CTG Tyr Leu 50 CAA AAT Gin Asn 65 TTA TTG Leu Leu GTG ATG Val Met CAT GAT His Asp GGA CTT Gly Leu 130 GGC TAT AAC Gly Tyr Asn 5 GGT ACC GTT Gly Thr Val 20 TAT CGT GAA Tyr Arg Glu GAT ACC CGC Asp Thr Arg ATG GAT ATA Met Asp Ile GAA AGC ATT Glu Ser Ile GAA ATG CTC Glu Met Leu 100 GCT TAT GAA Ala Tyr Glu 115 GAG CAA CTC Glu Gin Leu
AAT
Asn
TCT
Ser
GCA
Ala
CGC
Arg
GAA
Glu 70
AAA
Lys
TCC
Ser
AAT
Asn
AAT
Asn
CAA
Gin
TCC
Ser
CGC
Arg
CCA
Pro 55
TTA
Leu
ACT
Thr
ACT
Thr
GTG
Val
GCA
Ala 135
TTT
Phe
ATG
Met
AAT
Asn 40
GAT
Asp
TCC
Ser
GAA
Glu
TTC
Phe
CGT
Arg 120
TCA
Ser AGC GGT Ser Gly 10 TTC TCC Phe Ser TTA CAC Leu His CTC AAA Leu Lys ACA CTC Thr Leu TCT AAA Ser Lys 90 CGT CCT Arg Pro 105 GAA GTT Glu Val CCG GCA Pro Ala GCT TCA Ala Ser AGA GCC AGT CAA TAT GTT 48 Ala Ser GCC GCT Ala Ala AGT GAC Ser Asp ATG GCG Met Ala TTG TCC Leu Ser GAA AAC Glu Asn GGC GCA Gly Ala CAG CTA Gin Leu 125 GCC GGG Ala Gly 140 TCG CCT Ser Pro Tyr
TTG
Leu
GTT
Val
AGT
Ser
GAG
Glu
ACT
Thr
CCT
Pro
GAT
Asp
ATG
Met
CTA
Leu Val ACT 96 Thr TAT 144 Tyr CAG 192 Gin CTG 240 Leu AAA 288 Lys TAT 336 Tyr CCT 384 Pro CAT 432 His TTT 480 Phe 160 GCC TCC CTA TTG GGT ATT AAC Ala Ser Leu Leu Ile Asn -231- AAT ATT CTG ACG GAG GAG ATT Asn Ile Leu Thr Giu Giu Ile
ACC
Thr GAA GGT Giu 4
AAG
Lys
TAC
Tyr
ATT
Ile
CTT
Leu 225
CGG
Arg 25
CTA
Leu
AAT
Asn
GAA
35 Glu
GCA
Ala 40 305
GGC
Gly
AAA
Lys
AAC
Asn
CTG
Leu
GAC
Aso 385
GCT
Ala
CAA
Gin AAA AAT Lys Asn CTT AAA Leu Lys 195 GGT AAA Gly Lys 210 ATT ACT Ile Thr ATC ACC Ile Thr TTT CCC Phe Pro TTT TAT Phe Tyr 275 CTT GTT Leu Vai 290 AAT ATC Asn Ile CTG ACA Leu Thr TTT ACC Phe Thr AAG GCT Lys Ala 355 GAA GGC Glu Gly 370 GTA TTA Vai Leu ATT CAT, Ile His CGT TCA Arg Ser
GGT
Gly
TAT
Tyr
AGC
Ser
GTA
Val
GAA
Glu 245
GGT
Gly
GCC
Ala
ACT
Thr
TTA
Leu
GTA
Val 325
GAA
Glu
CGT
Arg
GTG
Val
AAA
Lys
GAA
Glu 405
GAT
Asp
ATC
Ile
AAT
Asn
TTT
Phe 215
AAC
Asn
ACA
Thr
GAG
Glu
TAT
Tyr
GGC
Gly 295
ACC
Thr
CCT
Pro
TAT
Tyr
TCA
Ser
AGT
Ser 375
TTT
Phe
GCC
Ala
CAA
Gin
GAA
Glu
TTA
Leu 200
GGT
Gly
AGC
Ser
ACC
Thr
AAT
Asn
TTA
Leu 280
GCT
Ala
GCT
Ala
TCC
Ser
AAC
Asn
CGT
Arg 360
GTT
Val
CTG
Leu
CTG
Leu
CCT
Pro
CCG
Pro 185
AGC
Ser
CAA
Gin
AGT
Ser
AAT
Asn
TAT
Tyr 265
TCC
Ser
CCT
Pro
GAT
Asp
GGT
Gly
CAA
Gin 345
GCG
Ala
AAT
Asn
ACT
Thr
ATA
Ile
AGC
Ser 425 Gly 170
GCC
Ala
GAT
Asp
CAG
Gin
GAT
Asp
GCT
Ala 250
CGG
Arg
ATC
Ile
CAA
Gin
ATC
Ile
TCT
Ser 330
TAC
Tyr
ACA
Thr
CTA
Leu
AAA
Lys
CTA
Leu 410
CAA
Gin
TCT
Ser
TCA
Ser
GAA
Glu
GAA
Glu
GGC
Gly 235
TAT
Tyr
TTA
Leu
AAG
Lys
GTC
Va1
AGT
Ser 315
TGG
Trp
TCT
Ser
GAA
Glu
CAA
Gin
TAT
Tyr 395
TGC
Cys
TTT
Phe
ACC
TTG
Leu
GAA
Glu
TAT
Tyr 220
ACG
Thr
CAA
Gin
GAT
Asp
TTA
Leu
AAT
Asn 300
CAA
Gin
GCA
Ala
TTT
Phe
TTG
Leu
CTG
Leu 380
TAT
Tyr
AAC
Asn
GAT
Asp
GGC
GCT
Ala
CTT
Leu 205
AGT
Ser
GTT
Val
ATG
Met
TAT
Tyr
AAT
Asn 285
ATA
Ile
CCT
Pro
TAT
Tyr
CTG
Leu
TCA
Ser 365
GAT
Asp
ATG
Met
GCG
Ala
CGC
Arg
GAT
ATG
Met 190
AGT
Ser
AAT
Asn
AAG
Lys
GAT
Asp
AAA
Lvs 270
GAT
Asp
GAA
Glu
TTT
Phe
GCC
Ala
CTA
Leu 350
CCC
Pro
ATC
Ile
CAG
Gin
CCT
Pro
CTG
Leu 430
GAG
CCC
Prc
CAG
Gin
AAC
Asn
GTA
Vai
GTG
Val 255
TTC
Phe
AMA
Lys
TAC
Tyr
GAA
Glu
GCC
Ala 335
AAA
Lys
ACG
Thr
PAC
Asn
CGT
Arg
ATT
Ile 415
PTT
Phe
GAG
GAA 576 Glu TTT 624 Phe CAA 672 Gin TAT 720 Tyr 240 GAG 768 Glu AAA 816 Lys AGA 864 Arg TCC 912 Ser ATT 960 Ile 320 GCA 1008 Ala CTT 1056 Leu ATT 1104 Ile ACA 1152 Thr TAT 1200 Tyr 400 TCA 1248 Ser AAT 1296 Asn ATT 1344 AAT GCT GAG GAA CTT TAT 528 Asn Ala Glu Giu Leu Tyr 175 ACG CCA Thr Pro TTA CTG AAC GGA CAA TAT TTT Leu Leu Asn Gly Gin Tyr Phe Thr Gly Asp Giu Giu Ile -232- 435 GAT TTA PAT TCA GGT AspLeu Asn Ser Gly 450 440 AGC ACC GGC GAT TGG CGA AAA Ser Thr Gly Asp Trp Arg Lys 455 460 ATA CTT AAG 1392 Ile Leu Lys CGT GCA TTT PAT ATT GAT GAT GTC TCG CTC TTC CGC CTG CTT AA.
Arg 465
ACC
Thr
CTT
Leu
ACC
Thr
ACT
Thr
AAA
Lys 545 30
TTC
Phe
CCT
Pro
TTT
Phe
GCG
Ala
CTT
Leu' 625
AAA
Lys GCC C Ala CAA T GinIL CGT C Arg L 6 GCG C Ala P 705 Alz
GA(
Ast
TCC
Ser
ATI
Ile
PAT
Asn 530
CTC
Leu
CAG
Gin
GAA
G1 u
GAT
Asp
GCC
Ala 510
TGG
LrD
CTC
?he
;TA
Ial
:TG
.eu
TA
.eu 90
:CC
'ro a Phe
CAT
His
PAT
Asn
GAT
Asp 515
TTA
Leu
PAT
Asn
CTA
Leu
ATT
Ile
AAA
Lys 595
ACC
Thr
GCA
Ala
TGG
Trp
GAA
Glu
GAA
Glu 675
TTT
Phe
GCG
Ala Asr
GA
AsF
TTT
Leu 500
GAP
Glu
TCC
Ser
ACT
Thr
TTT
Phe
PAG
Lys 580
GAC
Asp
TTG
Leu
GAT
Asp
GAC
Asp
ACG
Thr 660
ATG
Met 3
TG
lal
:AT
iis i Ile Asr 47C PAT AAP Asn Lys 485 TAT ATT Tyr Ile CTG GAT Leu Asp GCT ATC Ala Ile ATT ACC Ile Thr 550 ATC ATG Ile Met 565 PAT TTG Asn Leu AAA GCA Lys Ala CAA TTA Gin Leu PAG TTA Lys Leu 630 TGG TTG Trp Leu 645 CAG GAA Gin Glu GTT TAC Val Tyr ACA AAA Thr Lys GAT GCC Asp Ala 710 GAT GGA Asp Gly GGA AAA Gly Lys TTA TTA Leu Leu 520 AGT GAT Ser Asp 535 AGC TGG Ser Trp ACC TCC Thr Ser CTG GAT Leu Asp GAT TTG Asp Leu 600 TCA TCG Ser Ser 615 CAG CCC Gin Pro PAT ACT Asn Thr CAT ATC His Ile CAT TCC I His Ser 680 CCA GAG Pro Glu M 695 CTT TCA C Leu Ser L AAA AT] Lys
TTA
Leu 505
CTG
Leu
AG
Lys
CTA
Leu
ACC
Thr
ACC
Thr 585
CTA
Leu
GAA
Glu
GGC
Gly kAG Lys 3TT lai 665 ~cc 'hr
LTG
let I TG I ~eu Ile 490
CTG
Leu
ATT
Ile
CAA
Gin
CAT
His
AGC
Ser 570
GTC
Vai
CAT
His
AAT
Asn
GAC
Asp
TAT
ryr 650
CAG
Gln
GGC
;iy rTT ?he
~TT
:le SDLJ val oer Let 1 Phe Arg Leu Leu 475 AAA PAT PAC CTA Lys Asn Asn Leu GCA GAT ATT CAT Ala Asp Ile His 510 GCC GTA GGT GAA Ala Val Gly Glu 525 TTG GCT ACC CTG Leu Ala Thr Leu 540 ACA CAG AAG TGG Thr Gin Lys Trp 555 TAT PAC AAA ACG Tyr Asn Lys Thr I TAC CAC GGT TTA Tyr His Giy Leu 590 GTC ATG GCG CCC q Val Met Ala Pro 1 605 GTC GCC CAC TCG G Vai Ala His Ser V 620 GGC GCA ATG ACA G Gly Ala Met Thr P 635 ACG CCG GGT TCA T Thr Pro Gly Ser S 6 TAT TGT CAG GCT C Tyr Cys Gin Ala L 670 ATC PAC GPA PAC G Ile Asn Glu Asn A 685 GGC GCT GCA ACT G Gly Ala Ala Thr G 700 ATG CTG ACA CGT T Met Leu Thr Arg P1 715 Ly
AA
Ly.
49 CAj Gli
GGI
Gi
ATC
Ile
AGI
Ser
CTA
Leu 575
:AA
31n
AT
*yr
;TA
tal
CA
la
CG
er
TG
eu
CC
la
GA
ly
TT
he A ATT 1440 s Ile 480 G AAT 1488 S Asn 4 TTA 1536 n Leu AAA 1584 I Lys AGA 1632 Arg GTA 1680 Val 560 ACO 1728 Thr GGT 1776 Gly ATT 182 4 Ile CTC 1872 Leu GAA 1920 Glu 640 GAA 1968 Glu GCA 2016 Ala TTC 2064 Phe GCA 2112 Ala GCG 2160 Ala 720 GAT TGG GTG PAC GCA CTA GGC GPA AAA GCG TCC TCG GTG CTA GCG GCA 2208 -233- Asp Trp Val Asn Ala Leu Gly Giu Lys 725 Ala. Ser 730 Ser Val Leu Ala Ala 735
TTT
Phe
CTT
Leu
CAA
Gin
TCT
Ser 785
GTC
Val1
TCA
25 Ser
GTA
Val1 30
TTT
Phe
CAA
Gin 865 cAA Gin
CGG
Arg
TTG
Leu
TTC
Phe
GGT
Gly 945
ATG
Met
AGC
Ser
TAT
Ty GCA GAA Ala Giu 745 CAA GCC Gin Ala 760 CCA GAA Pro Giu TGG GTT Trp Val GCT TTG Ala Leu TAT GCC Tyr Ala 825 TCA CAA Ser Gin 840 GCC GCA Ala Ala GCT ATT Ala Ile CAG GTT Gin Val AGT ATT Ser Ile 905 GCC AAT Ala Asn 920 TAC AAT Tyr Asn TAC CCG Tyr Pro ATG ATG Met Met CAT ACC Asp Thr 985 GTG GCT Val Ala 1000 GCT GAT Ala Asp CAA GCA Gin Ala 765 TTC TCC Phe Ser 780 GCA CAA Ala Gin CTG GAT Leu Asp GAA AAC Glu Asn AAT ACA Asn Thr 845 ACC TAC Thr Tyr 860 CGT GAT Arg Asp GCA ATA Ala Ile- TAC GTC Tyr Val GTT ATC Val Ile 925 TAC AGC Tyr Ser 940 TAT ATT Tyr Ile TTA CTG Leu Leu CAT GCC Asp Ala ATG AAT 2256 Met Asn AAT CAT 2304 Asn His TGG ACA 2352 Trp Thr TTG AAT 2400 Leu Asn 800 ATT CAA 2448 Ile Gin 815 GCA GGC 2496 Ala Gly CAC GCT 2544 His1 Ala ATC CGT 2592 Ile Arg TTG TAT 2640 Leu Tyr 880 ACC ACC 2688 Thr Thr 895 CGG GCA 2736 Arg Ala CGC CAA 2784' Arg Gin TGG GCG 2832 Trp Ala CCG ACC 2880 Pro Thr 960 TCC GTC 2928 Ser Val 975 ATG TCT 2976 Met Ser AAT CTT AAA GTT ATT AGC GCA 3024 Asn Leu Lys Val Ile Ser Ala 1005 -2 34- TAT CAC GAT Tyr His Asp 1010 AAT ATT AAT AAC GAT Asn Ile Asn Asn Asp 1015 CAA GGG .CTG ACC TAT Gin Gly Leu Thr Tyr 1020 TTT ATC GGA 3072 Phe Ile Gly CTC AGT Leu Ser 1025 GAA ACT GAT Glu Thr Asp GCC GGT Ala Gly 1030 GAA TAT TAT Glu Tyr Tyr TGG CGC Trp Arg 1035 AGT GTC GAT Ser Val Asp CAC 3120 His 1040 AGT AAA TTC AAC Ser Lys Phe Asn GAC GGT Asp Gly 1045 AAA TTC GCG Lys Phe Ala GCT AAT Ala Asn 1050 GCC TGG AGT Ala Trp Ser GAA TGG 3168 Glu Trp 1055 CAT AAA ATT His Lys Ile GAT TGT Asp Cys 1060 CCA ATT AAC Pro Ile Asn CCT TAT AAA Pro Tyr Lys 1065 AGC ACT ATC CGT CCA 3216 Ser Thr Ile Arg Pro 1070 GTG ATA TAT AAA Val Ile Tyr Lys 1075 TCC CGC CTG Ser Arg Leu TAT CTG Tyr Leu 1080 CTC TGG TTG Leu Trp Leu GAA CAA AAG GAG 3264 Glu Gin Lys Glu 1085 ATC ACC AAA Ile Thr Lys 1090 CAG ACA GGA Gin Thr Gly AAT AGT Asn Ser 1095 AAA GAT GGC Lys Asp Gly TAT CAA Tyr Gin 1100 ACT GAA ACG 3312 Thr Glu Thr 25 GAT TAT Aso Tyr 1105 CGT TAT GAA Arg Tyr Glu CTA AAA TTG Leu Lys Leu 1110 GCG CAT ATC CGC TAT GAT GGC Ala His Ile Arg Tyr Asp Gly 1115 ACT 3360 Thr 1120 TGG AAT ACG CCA Trp Asn Thr Pro ATC ACC Ile Thr 1125 TTT GAT GTC Phe Asp Val AAT AAA Asn Lys 1130 AAA ATA TCC Lys Ile Ser GAG CTA 3408 Glu Leu 1135 AAA CTG GAA Lys Leu Glu AAA AAT Lys Asn 1140 AGA GCG CCC Arg Ala Pro GGA CTC Gly Leu 1145 TAT TGT GCC Tyr Cys Ala GGT TAT CAA 3456 Gly Tyr Gin *1150 r r r GGT GAA GAT ACG Gly Glu Asp Thr 1155 GAT AGT TAT AAA Asp Ser Tyr Lys -1170 TTG CTG GTG Leu Leu Val ATG TTT Met Phe 1160 TAT AAC CAA CAA GAC ACA CTA 3504 Tyr Asn Gin Gin Asp Thr Leu 1165 AAC GCT Asn Ala TCA ATG Ser Met 1175 CAA GGA CTA Gin Gly Leu TAT ATC Tyr Ile 1180 TTT GCT GAT 3552 Phe Ala Asp ATG GCA Met Ala 1185 TCC A-AA GAT Ser Lys Asp ATG ACC Met Thr 1190 CCA GAA CAG Pro Glu Gin AGC AAT Ser Asn 1195 GTT TAT CGG Val Tyr Arg GAT 3600 Asp 120C AAT AGC TAT CAA 50 Asn Ser Tyr Gin CAA TTT Gin Phe 1205 GAT ACC AAT Asp Thr Asn AAT GTC Asn Val 1210 AGA AGA GTG AAT AAC 3648 Arg Arg Val Asn Asn 1215 CGC TAT GCA Arg Tyr Ala GAG GAT Glu Asp 1220 TAT GAG ATT Tyr Glu Ile CCT TCC Pro Ser 1225 TCG GTA AGT Ser Val Ser AGC CGT AAA 3696 Ser Arg Lys 1230 GAC TAT GGT TGG GGA GAT TAT Asp Tyr Gly Trp Gly Asp Tyr 1235 TAC CTC Tyr Leu 1240 AGC ATG GTA Ser Met Val TAT AAC GGA GAT 3744 Tyr Asn Gly Asp 1245 ATT CCA ACT Ile Pro Thr 1250 ATC AAT TAC Ile Asn Tyr AAA GCC GCA TCA AGT Lys Ala Ala Ser Ser 1255 GAT TTA Asp Leu 1260 AAA ATC TAT 3792 Lys Ile Tyr GGA CAG AAG 3840 Gly Gin Lys 1280 ATC TCA Ile Ser 1265 CCA AAA TTA Pro Lys Leu AGA ATT Arg Ile 1270 ATT CAT AAT Ile His Asn GGA TAT GAA Gly Tyr Glu 1275 CGC AAT CAA TGC AAT CTG ATG AAT AAA Arg Asn Gin Cys Asn Leu Met Asn Lys 1285 TAT GGC AAA CTA GGT GAT AAA 3888 Tyr Gly Lys Leu Gly Asp Lys 1290 1295 -235- TTT ATT GTT TAT ACT AGC TTG GGG GTC AAT CCA AAT AAC TCG TCA AAT 3936 Phe Ile Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 51300 1305 1310 AAG CTC ATG TTT TAC CCC GTC TAT CAA TAT AGC GGA AAC ACC AGT GGA 3984 Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325 CTC AAT CAA GGG AGA CTA CTA TTC CAC CGT GAC ACC ACT TAT CCA TCT 4032 Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340 AAA GTA GAA GCT TGG ATT CCT GGA GCA AAA CGT TCT CTA ACC AAC CAA 4080 Lys Val Glu Ala Trp Ile Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 AAT GCC GCC ATT GGT GAT GAT TAT GCT ACA GAC TCT CTG AAT AAA CCG 4128 Asn Ala Ala Ile Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375 GAT GAT CTT AAG CAA TAT ATC TTT ATG ACT GAC AGT AAA GGG ACT GCT 4176 Asp Asp Leu Lys Gin Tyr Ile Phe Met Thr Asp Ser Lys Gly Thr Ala 1380 1385 1390 ACT GAT GTC TCA GGC CCA GTA GAG ATT AAT ACT GCA ATT TCT CCA GCA 4224 Thr Asp Val Ser Gly Pro Val Glu Ile Asn Thr Ala Ile Ser Pro Ala 1395 1400 1405 AAA GTT CAG ATA ATA GTC AAA GCG GGT GGC AAG GAG CAA ACT TTT ACC 4272 Lys Val Gin Ile Ile Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 1410 1415 1420 GCA GAT AAA GAT GTC TCC ATT CAG CCA TCA CCT AGC TTT GAT GAA ATG 4320 Ala Asp Lys Asp Val Ser Ile Gin Pro Ser Pro Ser Phe Aso Glu Met S1425 1430 1435 1AA40 AAT TAT CAA TTT AAT GCC CTT GAA ATA GAC GGT TCT GGT CTG AAT TTT 4368 Asn Tyr Gin Phe Asn Ala Leu Glu Ile Asp Gly Ser Gly Leu Asn Phe 1445 1450 1455 ATT AAC AAC TCA GCC AGT ATT GAT GTT ACT TTT ACC GCA TTT GCG GAG 4416 Ile Asn Asn Ser Ala Ser Ile Asp Val Thr Phe Thr Ala Phe Ala Giu 1460 1465 1470 GAT GGC CGC AAA CTG GGT TAT GAA AGT TTC AGT ATT CCT GTT ACC CTC 4464 1475 1480 1485 AAG GTA AGT ACC GAT AAT GCC CTG ACC CTG CAC CAT AAT GAA AAT GGT 4512 Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly 1490 1495 1500 GCG CAA TAT ATG CAA TGG CAA TCC TAT CGT ACC CGC CTG AAT ACT CTA 4560 Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520 TTT GCC CGC CAG TTG GTT GCA CGC GCC ACC ACC GGA ATC GAT ACA ATT 4608 Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly Ile Asp Thr Ile 1525 1530 1535 CTG AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAG TTA GGC AAA GGT 4656 Leu Ser Met Glu Thr Gin Asn Ile Gin Glu Pro Gin Leu Gly Lys Gly 65 1540 1545 1550 TTC TAT GCT ACG TTC GTG ATA CCT CCC TAT AAC CTA TCA ACT CAT GGT 4704 Phe Tyr Ala Thr Phe Val Ile Pro Pro Tyr Asn Leu Ser Thr His Gly 1555 1560 1565 GAT GAA CGT TGG TTT AAG CTT TAT ATC AAA CAT GTT GTT GAT AAT AAT 4752 Asp Glu Arg Trp Phe Lys Leu Tyr Ile Lys His Val Val Asp Asn Asn -236- 1570 TCA CAT ATT Ser His Ile 1585 ACA TTA TTT Thr Leu Phe GCC AAG GTT Ala Lys Val
ATC
Ile
ATT
Ile
TAT
Tyr 162( 1575 TAT TCA GGC Tyr Ser Gly 1590 CCT CTT GAT C Pro Leu Asp 1605 ATG ACC TTC Met Thr Phe L
:AG
Sin
;AT
%sp
LAG
,ys 1580 CTA ACA GAT ACA AAT ATA AAC ATC 4800 Leu Thr Asp Thr Asn Ile Asn Ile 1595 1600 GTC CCA TTG AAT CAA GAT TAT CAC 4848 Val Pro Leu Asn Gin Asp Tyr His 1610 1615 AAA TCA CCA TCA GAT GGT ACC TGG 4896 Lys Ser Pro Ser Asp Gly Thr Trp 1625 1630 0 TGG GGC CCT CAC TTT Trp Gly Pro His Phe 1635 GTT AGA GAT GAT Val Arg Asp Asp 1640
AAA
Lys GGA ATA GTA ACA Gly Ile Val Thr 1645
ATA
lie AAC 4944 Asn *e e CCT AAA TCC ATT TTG ACC CAT TTT GAG AGC GTC AAT GTC CTG AAT AAT 4992 Pro Lys Ser Ile Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 1650 1655 1660 ATT AGT AGC GAA CCA ATG GAT TTC AGC GGC GCT AAC AGC CTC TAT TTC 5040 Ile Ser Ser Glu Pro Met Asp Phe Set Gly Ala Asn Ser Leu Tyr Phe 1665 1670 1675 1680 TGG GAA CTG TTC TAC TAT ACC CCG ATG CTG GTT GCT CAA CGT TTG CTG 5088 Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 1685 1690 1695 CAT GAA CAG AAC TTC GAT GAA GCC AAC CGT TGG CTG AAA TAT GTC TGG 5136 His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Tro 1700 1705 1710 35 AGT CCA TCC GGT TAT ATT GTC CAC GGC CAG ATT CAG AAC TAC CAG TGG 5184 Ser Pro Ser Gly Tyr Ile Val His Gly Gin Ile Gin Asn Tyr Gin Trp 1715 1720 1725 AAC GTC CGC CCG TTA CTG GAA GAC ACC AGT TGG AAC AGT GAT CCT TTG 5232 Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu 1730 1735 1740 GAT TCC GTC GAT CCT GAC GCG GTA GCA CAG CAC GAT CCA ATG CAC TAC 5280 Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 1745 1750 1755 1760 AAA GTT TCA ACT TTT ATG CGT ACC TTG GAT CTA TTG ATA GCA CGC GGC 5328 Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu Ile Ala Arg Gly 1765 1770 1775 GAC CAT GCT TAT CGC CAA CTG GAA CGA GAT ACA CTC AAC G7-C MC-AAG 5376 Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 1780 1785 1790 ATG TGG TAT ATG CAA GCG CTG CAT CTA TTA GGT GAC AAA CCT TAT CTA 5424 Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1805 CCG CTG AGT ACG ACA TGG AGT GAT CCA CGA CTA GAC AGA GCC GCG GAT 5472 Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1810 1815 1820 ATC ACT ACC CAA Ile Thr Thr Gin 1825 AAT ATA CCT ACA Asn Ile Pro Thr AAT GCT CAC GAC Asn Ala His Asp 1830 CCG GCA CCT TTA Pro Ala Pro Leu 1845
AGC
Ser
TCA
Ser 1849 GCA ATA Ala Ile 1835 5547 GTC GCT CTG CGG Val Ala Leu Arg CAG 5520 Gin 1840 -237- INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 1849 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49 (TcdAii) Features From Peptide 1 To Description 1849 TcdAii peptide Fragment Fragment Fragment Fragment Fragment Fragment Fragment Leu Ile Gly Tyr Ala Pro Glu Leu Tyr Leu Gin Asr.
65 35 Leu Leu Val Met: His Asc Gly L eu 130 Gin Ala 145 Asn Ile Lys Lys Tyr Leu Ile Gly 210 Leu I!= 225 Arg I I 1 196 466 993 1297 1390 1532 Asn Asn 5 Val Ser Giu Ala Arg Arg Ile Giu 70 Ile Lys Leu Ser Giu Asn Leu Asn Leu Gly 150 Giu Glu 165 Gly Asn Tyr Tyr Ser Asn Val Val 230 Giu Tyr 245 12 211 475 1004 1312 1409 1554 Gin Phe Ser Met Arg Asn Pro Asp 55 Leu Ser Thr Giu Thr Phe Val Arg 120 Ala Ser 135 Ile Asn Ile Thr Ile Glu Asn Leu 200 Phe Gly 215 Asn Ser Thr Thr Gly Arg Ala Ser TcdAii N-terminus (SEQ ID NO:13) (SEQ ID NO:38) (SEQ ID NO:17) (SEQ ID NO:23; 12/13) (SEQ ID NO:18) (SEQ ID NO:39) (SEQ ID NO:21; 19/23) Gin Tyr Val Le u ValI Ser Giu Thr Pro Asp Met Leu Leu 175 Pro Gin As n Vali Val1 255 Thr Tyr Gin Le u Lys T yr Pro His Phe 160 Tyr Giu Phe Gin Tyr 240 Giu -2 38- Gly Gly Giu Asn Tyr 265 Ala Ser Tyr Leu Ser 280 Arg. Leu Ile Lys
A
Gl u Ala 305 Gly Lys Asn Leu 25 Asp 385 Al a 30 Gin Thr AspI Arg 465 Thr Leu S Thr I Thr A Lys L 545 Phe G Pro G Phe A Ala A.
6 Le 29 As Le~ Pht Ly~ GiL 370 ValI Ile Ara Pro ,eu 150 l a Lsp ;er le .sn 30 eu in iu sp la 10 U Val Arg Thr Giu Ala Pro Gin Val Asn Ile Giu u Ile Thr Thr *Ala 355 2Gly Leu His Ser Leu 435 Asn Phe His Asn AspC 515 Leu S Asn TI Leu P Ile L 5 Lys A 595 Thr L 300 Thr Leu Ar Va 34( Ile Ile Ci y Ala Tyr 420 Le u S er Disn %sp ,eu 500 1iu e r h r h e 'ys 80 .sp g Vai 325 Arg Val Lys Gi u 405 Asp Asn Giy Ile Asn 485 Tyr Leu Ala Ile 'I Ile N' 565 Asn L Lys A As n 310 Leu Giu Leu Thr Ala Asp Ile Ser G1 Arc Val 390 Thr As n Ci y Ser Asp 470 Lys Ile %sp Ile h r ~50 e t e u I a ISez 375 Phe Ala Gin Gin Thr 455 Asp Asp Gi y Leu Ser 535 Ser Thr Leu Asp DSer Gly Ser 330 r Asn Gin Tyr 345 Arg Ala Thr 360 Val Asn Leu Leu Thr Lys Leu Ilie Leu 410 Pro Ser Gin 425 Tyr Phe Ser 440 Gly Asp Trp Val Ser Leu Gly Lys Ile 490 Lys Leu Leu 505 Leu Leu Ile 520 Asp Lys Gin I Trp Leu His TI Ser Thr Ser TI 570 Asp Thr Vai TI 585 Leu Leu His V 600 315 Trp Ser Giu Gin Tyr 395 Cys Phe Thr Arg Phe 475 Lys I %ila klia V .eu A~ 5 ~hr G ~55 'yr A 'yr H al M Al P h Let Let 38( Tyi As r Asp Lys 460 krg ts n s p ~a i 1i a 40 i n s n i s e t n Pro a Tyr Leu Ser 365 *t Asp *Met Ala Arg *Asp 445 Thr LeuI Asn I le H- Gly G 525 Thr L Lys T Lys T Gly L 5 Ala P 605 Ph Al Le' 35' Pri il Gir Prc Le u 430 Glu Ile Leu ~eu is 1 u e u rp hr eu ro a Al 33 Ly 0 *Th.
Asi IArc Ile 415 IPhe Gi u Le u Lys Lys 495 Gin Gly Ile Ser Leu 575 Gin Tyr Ile 320 a Ala Leu Ile Thr I Tyr 400 Ser Asn Ile Lys Ile 480 Asn Leu Lys Arg Val- 560 Thr Gly Ile Tyr Ser eu Gin Leu Ser Giu Asn Val Ala His Ser 620 Val Leu Leu 625 Trp Ala Asp Lys Gin Pro Giy Asp Gly .635 -239- Ala Met Thr Ala Lys Phe Trp Asp Ala Val Giu Thr 660 Gin Leu Glu Met 675 Arg Leu Phe-Val 690 Ala Pro Ala His 705 Asp Tro Val Asn Phe Giu Ala Asn 740 Leu Asp Ala Asn 755 25 Gin His Leu Pro 7710 Ser Ile Asn Thr 785 30 Val Ala Pro Gin Ser Met Lys Glu 35 820 Val Leu Thr Ala 835 Phe Leu Asp Glu 850 Gin Val Ala Lys 865 Gin Tyr Leu Leu 900 Leu Giu Asn Val 915 Phe Phe Ile Asp 930 Giy Val Ser Gin 945 6.Met Arg Ile Gly Ser Gin Ser Gin 980 Tyr Leu Thr Ser TrD 645 Gin VIai Thr Asp Ala 725 Ser Leu Pro Ile Gly 805 Thr Gi y Ser Ala Ile 885 Ala Glu Trp Leu Gin 965 Leu Phe Leu Asn Thr Lys Thr Pro Gly Ser Glu Tyr Lys Ala 710 Leu Leu Leu Val1 Leu 790 ValI Pro Leu Arg Ala 870 Asp Ile Giu Asp ValI 950 Thr As n Giu His His Pro 695 Le u Gly Thr Leu Thr 775 Gin Ser Thr Asn Ser 855 Al a Asn Ala Asn Lys 935 Tyr Lys Al a Gin Ile Ser 680 Giu Ser Giu Al a Gin 760 Pro T rp Ala T yr Ser 840 Al a Al a Gin Ser Ala 920 Tyr Tyr Met Asp Val 1000 Va.
66~ Thj Met LeL Lys GIl 745 Ala Giu ValI Leu Al a 825 Gin Al a Ile Val Ile 905 Asn Pisn Pro Mtet rhr 985 AlIa L Gin Tyr Cys rGly Ile Asn Phe Gly Ala 700 Ile Met Leu 715 Ala Ser Ser 730 Gin Leu Ala *Ser Ile Gin *Asn Ala Phe 780 Asn Vai Ala 795 Val Gly Leu 810 Gin. Trp Glu Gin Ala Asn Leu Ser Thr 860 Lys Ser Arg 875 Ser Ala Ala 890 Gin Leu Tyr Ser Gly Val Lys Arg Tyr 940 Giu Asn Tyr 955 Asp Ala Leu 970 Val Giu Asp Asn Leu Lys Gin Ala Leu Ala 670 Giu Asn Ala Phe 685 Ala Thr Gly Ala Thr Arg Phe Ala 720 Val Leu Ala Ala 735 Asp Ala Met Asn 750 Ala Gin Asn His 765 Ser Cys Trp Thr Gin Gin Leu Asn 800 Asp Tyr Ile Gin 815 Asn Ala Ala Gly 830 Thr Leu His Ala 845 Tyr Tyr Ile Arg Asp Asp Leu Tyr 880 Ile Lys Thr Thr 895 Val Asn Arq Ala 910 Ile Ser Arg Gin 925 Ser Thr Trp Ala Ile Asp Pro Thr 960 Leu Gin Ser Val 975 Alia Phe Met Ser 990 Val Ile Ser Ala Ser Glu 655 Tyr His Asp Asn Ile Asn Asn Asp Gin Giy Leu Thr Tyr Phe le Gly 1010 1015 1020 -24 0- Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040 Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 1045 1050 1055 His Lys Ile Asp Cys Pro Ile Asn Pro Tyr Lys Ser Thr Ile Arg Pro 1060 1065 1070 Val Ile Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 1075 1080 1085 Ile Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 1090 1095 1100 Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Thr 1105 1110 1115 1120 Trp Asn Thr Pro Ile Thr Phe Asp Val Asn Lys Lys Ile Ser Glu Leu 1125 1130 1135 Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 1140 1145 1150 25 Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1160 1165 Asp Ser Tyr Lys Asn Ala Ser-Met Gin Gly Leu Tyr Ile Phe Ala Asp 30 1170 1175 1180 Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 1185 1190 1195 1200 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 1205 1210 1215 Arg Tyr Ala Glu Asp Tyr Glu Ile Pro Ser Ser Val Ser Ser Arg Lys 1220 1225 1230 Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp 1235 1240 1245 Ile Pro Thr Ile Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys Ile Tyr 1250 1255 1260 SIle Ser Pro Lys Leu Arg Ile Ile His Asn Gly Tyr Glu Gly Gin Lys 1265 1270 1275 1280 50 Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 1285 1290 1295 Phe Ile Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1300 1305 1310 Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325 Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340 Lys Val Glu Ala Trp Ile Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 Asn Ala Ala Ile Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375 Asp Asp Leu Lys Gin Tyr Ile Phe Met Thr Asp Ser Lys Gly Thr Ala 1380 1385 1390 Thr Asp Val Ser Gly Pro Val Glu Ile Asn Thr Ala Ile Ser Pro Ala -241- 1395 1400 1405 Lys Val Gin Ile Ile Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 1410 1415 1420 Ala Asp Lys Asp Val Ser Ile Gin Pro Ser Pro Ser Phe Asp Glu Met 1425 1430 1435 1440 Asn Tyr Gin Phe Asn Ala Leu Glu Ile Asp Gly Ser Gly Leu Asn Phe 1445 1450 1455 Ile Asn Asn Ser Ala Ser Ile Asp Val Thr Phe Thr Ala Phe Ala Glu 1460 1465 1470 Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser Ile Pro Val Thr Leu 1475 1480 1485 Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly 1490 1495 1500 Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520 Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly Ile Asp Thr Ile 25 1525 1530 1535 0Leu Ser Met Glu Thr Gin Asn Ile Gin Glu Pro Gin Leu Gly Lys Gly 1540 1545 1550 30 Phe Tyr Ala Thr Phe Val Ile Pro Pro Tyr Asn Leu Ser Thr His Gly 1555 1560 1565 Asp Glu Arg Trp Phe Lys Leu Tyr Ile Lys His Val Val Asp Asn Asn 1570 1575 1580 -35 Ser His Ile Ile Tyr Ser Gly Gin Leu Thr Asp Thr Asn Ile Asn Ile 1585 1590 1595 1600 Thr Leu Phe Ile Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 1605 1610 1615 S* Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 1620 1625 1630 4 Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly Ile Val Thr Ile Asn 1635 1640 1645 Pro Lys Ser Ile Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 50 1650 1655 1660 .i 50 Ile Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 1665 1670 1675 1680 Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 1685 1690 1695 His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 1700 1705 1710 Ser Pro Ser Gly Tyr Ile Val His Gly Gin Ile Gin Asn Tyr Gin Trp 1715 1720 1725 Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu 1730 1735 1740 Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 1745 1750 1755 1760 Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu Ile Ala Arg Gly 1765 1770 1775 -242- Asp His Ala Tvr Arg Gln Leu Glu Arg Asp.Thr Leu Asn Glu Ala Lys 1 80 1785 1790 Met Trp Tyr Met Gln Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1805 Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Aso Arg Ala Ala Asp 1810 1815 18 0 Ile Thr Thr Gin Asn Ala His Asp Ser Ala Ile Val Ala Leu Arg Gln 1825 1830 1835 1840 Asn Ile Pro Thr- Pro Ala Pro Leu Ser 1845 1849 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1740 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:50 (tccLAiii coding region):: r r r o r TTG CGC AGC GCT AAT ACC CTG ACT GAT Leu 1
GAA
Glu
CTG
Leu
TAT
Tyr
ACT
Thr 65
CGT
Arg
ACC
Thr
GAA
Glu
ACT
Thr
AAA
Lys 145 Thr
TAC
Tyr
TCT
Ser
GAT
Asp
GGC
Gly 70
CTG
Leu
ACG
Thr
TTA
Leu
CAG
Gln
AAA
Lys 150 Leu Thr Asp TGG CAG ACA Trp Gln Thr ATC GAC GGC Ile Asp Gly CCG AAA GCG Pro Lys Ala 55 AAG CTA CCG Lys Leu Pro GAA AAT GCG Glu Asn Ala TTA CAA AAT Leu Gln Asn 105 TTA CAA AAT Leu Gln Asn 120 GAC AAA ACC Asp Lys Thr 135 TCC AAA GCG Ser Lys Ala TTC CTG CCG Phe Leu Pro GCT CAG AGA Ala Gln Arg CCG TTA TAT Pro Leu Tyr CTC AGC GCC Leu Ser Ala TCA TTT ATG Ser Phe Met 75 GGC ATG GTT Gly Met Val ATC GAA CGT Ile Glu Arg GCC GCC GAG Ala Ala Glu 125 GAA GAA TTG Glu Glu Leu 140 GCA CAA TCG Ala Gln Ser 155 AAT 4 8 Asn AAT 96 Asn ATC 144 Ile GCC 192 Ala TGG 240 Trp CTC 288 Leu GCG 336 Ala TTG 384 Leu GAG 432 Glu GAT 480 Asp 160 AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC CAA~ 528 -243- *4 Ser
GCC
Ala
GCA
Al a
GGC
Gly
GGT
Gl y 225
AAA
Lys
ATC
25 Ile
CTC
Leu 30
AGT
Ser
CAA
Gin 305
GTG
Leu
CTG
45 Leu.
CGC
50 Arg
GCA
Ala
CTG
Leu 385
GC
Ala
GCT
Ala Tyr
ATG
Met
TCC
Ser
TTT
Phe 210
TAT
Tyr
ATT
Ile
CAG
Gin
A
Lys
CTG
Leu 290
CGT
Arg
GG
Al1a
ATG
Met
TTC
Phe
GGT
Gly 370
~AA
Lys
GAA
Glu
CAG
Gin Gi y
ACG
Thr
CGT
Arg 195
GCC
Ala
GTG
Val1
AGC
Ser
CGG
Arg
TCA
Ser 275
MA
Lys
AAG
Lys
GCG
Ala
GCA
Al a
ATT
Ile 355
GAA
GI u
CGC
Arg
GTT
Val
GAA
Glu Lys
CTA
Leu 180
CTG
Leu
GGT
Gly
ATG
Met
CAA
Gin
AAT
As n 260
CTC
Leu
ACC
Thr
TTC
Phe
ATT
Ile
GAA
Gi u 340
A
Lys
A.CC
Thr
GAT
A~sp
TAT
T yr
P.TT
Ile 420 Leu 165
CGA
Arg
GCC
Al a
GGC
Gl y
GMA
Gi u
TCT
Ser 245
MAT
As n
GCT
Al a
CMA
Gin
AGC
Ser
TAG
Tyr 325
CMA
Gin
GCG
Pro
TTG
Leu
A
Lys
GCA
Al a 405
GAC
Asp Tyr
GCG
Ala
GGT
Gi y
GGC
Gi y
TTC
Phe 230
GMA
Glu
GCC
Al a
GTA
Val
CMA
Gin
MAT
Asn 310
TTC
Phe
GCT
Al a
GGC
Gi y
ATG
Met
CGC
Arg 390
GGA
Gly
AAG
Lys Asp
TCC
Ser
GCG
Aila
AGC
Ser 215
TCC
Ser
ACC
Thr
GMA
Glu
CGC
Arg
GMA
Gi u 295
CAG
Gin
CAG
Gin
TAG
Tyr
GGG
Ala
CTG
Leu 375
GCA
Ala
TTA
Leu
CTG
Leu Glu
GCC
Ala
GG
Ala 200
GGT
Arq
GCG
Al a
TAG
Tyr
GG
Al a
GGC
Arg 280
GAG
Gin
GCG
Al a
TTC
Phe
CGT
PArg
TGG
Trp 360
P.GT
Ser
TTA
Leu
CCA
Pro
GTG
Val 1Asr
GCC
Ala 185
GCT
Ala
TGG
Trp
MAT
Asn
GGT
Arg
GMA
Giu 265
GMA
Glu
ACC
Thr
TTA
Leu
TAG
Tyr
TGG
Trp 345
GAG
Gin
GTG
Leu
GAG
Gi u
A
Lys
AGT
Ser 425
GGC
Ile 170
GGG
Gly
GAT
Asp
GGG
Gl y
GTT
Val1
GGT
Arg 250
TTG
Leu G CC Ala
CAA.
Gin
TAG
Tyr
GAT
Asp 330
GMA
Giu
GGA
Gly
GGA
Ala
GTT
Val
GAT
Asp 410
CAA
Gin
GGG
Ala As n
GTT
Leu
GTG
Leu
GGT
ATG
Met 235
GG
Arg
MAG
Lys
GCC
Al a
TGT
Ser
P.AC
315
TTG
Leu
CTC
Le u rhr Gin Glu 395 k.AG 3GT 3GC 31v Alz
ACC
Thr
GTG
ValI
ATC
Ile 220
MGC
As n
GGT
Arg
GMA
Gin
GTA
ValI
GMA
Gin 300
TGG
Trp
GCC
Al a
MAT
As n
TAT
Tyr
ATG
Met 380
GG
Arg
GGT
Gly
TGA
Ser
AG
Th r Gly Giu Asn 175 -ACG GGA GTT *Thr Aia Val 190 GGT MAC ATG *Pro Asn Ile 205 OCT GAG GCG Ala Glu Ala ACC GMA GCG Thr Glu Ala GAG GAG TGG Gin Glu Trp 255 ATG GAT GGT Ile Asp Ala 270 TTG GAG A Leu Gin Lys 285 TTG GGC TTC Leu Ala Phe CTG CGT GGT Leu Arg Gly GTC GGG GGT Vai Ala Arg 335 GAT GAG TCT Asp Asp Ser 350 GCG GOT CTG Ala Gly Leu 365 GMA GAG GGT Glu Asp Ala AGA GTA TCG Thr Val Ser GCA TTT TGG Pro Phe Ser 415 GGC AGT GGC C Gly Ser Ala C 430 GAG ACT MAA I Asp Thr Lys T 445 Gin GAG 576 Gin TTC 624 Phe ACA 672 Thr GAT 720 Asp 240 GAG 768 GI u GAG 816 Gin ACC 864 Thr CTG 912 Leu CGA 960 PArg 320 TGC 1008 Cys GCC 1056 Alia CTT 1104 Leu CAT 1152 His 'TG 1200 Leu 400 'TG 1248 eu G C 1296 ~CC 1344 ~hr AGT GGT MAT MAT Ser Giy Asn Asn 435 MAT TTG GCG TTG Asn Leu Ala Phe Giy 440 -24 4- TCT TTG Ser Leu 450 TAC C CG Tyr Pro 465 ACT TTG Thr Leu TCT TAO Ser Tyr GTT TOT Vai Ser GAT CCCC Asp Gly 530 OTG ACA LeuTr 545 GOC ACT 30 Ala Thr ACC ATT Thr Ile
OAG
Gin
GOA
Ala 000 Pro
GGO
Gly
CAC
His 515
AA
Lys
OTG
Leu
ATG
Mlet Lys 579
GOA
Ala
TOG
Ser
CG
Al a
CAT
Asp 500
CGT
Cly
TTO
Phe
AGO
Ser
TTA
Le u
TAA
TCA CTT Ser Val OTT GCC Leu Gly 470 OTA OTC Leu Leu 485 AAA CO Lys Ala ATC AAT Met Asn
CTGCOCA
Leu Pro T TC OCA Phe Pro 550 AAA ACC Lys Thr 565 1740 TOA TTC Ser Phe 455 AAA ATT Lys Ile GGA CCC Cly Pro CGA TTA Cly Leu GAO AGO Asp Ser 520 TTO GAA Phe Ciu 535 AAT GOA Asn Ala 0T7G AAC Leu Asn GOT CAT Ala Asp OGA CT Arg Arg TAT CAC Tyr Gin 490 GOT AAO Ala Asn 505 CCC CAA Cly Gin CCC ATO Cly Ilie TOT ATC Ser Met CAT ATO Asp Ilie 570
TTG
Leu
ATO
Ile 475
GAT
Asp
GC
Cl y
TTO
Phe
CO
Al a
CG
Pro 555
ATT
Ile AAA ATT CT CAA CAT 1392 Lys Ile Arg Clu Asp 460 AAA CAG ATO AGO GTC 1440 Lys Gin Ile Ser Val 480 GTA CAG GOA ATA TTC 1488 Val Gin Ala Ile Leu 495 TCT GAA CG CTG GCA 1536 Cys Clu Ala Leu Ala 510 CAG OTO CAT TTO AAO 1584 Gin Leu Asp Phe Asn 525 ATT CAT CAA CCC ACC 1632 Ile Asp Gin Cly Thr 540 GAG AAA GCT AAA CAA 1680 Glu Lys Cly Lys Gin 560 TTGCOAT ATT CCC TAO 1728 Leu His Ile Arg Tyr 575 INFORMATION FOR SEQ ID NO:51: SEQUENCE
CHARACTERISTICS:
LENGTH: 579 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear 45 (iJi) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51 Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu 50 1 5 10 Clu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Cmn 25 Leu Arg His Asn Leu Ser Ile Asp Cly Gin Pro Leu 40 Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser 55 60 Thr Ser Gin Cly Cly Cly Lys Leu Pro Glu Ser Phe 70 75 Arg Phe Pro His Met Leu Clu Asn Ala Arg Gly Met 85 90 Thr G0'n Phe Cly Ser Thr Leu Gin Asn Ile Ile Glu 100 105 (TcdAiii): Pro Gin Ile Arg Val Tyr Tyr Leu Pro Ala Ala Val Met Ser Leu Val Ser Gin Arg Gin Asp 110 -24 Glu Ala Leu 115 As Ala Leu Leu Asn GlnAla Ala Glu Leu Ile Leu Thr Lys 145 Ser Ala Ala Gly Gly 225 25 Lys Ile Leu Ser Gin 305 Leu Leu I 415 Arg Ala Leu 385 Alal Ala Ser Ser Tyr I 465 Thr I Asn 130 Thr Tyr Met Ser Phe 210 Tyr Ile Gin Lys Leu 290 Arg Ala Mlet Phe Gly 370 Lys lu Gin 1ly Leu 450 Pro .'eu Leu Val Gly Thr Arg 195 Ala Va1 Ser Arg Ser 275 Lys Lys Ala Ala Ile 355 Glu Arg 2 Val Glu Asn I 435 Gin 1 Ala Pro I Ser Leu Lys Leu 180 Leu Gly Me t- Gin Asn 260 Leu Thr Phe Ile lu 340 Lys rhr ksp Tyr Ile 420 ksn la 3er la Ile Glu Leu 165 Arg Ala Gly Glu Ser 245 Asn Ala Gin Ser Tyr 325 Gin Pro Leu Lys Ala 405 Asp Asn Ser Leu Leu 485 Gin I Lys 150 Tyr Ala Gly Gly Phe 230 Glu Ala Val Gin Asn 310 Phe 4 Ala Gly 2 Met Arg I 390 Gly Lys I Leu I Val S 4 Gly I 470 Leu G Asp 135 Ser Asp Ser Ala Ser 215 Ser Thr Glu Arg Glu 295 Sin Gin ryr Ala Leu 375 kla Leu .eu la er 55 ~ys ;ly Lyc Lys Glu Ala Ala 200 Arg Ala Tyr Ala Arg 280 Gin Ala Phe Arg Trp 360 Ser Leu Pro Val Phe 440 Phe Ile Pro 3 Thr 3 Ala Asn Ala 185 Ala Trp Asn Arg Glu 265 Glu Thr Leu Tyr Trp 345 Gin Leu I Glu Lys I Ser C 425 Gly I Ala ;Z Arg I Ile Giy Ile 170 Gly Asp Gly Val Arg 250 Leu Ala Gin Tyr Asp 330 lu Gly Uia /al ksp 410 3in la ~sp rg Glu Ala 155 Asn Leu Leu Ala Met 235 Arg Lys Ala Ser.
Asn 315 Leu Leu Thr Gin Glu I 395 Asn Gly I Gly I Leu 1 4 Ile I 475 Git 14( Gir Ala Thr Val Ile 220 Asn Arg Gin VaI Gin 300 Trp Ala Asn ryr M1et 380 krg ;ly 3er rhr ys 60 ,ys u Leu I Ser Gly Thr Pro 205 Ala Thr Gin Ile Leu 285 Leu Leu Val Asp 2 Ala 365 Glu I Thr Pro I Gly S 4 Asp I 445 Ile A Gin I AsD Arg Glu Ala 190 Asn Gl-u Glu Glu Asp 270 Gin Ala Arg Ala Asp 350 Gly ksp lal ?he jer 30 'hr Lrg le Ala Phe Asr 175 Va1 Ile Ala Ala Trp 255 Ala Lys Phe Gly Arg 335 Ser Leu Ala Ser Ser 415 Ala Lys Glu Ser Glu Aso 160 Gin Gin Phe Thr Asp 240 Glu G1n Thr Leu Ara 320 Cys Ala Leu His Leu 400 Leu Gly Thr Asp VaI 480 Tyr Gin Asp 490 Val Gin Ala Ile Leu 495 -246- Ser Val Asp Leu 545 Ala Thr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 500 505 510 His Gly Met Asn Asp Ser Gly Gln Phe Gin Leu Asp Phe Asn 515 520 525 Lys Phe.Leu Pro Phe Glu Gly Ile Ala Ile Asp Gin Gly Thr 535 540 Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 550 555 560 Met Leu Lys Thr Leu Asn Asp Ile Ile Leu His Ile Arg Tyr 565 570 575 Lys 579 .00 .o 0 .0 INFORMATION FOR SEQ ID NO:52: SEQUENCE CHARACTERISTICS: LENGTH: 5532 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52 (tcbAii coding region): TTT ATA CAA GGT TAT AGT GAT CTG TTT Phe Ile Gin Gly Tyr Ser Asp Leu Phe 35 1 5 GCC GCG CCG GGC TCG GTT GCA TCG ATG Ala Ala Pro Gly Ser Val Ala Ser Met 25 ACG.GAA TTG TAC CGT GAA GCC AAA AAC Thr Glu Leu Tyr Arg Glu Ala Lys Asn 40 TAT TAC CTA GAT AAA CGT CGC CCG GAT Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 55 CAG AAA AAT ATG GAT GAG GAA ATT TCA Gin Lys Asn Met Asp Glu Glu Ile Ser 70 TTG TGC CTT GCC GGG ATC GAA ACA AAA Leu Cys Leu Ala Gly Ile Glu Thr Lys 85 GTG ATG GAT ATG TTG TCA ACT TAT CGT Val Met Asp Met Leu Ser Thr Tyr Arg 100 105 CAT CAC GCT TAT GAA ACT GTT CGT GAA His His Ala Tyr Glu Thr Val Arg Glu 115 120 GGA TTT CGT CAT TTG TCA CAG GCA CCC Gly Phe Arg His Leu Ser Gin Ala Pro 130 135 AAT CGT GCT GAT AAC TAT 48 Asn Arg Ala Asp Asn Tyr TCA CCG GCG GCT TAT TTG 96 Ser Pro Ala Ala Tyr Leu CAT GAC AGC AGC TCA ATT 144 His Asp Ser Ser Ser Ile GCA AGC TTA ATG CTC AGC 192 Ala Ser Leu Met Leu Ser CTG GCT CTC TCT AAT GAA 240 Leu Ala Leu Ser Asn Glu 75 GGA AAA TCA CAA GAT GAA 288 Gly Lys Ser Gin Asp Glu AGT GGA GAG ACA CCT TAT 336 Ser Gly Glu Thr Pro Tyr 110 GTT CAT GAA CGT GAT CCA 384 Val His Glu Ara Asp Pro 125 GTT GCT GCT AAG CTC GAT 432 Val Ala Ala Lys Leu Asp 140 CCT GTG ACT TTG TTG GGT ATT AGC TCC CAT ATT TCG CCA GAA CTG TAT 480 -247- Pro Val Thr Leu Leu Gly Ile Ser Ser His Ile Ser Pro Glu Leu Tyr 145 150 155 160 AAC TTG CTG ATT GAG GAG ATC CCG GAA AAA GAT GAA GCC GCG CTT GAT 528 Asn Leu Leu Ile Glu Glu Ile Pro Glu Lys Asp Glu Ala Ala Leu Asp 165 170 175 ACG CITT TAT AAA ACA AAC TTT GGC GAT ATT ACT ACT GCT CAG TTA ATG 576 Thr Leu Tyr Lys Thr Asn Phe Gly Asp Ile Thr Thr Ala Gin Leu Met 180 185 190 TCC CCA AGT TAT CTG GCC CGG TAT TAT GGC GTC TCA CCG GAA GAT ATT 624 Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp Ile 195 200 205 GCC TAC GTG ACG ACT TCA TTA TCA CAT GTT GGA TAT AGC AGT GAT ATT 672 Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp Ile 210 215 220 CTG GTT ATT CCG TTG GTC GAT GGT GTG GGT AAG ATG GAA GTA GTT CGT 720 Leu Val Ile Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arg 225 230 235 240 GTT ACC CGA ACA CCA TCG GAT AAT TAT ACC AGT CAG ACG AAT TAT ATT 768 Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr Ile 245 250 255 GAG CTG TAT CCA CAG GGT GGC GAC AAT TAT TTG ATC AAA TAC AAT CTA 816 Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu Ile Lys Tyr Asn Leu 260 265 270 SAGC AT AGT TTT GGT TTG GAT GAT TTT TAT CTG CAA TAT AAA GAT GGT 864 Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 275 280 285 TCC GCT GAT TGG ACT GAG ATT GCC CAT AAT CCC TAT CCT GAT ATG GTC 912 Ser Ala Asp Trp Thr Glu Ile Ala His Asn Pro Tyr Pro Asp Met Val 290 295 300 ATA AAT CAA AAG TAT GAA TCA CAG GCG ACA ATC AAA CGT AGT GAC TCT 960 Ile Asn Gin Lys Tyr Glu Ser Gin Ala Thr Ile Lys Arg Ser Asp Ser 305 310 315 320 GAC AAT ATA CTC AGT ATA GGG TTA CAA AGA TGG CAT AGC GGT AGT TAT 1008 45 Asp Asn Ile Leu Ser Ile Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 325 330 335 AAT T-T GCC GCC GCC AAT TTT AAA ATT GAC CAA TAC TCC CCG AAA GCT 1056 Asn Phe Ala Ala Ala Asn Phe Lys Ile Asp Gin Tyr Ser Pro Lys Ala 50 340 345 350 TTC CTG CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 Phe Leu Leu Lys Met Asn Lys Ala lie Arg Leu Leu Lys Ala Thr Gly 355 360 365 CTC TCT TTT GCT ACG TTG GAG CGT ATT GTT GAT AGT GTT AAT AGC ACC 1152 Leu Ser Phe Ala Thr Leu Glu Arg Ile Val Asp Ser Val Asn Ser Thr 370 375 380 AAA TCC ATC ACG GTT GAG GTA TTA AAC AAG GTT TAT CGG GTA AAA TTC 1200 Lys Ser Ile Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 385 390 395 400 TAT ATT GAT CGT TAT GGC ATC AGT GAA GAG ACA GCC GCT ATT TTG GCT 1248 Tyr Ile Asp Arg Tyr Gly Ile Ser Glu Glu Thr Ala Ala Ile Leu Ala 405 410 415 AAT .TT AAT ATC TCT CAG CAA GCT GTT GGC AAT CAG CTT AGC CAG TTT 1296 Asn lie Asn Ile Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 420 425 430 -248- GAG CAA CTA TTT Glu Gin Leu Phe 435 AAT CAC CCG CCG CTC Asn His Pro Pro Leu 440 AAT GGT ATT Asn Gly Ile CGC TAT GAA ATC 1344 Arg Tyr Glu Ile 445 AGT GAG GAC AAC Ser Glu Asp Asn 450 CCA GAC AGT ACC Pro Asp Ser Thr 465 TTT CAG GTT AAC Phe Gin Val Asn CGT AAA GAA GAC Arg Lys Glu Asp 500 CTG TAT TTG GTT Leu Tyr Leu Val 515 GAA TTG AAC ATT Glu Leu Asn Ile 530 TAT CAG ATT ACC 30 Tyr Gin Ile Thr 545 TGG ATC ACT CAA Trp Ile Thr Gin 35 TTT CTG ATG ACC Phe Leu Met Thr 580 AGC AAT CTG ACG Ser Asn Leu Thr 595 CTG ATT GGG GAA Leu Ile Gly Glu 610 GCT -aS-GA-C TG Ala Leu His Leu 625 ATA GAC CAG ATT 55 Ile Asp Gin Ile GAA GTG CAA ACA Glu Val Gin Thr 660 GTG CTG GCA CAA Val Leu Ala Gin 675 ACG GAA CTG TCA Thr Glu Leu Ser 690
AAA
Lys
GAT
Asp 470
AGT
Ser
GTT
Val
TTG
Leu
TTG
Leu
GAT
Asp 550
TTG
Leu
GCC
Ala
ACG
Thr
CTG
Leu
TCT
Ser 630
CCG
Pro
CCA
Pro
AGC
Ser
ATC
Ile
CAC
His 710 CTT CCT Leu Pro CAA CGC Gin Arg TTG TAT Leu Tyr AAA AAT Lys Asn 505 GCC CAG Ala Gin 520 ATT TGT Ile Cys TTA GCC Leu Ala ACC CAA Thr Gin TAC AGC Tyr Ser 585 TCT TCA Ser Ser 600 AGA GCA Arg Ala GAA GTT Glu Val CAA ATA Gin Ile AGC TTG Ser Leu 665 ATC TAT Ile Tyr 680 ACT CAA Thr Gin AAT CCT Asn Pro AAG GCG Lys Ala 475 CAG ATG Gin Met 490 AAC TTA Asn Leu ATT CAT Ile His GGC TAT Gly Tyr AAA ATA Lys Ile 555 AAA TGG Lys Trp 570 ACC ACT Thr Thr ACT TTG Thr Leu ATG GCG Met Ala GCG TAT Ala Tyr 635 ACT GTT Thr Val 650 AAG GTG Lys Val CGT CGT Arg Arg TCT TCT Ser Ser ACC CTG GAT CTG Asp Leu 460 GTT TTA Val Leu TTA TTG Leu Leu GAG AAT Glu Asn AAC CTG Asn Leu 525 GGC GAC Gly Asp 540 GTG GAA Val Glu ACA GTT Thr Val TTA ACG Leu Thr CAT GGC His Gly 605 CCT TGC Pro Cys 620 GAC CTG Asp Leu GAT GGG Asp Gly ATT ACC Ile Thr ATT GGG Ile Gly 685 CTG CTA Leu Leu 700 ATG GCC Met Ala AAC CTT AAA 1392 Asn Leu Lys AAA CGC GCG 1440 Lys Arg Ala 480 ATC ACT GAT 1488 Ile Thr Asp 495 TTG TCT GAT 1536 Leu Ser Asp 510 ACT ATT GCT 1584 Thr Ile Ala ACC AAC ATT 1632 Thr Asn Ile ACA TTG TTG 1680 Thr Leu Leu 560 ACC GAC CTG 1728 Thr Asp Leu 575 CCA GAA ATT 1776 Pro Glu Ile 590 AAA GAG AGT 1824 Lys Glu Ser TTC ACT TCG 1872 Phe Thr Ser CTG TTG TGG 1920 Leu Leu Trp -640 TTT TGG GAA 1968 Phe Trp Glu 655 TTT GCT CAG 2016 Phe Ala Gin 670 TTA AGT GAA 2064 Leu Ser Glu GTG GCA GGC 2112 Val Ala Gly TTG GAA GGT 2160 Leu Glu Gly 720
AAA
Lys 705 AGC ATA CTG GAT Ser Ile Leu Asp GGT CTG TTA Gly Leu Leu Thr Leu 715 -249- TTT CAT ACC TGG Phe His Thr Trp GTT AAT GGC TTG GGG CAA'CAT Val Asn Gly Leu Gly Gin His GCC TCC TTG ATA TTG 2208 Ala Ser Leu Ile Leu 725 a. a a GCG GCG Ala Ala ATG AAT Met Asn AAG GAT Lys Asp 770 CAA TGG Gin Trp 785 GCA GGG Ala Gly TGG CAA Trp Gin CAG AAA Gin Lys ATT AAT Ile Asn 850 TTA TAT Leu Tyr 865 ACT TCA Thr-Ser CGG GCT Arg Ala CGT CAG Arg Gin TGG GCT Trp Ala 930 CCC ACT Pro Thr 945 TCC ATC Ser Ile AAA ACT Lys Thr
TTG
Leu
AAG
Lys 755
CTA
Leu
TTA
Leu
ATG
Met
GCT
Ala
AAA
Lys 835
GCT
Ala
ACC
Thr
CGT
Arg
TTA
Leu
TTC
Phe 915
GGT
Gly
CAG
Gin
AAC
Asn
TAT
Tyr
AAA
Lys 740
GAG
Glu
ACA
Thr
CAG
Gin
ATG
Met
GCG
Ala 820
CTG
Leu
GTT
Val
TAT
Tyr
ATT
Ile
AAC
Asn 900
TTC
Phe
GTC
Val
CGC
Arg
CAG
Gin
TTG
Leu 980
GAC
Asp
GAA
Glu
AAA
Lys
ATG
Met
GCC
Ala 805
GCG
Ala
GAT
Asp
GTC
Val
TTG
Leu
GCA
Ala 885
CGA
Arg
ACT
Thr
TCT
Ser
ATT
Ile
AGC
Ser 965
ACC
Thr
GGA
Gly
TCT
Ser
CTG
Leu
TCT
Ser 790
CTG
Leu
GCT
Ala
GAG
Glu
GAT
Asp
CTG
Leu 870
GAA
Glu
GAT
Asp
GAC
Asp
GAA
Glu
GGG
Gly 950
CAG
Gin
AGC
Ser GCC TTG Ala Leu CTC CTA Leu Leu 760 ACC AGT Thr Ser 775 TCG GCC Ser Ala AAA TAT Lys Tyr GCG CTG Ala Leu ACG TTC Thr Phe 840 AGT GCT Ser Ala 855 ATT GAT Ile Asp GCT ATC Ala Ile GAA GGT Glu Gly TGG GAA Trp Glu 920 CTG GTC Leu Val 935 CAA ACC Gin Thr CTA AAT Leu Asn TTT GAG Phe Glu ACA GTT Thr Val 745 CAA ATG Gin Met TGG ACA Trp Thr TTG GCG Leu Ala GGG ATA Gly Ile 810 ATG GCT Met Ala 825 AGT AAG Ser Lys GCT GGA Ala Gly AAT CAG Asn Gin GCC GGT Ala Gly 890 CAG CTT Gin Leu 905 CGT TAC Arg Tyr TAT TAT Tyr Tyr AAA ATG Lys Met GCG GAT Ala Asp 970 CAG GTA Gin Val 985
ACC
Thr
GCA
Ala
CAG
Gin
GTT
Val 795
GAT
Asp
GAT
Asp
GCA
Ala
GTA
Val
GTT
Val 875
ATT
Ile
GCA
Ala
AAT
Asn
CCA
Pro
ATG
Met 955
ACG
Thr
GCA
Ala
GAT
Asp
GCT
Ala
ATT
Ile 780
TCT
Ser
CAT
His
CAT
His
TTA
Leu
CGT
Arg 860
TCT
Ser
CAAM
Gin
TCG
Ser
AAA
Lys
GAA
Glu 940
GAT
Asp
GTG
Val
AAT
Asn
GTA
Val
AAT
Asn 765
GAC
Asp
CCA
Pro
AAC
Asn
GCT
Ala
TGT
Cys 845
GAT
Asp
GCC
Ala
CTG
Leu
GAC
Asp
CGT
Arg 925
AAC
Asn
GCG
Ala
GAA
Glu
CTG
Leu GCT 2256 Ala GAG 2304 Glu CTG 2352 Leu CTG 2400 Leu 800 GCC 2448 Ala GCA 2496 Ala TAT 2544 Tyr GGT 2592 Gly ATC 2640 Ile 880 AAC 2688 Asn ACC 2736 Thr ACT 2784 Thr GAT 2832 Asp CAA 2880 Gin 960 TTC 2928 Phe ATT 2976 Ile AGT GCT TAC CAC GAT AAT GTG AAT GTG GAT Ser Ala Tyr His Asp Asn Val Asn Val Asp CAA GGA TTA ACT TAT TTT 3024 Gin Gly Leu Thr Tyr Phe -250- 995 1000 1005 ATC GGT ATC GAC CAA GCA GCT CCG GGT ACG TAT TAC TGG CGT AGT GTT 3072 Ile G-ly Ile Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val 1010 1015 1020 GAT CAC AGC AAA TGT GAA AAT GGC AAG TTT GCC GCT AAT GCT TGG GGT 3120 Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 1025 1030 1035 1040 GAG TGG AAT AAA ATT ACC TGT GCT GTC AAT CCT TGG AAA AAT ATC ATC 3168 Glu Trp Asn Lys Ile Thr Cys Ala Val Asn Pro Trp Lys Asn Ile Ile 1045 1050 1055 CGT CCG GTT GTT TAT ATG TCC CGC TTA TAT CTG CTA TGG CTG GAG CAG 3216 Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 1060 1065 1070 CAA TCA AAG AAA AGT GAT GAT GGT AAA ACC ACG ATT TAT CAA TAT AAC 3264 Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr Ile Tyr Gin Tyr Asn 1075 1080 1085 TTA AAA CTG GCT CAT ATT CGT TAC GAC GGT AGT TGG AAT ACA CCA TTT 3312 Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100 ACT TTT GAT GTG ACA GAA AAG GTA AAA AAT TAC ACG TCG AGT ACT GAT 3360 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120 GCT GCT GAA TCT TTA GGG TTG TAT TGT ACT GGT TAT CAA GGG GAA GAC 3408 Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 1125 1130 1135 9999 9 ACT CTA TTA GTT ATG TTC TAT TCG ATG CAG AGT AGT TAT AGC tCC TAT 3456 Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 1140 1145 1150 ACC GAT AAT AAT GCG CCG GTC ACT GGG CTA TAT ATT TTC GCT GAT ATG 3504 Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr Ile Phe Ala Asp Met 1155 1160 1165 TCA TCA GAC AAT ATG ACG AAT GCA CAA GCA ACT AAC TAT TGG AAT AAC 3552 Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn 1170 1175 1180 AGT TAT CCG CAA TTT GAT ACT GTG ATG GCA GAT CCG GAT AGC GAC AAT 3600 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 1185 1190 1195 1200 50 AAA AAA GTC ATA ACC AGA AGA GTT AAT AAC CGT TAT GCG GAG GAT TAT 3648 Lys Lys Val Ile Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 1205 1210 1215 GAA ATT CCT TCC TCT GTG ACA AGT AAC AGT AAT TAT TCT TGG GGT GAT 3696 Glu Ile Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 1220 1225 1230 CAC AGT TTA ACC ATG CTT TAT GGT GGT AGT GTT CCT AAT ATT ACT TTT 3744 His Ser Leu Thr Met Leu Tyr Gly Gly Ser Val Pro Asn Ile Thr Phe 1235 1240 1245 GAA TCG GCG GCA GAA GAT TTA AGG CTA TCT ACC AAT ATG GCA TTG AGT-3792 Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 1250 1255 1260 ATT ATT CAT AAT GGA TAT GCG GGA ACC CGC CGT ATA CAA TGT AAT CTT 3840 Ile Ile His Asn Gly Tyr Ala Gly Thr Arg Arg Ile Gin Cys Asn Leu 1265 1270 1275 1280 ATG AAA CAA TAC GCT TCA TTA GGT GAT AAA TTT ATA ATT TAT GAT TCA 3888 -251- Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe Ile Ile Tyr Asp Ser 1285 1290 1295 TCA TTT GAT GAT GCA AAC CGT TTT AAT CTG GTG CCA TTG TTT AAA TTC 3936 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310 GGA AAA GAC GAG AAC TCA GAT GAT AGT ATT TGT ATA TAT AAT GAA AAC 3984 Gly Lys Asp Glu Asn Ser Asp Asp Ser Ile Cys Ile Tyr Asn Glu Asn 1315 1320 1325 CCT TCC TCT GAA GAT AAG AAG TGG TAT TTT TCT TCG AAA GAT GAC AAT 4032 Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 1330 1335 1340 AAA ACA GCG GAT TAT AAT GGT GGA ACT CAA TGT ATA GAT GCT GGA ACC 4080 Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys Ile Asp Ala Gly Thr 1345 1350 1355 1360 AGT AAC AAA GAT TTT TAT TAT AAT CTC CAG GAG ATT GAA GTA ATT AGT 4128 Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu Ile Glu Val Ile Ser 1365 1370 1375 GTT ACT GGT GGG TAT TGG TCG AGT TAT AAA ATA TCC AAC CCG ATT AAT 4176 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys Ile Ser Asn Pro Ile Asn 1380 1385 1390 ATC AAT ACG GGC ATT GAT AGT GCT AAA GTA AAA GTC ACC GTA AAA GCG 4224 30 Ile Asn Thr Gly Ile Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 30 1395 1400 1405 GGT GGT GAC GAT CAA ATC TTT ACT GCT GAT AAT AGT ACC TAT GTT CCT 4272 Gly Gly Asp Asp Gin Ile Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420 CAG CAA CCG GCA CCC AGT TTT GAG GAG ATG ATT TAT CAG TTC AAT AAC 4320 too*: Gin Gin Pro Ala Pro Ser Phe Glu Glu Met Ile Tyr Gin Phe Asn Asn 1425 1430 1435 1440 CTG ACA ATA GAT TGT AAG AAT TTA AAT TTC ATC GAC AAT CAG GCA CAT 4368 Leu Thr Ile Asp Cys Lys Asn Leu Asn Phe Ile Asp Asn Gin Ala His e C1445 1450 1455 ATT GAG ATT GAT TTC ACC GCT ACG GCA CAA GAT GGC CGA TTC TTG GGT 4416 45 Ile Glu Ile Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 1460 1465 1470 GCA GAA ACT TTT ATT ATC CCG GTA ACT AAA AAA GTT CTC GGT ACT GAG 4464 Ala--Efr-- e Ile Ile Pro Val Thr Lys Lys Val Leu Gly Thr Clu 50 1475 1480 1485 eeot AAC GTG ATT GCG TTA TAT AGC GAA AAT AAC GGT GTT CAA TAT ATG CAA 4512 Asn Val Ile Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 1490 1495 1500 ATT GGC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTG 4560 ie Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 1505 1510 1515 1520 GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA GTG CTC AGT ATG GAA ACT 4608 Val Ser Arg Ala Asn Arg Gly Ile Asp Ala Val Leu Ser Met Glu Thr 1525 1530 1535 CAG AAT ATT CAG GAA CCG CAA TTA GGA GCG GGC ACA TAT GTG CAG CTT 4656 Gin Asn Ile Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550 GTG TTG GAT AAA TAT GAT GAG TCT ATT CAT GGC ACT AAT AAA AGC TTT 4704 Val Leu Asp Lys Tyr Asp Glu Ser Ile His Gly Thr Asn Lys Ser Phe 1555 1560 1565 -252- GCT ATT GAA TAT GTT GAT ATA TTT AAA GAG AAC GAT AGT TTT GTG ATT 4752 Ala Ile Glu Tyr Val Asp Ile Phe Lys Glu Asn Asp Ser Phe Val Ile 1570 1575 1580 TAT CAA GGA GAA CTT AGC GAA ACA AGT CAA ACT GTT GTG AAA GTT TTC 4800 Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1585 1590 1595 1600 TTA TCC TAT TTT ATA GAG GCG ACT GGA AAT AAG AAC CAC TTA TGG GTA 4848 Leu Ser Tyr Phe Ile Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 1605 1610 1615 CGT GCT AAA TAC CAA AAG GAA ACG ACT GAT-AAG ATC TTG TTC GAC CGT 4896 Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys Ile Leu Phe Asp Arg 1620 1625 1630 ACT GAT GAG AAA GAT CCG CAC GGT TGG TTT CTC AGC GAC GAT CAC AAG 4944 Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1645 ACC TTT AGT GGT CTC TCT TCC GCA CAG GCA TTA AAG AAC GAC AGT GAA 4992 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660 25 CCG ATG GAT TTC TCT GGC GCC AAT GCT CTC TAT TTC TGG GAA CTG TTC 5040 Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 1665 1670 1675 1680 TAT TAC ACG CCG ATG ATG ATG GCT CAT CGT TTG TTG CAG GAA CAG AAT 5088 Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Glu Gin Asn 1685 1690 1695 TTT GAT GCG GCG AAC CAT TGG TTC CGT TAT GTC TGG AGT CCA TCC GGT 5136 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 1700 1705 1710 TAT ATC GTT GAT GGT AAA ATT GCT ATC TAC CAC TGG AAC GTG CGA CCG 5184 Tyr Ile Val Asp Gly Lys Ile Ala Ile Tyr His Trp Asn Val Arg Pro 1715 1720 1725 40 CTG GAA GAA GAC ACC AGT TGG AAT GCA CAA CAA CTG GAC TCC ACC GAT 5232 Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 1730 1735 1740 CCA GAT GCT GTA GCC CAA GAT GAT CCG ATG CAC TAC AAG GTG GCT ACC 5280 Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 1745 1750 1755 1760 TTT ATG GCG ACG TTG GAT CTG CTA ATG GCC CGT GGT GAT GCT GCT TAC 5328 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr 1765 1770 1775 CGC CAG TTA GAG CGT GAT ACG TTG GCT GAA GCT AAA ATG TGG TAT ACA 5376 Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 1780 1785 1790 CAG GCG CTT AAT CTG TTG GGT GAT GAG CCA CAA GTG ATG CTG AGT ACG 5424 Gin Ala Leu-Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 1795 1800 1805 ACT TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAG 5472 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1810 1815 1820 CAG GTT CGT CAG CAA GTG CTT ACC CAG TTG CGT CTC AAT AGC AGG GTA 5520 Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1825 1830 1835 1840 AAA ACC CCG TTG 5532 Lys Thr Pro Leu 1844 -253- INFORMATION FOR SEQ ID NO:53: SEQUENCE CHARACTERISTICS: LENGTH: 1844 amino acids TYPE: amino acids STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEG ID Ni Features From To Peptide 1 1844 Fragment 1 11 Description TcbAii peptide a a Phe 1 Al a Thr Tyr Gin 65 Leu Val1 His 45 Gly Pro 145 As n Thr Ser Ala Leu 225 Val Fragment Fragment Fragment Fragment Gin Gly Tyr 5 Pro Gly Ser 20 Leu Tyr Arg 35 Leu Asp Lys Asn Met Asp Leu Ala Gly 85 Asp Met Leu 100 Ala Tyr Giu 115 Arg His Leu Thr Leu Leu Leu Ile Giu 165 Tyr Lys Thr 180 Ser Tyr Leu 195 Val Thr Thr Ile Pro Leu Arg Thr Pro 245 978 1387 1484 1527 Ser Val Giu Arg Giu 70 Ile Ser Thr Ser Gi y 150 Gi u Asn Ala Ser Val 230 Ser 990 1401 1505 1552 Leu iSer iLys 40 Pro le iThr *Tyr *Arg 120 iAla Ser Pro Gly Tyr 200 t Ser Gly Asn
(SEQ
(SEQ
(SEQ
(SEQ
(SEQ
Gly Asn 10 Phe Ser Leu His Leu Ala Thr Leu 75 Thr Gly 90 Leu Ser Ile Val Ile Val His Ile 155 Lys Asp 170 Ile Thr Gly Val Vai Gly Gly Lys 23.5 Thr Ser 250 I D NO: 1) ID NO:23) ID NO:22) ID NO:24) ID NO:21) Arg Ala Asp Asn Pro Ala Ala Tyr Asp Ser Ser Ser Ser Leu Met Leu Ala Leu Ser Asn Lys Ser Gin Asp Gly Giu Thr Pro 110 His Giu Arq Asp 125 Ala Ala Lys Leu 140 Ser Pro Giu Leu Giu Ala Ala Leu 175 Thr Ala Gin Leu 190 Ser Pro Giu Asp 205 Tyr Ser Ser Asp 220 Met Giu Val Val Gin Thr Asn Tyr 255 Tyr Leu Ile Ser Glu Giu Tyr Pro Asp Tyr 160 Asp Met Ile Ile Arg 240 Ile -254- Glu Leu Tyr Ser Ser Ile 305 Asp Asn Phe Leu Lys 385 Tyr 30 Asn Glu Ser Pro 465 Phe Arg Leu Glu Tyr 545 Trp Phe Ser Leu Ser 275 Asp Gin Ile Ala Leu 355 Phe Ile Asp Asn Leu 435 Asp Ser Val Glu Leu 515 Asn Ile Thr Met Leu 595 Gly Pro 260 Phe Trp Lys Leu Ala 340 Lys Ala Thr Arg lle 420 Phe Asn Thr Asn Asp 500 Val Ile Thr Gin Thr 580 Thr Glu Gin Gly Gly Asp Asn Tyr.Leu 265 Gly Thr Tyr Ser 325 Ala Met Thr Val Tyr 405 Ser Asn Ser Gly Ala 485 Gly Ser Leu Asp Trp 565 Thr Ala Asp Asp Ile 295 Ser Gly Phe Lys Glu 375 Val Ile Gin Pro His 455 Asp Glu Ile Leu Val 535 Asn Lys Thr Leu Lys 615 Asp 280 Ala Gin Leu Lys Ala 360 Arg Leu Ser Ala Pro 440 Leu Gin Leu Lys Ala 520 Ile Leu Thr Tyr Ser 600 Arg Phe Tyr Leu His Ala Gin Ile 345 Ile Ile Asn Glu Val 425 Leu Pro Arg Tyr Asn 505 Gin Cys Ala Gin Ser 585 Ser Ala Val Asn Thr Arg 330 Asp Arg Val Lys Glu 410 Gly Asn Asn Lys Gin 490 Asn Ile Gly Lys Lys 570 Thr Thr Met Ala Pro Ile 315 Trp Gin Leu Asp Val 395 Thr Asn Gly Pro Ala 475 Met Leu His Tyr Ile 555 Trp Thr Leu Ala Tyr 635 Gin Tyr 300 Lys His Tyr Leu Ser 380 Tyr Ala Gin Ile Asp 460 Val Leu Glu Asn Gly 540 Val Thr Leu His Pro 620 Asp Tyr 285 Pro Arg Ser Ser Lys 365 Val Arg Ala Leu Arg 445 Leu Leu Leu Asn Leu 525 Asp Glu Val Thr Gly 605 Cys Leu Lys Asp Ser Gly Pro 350 Ala Asn Val Ile Ser 430 Tyr Asn Lys Ile Leu 510 Thr Thr Thr Thr Pro 590 Lys Phe Leu SAsp Gly SMet Val Asp Ser 320 Ser Tyr 335 Lys Ala Thr Gly Ser Thr Lys Phe 400 Leu Ala 415 Gin Phe Glu Ile Leu Lys Arg Ala 480 Thr Asp 495 Ser Asp Ile Ala Asn Ile Leu Leu 560 Asp Leu 575 Glu Ile Glu Ser Thr Ser Leu Trp 640 Ile Lys Tyr Asn Leu 270 Ala Leu His Leu Thr Ser Gin Glu -255- Ile Asp Gin Ile Gin Pro Ala Gin Ile Thr Val Asp Gly Phe 645 a.
Glu Val Thr Lys 705 Phe Ala Met Lys Gin 30 785 Al a 35 Trp Gin 40 Ile Leu 45 865 Thr Arg Arg Trp Pro 945 Ser Lys Ser Ile Gin Thr 660 Ala Gin 675 Leu Ser Ile Leu Thr Trp Leu Lys 740 Lys Giu 755 Leu Thr Leu Gin Met Met Ala Ala 820 Lys Leu 835 Ala Val Thr Tyr Arg Ile Leu Asn 900 Phe Phe 915 Gly Val Gin Arg Asn. Gin Tyr Leu 980 Tyr His 995 Ile Asp Pro Ser Ile His 710 Asn Gly Ser Leu Ser 790 Leu Al a Giu Asp Leu 870 Giu Asp Asp Giu Gi y 950 Gin Ser Asn Ala Ala *Leu Tyr Giy Leu Met 825 Phe Ser 840 Ala Ala Asp Aso Ile Ala Gly Gin 905 Giu Arg 920 Val Tyr Thr Lys Asn Ala Giu Gin 985 Asn Val 1000 Pro Gly 650 Lys Val Arg Arg Ser Ser Thr Leu 715 Gin His 730 Val Thr Met Ala Thr Gin Ala Val 795 Ile Asp 810 Ala Asp Lys Ala Gly Val Gin Val 875 Gly Ile 890 Leu Ala Tyr Asn Tyr Pro Met Met 955 Asp Thr 970 Vai Ala Asp Gin Thr Phe 670 Gly Leu 685 Leu Val Ala Leu Ser Leu Val Ala 750 Asn Gln 765 Asp A Ia Pro Leu Asn Tyr Ala Asn 830 Cys Asn 845 Asp Arg Ala Asp Leu Tyr Asp Val 910 Arg Tyr 925 Asn Tyr Ala Leu Giu Asp Leu Lys 990 Leu Thr 1005 Al a Ser Ala Giu Ile 735 Gin ValI Ile Asp Al a 815 Gin T yr Asn.
ValI Val 895 Ser Ser Val1 Leu Ala 975 Val Tyr Trp Giu 655 Gin Glu Gly- Gi y 720 Leu Al a Giu Le u Leu 800 Al a Ala Tyr Gi y Ile 880 As n Thr Thr Asp Gin 960 Phe Ile Phe Thr Tyr Tyr Trp Arg Ser Vai -2 56- 1010 1015 1020 Asp His Ser Lys Cys Giu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 1025 1030 1035 1040 Giu Trp Asn Lys Ile Thr Cys Ala Val Asn Pro Trp Lys Asn Ile Ile 1045 1050 1055 Arg Pro Val Vai Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Giu Gin 1060 1065 1070 Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr Ile Tyr Gin Tyr Asn 1075 1080 1085 Leu Lys Leu Ala His Ile Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100 Thr Phe Asp Val Thr Giu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 -1120 Ala Ala Giu Ser Leu Gly Leu Tyr Cys Thr Giy Tyr Gin Giy Giu Asp -1125 1130 1135 5 Thr Leu Leu Vai Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 1140 1145 1150 .Thr Asp Asn Asn Aia Pro Val Thr Gly Leu Tyr Ile Phe Aia Asp Met 1155 1160 1165 Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn *.*1170 1175 1180 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 1185 1190 1195 1200 *Lys Lys Vai Ile Thr Arg Arq Val Asn Asn Arg Tyr Aia Glu Asp Tyr 1205 1210 1215 ***Glu Ile Pro Ser Ser Vai Thr Ser Asn Ser Asn Tyr Ser Trp Giy Asp 1220 1225 1230 His Ser Leu Thr Met Leu Tyr Giy Gly Ser Val Pro Asn Ile Thr Phe 1235 1240 1245 Giu Ser Ala Ala Giu Asp Leu Arq Leu Ser Thr Asn Met Ala Leu Ser 1250 1255 1260 Ile Ile His Asn Gly Tyr Ala Gly Thr Arg Arg Ile Gin Cys Asn Leu 1265 1270 1275 1280 Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe Ile Ile Tyr Asp Ser 1285 1290 1295 Ser Phe Asp Asp Ala Asn Arq Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310 Gly Lys Asp Giu Asn Ser Asp Asp Ser le Cys Ile Tyr Asn Giu Asn 1315 1320 1325 Pro Ser Ser Giu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 1330 1335 1340 Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys Ile Asp Ala Gly Thr 1345 1350 1355 1360 Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Giu Ile Giu Val Ile Ser 1365 1370 1375 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys Ile Ser Asn Pro Ile Asn 1380 1385 1390 -257- Ile Asn Thr Gly Ile Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405 Gly Gly Asp Asp Gin Ile Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420 Gln Gin Pro Ala Pro Ser Phe Glu Glu Met Ile Tyr Gin Phe Asn Asn 1425 1430 1435 1440 Leu Thr Ile Asp Cys Lys Asn Leu Asn Phe Ile Asp Asn Gin Ala His 1445 1450 1455 Ile Giu Ile Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 1460 1465 1470 Ala Glu Thr Phe Ile Ile Pro Val Thr Lys Lys Val Leu Gly Thr Glu 1475 1480 1485 Asn Val Ile Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 1490 1495 1500 Ile Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 1505 1510 1515 1520 25 Val Ser Arg Ala Asn Arg Gly Ile Asp Ala Val Leu Ser Met Glu Thr 1525 1530 1535 Gin Asn Ile Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550 Val Leu Asp Lys Tyr Asp Glu Ser Ile His Gly Thr Asn Lys Ser Phe 1555 1560 1565 Ala Ile Glu Tyr Val Asp Ile Phe Lys Glu Asn.Asp Ser Phe Val Ile S• 35 1570 1575 1580 Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1585 1590 1595 1600 40 Leu Ser Tyr Phe Ile Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 1605 1610 1615 SArg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys Ile Leu Phe Asp Arg 1620 1625 1630 Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1645 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660 Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 1665 1670 1675 1680 Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Glu Gin Asn 1685 1690 1695 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 1700 1705 1710 Tyr Ile Val Asp Gly Lys Ile Ala Ile Tyr His Trp Asn Val Arg Pro 1715 1720 1725 Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 1730 1735 1740 Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 1745 1750 1755 1760 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr 1765 1770 1775 -258- Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 1780 1785 1790 Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 1795 1800 1805 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1810 1815 1820 Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1825 1830 1835 1840 Lys Thr Pro Leu 1844 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 1722 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54 (tcbAiii coding region): CTA GGA ACA GCC AAT TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT 48 Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 1 5 10 AGC AAG CTC AAA GGC TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT 96 Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asnri 25 TTA CGT CAT AAT CTG TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG 144 Leu Arg His Asn Leu Ser Ile Asp Gly Gin Pro Leu Ser Leu Pro Leu 35 40 TAT GCT AAA CCG GCT GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA 192 Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 50 55 45 GCT TCT CAA GGG GGA GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC 240 Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr Ile His 70 75 CGC TTC CCT CAA ATG CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT 288 Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 90 ATA CAG TTC GGT AGT TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG 336 Ile Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 100 105 110 GAA GCT ATG AGT CAA CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG 384 Glu Ala Met Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu Ile Leu 115 120 125 ACC AGT ATT CGT ATG CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA 432 Thr Ser Ile Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 130 135 140 AAA ACC GCC TTG CAA GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC 480 Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 145 150 155 160 -259- AGC TAT AGC Ser Tyr Ser CAA CTG TAT GAG GAG AAG ATC AAC Gin Leu Tyr Giu Glu Asn Ile Asn 165 170 GCA GGT GAG CAG CGA 528 Ala gly Glu Gin Arg 175 a.
GCGG
Ala
ATT
Ile
GGC
Gly
GCT
Ala 225
AMA
Lys
ATT
Ilie
CTG
30 Leu
TAC
Tyr 35
AGA
Arg 305 40
TTG
Leu 45 CTG Leu
AGC
Ser
TGT
Cys
CTG
Leu 385
GCA
Ala
GAA
Glu
:TG
,eu
TCC
Ser
CTG
Lieu 210
GAC
Asp
GTT
Val1
CAG
Gin
GAA
Giu
CTG
Leu 290
AGC
Ser
TCA
Ser
ATG
Met
TTT
Phe
GGA
Glyi 370
AAA
Lys
GTG
Val]
CA-
Glr
GCG
Al a
CGT
Arg 195
GGT
Ala
GGT
Gly
GCT
Al a
CGT
Ar g
TCA
Ser 275
A
Lys
A
L ys
GGT
Gly
GCA
Al a
GTC
Val 355
GAA
Glu
TGG
Trp
GTT
Val
~ATP
Sle TTA CGC 'I Leu Arg I 180 ATG GCAC Met AlaC GAT GGC Asp GlyC ATT GAG Ile Giu CAG TCG GlnLSer 245 GAC AAC Asp Asn 260 CTG TCT Leu Ser ACC GAG Thr Gin TTC AGT Phe Ser ATT TAT Ile Tyr 325 GAG CAA Glu Gin 340 AAA CG Lys Pro *GGT TTG Ala Leu GAA TCT *Giu Ser TAT GAT *Tyr Asp 405 CCT GGA *Pro Ala 420
CA
,er
,GC
'ly
;,C
Uly
T'TG
Leu 230
GAA
.,lu
.,CA
Ala
A.TT
Ile
CAA
Gin
AAT
Asn 310
TTC
Phe
TC
Ser
GGT
Gly
ATA
Ile
CGC
Arg 390
TCA
Ser
TTA
Leu
GAA
Giu
GCG
Ala
ATG
Met 215
AGT
Ser
ATA
Ile
CAA
Gin
CC
Arg
GCT
Al a 295
CAA
Gin
GAG
Gin
TAT
Tyr
GCA
Ala C AA Gin 37 5
GGT
*Ala
GTG
Leu
TTG
Leu
['CT
3er 3GT 200
CAT
H~is Ala
TAT
Tyr
GCG
Al a
CT
Arg 280
GAG
Gin
GG
Ala
TTG
Phe
CAA
Gin
TGG
Trp 360
AAT
Asn
TTG
Leu
GAA
Glu
GAT
Asp GGT AlaI 185
GTTC
Val2
TATC
Tyr
TGT
Ser
CGG
Arg
GAG
Glu 265
GAA
Giu
GG
Ala
TTA
Leu
TAT
Tyr
TGG
Trp 345
CAA
Gin
GTG
Leu
GAA
Glu
*GGT
Gly
*AAG
Lys 425
TT
1le
;AT
ksp 3GT 3iY
GCC
kla
CGT
Arg 250
ATT
Ile
GCC
Alia
CAG
Gin
TAT
Tyr
GAG
Asp 330
GAA
Gi u
GGA
Gly
GCA
Ala
GTA
Val1
AAI
As r 410
GGG
Gly
GAG
Glu
ATG
Met
GGT
Ala
AAG
Lys 235
GGC
Arg
AAC
Asn
OCT
Al a
GCA
Ala
AGT
Ser 315
TTG
Leu
GCT
Ala
ACT
Thr
CAGA
Gin
GMA
*Glu 395
GAT
Asp
GAG
Glu
['CT
Ser
GCA
Ala
ALTT
Ile 220
A.TG
Miet
CGT
Arg
CAG
Gin
GAA
Giu
CAA
Gin 300
TGG
Trp
GCC
Ala
MAT
Asn
TAG
Tyr
ATG
Met 380
GC
Arq
GT
Arg
GGA
Cl y
GAG
Gin
GGA
Pro 205
GGG
Al a
GTT
Val
CAA
Gln
TTA
Leu
ATG
Met 285
CTT
Leu
TTA
Leu
GTA
Val
GAT
Asp
GCC
Ala 365
GMA
Glu
AG
Thr
TTT
Phe
ACA
Thr GGA GG Gly Ala 190 MAT ATC Asn Ile TAT GCG Tyr Ala GAT GCG Asp Ala GMA TGG Ciu Trp 255 MAC GCG Asn Ala 270 CMA AMA Gin Lys ACT TTC Thr Phe CGA GGG Arg Gly TGA CGT Ser Arg 335 MAT TGG Asn Ser 350 GGG TTA Cly Leu GAG GGA Glu Ala OTT TGA Val Ser MAT TTA Asn Leu 415 GCA GGA Ala Cly 430 GAG 576 Gin TTC 624 Phe ATC 672 Ile GAG 720 Giu 240 AAA 768 Lys CMA 816 Gin GAG 864 Giu TTA 912 Leu CGT 960 Arg 320 TG 1008 Gys ATT 1056 Ile TTG 1104 Leu TAT 1152 Tyr TTG 1200 Leu 400 GCG 1248 Ala ACT 1296 Thr AAA GMA MT Lys Glu Asn 435 GGG TTA TGA TTG Gly Leu Ser Leu GGT AAT GOT ATC CTG TCA Ala Asn Ala Ile Leu Ser 440 445 -260- GCT TCG GTC 1344 Ala Ser Val AAA TTG TCC GAC TTG AAA CTG GGA ACG GAT TAT CCA GAC Lys Leu 450 GGT AGC Gly Ser 465 GCA TTG Ala Leu GGC ACT Gly Ser GGT ACC Gly Thr TAC CTG Tyr Leu 530 CTT CAA Leu Gin 545 30 ATG AGC Met Ser Ser Asp Leu Lys Leu Gly 455 Thr Asp Tyr Pro Asp 460 ACT ATC GTT 1392 Ser Ile Val TCG CTA CCT 1440 Ser Leu Pro 480 AGC TAT GGT 1488 Ser Tyr Gly 495 GTG TCT CAT 1536 Val Ser His 510 GAC GGC AAA 1584 Asp Gly Lys ACA CTG AAT 1632 Thr Leu Asn TTG CAA ACT 1680 Leu Gin Thr 560 TAA 1722 4 4* 4 4* .4 -4 35 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 573 amino acids TYPE: amino acids 40 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:55 (TcbAj.j): Leu Giy Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Giu Asn 10 Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg-det.t2bqeAsn 20 25 Leu Arg His Asn Leu Ser Ile Asp Gly Gin Pro Leu Ser Leu Pro Leu 40 Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 55 Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr Ile His 70 75 Arg Phe Pro Gin Met Leu Giu Gly Ala Arg Gly Leu Val Asn Gin Len 90 Ile Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Giu Arg Gin Asp Ala 100 105 110 Giu Ala Met Ser Gin Leu Len Gin Thr Gin Ala Ser Gin Len Ile Leu 115 120 125 -261-
S.
Thr Ser 130 Lys Thr 145 Ser Tyr Ala Leu Ile Ser Gly Leu 210 Ala Asp 225 Lys Val Ile Gin Leu Glu Tyr Leu 290 Arg Ser 305 Leu Ser 40 Leu Met Ser.Pbe Cys Gly 370 Leu Lys 385 Ala Val Glu Gin Lys Glu Lys Leu 450 Gly Ser 465 Ala Leu Gly Ser Ile Ala Ser Ala Arg 195 Ala Gly Ala Arg Ser 275 Lys Lys Gly Ala Val 355 Glu Trp Val Ile Asn 435 Ser Asn Val Thr Met Gin Asp Asn 135 Gin Val Ser Leu 150 Leu Tyr Glu Glu 165 Arg Ser Glu Ser Ala Gly Ala Gly 200 Gly Gly Met His 215 Glu Leu Ser Ala 230 Ser Glu Ile Tyr 245 Asn Ala Gin Ala Ser Ile Arg Arg 280 Gin Gin Ala Gin 295 Ser Asn Gin Ala 310 Tyr Phe Gin Phe 325 Gin Ser Tyr Gin Pro Gly Ala Trp 360 Leu Ile Gln Asn 375 Ser Arg Ala Leu 390 Asp Ser Leu Glu 405 Ala Leu Leu Asp Leu Ser Leu Ala 440 Leu Lys Leu Gly 455 Val Arg Arg Ile 470 Pro Tyr Gin Asp 485 Leu Pro Lys Gly Leu Ala Glu Leu Asp Ser Glu 140 Gly Val Gin Gin Arg Phe Asp 155 160 Ile Asn Ala Gly Glu Gin Arg 170 175 Ile Glu Ser Gin Gly Ala Gin 190 Asp Met Ala Pro Asn Ile Phe 205 Gly Ala Ile Ala Tyr Ala Ile 220 Ala Lys Met Val Asp Ala Glu 235 240 Arg Arg Arg Gin Glu Trp Lys 250 255 Ile Asn Gin Leu Asn"Ala Gin 270 Ala Ala Glu Met Gin Lys Glu 285 Gin Ala Gin Leu Thr Phe Leu 300 Tyr Ser Trp Leu Arg Gly Arg 315 320 Asp Leu Ala Val Ser Arg Cys 330 335 Glu Ala Asn Asp Asn Ser Ile 350 Gly Thr Tyr Ala Gly Leu Leu 365 Ala Gin Met Glu Glu Ala Tyr 380 Val Glu Arg Thr Val Ser Leu 395 400 Asn Asp Arg Phe Asn Leu Ala 410 415 Gly Glu Gly Thr Ala Gly Thr 430 Ala Ile Leu Ser Ala Ser Val 445 Asp Tyr Pro Asp Ser Ile Val 460 Gin Ile Ser Val Ser Leu Pro 475 480 Gin Ala Met Leu Ser Tyr Gly 490 495 Ser Ala Leu Ala Val Ser His 510 -262- Gly Thr Asn Asp Ser Gly Gin Phe Gin LeuAsp Phe Asn Asp Gly Lys 515 520 525 Tyr Leu Pro Phe Glu Gly Ile Ala Leu Asp Asp Gin Gly Thr Leu Asn 530 535 540 Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala Ile Leu Gin Thr S545 550 555 560 Met Ser Asp Ile Ile Leu His Ile Arg Tyr Thr Ile Arg 565 570 573 INFORMATION FOR SEQ ID NO:56 SEQUENCE CHARACTERISTICS: LENGTH: 2994 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56 (tccA) 1 ATG AAT CAA CTC GCC AGT CCC CTG ATT TCC CGC ACC GAA GAG ATC CAC 48 1 Met Asn Gin Leu Ala Ser Pro Leu Ile Ser Arg Thr Glu Glu Ile His 16 30 49 AAC TTA CCC GGT AAA TTG ACC GAT CTT GGT TAT ACC TCA GTG TTT GAT 96 17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 32 97 GTG GTA CGT ATG CCG CGT GAG CGT TTT ATT CGT GAG CAT CGT GCT GAT 144 35 33 Val Val Arg Met Pro Arg Glu Arg Phe Ile Arg Glu His Arg Ala Asp 48 145 CTC GGG CGC AGT GCT GAA AAA ATG TAT GAC CTG GCA GTG GGC TAT GCT 192 49 Leu Gly Arg Ser Ala Glu Lys Met Tyr Asp Leu Ala Val Gly Tyr Ala 64 40 193 CAT CAG GTG TTA CAC CAT TTT CGC CGT AAT TCT CTT AGT GAA GCT GTT 240 His Gin Val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala Val 241 CAG TTT GGC TTG AGA AGT CCG TTC TCC GTA TCA GGC CCG GAT TAC GCC 288 81 Gin Phe Gly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 96 289 AAT CAG TTT CTT GAT GCA AAC ACG GGT TGG AAA GAT AAA GCA CCA AGT 336 97 Asn Gin Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro Ser .112 337 GGA TCA CCG GAA GCC AAT GAT GCG CCG GTA GCC TAT CTG ACT CAT ATT 384 113 Gly Ser Pro Glu Ala Asn Asp Ala Pro Val Ala Tyr Leu Thr His Ile 128 385 TAT CAA TTG GCC CTT GAA CAG GAA AAG AAT GGC GCC ACT ACC ATT ATG 432 129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asi Gly Ala Thr Thr Ile Met 144 433 AAT ACG CTG GCG GAG CGT CGC CCC GAT CTG GGT GCT TTG TTA ATT AAT 480 145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu Ile Asn 160 481 GAT AAA GCA ATC AAT GAG GTG ATA CCG CAA TTG CAG TTG GTC AAT GAA 528 161 Asp Lys Ala Ile Asn Glu Val Ile Pro Gin Leu Gin Leu Val Asn Glu 176 -263- 529 ATT 177 Ile CTG TCC AAA GCT ATT CAG AAG AAA CTG AGT TTG ACT GAT CTG GAA Leu Ser Lys Ala Ile Gin Lys Lys Leu Ser Leu Thr Asp Leu Giu 577 193 625 209 673 225 721 241 769 257 25 817 273 865 289 913 305 961 1008
GCG
Ala
TAT
Tyr
ACT
Thr
A.AC
Asn
GCT
Al a
CAG
Gin
TAT
Tyr
ATG
Met
TGT
GTA AAC Val Asn CAT TAT His Tyr ACG TTG Thr Leu TTC TGG Phe Trp TTG ACC Leu Thr AAA ATC Lys Ile GGT GAC Gly Asp ACT GAT Thr Asp TCA ACT GCC AGA Ala Arg GGT CAT Gly His CAA GAT Gin Asp GCA ACA Ala Thr CGA CTG Arg Leu ATT ACG Ile Thr AGT TCG Ser Ser CGA ACA Arg Thr GTC GGA
TCC
Ser
CAG
Gin
ACT
Thr
AAA
Lys
ATC
Ile
ACT
Thr
ACT
Thr
TTG
Leu
TCT
ACC
TJbr
CAG
Gin
CCA
Pro
AAA
Lys
GCG
Ala
GGT
Gly
AAT
Asn
GTA
Val
GTT
CGT
Arg
ACA
Thr
CAG
Gin
CTG
Leu
ACT
Ser
CAG
Gin
AGT
Ser
CCC
Pro
GTT
TAC
Tyr
GCT
Al a
ACG
Thr
AGC
Ser
CAG
Gin
GAT
Asp
TTC
Phe
CAG
Gin
AAG
CCG
Pro
CAA
Gin
CTG
Leu
GAT
Asp Phe
TTC
Phe
AGC
Ser
GTA
Vai
TCT
AAT
Asn
TCG
Ser
CAT
Asp
ACG
Thr
TCC
Ser
TAT
Tyr
GAC
Asp
GAA
Giu
GAT
AAT
Asn
GTA
Val
CTG
Leu
ACT
Thr
CCA
Pro
CAG
Gin
ATG
Met
CTG
Leu
AAT
CTG
Leu
TTG
Leu
CCC
Pro
GCC
Al a
GAG
Giu
CTT
Leu
ACC
Thr
ATG
Met
GTG
CCG
Pro
GGT
Cly
CAA
Gin
AGT
Ser
CAG
Gin
AAC
Asn
ATA
Ile
TTG
Leu
AGT
0 Oft .00.
.00.0.
576 192 624 208 672 224 720 240 768 256 816 272 864 288 912 304 960 320 336 352 368 384 400 416 432 321 Cys Ser Thr Val Gly Giy Ser Thr Val Val Lys Ser Asp Asn Val Ser 1009 TCT GGT 1056 45 337 Ser Gly 1057 CAT GCC 1104 1105 GCG CAT 1152 369 Ala His 1153 CGT ATT 1200 385 Arg Ile 1201 GAG GAT 1248 401 Giu Asp 1249 AAT ACC 1296 417 Asn Thr C TAT Ala Tyr CTG AGT Leu Ser CTC ACA Leu Thr AAA TGG Lys Trp CCT ATG Ala Met ACG CTG Thr Leu GGC GCC CC Gly Ala Arg CGC ACT GGT Arg Ser Cly GAT GAC AAG Asp Asp Lys CTG AAT CTG Leu Asn Leu GAT C GAA Asp Ala Ciu CGT ATG TTG Arg Met Leu TTT ATT Phe Ile GCG GAG Ala Glu TTG GAC Leu Asp CCT TAT Pro Tyr ACA GGA Thr Gly GCA GTG Gly Val -264- 1297 TTC AAA CAT TAT CAG GCG AAG TAT GGT GTT AGC GCT AAA CAA TTT GCT 1344 433 Phe Lys His Tyr Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe Ala
S
5 *5
S..
.5 *555* S *5SS S S *5
*SS*QS
S
p..
S
9* 5 9*
S.
*5**55
S
5*
S
*5*5 1345 1392 449 1393 1440 465 1441 1488 481 1489 1536 497 1537 1584 513 1585 1632 529 1633 1680 545 1681 1728 561 1729 1776 577 1777 1824 593 1825 1872 609 1873 1920 625 1921 1968 641 GGC TGG Gly Trp 'rrr TTA Phe Leu GTG ATA Val Ile GGG GCG Gly Ala CAG TTC Gin Phe ACG CAA Thr Gln CGT CTG Arg Leu TGT GCC Cys Ala CAA TTG Gin Leu CTG GCG Leu Ala CAA TGG Gin Trp TTG AGT Leu Ser AAC TTT Asn Phe CTG CGC Leu Arg GAC CAA Asp Gin GAT AAT Asp Asn CGT GTT Arg Val CTG TTA Leu Leu AGC ACA Ser Thr GCT AAT Ala Asn TTG GTT Leu Val GCA GGG Ala Gly GCG GAT Ala Asp CAA CAA Gin Gin GAC AAC Asp Asn ATC CGT Ile Arg GTG GCC Val Ala TTr AAC Phe Asn GAT T Asp Phe CAT ATC His Ile GCG GAT Ala Asp AAC TGT Asn Cys GCG CGC Ala Arg CGA TTA Arg Leu CCC ACA Pro Thr CTG AGT Leu Ser CAC GAT His Asp ATT TCT Ile Ser GTG TG Val Trp,
CCG
Pro
TCC
Ser
GTC
Val
AGC
Ser
AAT
Asn
AAT
Asn
ACA
Thr
GAT
Asp
ATC
Ile Leu
TTA
Leu
ACC
Thr
CAG
Gin
T
Phe
GTC
Val
TAT
Tyr
ACG
Thr
ATT
Ile
CTG
Leu
TTG
Leu
GCA
Ala
ACG
Thr
CTG
Leu
GAA
Giu
TCG
Ser
AAC
Asn
GCC
Ala
GGC
Gly
ACA
Thr
GCA
Al a
GCC
Al a
TTT
Phe
GG
Gly
GT
Gly
GTA
Val
CAA
Gin
TTT
Phe
CAG
Gin
CTA
Leu
ATT
Ile
ACC
Thr
TTG
Leu
CTG
Leu
COT
Arg
GTG
Val
ATA
Ile
ACA
Thr
CCA
Pro
GCG
Al a
TCA
Ser
GC
Gi y
GGC
Gly
ACA
Thr
TTT
Phe
ACC
Thr
GGC
Gly
CAA
Gin
GTG
Val
AAT
Asn
GC
Gly
CAA
Gin
CTA
Leu
GCA
Ala
ACT
Thr
AGT
Ser
CCG
Pro
GAT
Asp
ACC
Thr
CTC
Leu
CAG
Gin
TCA
Ser
CCA
Pro
ATC
Ile
AAA
Lys
AGT
Ser
CTG
Leu
GAC
Asp
ACG
Thr GCA ACG CCG Ala Thr Pro ACA CCG-TTT Thr Pro Phe GGO GGC OAT Gly Oly Asp AAT CAT CGT Asn His Arg 000 AAT GTC Gly Asn Val GCT TTC TAC Ala Phe Tyr GAG TCT TTC Giu Ser Phe GTC TGG CAG Val Trp Gin GAT TCC CCG Asp Ser Pro GCG ATT GCT Aia Ile Ala CTT TTG CTG, Leu Leu Leu GAT CPA TTG Asp Gin Leu TIT GTG GGT Phe Val Gly 448 464 480 496 512 528 544 560 576 592 608 624 640 656 -265- 1969 OCA ACA TTG TTG TCC CGC AGT GOG GCA CCA TTA GTC GAT ACC AAC GGC 2016 657 Ala Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Val Asp Thr Asn Gly 0 00 *0
S
S.
*050 05 0* 0 5 0 0 0@*e 0* 0 0* 0e *@0S e
S
0* a 00@@ *0eS 0 2017 2064 673 2065 2112 689 2113 2160 705 2161 2208 721 2209 2256 737 2257 2304 753 2305 2352 769 2353 2400 785 2401 2448 801 2449 2496 817 2497 2544 833 2545 2592 849 2593 2640 865
CAC
His
ATC
Ile
GCA
Ala
GCA
Ala
CAG
Gin
TCA
Ser
TGG
Trp
GAT
Asp
TCC
Ser
TTG
Leu
GAT
Asp
TTG
Leu
GCT
Ala
GAT
Asp
ACG
Thr
ATC
Ile
GGC
Gly
CTG
Leu
TTG
Leu
ATT
Ile
TTG
Leu
CTG
Leu
ATC
Ile
CTC
Leu TTT GCT CTG Phe Ala Leu CTG GTG ACT Leu Val Thr ACA CAA AGC Thr Gin Ser ACT AAT ACG Thr Asn Thr AGT CTG TTG Ser Leu Leu TTG TTG CGC Leu Leu Arg TGG GCA TTG Trp Ala Leu TAT CTG CGT Tyr Leu Arg CAA TTC ACG Gin Phe Thr GCC TAT TTT Ala Tyr Phe ATG CTT TAT Metjeu Tyr GAA GCT GGT Glu Ala Gly AAT OCT ACC Asn Ala Thr GCA GOT AAT Ala Gly Asn GGC ATA CAA Gly Ile Gin OAT OAA GAT Asp Giu Asp CAG GTA CAG Gin Val Gin ACT CTG AAC Thr Leu Asn GGA CAA ACA Gly Gin Thr 0CC GTT AAG Ala Vai Lys CGT OAA GTG Arg Giu Val CCT OCA ATG Pro Ala Met TCC GCA GAA Ser Ala Glu AGC TGT TAT Ser Cys Tyr OAA GAT OAT Glu Asp Asp AGT CCO Ser Pro AGT OTT Ser Val AAG AAG Lys Lys AAA ACT Lys Thr GTG AGT Val Ser ACC TAC Thr Tyr ACT 0CC Thr Ala OTA CGC Val Arg GTG CAA Val Gin ACA GTG Thr Val ACC OAT Ser Asp OTA CTO Val Leu
CTT
Leu
ATA
Ile
CTG
Leu
CAA
Gin
CAG
Gin
CAG
Gin 0CC Ala cGC Arg
ACC
Thr
ACC
Thr
TTA
Leu
GCC
Ala 672 688 704 720 736 752 768 784 800 816 832 848 864 TAC TTA COC ACA OCT Tyr Leu Arg Thr Ala ACA CCO TTO AOC CAA TCT OAT OCT Thr Pro Leu Ser Gin Ser Asp Ala -266- 2641 GCA CAG ACG TTG GCA ACG CTA TTG GGT TGG GAG GTT AAC GAG TTG CAA 2688 881 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 896 2689 2736 897 2737 2784 913 2785 2832 929 2833 2880 945 2881 2934 961 GCC GCT TGG TCG Ala Ala Trp Ser GAT GCG CTT CTG Asp Ala Leu Leu GTT ACA CAG CAA Val Thr Gin Gin ACC CTT TGG CAA Thr Leu Trp Gin GTA TTG GGC GGG ATT GCC AAA ACC ACA CCG CAA CTG Val Leu Gly Gly Ile Ala Lys Thr Thr Pro Gin Leu CGT TTG CAA CAG GCA CAG AAC CAA ACT GGT CTT GGC Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly CAG CAA GGC TAT CTC CTG AGT CGT GAC AGT GAT TAT Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr AGC ACC GGT CAG GCG CTG GTG GCT GGC GTA TCC CAT Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His GTC AAG GGC AGT AAC TGA GCATGGCAGA GCTCACTACC TGAGTGGATT TGATTT Val Lys Gly Ser Asn End 2935 TTCCGTATGG CCTAATGAGG CTATTTCTAA ACCGCCATTT AAGTAAGGCA GATAATTATG 2994 35 INFORMATION FOR SEQ ID NO:57 SEQUENCE CHARACTERISTICS: LENGTH: 965 amino acids TYPE: amino acid TOPOLOGY: linear ii) MOLECULE TYPE: protein xi) SEQUENCE DESCRIPTION: SEQ ID NO:57 (TccA peptide) Features From To Description 1 10 SEQ ID NO:8 1 17 50 33 49 65 81 97 113 129 145 161 177 His Asp Asp Ala Val Ala Ser Ile Met Asn Glu Ser Lys Ala Lys Leu Ser Leu Thr Asp Leu Glu -267- 193 Ala Val Asn Ala Arg Leu Ser Thr Thr' Arg Tyr Pro Asn Asn Leu Pro 209 225 241 257 273 289 305 321 337 353 369 385 401 417 433 449 35 465 481 40 497 513 529 545 561 577 593 609 625 641 657 673 689 705 721 737 Tyr Thr Asn Ala Gin Ty r Met Cys Ser His Al a Arg Giu Asn Phe Gly Phe Val Gly Gin Thr Arg Cys Gin Leu Gin Leu Asn Ala His Ile Ala Ala Gin His Tyr Thr Leu Phe Trp Leu Thr Lys Ile Gly Asp Thr Asp Ser Thr Gly Asp Ala Gly His Phe Ile Asn Asp Ile Thr Ala Lys His Trp Leu Leu Asp Ile Asp Ala Arg Phe Leu Gin Ser Leu Ala Ala Leu Leu Ala Ala Ala Trp Gin Ser Asp Phe-Iie Thr Leu Ala Ile Asp Lys Thr Val Ile Thr Giy Val Gly Gin Ala Arg Ile Ser Arg Val Thr Lys Ala Arg Asp Leu Tyr Arg Gin Asn Val Leu Thr Asn Val Giy Asp Gin Asn Arg Leu Asp Val Val Thr Ala His Asp Thr Leu Thr Ser Thr Gly Thr Pro Leu Thr Leu Ser Gin Val Val Gin Lys Leu Leu Leu Asp Lys Ile Gin Pro Gin Ser Trp Gly Asn Leu Val Gin Ile Al a Gin Glu Leu Ser Gly Al a Glu Thr Val Leu Met Al a Val Phe Asp His Ala Asn Al a Arg Pro Leu His Ile Val Arg Phe Leu Thr Thr Ser Gin Thr Lys Ile Thr Thr Leu Ser Thr Ala Val Arg Val Asn Lys Ala Asn Phe Ile Asp Cys Arg Leu Thr Ser Asp Ser Trp Ser Ala Vai Gin Asn Leu Ile Leu Gly Met Val Val1 Thr Thr Pro Ile Asn Leu Thr Asp Tyr Pro Ser Val Ser Asn Asn rhr Asp Ile Lieu Leu rhr Gly Leu rhr Ser Thr Lieu Gin Thr Pro Gin Lys Leu Ala Ser Gly Gin Asn Ser Val Pro Val Val Phe Ala Thr Leu Asn Leu Gin Lys Ser Ala Asn Thr Gly Val Phe Ala Val Gly Tyr Thr Thr Ala Ile Ala Leu Phe Leu Gly Ala Gly Thr Val Leu Gin Giu Phe Ser Gin Asn Leu Ala Pro Leu Ser Asp Ala Leu Ser Leu Asn Ala Gin Al a Thr Ser Gin Asp Phe Gin Lys Tyr Ser Thr Trp Met Leu Ser Ile Thr Leu Leu Arg Val1 Ile Thr Pro Ala Ser Giy
G
1 y Leu Alia Gly Asp rhr Gin Ser Leu Asp Asp Thr Phe Ser Phe Tyr Ser Asp Val Glu Ser Asp Giy Ara Arg Ser Asp Asp Leu Asn Asp Ala Arg Met Ala Lys Thr Pro Phe Asp Thr Thr Giy Leu Gin Gin Val Ser Asn Pro Gly Ile Gin Lys Leu Ser Ala Leu Thr Asp Ser Thr Val Asp Gly Asn Ile Gin Giu Asp Val Gin Leu Asn Val Lei Thz Pro Gin Met Leu Asn Arg Gly Lys Leu Glu Leu Gin Al a Thr Gly Asn Gly Ala Glu Val Asp Ala Leu Asp Phe rhr Ser Ser Lys Lys Val Leu Pro Ala Giu Leu *Thr Met *Val Phe Al a Leu Pro Thr Gly Phe Thr Pro Gly His Asn Phe Ser Trp Ser Ile Leu Gin Val Asn Pro Val Lys Thr C Ser C Giy Gin Se r Gin Asn Ile Leu Ser Ile Giu Asp Tyr Gly Val Ala Pro Phe Asp Arg Val1 Tyr Phe Gin Pro kl a Ieu Leu Ile ,eu in in 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 -2 68- 753 769 785 801.
817 833 849 865 881 897 913 929 945 961 Gly Gin Ala-V- Arg Glu Pro Ala Ser Ala Ser Cys Glu Asp Leu Ser Glu Val Lys Thr Asn Gin Ser Arg Val Ala Ser Asn 965 INFORMATION FOR SEQ ID NO:58 SEQUENCE CHARACTERISTICS: LENGTH: 4932 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) 1 ATG 1 Met SEQUENCE DESCRIPTION: SEQ ID NO:58 (tccB) TTA TCG ACA ATG GAA AAA CAA CTG AAT GAA TCC CAG CGT GAT GCG Leu Ser Thr Met Giu Lys Gin Leu Asn Giu Ser Gin Arg Asp Ala 4 9 TTG GTG ACT GGC TAT ATG AAT TrT GTG GCG CCG ACG TTG AAA GGC GTC 96 17 Leu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Gly Val 32 97 AGT GGT CAG CCG GTG ACG GTG GAA GAT TTA TAC GAA TAT TTG CTG ATT 144 5G 33 Ser Gly Gin Pro Val Thr Val Glu Asp Leu Tyr Glu Tyr Leu Leu Ile 48 145 49 193 241 81 289 97 GTG GCT Val Ala ATA CAG Ile Gin CAG GCG Gin Ala CAA TAT Gin Tyr GAG ACG AGT Glu Thr Ser ACT CGT CTG Thr Arg Leu TCT ACA GCT Ser Thr Ala GCT GCG GGG Ala Ala Gly 337 TAC GCT GAA AAC TAT ATT TCA CCC ATC ACC CGG CAG GAA AAA AGC CAT 384 113 Tyr Ala Glu Asn Tyr Ile Ser Pro Ile Thr Arg Gin Giu Lys Ser His 128 -269- 385 TAT TTC TCG GAG CTG GAG ACG ACT TTA AAT CAG AAT CGA CTC GAT CCG 432 129 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144 433 GAT CGT GTG CAG GAT GCT GTT TTG GCG TAT CTC AAT GAG TTT GAG GCA 480 145 Asp Arg Val Gin Asp Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 160 481 GTG AGT AAT CTA TAT GTG CTC AGT GGT TAT ATT AAT CAG GAT AAA TTT 528 161 Val Ser Asn Leu Tyr Val Leu Ser Gly Tyr Ile Asn Gin Asp Lys Phe 176 529 GAC CAA GCT ATC TAC TAC TTT ATT GGT CGC ACT ACC ACT AAA CCG TAT 576 177 Asp Gin Ala Ile Tyr Tyr Phe Ile Gly Arg Thr Thr Thr Lys Pro Tyr 192 577 CGC TAC TAC TGG CGT CAG ATG GAT TTG AGT AAG AAC CGT CAA GAT CCG 624 193 Arg Tyr Tyr Trp Arg Gin Met Asp Leu Ser Lys Asn Arg Gin Asp Pro 208 625 GCA GGG AAT CCG GTG ACG CCA AAT TGC TGG AAT GAT TGG CAG GAA ATC 672 209 Ala Gly Asn Pro Val Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu Ile 224 673 ACT TTG CCG CTG TCT GGT GAT ACG GTG CTG GAG CAT ACA GTT CGC CCG 720 225 Thr Leu Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 240 721 GTA TTT TAT AAT GAT CGA CTA TAT GTG GCT TGG GTT GAG CGT GAC CCG 768 241 Val Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 256 769 GCA GTA CAG AAG GAT GCT GAC GGT AAA AAC ATC GGT AAA ACC CAT GCC 816 257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn Ile Gly Lys Thr His Ala 272 817 TAC AAC ATA AAG TTT GGT TAT AAA CGT TAT GAT GAT ACT TGG ACA GCG 864 273 Tyr Asn Ile Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 288 865 CCG AAT ACG ACC ACG TTA ATG ACA CAA CAA GCA GGG GAA AGT TCA GAA 912 289 Pro Asn Thr Thr Thr Leu Met Thr Gln Gin Ala Gly Glu Ser Ser Glu 304 913 ACA CAG CGA TCC AGC CTG CTG ATT GAT GAA TCT AGC ACC ACA TTG CGC 960 **305 Thr Gin Arg Ser Ser Leu Leu Ile Asp Glu Ser Ser Thr Thr Leu Arg 320 961 CAA GTT AAT CTG TTG GCT ACC ACC GAT TTT AGT ATC GAT CCG ACG GAG 150 loo8 321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser Ile Asp Pro Thr Glu 336 1009 GAA ACG GAC AGT AAC CCG TAT GGC CGC CTA ATG TTG GGG GTG TTT GTC os1056 337 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 352 1057 CGT CAA TTT GAA GGT GAT GGG GCC AAT AGA AAA AAT AAA CCC GTT GTT 1104 353 Arg Gin Phe Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Val 368 1105 TAT GGT TAT CTC TAT TGT GAC TCA GCT TTC AAT CGT CAT GTT CTC AGG 1152 369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 384 1153 CCG TTA AGT AAG AAC TTT TTG TTC AGT ACT TAC CGT GAT GAA ACG GAT 1200 385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp 400 -270- 1201 1248 401 1249 1296 417 1297 1344 433 1345 1392 449 1393 1440 465 1441 1488 481 1489 1536 35 .497 1537 1584 40 513 1585 1632 529 1633 1680 545 1681 1728 561 1729 1776 577 1777 1824 593 1825 1872
GOT
Gly
ACT
Thr
GTA
Val1
TAT
Tyr
GGG
Giy
GAT
Asp
TAT
Tyr
ATC
Ile
ACG
Thr
CTG
Leu
TG
Trp
ACC
Thr
AAT
Asn
ACT
TAC GAT Tyr Asp OAT CCC Asp Pro GGC ACT Gly Thr CAT ATA His Ile OGA TAT Gly Tyr TGG TCA Trp Ser TAC ACC Tyr Thr GOT GGA Gly Gly GAG OGA Oiu Oly GTO AAG Val Lys AAT CTG Asn Leu ATA AAT Ile Asn CCA GAT Pro Asp OAT AAC AAA AAG Lys Lys GAA AAT Olu Asn ACT 000 Thr Gly CAA ACC Gin Thr AAC OAT Asn Asp OGA AAT Gly Asn TTT CAT Phe His TCT OTT Ser Vai TOO OCT Trp Ala GOC AOT Oly Ser TAT ATT Tyr Ile TTT OCT Phe Ala TOO CCA Trp Pro CGC -AAA TAT GTA ATT Tyr Vai Ile ACA OGA TGG Thr Oly Trp 0CC TAT OTO Ala Tyr Val ACA ACT AAT Thr Thr Asn CTT GTA TAT Leu Val Tyr OAA GOT TTT Oiu Gly Phe AAT OCA ATA Asn Aia Ile CCT AAT OGA Pro Asn Giy ATT OCT CCC Ile Ala Pro TAT ATC OCT Tyr Ile Ala CCA OAT GOT Pro Asp Giy ATT GOT CTT Ile Oly Leu ACA CTA ACC Thr Leu Thr TTC TAT CAG 416 432 448 464 480 496 512 528 544 560 576 592 608 624 609 Thr Ile Lys Asn Phe Ser Lys Ile Aia Asp Asn Arg Lys Phe Tyr Gin -27 1- 1873 GAA ATC AAT GCT GAG ACG GCG GAT GGA CGC AAC CTG TTT AAA CGT TAC 1920
S
.55.55
S
5* S S S *5 *5 625 1921 1968 641 1969 2016 657 2017 2064 673 2065 2112 689 2113 2160 705 2161 2208 721 2209 2256 737 2257 2304 753 2305 2352 769 s0 2353 2400 785 2401 2448 801 2449 2496 817 2497 2544 833 Glu
AGT
Ser
TAT
Tyr
CTA
Leu
GGG
Gly
GTT
Val
TCC
Ser
ACC
Thr
ACT
Thr
ACC
Thr
AAA
Lys
GAA
Glu
GAA
Glu
CCG
Pro Ile
ACT
Thr
ACT
Thr
CAG
Gin
AAA
Lys
GCG
Ala
CGT
Arg
TCA
Ser
TTG
Leu
TTG
Leu
AGC
Ser
TTG
Leu
CAG
Gin
GCG
Ala Asn
CAA
Gin
TTG
Leu
GTT
Val1
AAA
Lys
TTG
Leu
TAC
Tyr
TCA
Ser
ATT
Ile
CAG
Gin
GAA
G lu
TTC
Phe
CAA
Gin
ATG
Met Ala
ACT
Thr
TCT
Ser
TGT
Cys
GGG
Gly
CAA
Gin
GAT
Asp
TTA
Leu
GAG
Glu
GCA
Ala
CCA
Pro
TTT
Phe
TTT
Phe
AAA
Lys Glu
TTC
Phe
GAG
Glu
TTG
Leu
GCT
Ala
GAT
Asp
AGT
Ser
CCC
Pro
AAG
Lys
GAT
Asp
ATG
Met
CAC
His
TCG
Ser
AAC
Asn Thr
GGA
Gly
GCG
Al a
AAT
Asn
TAT
Tyr
AGC
Ser
AAA
Lys
GCG
Al a
GCT
Al a
CCT
Pro
GAC
Asp
CTG
Leu
CCG
Pro
AAG
Lys CTT ACC Leu Thr GAT TTC Asp Phe GTC GTG Val Val TCT TGG Ser Trp AAA GCT Lys Ala CGT GGT Arg Gly AAA ACC Lys Thr AAT CTG Asn Leu TCT CTG Ser Leu TTT AAT Phe Asn CCG TTT Pro Phe GCA CAA Ala Gin CCA CAC Pro His ACT TAT Thr Tyr CCG GAC Pro Asp TAT GAC Tyr Asp TGG TTT Trp Phe ATT CCT Ile Pro TAT CTG Tyr Leu ACC ACC Thr Thr AGT TTG Ser Leu TTA GTG Leu Val GGT CTC Gly Leu ACA CGC Thr Arg CAT TAC His Tyr GCT TAT Ala Tyr
ACA
Thr
AAC
Asn
CCG
Pro
GTC
Val1
TTA
Leu
TTC
Phe
GTG
Val
GAT
Asp
GAC
Asp
TTC
Phe
GCC
Al a
TTT
Phe
AAT
Asn
ACT
Thr
TAC
Tyr
TCA
Ser
TAT
Tyr
GTT
Val1
TGG
Trp
CGT
Arg
TAC
Tyr
GGC
Gly
TGG
Trp
AAC
Asn
GAC
Asp
GTA
Val1 Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr -272- 2545 2592 849 2593 2640 865 2641 2688 881 CGT CCG TTG GTT GAA GGA AAC AGC GAT TTG TCA CGT CAT TTG GAC GAT Arg Pro Leu Val Giu Gly Asn Ser Asp Leu Ser Arg His Leu Asp Asp 864 TCT ATA GAC CCA GAT ACT CAA Ser Ile Asp Pro Asp Thr Gin AAA GCG GTG TTT ATT GCC TAT Lys Ala Vai Phe Ile Ala Tyr GCT TAT GCT CAT CCG GTG ATA TAC CAG Ala Tyr Ala His Pro Val Ile Tyr Gin GTC AGT AAC CTG Val Ser Asn Leu ATT GCT CAG GGA GAT Ile Ala Gin Gly Asp 4# 6 0* **b
S
S
S
L
S*S
fe**: weep.
2689 2736 897 2737 2784 913 2785 2832 929 2833 2880 945 2881 2928 961 2929 2976 977- 2977 3024 993 1008 3025 3072 1009 1024 3073 3120 1025 1040 3121 3168 1041 1056 3169 3216 ACC GCT TTA CCC Thr Ala Leu Pro GCA TTG CCG Ala Leu Pro GGC CGC AAT GTC AGC TAC TTG AAA CTG Gly Arg Asn Val Ser Tyr Leu Lys Leu GCA GAT A-AT GGC TAC TTT A-AT GAA CCG CTC A-AT GTT CTG ATG TTG TCT Ala Asp Asn Gly Tyr Phe A-sn Glu Pro Leu Asn Val Leu Met Leu Ser CAC TGG GAT ACG TTG GAT OCA CGG TTA TAC A-AT CTG CGT CAT AAC CTG His Trp Asp Thr Leu Asp Ala Arg Leu Tyr A-sn Leu A-rg His Asn Leu A-CC GTT GAT GGC AAG CCG CTT TCG CTG CCG CTG TAT GCT GCG CCT GTT Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val GAT CCG GTA GCG TTG TTG GCT CAG CGT GCT CAG TCC GGC ACG TTG ACG Asp Pro Val Ala Leu Leu Ala Gin A-rg Ala Gin Ser Gly Thr Leu Thr A-TG TGG TAT CGC CA-A TTG ACT CGT GAC GGT CTG ACT CAG GCC CGT GTC Met Trp Tyr Arg Gin Leu Thr Arg Asp Giy Leu Thr Gin Ala Arg Vai TAT TAC AAT CTG GCC GCT GA-A TIG CTA GGG CCT CGT CCG GAT GTA TCG Tyr Tyr Asn Leu Ala Ala Giu Leu Leu Gly Pro Arg Pro Asp Val Ser CTG AGT AGC A-TT TGG ACG CCG CAA ACC CTG GAT ACC TTA GCA GCC GGG Leu Ser Ser Ile Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly CAA AAA GCG GTT TTA CGT GAT TTT GAG CA-C CAG TTG GCT AAT AGT GAT Gin Lys Ala Val Leu A-rg Asp Phe Giu His Gin Leu Ala Asn Ser Asp A-AT GGC Asn Gly GTC AGT GGC GCC ATG TTG A-CG GTG CCG CCA TAC CGT TTC AGC Val Ser Gly Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser OCT ATG TTG CCG CGA GCT TA-C AGC GCC GTG GGT ACG TTG ACC AGT TTT -27 3- 1057 Ala Met Leu Pro Arg Ala Tyr Ser Ala Val Gly Thr Leu Thr Ser Phe 1072 3217 GGT CAG AAC CTG CTT AGT TTG TTG GAA CGT AGC GAA CGA GCC TGT CAA 3264 1073 Gly Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin 1088 3265 GAA GAG TTG GCG CAA CAG CAA CTG TTG GAT ATG TCC AGC TAT GCC ATC 3312 1089 Glu Glu Leu Ala Gin Gin Gin Leu Leu Asp Met Ser Ser Tyr Ala Ile 1104 3313 ACG TTG CAA CAA CAG GCG CTG GAT GGA TTG GCG GCA GAT CGT CTG GCG 3360 1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Ala 1120 3361 CTG CTA GCT AGT CAG GCT ACG GCA CAA CAG CGT CAT GAC CAT TAT TAC 3408 25 1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr 1136 3409 ACT CTG TAT CAG AAC AAC ATC TCC AGT GCG GAA CAA CTG GTG ATG GAC 30 3456 1137 Thr Leu Tyr Gin Asn Asn Ile Ser Ser Ala Glu Gin Leu Val Met Asp 1152 35 3457 ACC CAA ACG TCA GCA CAA TCC CTG ATT TCT TCT TCC ACT GGT GTA CAA 3504 1153 Thr Gin Thr Ser Ala .Gn Ser Leu Ile Ser Ser Ser Thr Gly Val Gin 1168 3505 ACT GCC AGT GGG GCA CTG AAA GTG ATC CCG AAT ATC TTT GGT TTG GCT 3552 1169 Thr Ala Ser Gly Ala Leu Lys Val Ile Pro Asn Ile Phe Gly Leu Ala 1184 3553 GAT GGC GGC TCG CGC TAT GAA GGA GTA ACG GAA GCG ATT GCC ATC GGG 3600 1185 Asp Gly Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala Ile Ala Ile Gly 1200 3601 TTA ATG GCT GCC GGA CAA GCC ACC AGC GTG GTG GCC GAG CGT CTG GCA 3648 1201 Leu Met Ala Ala Gly Gin Ala Thr Ser Val Val Ala Glu Arg Leu Ala 1216 3649 ACC ACG GAG AAT TAC CGC CGC CGC CGT GAA GAG TGG CAA ATC CAA TAC 3696 1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin Ile Gin Tyr- 1232 3697 CAG CAG GCA CAG TCT GAG GTC GAC GCA TTA CAG AAA CAG TTG GAT GCG 3744 1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 1248 -274- 3745 3792 1249 1264 CTG GCA GTG CGC GAG AAA GCA GCT CAA ACT TCC CTG CAA CAG GCG AAG Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 3793 3840 1265 1280 3841 3888 1281 1296 3889 3936 1297 1312 25 3937 3984 1313 1328 3985 4032 1329 1344 4033 4080 1345 40 1360 4081 4128 1361 1376 GCA CAG CAG GTA CAA ATT CGG ACC ATG Ala Gin Gin Val Gin Ile Arg Thr Met TTC ACC CAG GCG ACT CTG TAC CAG TGG CTG AGT GGT CAA TTA TCC GCG Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala TTG TAT TAT CAA GCG TAT GAT GCC GTG GTT GCT CTC TGC CTC TCC GCC Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Cys Leu Ser Ala CAA GCT TGC TGG CAG TAT GAA TTG GGT GAT TAC GCT ACC ACT TTT ATC Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe Ile CAG ACC GGT ACC TGG AAC GAC CAT TAC CGT GGT TTG CAA GTG GGG GAG Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu ACA CTG CAA CTC AAT TTG CAT CAG ATG GAA GCG GCC TAT TTA GTT CGT Thr Leu Gin Leu Asn Leu His Gin Met Glu Ala Ala Tyr Leu Val Arg CTG ACT TAC TTA ACT ACT CGT Leu Thr Tyr Leu Thr Thr Arg CAC GAA CGC His Glu Arg 4129=~ G-GG GAT 4176 1377 Leu Gly Asp 1392 4177 4224 1393 1408 4225 4272 1409 1424 4273 4320 1425 1440 TTT CCA TTA Phe Pro Leu TTG CGC CAG Leu Arg Gin CCG TAT CAA Pro Tyr Gin CGT CTT AAT GTG ATC CGT ACT GTG TCG CTC AAA AGC CTA Arg Leu Asn Val Ile Arg Thr Val Ser Leu Lys Ser Leu GAT GGT TTT GGT AAG TTA AAA ACC GAA GGC AAA GTC GAC Asp Gly Phe Gly Lys Leu Lys Thr GlU-ly-Gys Val Asp AGC GAA AAG CTG TTT GAC AAC GAC TAT CCG GGG CAC TAT Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr ATT AAA ACT GTG TCA GTG ACG TTG CCG ACG TTA GTC GGG Ile Lys Thr Val Ser Val Thr Leu Pro Thr Leu Val Gly AAC GTG AAG GCA ACG CTC ACT CAG ACC AGC AGC AGT ATA Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser Ile -275- 4321 TTG TTA GCA GCA GAT ATC AAT GGT GTT AAA CGT CTC AAT GAT CCG ACA 4368 1441 1456 4369 4416 1457 1472 4417 4464 1473 1488 4465 4512 1489 1504 4513 4560 1505 1520 4561 4608 1521 1536 4609 4656 40 1537 1552 GTC- ACC Val Thr AAT GAT Asn Asp TCA TTT Ser Phe CGT TCT Arg Ser ATG CAG Met Gin CAT TAT His Tyr
CTG
Leu
GGT
Gly
CG
Gly
GAT
Asp
GCA
Ala
GCC
Ala Leu Leu Ala Ala Asp Ile Asn Oly Val Lys Arg Leu Asn Asp Pro Thr 465 45 155 470 477 482 (2) 7 GGC GCC ACT TTC OCA AAC CAG GTC AAG AAA ACA CTC TCT TAA CATTAACT 3 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser End 9 TAACTAATCC CTCCCACTCT OTTCGCCAGA GTGGGAGAAG GTTTGTCATA TCTAAAAT( 0 ATCTTCAT CTTTCTCCAT TTCATTGGAA GGGAACCTGT AAAACAAATA AGOAATAT( 9 TATG INFORMATION FOR SEQ ID NO:59 SEQUENCE CHARACTERISTICS: (A).LENGTH: 1565 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59 (TccB peptide) Features From To Description 1 Met Leu Ser Thr Met Oiu Lys Gin Leu Asn Ciu Ser Gin Arg Asp Ala -276- IT 4708 1565 CA 4768 GA 4828 4932
S
*5
S.
*5 .55.
17 32 33 48 49 64 81 96 97 112 113 128 129 144 145 160 161 176 177 192 35 193 208 209 224 225 240 241 256 257 272 273 288 289 304 305 320 321 336 337 352 353 368 369 384 Leu Ser Asp Ile Pro Asn Tyr Tyr Asp Val Asp Arg Ala Thr Val Ala Tyr Pro Thr Gin Glu Arg Tyr Val Gly Pro Ala Gly Asp Al a Phe Arg Ser Gin Tyr Gly Leu Phe Val Asn Asn Gin Val Thr Gin Gly Thr Gin Glu Ser Arg As n Glu Ser Val Asn Al a Tyr Asn Pro Tyr Gin Ile Thr Arg Asn Asp Phe Tyr Gly Pro Val Ile Gin Gin Asn Giu Gin Leu Ile Trp Pro Leu Asn Lys Lys Thr Ser Leu Ser Giu Leu Tyr Val Al a Gin Al a Tyr Tyr Leu Asp Tyr Tyr Arg Val Ser Asp Asp Phe Thr Ser Leu Asn Gly Tyr Met Thr Asp Gin Met Ala Ile Giu Al a Val1 Tyr Gin Thr Gly Arg Al a Gly Leu Leu Al a Pro Asp Cys Asn Val Glu Tyr Glu- Ile Ser Thr Val Leu Phe Met Pro Asp Leu Asp Tyr Met Leu Thr Tyr Gly Asp Phe Val Giu Asp Val Glu Met Thr Rro, Ser Trp Ala Pro Ile Thr Leu Leu Ala Ser Gly Ile Gly Asp Leu Asn Cys Thr Val Tyr Val Gly Lys Lys Arg Thr Gin Ile Asp Thr Asp Gly Arg Ala Asn Ser Ala Thr Giu Arg Vai Asn Al a Gin Asn As n Asn Thr Asn Asp His Val Gly Asp Gly Ser Ile Leu Asn Arg Lys Gly Val Leu Leu Ile Ala Gin Ala Gly Ser-Giu Trp Arg Asp Val Arg Asn Lys Ser His Leu Asp Pro Phe Glu Ala Asp Lys Phe Lys Pro Tyr Gin Asp Pro Gin Glu Ile Val Arg Pro Arg Asp Pro Thr His Ala Trp Thr Ala Ser Ser Giu Thr Leu Arg Pro Thr Glu Val Phe Val Pro Val Val Val Leu Arg -277- 385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Giu Thr Asp 400 401 416 417 432 433 448 449 464 465 480 481 496 497 512 25 513 528 529 544 545 560 561 35 576 577 592 40 593 608 609 624 625 640 641 656 657 672 673 688 689 704 705 720 721 736 737 752 753 768 Gly Thr Val Tyr Gly Asp Tyr Ile Thr Leu Trp, Thr Asn Thr Giu Ser Tyr Leu Gly Val Ser Thr Thr Gin Lys Ser Ile Asp Ser Leu Asn Trp, Leu Giu Val1 Lys Ile Ile Thr Thr Gin Lys Al a Arg Ser Leu Asn Val Lys Asp Phe Lys Asp Tyr Ala Asp Gly Leu Leu Lys Asn Gin Leu Val Lys Leu Tyr Ser Ile Ser Val Val1 Gin Ile Ser Tyr Tyr Leu Thr Giu Leu Glu Asn Aia Thr Ser Cys Gly Gin Asp Leu Giu Leu Gin Thr Gly Asp Asp Asp Gly Asn Arg Gly Tyr His Asp Pro Ser Glu Gin Leu His Thr Pro Asp Trp Ser Vai Phe Ser Giu Thr Phe Giy Giu Ala Leu Asn Ala Tyr Asp Ser Ser Lys Pro Ala Al a Thr Lys Thr Thr Phe Asn Tyr Ile Val Gly Asp Thr Ile Asp Thr Phe Val Trp Ala Gly Thr Val Giu Gin Leu Phe Thr Tyr Giy Asn Thr Tyr Lys Ser Ala Gly Ser Ser Trp Vai Pro Leu Arg Tyr Asp Gly His Gly Trp Tyr Gly Glu Val Asn Ile Pro Asp Arg Gly Thr Asp Ser Asp Vai Leu Pro Thr Ile Tyr Se r Thr Gly Gly Lys Leu Asn Asp As n Asn Ala Asp His Lys Ala Gin Asn Glu Thr Gin Asn Gly Phe Ser Trp Gly Tyr Phe Trp Arg Leu Thr Pro Tyr Trp Ile Tyr Thr Asn Gly Thr Asp Asn His Val Aia Ser Ile Al a Pro Lys Phe Tyr Asp Asp Phe Pro Leu Thr *Thr Gly Ala Tyr Thr Thr Leu Val Glu Gly Asn Ala Pro Asn Ile Ala Tyr Ile Pro Asp Ile Gly Thr Leu Phe Tyr Lys Arg Ser Thr Lys Asn Arg Pro Asn Val Arg Leu Asp Phe Phe Val Trp Vai Asn Tyr Phe Ile Gly Pro Aila Gly Leu rhr 3i n Tyr ['hr T'yr Ser ['yr lai Crp krg Asp Lys Lys Tyr Vai Ile Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr -278- S
S
*5 S
S
555.5.
S S S. S *5 S *5 *5 S
S
*555 769 784 785 800 801 816 817 832 833 848 849 864 865 880 881 896 897 912 913 928 929 944 35 945 960 961 976 977 992 993 45 1008 1009 1024 1025 1040 1041 1056 1057 1072 1073 1088 1089 1104 1105 1120 1121 1136 Thr Lys Giu Giu Pro Arg Ser Lys Met Tyr Leu Gin Thr Ala His Thr Asp Asn Ala Gly Glu Thr Leu Leu Ser Leu Gin Al a Pro Ile Ala Trp Tyr Ser Lys Ala Asp Trp Val Pro Gly Met Gin Glu Leu Leu Gin Giu Phe Gin Met Leu Asp Vai Tyr Asn Se r AIla Leu Asn Asp Asp Val Val Leu Asn Leu Gin Ala Ala Pro Phe Phe Lys Val Pro Phe Arg Leu Ile Vai Pro Gly Thr Gly Ala Ser Pro Leu Ala Gin Ser Asp Met His Se r As n Giu Asp Ile Gin Al a Trp Leu Ala Tyr Leu Lys Leu Gly Arg Leu Gin Gin Gin Pro Ser Leu Asp Phe Asn Leu Pro Phe Pro Ala Gin Lys Pro His Gly Asn Ser Thr Gin Ala Aia Tyr Vai Leu Thr Arg Ala Giu Leu Thr Pro Gin Arg Asp Phe Leu Pro Gly Phe Asn Glu Asp Ala Arg Pro Leu Ser Leu Ala Gin Ala Met Leu Ala Tyr Ser Ser Leu Leu Gin Gin Leu Ala Leu Asp Ala Thr Ala Glu Ala Gly Ser Leu Val Lys Ser Asn Ala Asp Leu Tyr Ala Ser Asn Asp Gly Leu Gly Thr Leu Giu His Arg Asn Pro Leu Leu Tyr Leu Pro Arg Ala Thr Val Ala Vai Glu Arg Leu Asp Gly Leu Tyr Tyr His Val Ala Gin Pro Leu Ala Tyr Leu Arg Ala Gly Tyr Leu Arg Ser Asp Ile Trp, Leu Ile Gin Al a Asp Al a Asn Leu Met His Ala Thr Arg Thr Ala Tyr Arg Phe Asn Asp Tyr Gly Arg Val Al a Ser Lys Leu Asn Pro Leu Phe Ser Cys Alia Leu Asp Val Asp Gin Asp Val Ser Gly Asp Leu Ser Leu Vai Thr Ser- Phe Gin Ile Ala Asp Leu Val Thr Asp Gly Asn Gly LeU Tyr Phe Trp, Ala Thr Arg Phe Ala Asn Gin Gin Arg His Asp His Tyr Tyr -27 9- 1137 Thr Leu Tyr Gin Asn Asn Ile Ser Ser Ala Glu Gin Leu Val Met Asp 1152 1153 Thr Gin Thr Ser Ala Gin Ser Leu Ile Ser 5cr Ser Thr Gly Val Gin 1168 1169 Thr Ala Ser Gly Ala Leu Lys Val Ile Pro Asn Ile Phe Gly Leu Ala 1184 1185 Asp Gly Gly Ser Arg Tyr Giu Gly Val Thr Giu Ala Ile Ala Ile Gly 1200 1201 Leu Met Ala Ala Gly Gin Ala Thr Sd- Val Val Ala Giu Arg Leu Ala 1216 1217 Thr Thr Giu Asn Tyr Arg Arg Arg Arg Giu Giu Trp Gin Ile Gin Tyr 1232 1233 Gin Gin Ala Gin 5cr Giu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 1248 1249 Leu Ala Val Arg Giu Lys Ala Ala Gin Thr Ser Leu Gin Gin-Ala Lys '*~>1264 :25 1265 Ala Gin Gin Val Gin Ile Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 1280 91281 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Scr Gly Gin Leu Ser Ala 1296 1297 *Leu Tyr Tyr Gin Ala Tyr-Asp Ala Val Val Ala Leu Cys Leu Ser Ala 1312 1313 Gin Ala Cys Trp Gin Tyr Giu Leu Gly Asp Tyr Ala Thr Thr Phe Ile 1328 1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Giu 1344 40 1345 Thr Leu Gin Leu Asn Leu His Gin Met Giu Ala Ala Tyr Leu Val Arg 9. 1360 .136i His Giu Arg Arg Leu Asn Val Ile Arg Thr Val 5cr Leu Lys Ser Leu 1376 ~1377 Leu Gly Asp Asp Giy Phe Gly Lys Leu Lys Thr Giu Gly Lys Val Asp -1392 *9Q*1393 Phe Pro Leu 5cr Giu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 1408 1409 Leu Arg Gin Ile Lys Thr Val .Ser Val Thr Leu Pro Thr Leu Val Gly 1424 1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr 5cr Ser 5cr Ile 1440 1441 Leu Le-u Ala Ala Asp Ile Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 1456 1457 Gly Lys Giu Gly Asp Ala Thr His Ile Val Thr Asn Leu Arg Ala Ser 1472 1473 Gin Gin Val Ala Leu 5cr 5cr Gly Ile Asn Asp Ala Gly 5cr Phe Giu 1488 1489 Leu Arg Leu Giu Asp Giu Arg Tyr Leu Ser Phe Giu Gly Thr Gly Ala 1504 1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg 5cr Val Asp Giu His Ile 1520 -280- 1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Met Gin Ala Ala Leu Leu Ala 1536 1537 Asn Met Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 1552 1553 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser 1565 INFORMATION FOR SEQ ID
S
S..
0 4* 4 5 9*
@S
S
0* 0
S*
Sr..
1 1 49 17 97 144 33 145 192 49 40 193 240 241 288 81 289 336 97 112 337 384 113 128 385 432 129 144 SEQUENCE CHARACTERISTICS: LENGTH: 3132 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:60 (tccC) ATG AGT CCG TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA ACA GTC AGC 48 Met Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 16 GTG TTA GAT AAT CGC GGT CTG TCC ATT CGT GAT ATT GGT TTT CAC CGT 96 Val Leu Asp Asn Arg Gly Leu Ser Ile Arg Asp Ile Gly Phe His Arg 32 ATT GTA ATC GGG GGG GAT ACT GAC ACC CGC GTC ACC CGT CAC CAG TAT Ile Val Ile Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 48 GAT GCC CGT GGA CAC CTG AAC TAC AGT ATT GAC CCA CGC TTG TAT GAT Asp Ala Arg Gly His Leu Asn Tyr Ser Ile Asp Pro Arg Leu Tyr Asp 64 GCA AAG CAG GCT GAT AAC TCA GTA AAG CCT AAT TTT GTC TGG CAG CAT Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His GAT CTG GCC GGT CAT GCC CTG CGG ACA GAG AGT GTC GAT GCT GGT CGT Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96 ACT GTT GCA TTG AAT GAT ATT GAA GGT CGT TCG GTA ATG ACA ATG AAT Thr Val Ala Leu Asn Asp Ile Glu Gly Arg Ser Val Met Thr Met Asn GCG ACC GGT GTT CGT CAG ACC Ala Thr Gly Val Arg Gin Thr GGT CGC TTG TTA TCT GTG AGC Gly Arg Leu Leu Ser Val Ser CGT CGC TAT GAA GGC AAC ACC TTG CCC Arg Arg Tyr Glu Gly Asn Thr Leu Pro GAG CAA GTT Glu Gin Val TTC AAC CAA GAG AGT GCT Phe Asn Gin Glu Ser Ala 433 480 AAA GTG ACA GAG CGC TTT ATC TGG GCT GGG AAT ACA ACC TCG GAG AAA -281- 145 160 Lys Val Thr Giu Arg Phe Ile Trp Ala Gly Asn Thr Thr Ser Gu Lys 481 528 161 176 GAG TAT AAC Glu Tyr Asn CTC TCC GGT Leu Ser Gly CTG TGT ATA Leu Cys Ile CGC CAC Arg His 529 576 177 192 GTG ACC Val Thr CGG TTG ATG AGT CAG Arg Leu Met Ser Gin TCA CTG GCG GGC Ser Leu Aia Gly TAC GAC ACA GCG GGA Tyr Asp Thr Ala Gly GCC ATG CTA TCC CAA Ala Met Leu Ser Gin AAC TGG AGC GGT GAC Asn Trp Ser Giy Asp GAG GTCTAT ACG ACA Glu Val Tyr Thr Thr 577 624 193 208 625 672 209 224 673 30 720 225 240 721 768 241 256 769 816 257 272 817 864 273 288 865 912 289 304 913 960 305 320 961 1008 321 1009 1056 CAA AGT ACC ACT AAT GCC ATC GGG GCT TTA CTG ACC CAA ACC GAT GCG Gin Ser Thr Thr Asn Ala Ile Giy Ala Leu Leu Thr Gin Thr Asp Ala AAA GGC AAT ATT CAG CGT CTG GCT TAT GAC ATT GCC GGT CAG TTA AAA Lys Giy Asn Ile Gin Arg Leu Ala Tyr Asp Ile Ala Gly Gin Leu Lys GGG AGT TGG TTG ACG GTG AAA GGC CAG AGT GAA CAG GTG ATT GTT AAG Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Giu Gin Val Ile Val Lys TCC CTG AGC TGG TCA GCC GCA GGT CAT AAA TTG CGT GAA GAG CAC GGT Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Giu Giu His Gly AAC GGC GTG GTT ACG GAG TAC AGT TAT GAG CCG GAA ACT CAA CGT CTG Asn Gly Vai Val Thr Giu Tyr Ser Tyr Giu Pro Giu Thr Gin Azg Leu TCT CAC CAA TTG CTG GCG GAA GGG CAG GAG GCT Ser His Gin Leu Leu Ala Giu Gly Gin Giu Ala GAC GAA ACT GTC TGG CAG GGA ATG CTG GCA AGT Asp Giu Thr Vai Trp Gin Gly Met Leu Ala Ser ATA GGT ATC ACC ACC CGG CGT GCC GAA Ile Giy Ile Thr Thr Arg Arg Ala Glu GTA TTG CAG OAT CTA CGC TAT AAG TAT Val Leu Gin Asp Leu Arg Tyr Lys Tyr GGG AGT CAA TCA OGA GCC AGA Giy Ser Gin Ser Gly Ala Arg GAT CCG GTG GGG AAT OTT ATC Asp Pro Val Gly Asn Val Ile 336 AGT ATC CAT AAT GAT GCC* GAA OCT ACC CCC TTT TGG CGT AAT CAG AAA -282- 337 Ser Ile His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gin Lys 352 1057 GTG GAG CCG GAG AAT CGC TAT GTT TAT GAT TCT CTG TAT CAG CTT ATG 1104 353 Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Leu Tyr Gin Leu Met 368 1105 AGT GCG ACA GGG CGT GAA ATG GCT AAT ATC GGT CAG CAA AGC AAC CAA 1152 369 Ser Ala Thr Gly Arg Glu Met Ala Asn Ile Gly Gin Gin Ser Asn Gin 384 1153 CTT CCC TCA CCC GTT ATA CCT GTT CCT ACT GAC GAC AGC ACT TAT ACC 1200 385 Leu Pro Ser Pro Val Ile Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr 400 1201 AAT TAC CTT CGT ACC TAT ACT TAT GAC CGT GGC GGT AAT TTG GTT CAA 1248 401 Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gin 416 25 1249 ATC CGA CAC AGT TCA CCC GCG ACT CAA AAT AGT TAC ACC ACA GAT ATC 1296 417 Ile Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp Ile 432 30 1297 ACC GTT TCA AGC CGC AGT AAC CGG GCG GTA TTG AGT ACA TTA ACG ACA 1344 433 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 448 35 1345 GAT CCA ACC CGA GTG GAT GCG CTA TTT GAT TCC GGC GGT CAT CAG AAG **1392 449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 464 :40 1393 ATG TTA ATA CCG GGG CAA AAT CTG GAT TGG AAT ATT CGG GGT GAA TTG 1440 465 Met Leu Ile Pro Gly Gin Asn Leu Asp Trp Asn Ile Arg Gly Glu Leu 480 1441 CAA CGA GTC ACA CCG GTG AGC CGT GAA AAT AGC AGT GAC AGT GAA TGG .1488 481 Gin Arg Val Thr Pro Val Ser Arg Glu Asn Ser Ser Asp Ser Glu Trp 496 1489 TAT CGC TAT AGC AGT GAT GGC ATG CGG CTG CTA AAA GTG AGT GAA CAG 1536 497 Tyr Arg Tyr Ser Ser Asp Gly Met Arg Leu Leu Lys Val Ser Glu Gin 512 1537 CAG ACG GGC AAC AGT ACT CAA GTA CAA CGG GTG ACT TAT CTG CCG GGA 1584 513 Gin Thr Gly Asn Ser Thr Gin Val Gin Arg Val Thr Tyr Leu Pro Gly 528 1585 TTA GAG CTA CGG ACA ACT GGG GTT GCA GAT AAA ACA ACC GAA GAT TTG 1632 529 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu 544 1633 CAG GTG ATT ACG GTA GGT GAA GCG GGT CGC GCA CAG GTA AGG GTA TTG 1680 A 545 Gin Val Ile Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 560 1681 CAC TGG GAA AGT GGT AAG CCG ACA GAT ATT GAC AAC AAT CAG GTC CGC 1728 -283- 561 His Trp Glu Ser Gly Lys Pro Thr Asp Ile Asp Asn Asn Gin Val Arg
C.
C
1729 1776 577 1777 1824 593 1825 1872 609 1873 1920 625 1921 1968 641 1969 30 2016 657 2017 35 2064 673 2065 40 2112 689 2113 45 2160 705 2161 2208 721 2209 2256 737 2257 2304 753 2305 2352 769 2353 2400
TAC
Tyr
GAA
Glu
GCG
Ala
CGT
Arg
TAC
Tyr
GCG
Ala
CCC
Pro
AAT
Asn
GAG
Glu
ATT
Ile
ACG
Thr
GCG
Ala
AAA
Lys
GTA
CTG
Leu
AGT
Ser
AGA
Arg
GAG
Glu
CCT
Pro
GGG
Gly
GAC
Asp
TGG
Trp
TCA
Ser
GCG
Ala
ATC
Ile
TTG
Leu
CTT
Leu
GGC
GGC TCC Gly Ser GAA GAG Glu Glu CAG ACA Gin Thr GAT GCC Asp Ala GTG GGT Vai Gly AAT TTG Asn Leu GAC GGA Asp Gly GCT TCA Ala Ser AGA CGG Arg Arg GGC GGT Gly Gly GTC ATT Val Ile GGA TAT Gly Tyr CGA CTC Arg Leu GCT GCC AGC CAC Ser Gin TAT TAT Tyr Tyr GAA GCC Glu Ala ACT GGA Thr Gly CGA TGG Arg Trp TAC CGA Tyr Arg TTA CCA Leu Ala TTT TTG Phe Leu GGA CAA Gly Gin CTT GCG Leu Aia CTG GGG Leu Gly AAC GTC Asn Val GTA CAC Val Gin GGA GCG CTT GAA CTG Leu Giu Leu CCC TAT GGC Pro Tyr Gly AGC TAC AAA Ser Tyr Lys TTC TAT TAT Leu Tyr Tyr TTG ACT GCT Leu Ser Ala ATG GTG AGG Met Val Arg CCC TCT CCA Pro Ser Pro TTT CGT AAA Phe Arg Lys AAA ATT GGC Lys Ile Gly GCT ACC ATT Ala Thr Ile GTT GCG GCC Vai Ala Ala GGT AGC CTG Cly Ser Leu GGG AAA TCG Gly Lys Ser AGT TCA GCC
CAT
Asp
GGT
Gly
TTT
Phe
TAC
Tyr
CAT
Asp
AAT
Asn
AAT
Asn
CCT
Pro
AGA
Arg
GCC
Ala
GTA
Val
CTG
Leu
ACC
Thr
GCG
AGC
Ser
ACG
Thr
ATT
Ile
GGC
Gly
CCC
Pro
AAC
Asn
AGA
Arg
GAT
Asp
GCC
Ala
GCT
Ala
GGC
Gly
GAA
Glu
TTA
Leu
GCT
592 608 624 640 656 672 688 704 720 736 752 768 784 800 785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala -28 2401 TAT GGC GCA COG GCA CAA GOT GTC GOT GTT GCA TCA GCC GCC GGG GCG 2448 801 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 2449 2496 817 2497 2544 833 2545 2592 849 2593 2640 865 2641 2688 30 881 2689 2736 35 897 2737 2784 40 913 2785- 2832 929 2833 2880 945 2881 2928 961 2929 2976 977 2977 3024 993 1008 3025 3072 GTA ACA Val Thr GOC GOC Gly Gly TTA 000 Leu Gly GO OCG Gly Ala GOT ATC Gly Ile OGT TTA Gly Leu OTG GOT Val Gly ATG GOT Met Gly TAT GCC Tyr Ala CCT GTC Pro Val ATT OGA Ile Gly AGA OCO Arg Ala GCT GTO Ala Val ATT 000 Ile Oly GCC TCT Ala Ser GGT 000 Oly Oly 0CC OGT Ala Oly GTC OCT Val Ala GCC GCT Ala Ala OGA TTT Gly Phe GOT TTA Gly Leu GAG CCO Glu Pro GOC CTG Gly Leu AOT OCT Ser Ala
OGA
Gly 0CC Al a
ACC
Thr
ATO
Met
ATT
Ile
AOT
Ser
GT
Gly.
TTG
Leu 0CC Ala
ATA
Ile
CAC
His 0CC Ala
TCA
Ser
GG
Gly
CTT
Leu
ATC
Ile
GOC
Gly
AAC
Asn
TTG
Leu
AGT
Ser
AGA
Arg
TTT
Phe
AGA
Arg
GT
Gly TOO ATA Trp Ile AGT GCO Ser Ala ACC CAT Thr His ACC GOT Thr Oly ACC TAT Thr Tyr CCC 0CC Pro Ala GOT OCT Gly Ala AGO CTC Arg Leu CAA TTA Gin Leu AGT OTT Ser Val OTG ATO Val Met AOT GOT Ser Gly AAT AAT OCT OAT Asn Asn Ala Asp OTA OOC ACC ATT Val Oly Thr Ile OAA GTC GOG OCA Giu Val Gly Ala ACO CAA 000 AOT Thr Gin Oly Ser TAT GOC TCC TOO Tyr Oly Ser Trp OGA CAT TTA OCO Oly His Leu Ala OAA ATO OCT GTC Olu Met Ala-Val TTA GGC COO OTT Leu Gly Arg Val OTA CAT TTC AGT Val His Phe Ser CTC GGC 000 CTT Leu Gly Gly Leu OGA AGA GAG AOT Oly Arg Olu Ser ATA OAT CAT 0TC Ile Asp His Val COO 000 Arg Gly OAT ACT Asp Thr OCO OCG Ala Ala ACT CG Thr Arg ATT GOT Ile Gly AAT TAC Asn Tyr AAC AGA Asn Arg OTC AOC Val Ser OTC 0CC Val Ala OTC GOT Val Oly TOO ATT Trp Ile GCT GOC Ala Gly
ATT
Ile
ATG
Met
OOT
Gly
OCA
Ala
TTT
Phe
OCA
Ala
ATA
Ile
CCA
Pro
AGA
Axrg
GGT
Gly
TCC
Ser
ATG
Met 816 832 848 864 880 896 912 928 944 960 976 992 ATT GOT AAT CAO ATC AGA OGC AGO GTC .TTO ACC ACA ACC 000 ATC OCT -285- 1009 Ile Giy Asn Gin Ile Arg Giy Arg Val1 Leu Thr Thr Thr Giy Ile Ala 1024 3073 AAT GCG ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC GrCA CGA CGA GTfl 3120 1025 Asn Ala Ile Asp Tyr Gly Thr Ser Ala Vai Giy Ala Ala Arg Arg Val 1040 3121 TTT TCT TTG TAA 3132 1041 Phe Ser Leu End 1043 INFORMATION FOR SEQ ID NO:6i SEQUENCE CHARACTERISTICS: LENGTH: 1043 amino acids TYPE: amino acid TOPOLOGY: linear MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6i (TccC peptide) 1 Met Ser Pro Ser Giu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Vai Ser 16 *17 Val Leu Asp Asn Arg Gly Leu Ser Ile Arg Asp Ile Gly Phe His Arg 32 33 Ile Vai Ile Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 48 49 Asp Ala Arg Giy His Leu Asn Tyr Ser Ile Asp Pro Arg Leu Tyr Asp 64 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp, Gin His 35 81 Asp Leu Ala Giy His Ala Leu Arg Thr Glu Ser Val Asp Ala Giy Arg 96 *97 Thr Val Ala Leu As n Asp Ile Giu Gly Arg Ser Val Met Thr Met Asn 112 113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Giu Gly Asn Thr Leu Pro 128 129 Gly Arg Leu Leu Ser Val Ser Giu Gin Val Phe Asn Gin Giu Ser Ala 144 145 Lys Val Thr Gu Arg Phe Ile Trp Ala Gly Asn Thr Thr Ser Gu Lys 160 *161 Giu Tyr Asn Leu Ser Gly Leu Cys Ile Arg His Tyr Asp Thr Ala Gly 176 177 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 192 193 Ser His Gin Leux Leu Ala Giu Gly Gin Giu Ala Asn Trp Ser Gly Asp 208 209 Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 224 225 Gin Ser Thr Thr Asn Ala Ilie Gly Ala Leu Leu Thr Gin Thr Asp Ala 240 241 Lys Gly Asn Ile Gin Arg Leu Ala Tyr Asp Ile Ala Giy Gin Leu Lys 256 257 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Giu Gin Val Ile Val Lys 272 273 Ser Leu Ser. Trp Ser Ala Ala Gly His Lys Leu Arg Giu Giu His Gly 288 289 Asn Gly Vai Vai Thr Giu Tyr Ser Tyr Giu Pro Giu Thr Gin Arg Leu 304 305 Ile Gly Ile Thr Thr Arg Arg Ala Gu Gly Ser Gin Ser Gly Ala Arg 320 321 Val Leu Gin Asp Leu Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val Ile 336 337 Ser Ile His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gin Lys 352 -286- 353 369 385 401 417 433 449 465 481 497 513 529 25 545 561 577 593 609 35 625 641 657 40 673 689 45 705 721 737 753 769 785 801 817 833 849 65 865 881 897 913 Val Ser Leu Asn Ile Thr Asp Met Gin Tyr Gin Leu Gin His Tyr Giu Ala Arg Tyr Ala Pro Asn Giu Ile Ala Lys Val1 Tyr Val Gly Leu Giy Giy Giy Val Giu Pro Ala Thr Pro Ser Tyr Leu Arg His Val Ser Pro Thr Leu Ile Ary Val Arg Tyr Thr Gly Giu Leu Val Ile Trp Giu Ser Tyr Gly Gin Ile Trp, Tyr Ser Arg Tyr Gly Thr Ile Thr Arg Asn Gly met Ala Gly A-1-&Giy Gly Ile Gly Gly Gin Ser Gly Ala Thr Gly Gly Ala Giy Thr2 Ala Ala Ile His Leu Asp Gly Tyr Glu Gly Pro Arg Ser Ser Arg Pro Thr Ser Asn Arg Thr Ser Asp Ile Ala Gly Tyr Val Leu Thr Ser
G
1 y Gly kla ka krg kla Ile kla fly kla l a *Asr Arg *Val Thr Set Arg Val Giy Pro Ser Ser Thr Val1 Gly Asn Leu Ala Lys Gin Asp Thr Phe Ala Ile Ala Al a Leu Al a Ala Val1 Gly Ser Gly Gly Ala Al a LArC Gli Sle Tyr *Prc Set Asp Gin Val Asp Thr Thr Gly Lys Leu Ser Arg Glu Pro Gly Asp Trp, Ser Al a Ile Leu Leu Gly Gin Gly Ala Thr Met Ile Ser Gly
~TY
Met *Prc Thi Alz Asr Ala *Asn Ser Gly Gin Gly Glu Pro Leu Gin Asn Arg Trp Leu His Phe Met Ile Pro Met Al a Al a Gly Ser Gly Leu Ile Gly Asn Leu Val Ala Val Tyr Thr Arg Leu Leu *Arg *Met *Val *Val Aila Thr Gly Glu Gin Asp Val Asn Asp Ala Arg Giy Vai Giy Arg Ala2 Val Trp Ser2 Thr I Thr C Thr 'I Pro Gly -287- TyT Asn Prc Asp Gin Al a Phe Asp Giu Arg Gin Ala Gly Asp Ser Glu Thr Al a Gly Leu Gly Ser Arg Gly Ile T'yr eu kla Ile fly ~yr l a Asj Ile Thr IArg Asn Val Asp Trp Asn Leu Arg Asp Arg Ile Ser Tyr Giu Thr Arg Tyr Leu Phe Gly Leu Leu Asn Vai Gly Val Asn Val Giu Thr Tyr Gly Giu Sei Gl) Asj Gi) Set Leu Set Asn Ser Leu Val Lys Ala Asp Gin Tyr Ala Gly Trp, Arg Ala Leu Gin Ala Gly Vai Gin Aia Ala Asn Gly Val Gin Gly His Met 7Leu Tyr Gin rGin Gln Ser Asp Ser Thr Gly Asn Leu Tyr Thr Thr Ser Thr Leu *Giy Gly His Sle Arg Gly *Ser Asp Ser Lys Val Ser Thr Tyr Leu Thr Thr Giu Gin Val Arg Asn Asn Gin Leu Giu Leu Pro Tyr Gly Ser Tyr Lys Leu Tyr Tyr Leu Ser Ala Met Val Arg Pro Ser Pro Phe Arg Lys Lys Ile Gly2 Ala Thr Ile2 Val Aia Ala A Gl9iarefu I Gly Lys Ser 'I Ser Ser Ala Ser Ala Ala G Ala Asp Arg G Thr Ile Asp TI Giy Ala Ala A Gly Ser Thr A Ser Trp Ile G Leu Aia Asn Ala Val Asn A Lei AsI Tyi Val Asp Thr Gln Gi u Glu Glu Pro Asp Val1 Val Asp Gly Phe T'yr Asp Asn k.sn Pro krg kla lal .eu ~hr Lla ;iy ;iy 'hr ~la *rg ny 'yr .rg u Met i Gin *Thr Gin Ile *Thr Lys Leu *Trp *Gin Gly Leu Leu Arg Ser Thr Ile Gly Pro Asn Arg Asp Ala Ala Gly Glu Leu Ala Ala Ile Met Gly Al a Phe Aila Ile 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 929 Met Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 944 945 Tyr Ala Ala Gly Leu Ala Arg Gln Leu Val His Phe Ser Val Ala Arg 960 961 Pro Val Phe Glu Pro Ile Phe Ser Val Leu Gly Gly Leu Val Gly Gly 976 977 Ile Gly Thr Gly Leu His Arg.Val Met Gly Arg Glu Ser Trp Ile Ser 992 993 Arg Ala Leu Ser Ala Ala Gly Ser Gly Ile Asp His Val Ala Gly Met 1008 1009 Ile Gly Asn Gin Ile Arg Gly Arg Val Leu Thr Thr Thr Gly Ile Ala 1024 1025 Asn Ala Ile Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg Arg Val 1040 1041 Phe Ser Leu 1043 INFORMATION FOR SEQ ID NO:62: TcaAiv SEQUENCE CHARACTERISTICS: LENGTH: 5 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: TcaAiv Asn Ile Gly Gly Asp 1 INFORMATION FOR SEQ ID NO:63: TcaAii-syn SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids S(B) TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: internal S i (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: TcaAii-syn Cys Leu Arg Gly Asn Ser Pro Thr Asn Pro Asp Lys Asp Gly Ile 1 5 10 Phe Ala Gin Val Ala INFORMATION FOR SEQ ID NO:64: TcaAiii-syn SEQUENCE CHARACTERISTICS; LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: Internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: TcaAiii-syn Cys Tyr Thr Pro Asp Gin Thr Pro Ser Phe Tyr Glu Tht Ala Phe -288- 1 5 10 Arg Ser Ala Asp Gly INFORMATION FOR SEQ ID NO:65: TcaB;-syn SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: Internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: TcaBi-syn His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu 1 5 10 Ser Ile Asn Thr 19 0: :0 INFORMATION FOR SEQ ID NO:66: TcaBii-syn SEQUENCE CHARACTERISTICS: 25 LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: TcaBii-syn Cys Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala Gly Gly Asp 1 5 10 15 :Gly Thr Gly Ser Ser 0 INFORMATION FOR SEQ ID NO:67: TcaC-syn SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: TcaC-syn Cys Tyr Lys Ala Pro Gin Arg Gin Glu Asp Gly Asp Ser Asn Ala 1 5 10 Val Thr Tyr Asp Lys -289- INFORMATION FOR SEQ ID NO:68: TcbAii-syn SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: TcbAii-syn Cys Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe 1 5 10 Ser Ser Lys Asp Asp INFORMATION FOR SEQ ID NO:69: TcbAiii-syn SEQUENCE CHARACTERISTICS: 20 LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein 25 FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: TcbAiii-syn
C
ys Phe Asp Ser Tyr Ser Gin Leu Tyr Glu Glu Asn Ile Asn Ala S* 30 1 5 10 Gly Glu Gin Arg Ala INFORMATION FOR SEQ ID NO:70: TcdAii-syn SEQUENCE CHARACTERISTICS: LENGTH: 22 amino acids TYPE: amino acid STRANDEDNESS: single 40 TOPOLOGY: linear (ii) MOLECULAR TYPE: protein v FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: TcdA-j-syn Cys Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val 1 5 10 Tyr Gin Tyr Ser Gly Asn Thr INFORMATION FOR SEQ ID NO:71: TcdAiii-syn SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: TcdAiii-syn -290- Val Ser Gin Gly Ser Gly Ser Ala Gly Ser Gly Asn Asn Asn Leu 1 5 10 Ala Phe Gly Ala Gly INFORMATION FOR SEQ ID NO:72: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 160 kDa Hb Met Gin Asp Ser Pro Glu Val Ala Ile Thr Thr Leu 1 5 INFORMATION FOR SEQ ID NO:73: SEQUENCE CHARACTERISTICS: 25 LENGTH: 8 amino acids S* TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 170 kDa WIR Met Gin Arg Ser Ser Glu Val Ser i s INFORMATION FOR SEQ ID NO:74: SEQUENCE
CHARACTERISTICS:
40 LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein see* o 45 FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 180 kDa H9 Met Gin Asp Ile Pro Glu Val Gin Leu Asn 1 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: 170 kDa Hm(2) INFORMATION-FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal Met Gin Asp Ser Pro Glu Val Ser Val Thr Gin Asn -291- 1 INFORMATION FOR SEQ ID NO:76: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 74 kDa H9 Ser Glu Ser Leu Phe Thr Gin Ser Leu Lys Glu Ala Arg Arg Asp 1 5 10 INFORMATION FOR SEQ ID NO:77: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid 25 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal 30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 71 kDa Hb Met Asn Leu Ile Glu Ala Lys Leu Gin Glu Asn Arg Asp Ala 1 5 INFORMATION FOR SEQ ID NO:78: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids 40 TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear S. (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: 170 kDa H9 Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp 1 5 10 INFORMATION FOR SEQ ID NO:79: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: 109 kDa Hm 5 Met Leu Asp Ile Met Glu Lys Gin Leu Asn Glu Ser Glu Arg Asp 1 5 10 -292at INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 8 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:80: 170 kDa WX-1 Met Gin Asp Ser Arg Glu Val Ser 1 INFORMATION FOR SEQ ID NO:81: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single 25 TOPOLOGY: linear (ii) MOLECULAR TYPE: protein S* FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: 69 kDa H9 30 Leu Arg Ser Ala Xxx Ser Ala Leu Thr Thr Leu Leu 1 5 35 INFORMATION FOR SEQ ID NO:82: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid 40 STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: 64 kDa HP88 Leu Lys Leu Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val 1 5 10 INFORMATION FOR SEQ ID NO:83: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 70 kDa NC-1 Leu Lys Leu Ala Asp Asn Ser Tyr Phe Asn Glu Pro Leu Asn 1 5 10 -293- INFORMATION FOR SEQ ID NO:84: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 60 kDa WIR Ser Lys Asp Glu Ser Lys Ala Asp Ser Gin Leu Val Tyr His Thr 1 5 10 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 25 (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: 58 kDa NC-1 30 Met Lys Lys Arg Gly Leu Thr Thr Asn Ala Gly Ala Pro Val 1 5 S" INFORMATION FOR SEQ ID NO:86: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single 40 TOPOLOGY: linear S(ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 60 kDa WX-12 Met Leu Asn Pro Ile Val Arg Lys Phe Glu Tyr Gly Glu His Thr 1 5 10 INFORMATION FOR SEQ ID NO:87: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 60 kDa Hm Ala Glu Ile Tyr Asn Lys Asp Gly Asn Lys Leu Asp Leu Tyr Gly 1 5 10 -294- INFORMATION FOR SEQ ID NO:88: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULAR TYPE: protein FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: 140 kDa Hm Asn Leu Ile Glu Ala Thr Leu Glu Gln Asn Leu Arg Asp Ala 115 10 -295-

Claims (15)

1. A substantially pure microorganism culture of a Photorhabdus strain designated B2, DEPI, DEP2, DEP3, P. zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MPI, MP2, MP3, MP4, MP5, GL98, GL101, GL138, GL155, GL217, or GL257.
2. A composition, comprising an effective amount of a Photorhabdus protein toxin that has functional activity against an insect, wherein the toxin is produced by a purified culture of Photorhabdus strain designated B2, DEPI, DEP2, DEP3, P. zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MPI, MP2, MP3, MP4, MP5, GL98, GL101, GL138, GL155, GL217, or GL257.
3. A purified preparation comprising, a protein having an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID 15 NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:88.
4. A chimeric DNA construct, adapted for expression in a prokaryotic or eukaryotic host comprising, 5' to 3' a transcriptional promoter active in the host; a DNA sequence encoding a Photorhabdus protein that has functional activity against an insect; and a transcriptional terminator, wherein the protein encoded by the DNA sequence has an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:88. A method of controlling an insect, comprising orally delivering to an insect an effective amount of a protein toxin that has functional activity against an insect, wherein the protein is produced by a purified culture ofPhotorhabdus strain designated B2, DEP1, DEP2, DEP3, P. zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MPI, MP2, MP3, MP4, GL98, GL101, GL138, GL155, GL217, or GL257.
6. The method of claim 5, wherein the insect is of the order Lepidoptera, Coleoptera, Hymenoptera, Diptera, Dictyoptera, Acarina or Homoptera. [I:\DayLib\LIBFF]67109spec.doc:gcc 297
7. The method of claim 6, wherein the insect species is from order Coleoptera and is Southern Corn Rootworm, Western Corn Rootworm, Colorado Potato Beetle, Mealworm, Boll Weevil or Turf Grub, or the insect species is from order Lepidoptera and is Beet Armyworm, Black Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European Corn Borer, Tobacco Hornworm, or Tobacco Budworm.
8. A DNA or RNA oligonucleotide coding for an amino acid sequence selected from SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:88.
9. Use of an oligonucleotide of claim 8 as a probe to isolate genetic material from an Enterobacteriaceae, Photorhabdus or Photorhabdus luminescens organism.
10. A method for expressing a protein produced by a purified bacterial culture of the genus Photorhabdus in a prokaryotic or eukaryotic host in an effective amount so that 15 the protein has functional activity against an insect, wherein the method comprises: constructing a chimeric DNA construct having 5' to 3' a promoter, a DNA sequence encoding a protein, a transcription terminator, and then transferring the chimeric DNA construct into the host, wherein the protein encoded by the DNA sequence has an N-terminal amino 20 acid sequence selected from the group consisting of SEQ ID NO:62, SEQ ID NO:72, SEQ D NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ SEQ ID NO:78, SEQ ID NO:79 SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:8 0, SEQ ID NO:8 1, SEQ ID NO:87, and SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, and SEQ ID NO:86.
11. A method of making an antibody against a protein fragment that is part of a protein having functional activity, where the protein is produced by bacteria of the Enterobacteriaceae family, wherein the method comprises: a) isolating a fragment of the protein, where the protein fragment is at least six amino acids; b) immunizing a mammalian species with the protein fragment; and c) harvesting serum containing antibody or antibody from the spleen of the mammalian species, where the antibody harvested is antibody to the protein fragment having functional activity. [I:\DayLib\LIBFF]67109spec.doc:gcc 298
12. The method of claim 11 wherein the protein fragment is selected from the group consisting of SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, and SEQ ID NO:71.
13. A method of making an antibody against a protein fragment that is part of a protein having functional activity, where the protein is produced by bacteria of the Enterobacteriaceae family, substantially as hereinbefore described with reference to any one of the examples.
14. An antibody made in accordance with the method of any one of claims 11, 12 or 13. 0o 15. A method for expressing a protein produced by a purified bacterial culture of the genus Photorhabdus in a prokaryotic or eukaryotic host in an effective amount so that the protein has functional activity against an insect, substantially as hereinbefore described with reference to any one of the examples.
16. A protein expressed in accordance with the method of claim 10 or 15 17. A composition, comprising an effective amount of a Photorhabdus protein toxin that has functional activity against an insect, substantially as hereinbefore described with reference to any one of the examples.
18. A substantially pure microorganism culture of a Photorhabdus strain of claim 1, substantially as hereinbefore described with reference to any one of the 20 examples. Dated 5 December, 2001 Dow AgroSciences LLC Wisconsin Alumni Research Foundation Patent Attorneys for the Applicant/Nominated Person SPRUSON FERGUSON [1:\DayLib\LIBFF]67lO9spec.doc:gcc
AU97125/01A 1996-08-29 2001-12-06 Insecticidal protein toxins from photorhabdus Abandoned AU9712501A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU97125/01A AU9712501A (en) 1996-08-29 2001-12-06 Insecticidal protein toxins from photorhabdus

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US08705484 1996-08-29
WO618003 1996-11-06
US08743699 1996-11-06
AU97125/01A AU9712501A (en) 1996-08-29 2001-12-06 Insecticidal protein toxins from photorhabdus

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU28299/97A Division AU2829997A (en) 1996-08-29 1997-05-05 Insecticidal protein toxins from (photorhabdus)

Publications (1)

Publication Number Publication Date
AU9712501A true AU9712501A (en) 2002-02-07

Family

ID=3764392

Family Applications (1)

Application Number Title Priority Date Filing Date
AU97125/01A Abandoned AU9712501A (en) 1996-08-29 2001-12-06 Insecticidal protein toxins from photorhabdus

Country Status (1)

Country Link
AU (1) AU9712501A (en)

Similar Documents

Publication Publication Date Title
AU729228B2 (en) Insecticidal protein toxins from photorhabdus
TW509722B (en) Insecticidal protein toxins from photorhabdus
EP2142009B1 (en) Hemipteran- and coleopteran- active toxin proteins from bacillus thuringiensis
US6766613B2 (en) Materials and methods for controlling pests
IL128165A (en) Insecticidal protein toxins from xenorhabdus
JPH10506532A (en) Novel pesticidal proteins and strains
JPH07503845A (en) Peptides effective in killing insects
US6528484B1 (en) Insecticidal protein toxins from Photorhabdus
US7569748B2 (en) Nucleic acid encoding an insecticidal protein toxin from photorhabdus
US6280722B1 (en) Antifungal Bacillus thuringiensis strains
EP1130970B1 (en) Insecticidal agents
ES2348509T5 (en) New insecticidal proteins from Bacillus thuringiensis
AU9712501A (en) Insecticidal protein toxins from photorhabdus
MXPA06007784A (en) Toxin complex proteins and genes from xenorhabdus bovienii.
US6150156A (en) Bacillus thuringiensis isolates active against sucking insects
MXPA99001288A (en) Insecticidal protein toxins from xenorhabdus

Legal Events

Date Code Title Description
PC1 Assignment before grant (sect. 113)

Owner name: WISCONSIN ALUMNI RESEARCH FOUNDATION

Free format text: THE FORMER OWNER WAS: DOW AGROSCIENCES LLC, WISCONSIN ALUMNI RESEARCH FOUNDATION