AU2829997A - Insecticidal protein toxins from (photorhabdus) - Google Patents
Insecticidal protein toxins from (photorhabdus)Info
- Publication number
- AU2829997A AU2829997A AU28299/97A AU2829997A AU2829997A AU 2829997 A AU2829997 A AU 2829997A AU 28299/97 A AU28299/97 A AU 28299/97A AU 2829997 A AU2829997 A AU 2829997A AU 2829997 A AU2829997 A AU 2829997A
- Authority
- AU
- Australia
- Prior art keywords
- seq
- protein
- photorhabdus
- dna
- purified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01N—PRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
- A01N63/00—Biocides, pest repellants or attractants, or plant growth regulators containing microorganisms, viruses, microbial fungi, animals or substances produced by, or obtained from, microorganisms, viruses, microbial fungi or animals, e.g. enzymes or fermentates
- A01N63/20—Bacteria; Substances produced thereby or obtained therefrom
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01N—PRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
- A01N63/00—Biocides, pest repellants or attractants, or plant growth regulators containing microorganisms, viruses, microbial fungi, animals or substances produced by, or obtained from, microorganisms, viruses, microbial fungi or animals, e.g. enzymes or fermentates
- A01N63/50—Isolated enzymes; Isolated proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medicinal Chemistry (AREA)
- Environmental Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Agronomy & Crop Science (AREA)
- Pest Control & Pesticides (AREA)
- Virology (AREA)
- Dentistry (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Agricultural Chemicals And Associated Chemicals (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
- Pretreatment Of Seeds And Plants (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Description
INSECTICIDAL PROTEIN TOXINS FROM PHOTORHABDUS
Cross-reference to Related Application This patent application is a contmuation-m-part of U.S. Patent Application Serial Number 08/743,699 filed on November 6, 1996, which is a continuation- m-part of U.S. Patent Application Serial Number 08/705,484 filed on August 28, 1996, which is a continuation- m-part of U.S. Patent Application Serial Number 08/608,423 filed February 28, 1996, which is a cont uation- m-part of U.S. Patent Application Serial Number 08/395,947 filed February 28, 1995, which was a continuation-m-part of U.S. Patent Application Serial Number 08/063,615 filed May 18, 1993. This application is also a contmuation-m-part of provisional U.S. Patent Application Serial Number 60/007,255 filed November 6, 1995.
Field of the Invention
The present invention relates to toxins isolated from cacteria and the use of said toxins as insecticides
Background of the Invention
Many insects are widely regarded as pests to homeowners, to picnickers, to gardeners, and to farmers and others whose investments in agricultural products are often destroyed or diminished as a result of insect damage to field crops. Particularly areas where the growing season is short, significant insect damage can mean the loss of all profits to growers and a dramatic decrease crop yield Scarce supply of particular agricultural products invariably results m higher costs to food processors and, then, to the ultimate consumers of food plants and products derived from those plants.
Preventing insect damage to crops and flowers and eliminating the nuisance of insect pests have typically relied on strong organic pesticides and insecticides with broad toxicities These synthetic products have come under attack by the general population as being too harsh on the environment and on those exposed to such agents. Similarly in non-agricultural settings, homeowners would be satisfied to have insects avoid their homes or outdoor meals without needing to kill the insects.
The extensive use of chemical insecticides has raised environmental and health concerns for farmers, companies that
-1-
SUBSTTTUTE SHEET (RULE 26)
produce the insecticides, government agencies, public interest groups, and the public in general. The development of less intrusive pest management strategies has been spurred along both by societal concern for the environment and by the development of biological tools which exploit mechanisms of insect management. Biological control agents present a promising alternative to chemical insecticides.
Organisms at every evolutionary development level have devised means to enhance their own success and survival The use of biological molecules as tools of defense and aggression is known throughout the animal and plant kingdoms In addition, the relatively new tools of the genetic engineer allow modifications to biological insecticides to accomplish particular solutions to particular problems. One such agent, Bacillus thuπngiensis (Bt) , is an effective msecticidal agent, and is widely commercially used as such In fact, the msecticidal agent of the Bt bacterium is a protein which has such limited toxicity, it can be used on human food crops on the day of harvest. To non-targeted organisms, the St toxin is a digestible non-toxic protein.
Another known class of biological insect control agents are certain genera of nematodes known to be vectors of transmission for insect-killing bacterial symbionts. Nematodes containing msecticidal bacteria invade insect larvae. The bacteria then kill the larvae. The nematodes reproduce the larval cadaver The nematode progeny then eat the cadaver from withm. The bacteria- containing nematode progeny thus produced can then invade additional larvae.
In the past, msecticidal nematodes m the Stemernema and Heteror ajbditis genera were used as insect control agents.
Apparently, each genus of nematode hosts a particular species of bacterium. In nematodes of the Heterorhabdi tis genus, the symbiotic bacterium is Photorhabdus luminescens .
Although these nematodes are effective insect control agents, it is presently difficult, expensive, and inefficient to produce, maintain, and distribute nematodes for insect control.
It has been known m the art that one may isolate an msecticidal toxin from Photorhabdus luminescens that has activity only when injected into Lepidopteran and Coleopteran insect larvae. This has made it impossible to effectively exploit the msecticidal properties of the nematode or its bacterial symbiont What would be useful would be a more practical, less labor-intensive wide-area delivery method of an msecticidal toxin which would retain its
- ? -
SUBSTITUTE SHEET (RULE 25)
biological properties after delivery. It would be quite desirous to discover toxins with oral activity produced by the genus Photorhabdus . The isolation and use of these toxins are desirous due to efficacious reasons. Until applicants' discoveries, these toxins had not been isolated or characterized.
Summary of the Invention
The native toxins are protein complexes that are produced and secreted by growing bacteria cells of the genus Photorhabdus, of interest are the proteins produced by the species Photorhabdus luminescens . The protein complexes, with a molecular size of approximately 1,000 kDa, can be separated by SDS-PAGE gel analysis into numerous component proteins. The toxins contain no hemolysin, lipase, type C phospholipase, or nuclease activities. The toxins exhibit significant toxicity upon exposure administration to a number of insects.
The present invention provides an easily administered msecticidal protein as well as the expression of toxin in a heterologous system.
The present invention also provides a method for delivering msecticidal toxins that are functional active and effective against many orders of insects.
Ob ects, advantages, and features of the present invention will become apparent from the following specification.
Brief Description of the Drawings
Fig. 1 is an illustration of a match of cloned DNA isolates used as a part of sequence genes for the toxin of the present invention.
Fig. 2 is a map of three plasmids used in the sequencing process .
Fig. 3 is a map illustrating the mter-relationship of several partial DNA fragments.
Fig. 4 is an illustration of a homology analysis between the protein sequences of TcbAi! and TcaBu proteins.
Fig. 5 is a phenogram of Photorhabdus strains. Relationship of Photorhabdus Strains was defined by rep-PCR. The upper axis of Fig. 5 measures the percentage similarity of strains based on scoring of rep-PCR products (i.e., 0.0 [no similarity] to 1.0 [100% similarity]). At the right axis, the numbers and letters indicate the various strains tested; 14=W-14,
-3-
SUBSTTTUTE SHEET (RULE 26)
Hm=Hm, H9=H9, 7=WX-7, 1= X-1, 2=WX-2/ 88=HP88, NC-1=NC-1, 4=WX-4, 9= X-9, 8= X-8, 10=WX-10, WIR=WIR, 3=WX-3, 11=WX-11, 5=WX-5, 6=WX- 6, 12=WX-12, xl4=WX-14, 15= X-15, Hb=Hb, B2=B2, 48 through 52=ATCC 43948 through ATCC 43952. Vertical lines separating horizontal lines indicate the degree of relatedness (as read from the extrapolated intersection of the vertical line with the upper axis) between strains or groups of strains at the base of the horizontal lines (e.g., strain -14 is approximately 60% similar to strains H9 and Hm) . Fig. 6 is an illustration of the genomic maps of the W-14 Strain.
Fig. 6A is an illustration of the tea and tcb loci and primary gene products .
Fig. 7 is a phenogram of Photorhabdus strains as defined by rep-PCR. The upper axis of Fig. 7 measures the percentage similarity of strains based on scoring of rep-PCR products (i.e.,
0.0 [no similarity] to 1.0 [100% similarity]). At the right axis, the numbers and letters indicate the various strains tested.
Vertical lines separating horizontal lines indicate the degree of relatedness (as read from the extrapolated intersection of the vertical line with the upper axis) between strains or groups of strains at the base of the horizontal lines (e.g., strain Indicus is approximately 30% similar to strains MP1 and HB Oswego) Note that the Photorhabdus strains on the phenogram are as follows: 14 = W-14; Hm = H ; H9 = H9; 7 = WX-7; 1 = WX-1; 2 = WX-2; 88 = HP88;
NCI = NC-1; 4 = WX-4; 9 = WX-9; 8 = WX-8; 10 = WX-10; 30 = W30; WIR
= WIR; 3 - WX-3; 11 = WX-11; 5 = WX-5; 6 = WX-6; 12 = WX-12; 15 =
WX-15; X14 = WX-14; Hb = Hb ; B2 = B2 ; 48 = ATCC 43948; 49 = ATCC
43949; 50 = ATCC 43950; 51 = ATCC 43951; 52 = ATCC 43952.
Detailed Description of the Invention
The present inventions are directed to the discovery of a unique class of msecticidal protein toxins from the genus
Photorhabdus that have oral toxicity against insects. A unique feature of Photorhabdus is its biolum escence . Photorhabdus may be isolated from a variety of sources. One such source is nematodes, more particularly nematodes of the genus Heterorhabdi is. Another such source is from human clinical samples from wounds, see Farmer et al . 1989 J Clin. Microbiol. 27
-4-
SUBSTΪTUTE SHEET (RULE 26)
pp. 1594-1600. These saprohytic strains are deposited m the American Type Culture Collection (Rockville, MD) ATCC #s 43948, 43949, 43950, 43951, and 43952, and are incorporated herein by reference. It is possible that other sources could harbor Photorhabdus bacteria that produce msecticidal toxins . Such sources in the environment could be either terrestrial or aquatic based.
The genus Photorhabdus is taxonom cally defined as a member of the Family Enterobacteπaceae, although it has certain traits atypical of this family For example, strains of this genus are nitrate reduction negative, yellow and red pigment producing and biolummescent . This latter trait is otherwise unknown with the En t er oba ctena ceae . Photorhabdus has only recently been described as a genus separate from the Xenorhabdus (Boemare et al . , 1993 Int. J. Syst. Bacteriol . 43, 249-255). This differentiation is based on DNA-DNA hybridization studies, phenotypic differences (e.g., presence [ Photorhabdus) or absence {Xenorhabdus) of catalase and biolummescence) and the Family of the nematode host [Xenorhabdus ; Stemernejnatidae, Photorhabdus ; Heterorhabdi tidae) Comparative, cellular fatty-acid analyses (Janse et al 1990, Lett. Appl Microbiol 10, 131-135; Suzuki et al . 1990, J. Gen Appl. Microbiol., 36, 393-401) support the separation of Photorhabdus from Xenorhabdus .
In order to establish that the strain collection disclosed herein was comprised of Photorhabdus strains, the strains were characterized based on recognized traits which define Photorhabdus and differentiate it from other Enterobacteπaceae and Xenorhabdus species. (Farmer, 1984 Bergey's Manual of Systemic Bacteriology Vol. 1 pp.510-511; Akhurst and Boemare 1988, J. Gen Microbiol. 134 pp. 1835-1845, Boemare et al . 1993 Int. J Syst. Bacteriol. 43 pp. 249-255, which are incorporated herein by reference) The traits studied were the following: gram stain negative rods, organism size, colony pigmentation, inclusion bodies, presence of catalase, ability to reduce nitrate, biolummescence, dye uptake, gelatin hydrolysis, growth on selective media, growth temperature, survival under anerobic conditions and motility Fatty acid analysis was used to confirm that the strains herein all belong to the single genus Photorhabdus .
Currently, the bacterial genus Photorhabdus is comprised of a single defined species, Photorhabdus luminescens (ATCC Type strain #29999, Pomar et al . , 1977, Nematologica 23, 97-102). A variety of related strains have been described m the literature (e.g., Akhurst et al . 1988 J. Gen. Microbiol., 134, 1835-1845; Boemare
-5-
SUBSTΓΓUTE SHEET (RULE26)
et al. 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255; Putz et al . 1990, Appl. Environ. Microbiol., 56, 181-186) Numerous Photorhabdus strains have been characterized herein. Because there is currently only one species (luminescens) defined within the genus Photorhabdus , the luminescens species traits were used to characterize the strains herein. As can be seen m Fig. 5, these strains are quite diverse. It is not unforeseen that m the future there may be other Photorhabdus species that will have some of the attributes of the luminescens species as well as some different characteristics that are presently not defined as a trait of
Photorhabdus luminescens . However, the scope of the invention herein is to any Photorhabdus species or strains which produce proteins that have functional activity as insect control agents, regardless of other traits and characteristics. Furthermore, as is demonstrated herein, the bacteria of the genus Photorhabdus produce proteins that have functional activity as defined herein. Of particular interest are proteins produced by the species Photorhabdus luminescens . The inventions herein should in no way be limited to the strains which are disclosed herein. These strains illustrate for the first time that proteins produced by diverse isolates of Photorhabdus are toxic upon exposure to insects. Thus, mcluded with the inventions described herein are the strains specified herein and any mutants thereof, as well as any strains or species of the genus Photorhabdus that have the functional activity described herein.
There are several terms that are used herein that have a particular meaning and are as follows.
By "functional activity" it is meant herein that the protein toxin (s) function as insect control agents in that the proteins are orally active, or have a toxic effect, or are able to disrupt or deter feeding, which may or may not cause death of the insect. When an insect comes into contact with an effective amount of toxin delivered via transgenic plant expression, formulated protein compositions (s) , sprayable protein composition (s) , a bait matrix or other delivery system, the results are typically death of the insect, or the insects do not feed upon the source which makes the toxins available to the insects.
By the use of the term "genetic material" herein, it is meant to include all genes, nucleic acid, DNA and RNA.
-6-
SUBSTΓΓUTE SHEET (RULE 26)
By "homolog" it is meant an ammo acid sequence that is identified as possessing homology to a reference W-14 tox polypeptide ammo acid sequence.
By "homology" it is meant an ammo acid sequence that has a similarity index of at least 33% and/or an identity index of at least 26% to a reference W-14 toxm polypeptide am o acid sequence, as scored by the GAP algorithm using the BlOsum 62 protein scoring matrix (Wisconsin Package Version 9.0, Genetics Computer Group (GCG) , Madison, WI) .
By "identity" is meant an ammo acid sequence that contains an identical residue at a given position, following alignment with a reference W-14 tox polypeptide ammo acid sequence by the GAP algorithm.
The protein toxins discussed herein are typically referred to as "insecticides" By insecticides it is meant herein that the protein toxins have a "functional activity" as further defined herein and are used as insect control agents.
By the use of the term "oligonucleotides" it is meant a macromolecule consisting of a short chain of nucleotides of either RNA or DNA. Such length could be at least one nucleotide, but typically are in the range of about 10 to about 12 nucleotides. The determination of the length of the oligonucleotide is well withm the skill of an artisan and should not be a limitation herein Therefore, oligonucleotides may be less than 10 or greater than 12.
By the use of the term " Photorhabdus tox " it is meant any protein produced by a Photorhabdus microorganism strain which has functional activity against insects, where the Photorhabdustox could be formulated as a sprayable composition, expressed by a transgenic plant, formulated as a bait matrix, delivered via baculov rus, or delivered by any other applicable host or delivery system.
By the use of the term "toxic" or "toxicity" as used herein it is meant that the toxins produced by Photorhabdus have "functional activity" as defined herein.
- 7 -
SUBSTTTUTE SHEET (RULE 26)
By "truncated peptide" it is meant herein to include any peptide that is fragment (s) of the peptides observed to have functional activity.
By "substantial sequence homology" is meant either, a DNA fragment having a nucleotide sequence sufficiently similar to another DNA fragment to produce a protein having similar biochemical properties; or a polypeptide having an am o acid sequence sufficiently similar to another polypeptide to exhibit similar biochemical properties.
Fermentation broths from selected strains reported Table 20 were used to determine the following: breadth of msecticidal tox production by the Photorhabdus genus, the msecticidal spectrum of these toxins, and to provide source material to purify the toxm complexes. The strains characterized herein have been shown to have oral toxicity against a variety of insect orders Such insect orders include but are not limited to Coleoptera , Homoptera, Lepidoptera, Diptera, Acarina, Hymenoptera and Dic yoptera . As with other bacterial toxins, the rate of mutation of the bacteria in a population causes many related toxins slightly different in sequence to exist. Toxins of interest here are those which produce protein complexes toxic to a variety of insects upon exposure, as described herein Preferably, the toxins are active against Lepidoptera , Coleoptera , Ho opotera, Diptera, Hymenoptera, Di ctyoptera and Acarina . The inventions herein are intended to capture the protein toxins homologous to protein toxins produced by the strains herein and any derivative strains thereof, as well as any protein toxins produced by Photorhabdus These homologous proteins may differ in sequence, but do not differ m function from those toxins described herein. Homologous toxins are meant to include protein complexes of between 300 kDa to 2,000 kDa and are comprised of at least two (2) subunits, where a subunit is a peptide which may or may not be the same as the other subunit. Various protein subunits have been identified and are taught in the Examples herein. Typically, the protein subunits are between about 18 kDa to about 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to 160 kDa; about 80 kDa to about 100 kDa; and about 50 kDa to about 80 kDa. As discussed above, some Photorhabdus strains can be isolated from nematodes. Some nematodes, elongated cylindrical parasitic worms of the p .ylum Nematoda , have evolved an ability to exploit insect larvae as a favored growth environment The insect larvae
provide a source of food for growing nematodes and an environment in which to reproduce. One dramatic effect that follows invasion of larvae by certain nematodes is larval death. Larval death results from the presence of, in certain nematodes, bacteria that produce an msecticidal tox which arrests larval growth and inhibits feeding activity.
Interestingly, it appears that each genus of insect parasitic nematode hosts a particular species of bacterium, uniquely adapted for symbiotic growth with that nematode. In the interim since this research was initiated, the name of the bacterial genus Xenorhabdus was reclassified into the Xenorhabdus and the Photorhabdus . Bacteria of the genus Photorhabdus are characterized as being symbionts of Heterorhabdi tuε nematodes while Xenorhabdus species are symbionts of the Steiπernema species . This change in nomenclature is reflected in this specification, but in no way should a change in nomenclature alter the scope of the inventions described herein.
The peptides and genes that are disclosed herein are named according to the guidelines recently published in the Journal of Bacteriology "Instructions to Authors" p. i -xii (Jan. 1996), which is incorporated herein by reference. The following peptides and genes were isolated from Photorhabdus strain W-14.
-9-
SUBSTΪTUTE SHEET (RULE 26)
Table ι
Peptide/Gene Nomenclature Toxm Complex
"Sequence ID No.'s m brackets are peptide N-termini, bNumbers in parentheses are N-termim of internal peptide tryptic fragments =deduced from gene sequence internal gene fragment
The sequences listed above are grouped by genomic region. More specifically, the Photorhabdus luminesence bacteria (W-14) has at least four distinct genomic regions- tea, tcb, tec and ted. As can be seen in Table 1, peptide products are produced from these distinct genomic regions. Furthermore, as illustrated the Examples, specifically Examples 15 and 21, individual gene products produced from three genomic regions are associated with insect activity. There is also considerable homology between these four genomic regions.
-10-
SUBSTΓΓUTE SHEET (RULE 26)
As is further illustrated m the Examples, the tcbA gene was expressed m E. coli as two possible biological active protein fragments (TcbA and TcbAn/m) . The tcdA gene was also expressed in E. coli . As illustrated in Example 16, when the native unprocessed TcbA tox was treated with the endogeneous metalloproteases or insect gut contents containing proteases, the TcbA protein toxin was processed into smaller subunits that were less than the size of the native peptides and Southern Corn Rootworm activity increased. The smaller tox peptides remained associated as part of a tox complex. It may be desirable in some situations to increase activation of the toxm(s) by proteolytic processing or using truncated peptides. Thus, it may be more desirable to use truncated peptide (s) in some applications, i.e., commercial transgenic plant applications. In addition to the W-14 strain, there are other species withm the Photorhabdus genus that have functional activity which is differential (specifically see Tables 20 and 36) Even though there is differential activity, the amino acid sequences in some cases have substantial sequence homology. Moreover, the molecular probes indicate that some genes contained in the strains are homologous to the genes contained in the W-14 strain. In fact all of the strains illustrated herein have one or more homologs of W-14 toxm genes. The antibody data in Example 26 and the N- terminal sequence data in Example 25 further support the conclusion that there is homology and identity (based on am o acid sequence) between the protein toxm(s) produced by these strains. At the molecular level, the W-14 gene probes indicated that the homologs or the W-14 genes themselves (Tables 37, 38, and 39) are dispersed throughout the Photorhabdus genus. Further, it is possible that new toxm genes exist in other strains which are not homologous to W-14, but maintain overall protein attributes (see specifically Examples 14 and 25) .
Even though there is homology or identity between toxm genes produced by the Photorhabdus strains , the strains themselves are quite diverse. Using polymerase chain reaction technology further discussed in Example 22, most of the strains illustrated herein are quite distinguishable. For example as can be seen in Figs. 5, the percentage relative similarity of some of the strains, such as HP88 and NC-1, was about 0.8, which indicates that the strains are similar, while HP88 and Hb was about 0.1, which indicates substantial diversity. Therefore, even though the insect tox genes or gene products that the strains produce are the same or similar, the strains themselves are diverse
-11-
SUBSTΓΓUTE SHEET (RULE 26)
In view of the data further disclosed in the Examples and discussions herein, it is clear that a new and unique family of msecticidal protein toxm(s) has been discovered. It has been further illustrated herein that these toxm(s) widely exist withm bacterial strains of the Photorhabdus genus. It may also be the case that these tox genes widely exist withm the family Enterobacteracaea . Antibodies prepared as described in Example 21 or gene probes prepared as described in Example 25 may be used to further screen for bacterial strains with the family Enterobacteracaea that produce the homologous toxm(s) that have functional activity. It may also be the case that specific primer sets exist that could facilitate the identification of new genes with the Photorhabdus genus or family Enterobacteracaea .
As stated above, the antibodies may be used to rapidly screen bacteria of the genus Photorhabdus or the family En erbacteracaea for homologous toxm products as illustrated in Example 26. Those skilled in the art are quite familiar with the use of antibodies as an analysis or screening tool (see US Patent No. 5,430,137, which is incorporated herein by reference) . Moreover, it is generally accepted in the literature that antibodies are elicited against 6 to 20 ammo acid residue segments that tend to occupy exposed surface of polypeptides (Current Protocols in Immunology, Coligan et al, National Institutes of Health, John Wiley & Sons, Inc.) Usually the ammo acid consist of contiguous am o acid residues, however, m certain cases they may be formed by non-contiguous ammo acids that are constrained by specific conformation. The ammo acid segments recognized by antibodies are highly specific and commonly referred to epitopes. The ammo acid fragment can be generated by chemical and/or enzymatic cleavage of the native protein, by automated, "solid-phase peptide synthesis, or by production from genetic engineering organisms. Polypeptide fragments- can be isolated by a variety and/or combination of HPLC and FPLC chromatographic methods known in the art. Selection of polypeptide fragment can be aided by the use of algorithms, for example Kyte and Doolittle, 1982, Journal of Molecular Biology 157: 105-132 and Chou and Fasman, 1974, Biochemistry 13: 222-245, that predict those sequences most likely to exposed on the surface of the protein. For preparation of immunogen containing the polypeptide fragment of interest, in general, polypeptides are covalently coupled using chemical reactions to carrier proteins such as keyhole limpet hemocyanm via free amino (lysiπe) , sulfhydyl (cysteine) , phenolic (tyrosme) or carboxylic (aspartate or glutamate) groups. Immunogen with an adjuvant is injected n animals, such as mice or rabbits, or
-12-
SUBSTΓΠJTE SHEET (RULE 26)
chickens to elicit an immune response against the immunogen Analysis of antibody titer in antisera of inject animals against polypeptide fragment can be determined by a variety of immunological methods such as ELISA and Western blot. Alternatively, monoclonal antibodies can be prepared using spleen cells of the injected animal for fusion with tumor cells to produce immortalized hybndomas cells producing a single antibody species. Hybndomas cells are screened using immunological methods to select lines that produce a specific antibody to the polypeptide fragment of interest. Purification of antibodies from different sources can be performed by a variety of antigen affinity or antibody affinity columns or other chromatographic HPLC or FPLC methods.
The toxins described herein are quite unique in that the toxins have functional activity, which is key to developing an insect management strategy In developing an insect management strategy, it is possible to delay or circumvent the protein degradation process by injecting a protein directly into an organism, avoiding its digestive tract. In such cases, the protein administered to the organism will retain its function until it is denatured, non-specifically degraded, or eliminated by the immune system in higher organisms. Injection into insects of an msecticidal toxm has potential application only in the laboratory, and then only on large insects which are easily injected. The observation that the msecticidal protein toxins herein described exhibits their toxic activity after oral gestion or contact with the toxins permits the development of an insect management plan based solely on the ability to incorporate the protein toxins into the insect diet. Such a plan could result m the production of insect baits. The Photorhabdus toxins may be administered to insects a purified form. The toxins may also be delivered in amounts from about 1 to about 100 mg / liter of broth. This may vary upon formulation condition, conditions of the inoculum source, techniques for isolation of the toxm, and the like. The toxins may be administered as an exudate secretion or cellular protein originally expressed in a heterologous prokaryotic or eukaryotic host Bacteria are typically the hosts in which proteins are expressed. Eukaryotic hosts could include but are not limited to plants, insects and yeast. Alternatively, the toxins may be produced in bacteria or transgenic plants in the field or in the insect by a baculovirus vector. Typically the toxins will be introduced to the insect by incorporating one or more of the toxins into the insects' feed
-13-
SUBSTΓΓUTE SHEET (RULE 26)
Complete lethality to feeding insects is useful but is not required to achieve useful toxicity. If the insects avoid the toxm or cease feeding, that avoidance will be useful m some applications, even if the effects are sublethal . For example, if insect resistant transgenic crop plants are desired, a reluctance of insects to feed on the plants is as useful as lethal toxicity to the insects since the ultimate objective is protection of the plants rather than killing the insect.
There are many other ways in which toxins can be incorporated into an insect's diet. As an example, it is possible to adulterate the larval food source with the toxic protein by spraying the food with a protein solution, as disclosed herein. Alternatively, the purified protein could be genetically engineered into an otherwise harmless bacterium, which could then be grown in culture, and either applied to the food source or allowed to reside m the soil in an area m which insect eradication was desirable. Also, the protein could be genetically engineered directly into an insect food source. For instance, the major food source of many insect larvae is plant material. By incorporating genetic material that encodes the msecticidal properties of the Photorhabdus toxins into the genome of a plant eaten by a particular insect pest, the adult or larvae would die after consuming the food plant. Numerous members of the monocotyledonous and dictyledenous genera have been transformed. Transgenic agronmonic crops as well as fruits and vegetables are of commercial interest. Such crops include but are not limited to maize, rice, soybeans, canola, sunflower, alfalfa, sorghum, wheat, cotton, peanuts, tomatoes, potatoes, and the like. Several techniques exist for introducing foreign genetic material into plant cells, and for obtaining plants that stably maintain and express the introduced gene. Such techniques include acceleration of genetic material coated onto icroparticles directly into cells (U.S. Patents 4,945,050 to Cornell and 5,141,131 to DowElanco) . Plants may be transformed using Agrrobacterium technology, see U.S. Patent 5,177,010 to University of Toledo, 5,104,310 to Texas A&M,
European Patent Application 0131624B1, European Patent Applications 120516, 159418B1 and 176,112 to Schilperoot, U.S. Patents 5,149,645, 5,469,976, 5,464,763 and 4,940,838 and 4,693,976 to Schilperoot, European Patent Applications 116718, 290799, 320500 all to MaxPlanck, European Patent Applications 604662 and 627752 to Japan Tobacco, European Patent Applications 0267159, and 0292435 and U.S. Patent 5,231,019 all to Ciba Ge gy, U.S Patents 5,463,174 and 4,762,785 both to Calgene, and U.S. Patents 5,004,863 and
-14-
SUBSTΓΓUTE SHEET (RULE 25)
5,159,135 both to Agracetus . Other transformation technology includes whiskers technology, see U.S. Patents 5,302,523 and 5,464,765 both to Zeneca. Electroporation technology has also been used to transform plants, see WO 87/06614 to Boyce Thompson Institute, 5,472,869 and 5,384,253 both to Dekalb, WO9209696 and W09321335 both to PGS . All of these transformation patents and publications are incorporated by reference. In addition to numerous technologies for transforming plants, the type of tissue which is contacted with the foreign genes may vary as well. Such tissue would include but would not be limited to embryogenic tissue, callus tissue type I and II, hypocotyl , meristem, and the like. Almost all plant tissues may be transformed during dedifferentiation using appropriate techniques withm the skill of an artisan. Another variable is the choice of a selectable marker. The preference for a particular marker is at the discretion of the artisan, but any of the following selectable markers may be used along with any other gene not listed herein which could function as a selectable marker. Such selectable markers include but are not limited to ammoglycoside phosphotransferase gene of transposon Tn5 (Aph II) which encodes resistance to the antibiotics kana ycm, neomycm and G418, as well as those genes which code for resistance or tolerance to glyphosate; hygromycm; methotrexate; phosphmothric (bialophos) ; lmidazolmones, sulfonylureas and triazolopyrimidme herbicides, such as chlorosulfuron,- bromoxynil, dalapon and the like.
In addition to a selectable marker, it may be desirous to use a reporter gene . In some instances a reporter gene may be used without a selectable marker. Reporter genes are genes which are typically not present or expressed in the recipient organism or tissue. The reporter gene typically encodes for a protein which provides for some phenotypic change or enzymatic property Examples of such genes are provided in K. Weismg et al . Ann. Rev. Genetics, 22, 421 (1988), which is incorporated herein by reference. A preferred reporter gene is the glucuronidase (GUS) gene .
Regardless of transformation technique, the gene is preferably incorporated into a gene transfer vector adapted to express the Photorhabdus toxins in the plant cell by including m the vector a plant promoter. In addition to plant promoters, promoters from a variety of sources can be used efficiently in plant cells to express foreign genes. For example, promoters of bacterial origin, such as the octop e synthase promoter, the nopalme synthase
-15-
SUBSTΪTUTE SHEET (RULE 26)
promoter, the marmop e synthase promoter; promoters of viral origin, such as the cauliflower mosaic virus (35S and 19S) , reengineered 35S, known as 35T (see PCT/US96/16582 , WO 97/13402 published April 17, 1997, which is incorporated herein by reference) and the like may be used. Plant promoters include, but are not limited to rιbulose-1 , 6-bιsphosphate (RUBP) carboxylase small subunit (ssu) , beta-conglyc in promoter, phaseolm promoter, ADH promoter, heat-shock promoters and tissue specific promoters. Promoters may also contain certain enhancer sequence elements that may improve the transcription efficiency. Typical enhancers include but are not limited to Adh-mtron 1 and Adh- intron 6 Constitutive promoters may be used. Constitutive promoters direct continuous gene expression in all cells types and at all times (e.g., act , ubiquitin, CaMV 35S) Tissue specific promoters are responsible for gene expression specific cell or tissue types, such as the leaves or seeds (e.g., zein, oleosin, napin, ACP) and these promoters may also be used. Promoters may also be are active during a certain stage of the plants' development as well as active in plant tissues and organs. Examples of such promoters include but are not limited to pollen-specific, embryo specific, corn silk specific, cotton fiber specific, root specific, seed endosperm specific promoters and the like.
Under certain circumstances it may be desirable to use an mducible promoter. An mducible promoter is responsible for expression of genes in response to a specific signal, such as. physical stimulus (heat shock genes) , light (RUBP carboxylase) ; hormone (Em); metabolites, and stress. Other desirable transcription and translation elements that function plants may be used. Numerous plant-specific gene transfer vectors are known to the art.
In addition, it is known that to obtain high expression of bacterial genes plants it is preferred to reeng eer the bacterial genes so that they are more efficiently expressed the cytoplasm of plants. Maize is one such plant where it is preferred to reengmeer the bacterial gene(s) prior to transformation to increase the expression level of the tox in the plant One reason for the reeng eer g is the very low G+C content of the native bacterial gene(s) (and consequent skewing towards high A+T content) This results in the generation of sequences mimicking or duplicating plant gene control sequences that are known to be highly A+T rich. The presence of some A+T-rich sequences within the DNA of the gene(s) introduced into plants (e.g , TATA box regions normally found in gene promoters) may result in aberrant
-16-
SUBSTΓΓUTE SHEET (RULE 26)
transcription of the gene(s) . On the other hand, the presence of other regulatory sequences residing in the transcribed mRNA (e.g., polyadenylation signal sequences (AAUAAA) , or sequences complementary to small nuclear RNAs involved in pre-mRNA splicing) may lead to RNA instability. Therefore, one goal in the design of reengmeered bacterial gene(s), more preferably referred to as plant optimized gene(s) , is to generate a DNA sequence having a higher G+C content, and preferably one close to that of plant genes coding for metabolic enzymes. Another goal m the design of the plant optimized gene(s) is to generate a DNA sequence that not only has a higher G+C content, but by modifying the sequence changes, should be made so as to not hinder translation.
An example of a plant that has a high G+C content is maize. The table below illustrates how high the G+C content is in maize. As in maize, it is thought that G+C content n other plants is also high.
Table 2 Compilation of G+C Contents of Protein Coding Regions of Maize Genes
Number of genes in class given in parentheses. Standard deviations given in parentheses. Combined groups mean ignored calculation of overall mean.
■ 17 -
SUBSTΓΠJTE SHEET (RULE 26)
For the data in Table 2 , coding regions of the genes were extracted from GenBank (Release 71) entries, and base compositions were calculated using the MacVector™ program (IBI, New Haven, CT) . Intron sequences were ignored in the calculations. Group I and II storage protein gene sequences were distinguished by their marked difference m base composition.
Due to the plasticity afforded by the redundancy of the genetic code (i.e., some am o acids are specified by more than one codon) , evolution of the genomes of different organisms or classes or organisms has resulted in differential usage of redundant codons. This "codon bias" is reflected in the mean base composition of protein coding regions. For example, organisms with relatively low G+C contents utilize codons having A or T in the third position of redundant codons, whereas those having higher G+C contents utilize codons having G or C in the third position. It is thought that the presence of "minor" codons withm a gene's mRNA may reduce the absolute translation rate of that mRNA, especially when the relative abundance of the charged tRNA corresponding to the minor codon is low. An extension of this is that the diminution of translation rate by individual minor codons would be at least additive for multiple minor codons. Therefore, mRNAs having high relative contents of minor codons would have correspondingly low translation rates. This rate would be reflected by the synthesis of low levels of the encoded protein. In order to reengmeer the bacterial genets), the codon bias of the plant is determined. The codon bias is the statistical codon distribution that the plant uses for coding its proteins After determining the bias, the percent frequency of the codons in the gene(s) of interest is determined. The primary codons preferred by the plant should be determined as well as the second and third choice of preferred codons . The amino acid sequence of the protein of interest is reverse translated so that the resulting nucleic acid sequence codes for the same protein as the native bacterial gene, but the resulting nucleic acid sequence corresponds to the first preferred codons of the desired plant The new sequence is analyzed for restriction enzyme sites that might have been created by the modification. The identified sites are further modified by replacing the codons with second or third choice preferred codons. Other sites in the sequence which could affect the transcription or translation of the gene of interest are the exon -.intron 5' or 3 ' junctions, poly A addition signals, or RNA polymerase termination signals. The sequence is further analyzed and modified to reduce the frequency of TA or GC doublets In
- 18 -
SUBSTΓΓUTE SHEET (RULE 26)
addition to the doublets, G or C sequence blocks that have more than about four residues that are the same can affect transcription of the sequence. Therefore, these blocks are also modified by replacing the codons of first or second choice, etc. with the next preferred codon of choice. It is preferred that the plant optimized genets) contains about 63% of first choice codons, between about 22% to about 37% second choice codons, and between 15% and 0% third choice codons, wherein the total percentage is 100%. Most preferred the plant optimized gene(s) contain about 63% of first choice codons, at least about 22% second choice codons, about 7.5% third choice codons, and about 7.5% fourth choice codons, wherein the total percentage is 100%. The method described above enables one skilled in the art to modify gene(s) that are foreign to a particular plant so that the genes are optimally expressed in plants The method is further illustrated application PCT/US96/16582 , WO 97/13402 published April 17, 1997
Thus, in order to design plant optimized gene(s) the ammo acid sequence of the toxins are reverse translated into a DNA sequence, utilizing a nonredundant genetic code established from a codon bias table compiled for the gene DNA sequence for the particular plant being transformed The resulting DNA sequence, which is completely homogeneous in codon usage, is further modified to establish a DNA sequence that, besides having a higher degree of codon diversity, also contains strategically placed restriction enzyme recognition sites, desirable base composition, and a lack of sequences that might interfere with transcription of the gene, or translation of the product mRNA.
It is theorized that bacterial genes may be more easily expressed n plants if the bacterial genes are expressed in the plastids. Thus, it may be possible to express bacterial genes in plants, without optimizing the genes for plant expression, and obtain high express of the protein. See U.S. Patent Nos. 4,762,785; 5,451,513 and 5,545,817, which are incorporated herein by reference.
One of the issues regarding commercial exploiting transgenic plants is resistance management. This is of particular concern with Bacillus thuringiensis toxins. There are numerous companies commeπcally exploiting Bacillus thuringiensis and there has been much concern about Bt toxins becoming resistant. One strataegy for insect resistant management would be to combine the toxins produced by Photorhabdus with toxins such as Bt, vegetative insect proteins (Ciba Geigy) or other toxins. The combinations could be formulated
-19-
SUBSTΪTUTE SHEET (RULE 26)
for a sprayable application or could be molecular combinations. Plants could be transformed with Photorhabdus genes that produce insect toxins and other insect toxm genes such as Bt as with other insect tox genes such as Bt. European Patent Application 0400246A1 describes transformation of 2 Bt m a plant, which could be any 2 genes. Another way to produce a transgenic plant that contains more than one insect resistant gene would be to produce two plants, with each plant containing an insect resistant gene. These plants would be backcrossed using traditional plant breeding techniques to produce a plant containing more than one insect resistant gene.
In addition to producing a transformed plant containing plant optimized gene(s), there are other delivery systems where it may be desirable to reengmeer the bacterial gene(s) Along the same lines, a genetically engineered, easily isolated protein tox fusing together both a molecule attractive to insects as a food source and the msecticidal activity of the toxin may be engineered and expressed in bacteria or m eukaryotic cells using standard, well-known techniques. After purification in the laboratory such a toxic agent with "built-in" bait could be packaged inside standard insect trap housings.
Another delivery scheme is the incorporation of the genetic material of toxins into a baculovirus vector Baculoviruses infect particular insect hosts, including those desirably targeted with the Photorhabdus toxins. Infectious baculovirus harboring an expression construct for the Photorhabdus toxins could be introduced into areas of insect infestation to thereby intoxicate or poison infected insects.
Transfer of the msecticidal properties requires nucleic acid sequences encoding the coding the ammo acid sequences for the Photorhabdus toxins integrated into a protein expression vector appropriate to the host in which the vector will reside. One way to obtain a nucleic acid sequence encoding a protein with msecticidal properties is to isolate the native genetic material which produces the toxins from Photorhabdus, using information deduced from the toxin's ammo acid sequence, large portions of which are set forth below. As described below, methods of purifying the proteins responsible for tox activity are also disclosed. Using N-terminal ammo acid sequence data, such as set forth below, one can construct oligonucleotides complementary to all, or a section of, the DNA bases that encode the first am o acids of the tox . These oligonucleotides can be radiolabeled and used as
-20-
SUBSTΓΓUTE SHEET (RULE 26)
molecular probes to isolate the genetic material from a genomic genetic library built from genetic material isolated from strains of Photorhabdus. The genetic library can be cloned in plasmid, cosmid, phage or phagemid vectors. The library could be transformed into Escherichia coli and screened for toxm production by the transformed cells using antibodies raised against the tox or direct assays for insect toxicity.
This approach requires the production of a battery of oligonucleotides, since the degenerate genetic code allows an ammo acid to be encoded the DNA by any of several three-nucleotide combinations. For example, the ammo acid argmme can be encoded by nucleic acid triplets CGA, CGC, CGG, CGT, AGA, and AGG. Since one cannot predict which triplet is used at those positions the tox gene, one must prepare oligonucleotides with each potential triplet represented. More than one DNA molecule corresponding to a protein subunit may be necessary to construct a sufficient number of oligonucleotide probes to recover all of the protein subunits necessary to achieve oral toxicity.
From the am o acid sequence of the purified protein, genetic materials responsible for the production of toxms can readily be isolated and cloned, in whole or in part, into an expression vector using any of several techniques well-known to one skilled in the art of molecular biology A typical expression vector is a DNA plasmid, though other transfer means including, but not limited to, cosmids, phagemids and phage are also envisioned. In addition to features required or desired for plasmid replication, such as an origin of replication and antibiotic resistance or other form of a selectable marker such as the bar gene of Streptomyces hygroscopicus or viπdochromogenes, protein expression vectors normar-ly additionally require an expression cassette which incorporates the cis-actmg sequences necessary fcrr-tr-anscription and translation of the gene of interest The cis-actmg sequences required for expression in prokaryotes differ from those required in eukaryotes and plants. A eukaryotic expression cassette requires a transcriptional promoter upstream (5') to the gene of interest, a transcriptional termination region such as a poly-A addition site, and a ribosome binding site upstream of the gene of interest's first codon. In bacterial cells, a useful transcriptional promoter that could be mcluded in the vector is the T7 RNA Polymerase-binding promoter.
Promoters, as previously described herein, are known to efficiently promote transcription of mRNA Also upstream from the gene of interest the vector may include a nucleotide sequence encoding a
-21-
SUBSTΓΓUTE SHEET (RULE 26)
signal sequence known to direct a covalently linked protein to a particular compartment of the host cells such as the cell surface.
Insect viruses, or baculoviruses, are known to infect and adversely affect certain insects. The affect of the viruses on insects is slow, and viruses do not stop the feeding of insects. Thus viruses are not viewed as being useful as insect pest control agents . Combining the Photorhabdus toxins genes into a baculovirus vector could provide an efficient way of transmitting the toxins while increasing the lethality of the virus. In addition, since different baculoviruses are specific to different insects, it may be possible to use a particular toxin to selectively target particularly damaging insect pests. A particularly useful vector for the toxins genes is the nuclear polyhedrosis virus. Transfer vectors using this virus have been described and are now the vectors of choice for transferring foreign genes into insects. The virus-toxin gene recombinant may be constructed in an orally transmissible form. Baculoviruses normally infect insect victims through the mid-gut intestinal mucosa. The toxin gene inserted behind a strong viral coat protein promoter would be expressed and should rapidly kill the infected insect.
In addition to an insect virus or baculovirus or transgenic plant delivery system for the protein toxins of the present invention, the proteins may be encapsulated using Bacillus thuringiensis encapsulation technology such as but not limited to U.S. Patent Nos. 4,695,455; 4,695,462; 4,861,595 which are all incorporated herein by reference. Another delivery system for the protein toxins of the present invention is formulation of the protein into a bait matrix, which could then be used in above and below ground insect bait stations. Examples of such technology include but are not limited to PCT Patent Application WO 93/23998, which is incorporated herein by reference.
As is described above, it might become necessary to modify the sequence encoding the protein when expressing it in a non-native host, since the codon preferences of other hosts may differ from that of Photorhabdus . In such a case, translation may be quite inefficient in a new host unless compensating modifications to the coding sequence are made. Additionally, modifications to the amino acid sequence might be desirable to avoid inhibitory cross - reactivity with proteins of the new host, or to refine the insecticidal properties of the protein in the new host. A genetically modified toxin gene might encode a toxin exhibiting, for example, enhanced or reduced toxicity, altered insect
-22-
SUBSTΓΓUTE SHEET (RULE 26)
resistance development, altered stability, or modified target species specificity.
In addition to the Photorhabdus genes encoding the toxins, the scope of the present invention is intended to include related nucleic acid sequences which encode ammo acid biopolymers homologous to the toxm proteins and which retain the toxic effect of the Photorhabdus proteins in insect species after oral mgestion.
For instance, the toxms used in the present invention seem to first inhibit larval feeding before death ensues. By manipulating the nucleic acid sequence of Photorhabdus toxms or its controlling sequences, genetic engineers placing the tox gene into plants could modulate its potency or its mode of action to, for example, keep the eating- inhibitory activity while eliminating the absolute toxicity to the larvae. This change could permit the transformed plant to survive until harvest without having the unnecessarily dramatic effect on the ecosystem of wiping out all target insects. All such modifications of the gene encoding the toxm, or of the protein encoded by the gene, are envisioned to fall withm the scope of the present invention.
Other envisioned modifications of the nucleic acid include the addition of targeting sequences to direct the toxm to particular parts of the insect larvae for improving its efficiency.
Strains W-14, ATCC 55397, 43948, 43949, 43950, 43951, 43952 have been deposited in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA. Ammo acid and nucleotide sequence data for the W-14 native toxm (ATCC 55397) is presented below Isolation of the genomic DNA for the tox s from the bacterial hosts is also exemplified herein. The other strains identified herein have been deposited with the United States
Department of Agriculture, 1815 North University Drive, Peoria, IL 61604.
Standard and molecular biology techniques were followed and taught in the specification herein. Additional information may be found in Sambrook, J. , Fπtsch, E. F., and Maniatis, T. (1989),
Molecular Clon α. A Laboratory Manual. Cold Spring Harbor Press; Current Protocals Molecular Biology, ed. F. M. Ausubel et al . , (1997), which are both incorporated herein by reference
The following abbreviations are used throughout the Examples Tπs = tπs (hydroxymethyl) amino methane; SDS = sodium dodecyl sulfate, EDTA = ethylenediammetetraacetic acid, IPTG = lsopropylthio-B- galactoside, X-gal = 5-bromo-4-chloro-3-mdoyl-B-D-galactosιde,
-23-
SUBSTΓΓUTE SHEET (RULE 26)
CTAB = cetyltrimethylammonium bromide; kbp = kilobase pairs,- dATP, dCTP, dGTP, dTTP, I = 2 ' -deoxynucleoside 5 ' -tπphosphates of adenme, cytosine, guanine, thymme, and mosme, respectively,- ATP = adenosine 5' tπphosphate .
Example ι
Purification of Tox from Photorhabdus luminescens and
Demonstration of Toxicity after Oral Delivery of Purified Toxm
The msecticidal protein toxm of the present invention was purified from Photorhabdus luminescens strain W-14, ATCC Accession Number 55397 Stock cultures of Photorhabdus luminescens were maintained on petri dishes containing 2% Proteose Peptone No. 3 (i.e., PP3, Difco Laboratories, Detroit MI) in 1.5% agar, incubated at 25°C and transferred weekly. Colonies of the primary form of the bacteria were inoculated into 200 ml of PP3 broth supplemented with 0.5% polyoxyethylene sorbitan mono-stearate (Tween 60, Sigma Chemical Company, St. Louis, MO) in a one liter flask. The broth cultures were grown for 72 hours at 30°C on a rotary shaker. The toxm proteins can be recovered from cultures grown in the presence or absence of Tween,- however, the absence of Tween can affect the form of the bacteria grown and the profile of proteins produced by the bacteria. In the absence of Tween, a variant shift occurs insofar as the molecular weight of at least one identified toxm subunit shifts from about 200 kDa to about 185 kDa.
The 72 hour cultures were centrifuged at 10,000 x g for 30 minutes to remove cells and debris The supernatant fraction that contained the msecticidal activity was decanted and brought to 50 mM K2HPO„ by adding an appropriate volume of 1.0 M K^HPO, . The pH was acfjuste<3 to 8.6 by adding potassium hydroxide. This supernatant fraction was then mixed with DEAE-Sephace-z (Pharmacia LKB Biotechnology) which had been equilibrated with 50 mM K2HP04. The toxic activity was adsorbed to the DEAE resin. This mixture was then poured into a 2.6 x 40 cm column and washed with 50 mM K2HP04 at room temperature at a flow rate of 30 l/hr until the effluent reached a steady baseline UV absorbance at 280 nm. The column was then washed with 150 mM KC1 until the effluent again reached a steady 280 nm baseline. Finally the column was washed with 300 mM KC1 and fractions were collected. Fractions containing the toxin were pooled and filter sterilized using a 0.2 micron pore membrane filter The toxm was then concentrated and equilibrated to 100 mM KP04 , pH 6.9, using an ultraflltration membrane with a molecular weight cutoff of 100 kDa
-24-
SUBSTΓΓUTE SHEET (RULE 26)
at 4°C (Centπprep 100, Amicon Division-W.R. Grace and Company). A 3 ml sample of the toxin concentrate was applied to the top of a 2.6 x 95 cm Sephacryl S-400 HR gel filtration column (Pharmacia LKB Biotechnology) . The eluent buffer was 100 mM KP0„, pH 6.9, which was run at a flow rate of 17 ml/hr, at 4°C The effluent was monitored at 280 nm.
Fractions were collected and tested for toxic activity. Toxicity of chromatographic fractions was examined m a biological assay using Manduca sexta larvae. Fractions were either applied directly onto the insect diet (Gypsy moth wheat germ diet, ICN
Biochemicals Division - ICN Biomed cals, Inc.) or administered by mtrahemocelic injection of a 5 μl sample through the first proleg of 4th or 5th instar larva using a 30 gauge needle. The weight of each larva withm a treatment group was recorded at 24 hour intervals. Toxicity was presumed if the insect ceased feeding and died with several days of consuming treated insect diet or if death occurred withm 24 hours after injection of a fraction.
The toxic fractions were pooled and concentrated using the Centrιprep-100 and were then analyzed by HPLC using a 7.5 mm x 60 cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium phosphate, pH 6.9 eluent buffer running at 0 4 ml/min. This analysis revealed the toxm protein to be contained with a single sharp peak that eluted from the column with a retention time of approximately 33 6 minutes. This retention time corresponded to an estimated molecular weight of 1,000 kDa. Peak fractions were collected for further purification while fractions not containing this protein were discarded. The peak eluted from the HPLC absorbs UV light at 218 and 280 nm but did not absorb at 405 nm. Absorbance at 405 nm was shown to be an attribute of xenorhabdm antibiotic compounds.
Electrophoresis of the pooled peak fractions m a non- denaturing agarose gel (Metaphor Agarose, FMC BioProducts) showed that two protein complexes are present in the peak. The peak material, buffered in 50 mM Tris-HCl, pH 7.0, was separated on a 1.5% agarose stacking gel buffered with 100 mM Tris-HCl at pH 7.0 and 1.9% agarose resolving gel buffered with 200 mM Tris-borate at pH 8.3 under standard buffer conditions (anode buffer 1M Tris-HCl, pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycme) . The gels were run at 13 mA constant current at 15°C until the phenol red tracking dye reached the end of the gel. Two protein bands were visualized in the agarose gels using Coomassie brilliant blue staining
-25-
SUBSTΓΓUTE SHEET (RULE 26)
The slower migrating band was referred to as "protein band 1" and faster migrating band was referred to as "protein band 2." The two protein bands were present m approximately equal amounts . The Coomassie stained agarose gels were used as a guide to precisely excise the two protein bands from unstained portions of the gels. The excised pieces containing the protein bands were macerated and a small amount of sterile water was added. As a control, a portion of the gel that contained no protein was also excised and treated the same manner as the gel pieces containing the protein. Protein was recovered from the gel pieces by electroelution into
100 mM Tris-borate pH 8.3, at 100 volts (constant voltage) for two hours. Alternatively, protein was passively eluted from the gel pieces by adding an equal volume of 50 mM Tris-HCl, pH 7.0, to the gel pieces, then incubating at 30 °C for 16 hours. This allowed the protein to diffuse from the gel into the buffer, which was then collected.
Results of insect toxicity tests using HPLC-purifled toxm (33.6 mm. peak) and agarose gel purified tox demonstrated toxicity of the extracts. Injection of 1.5 μg of the HPLC purified protein kills with 24 hours. Both protein bands 1 and 2, recovered from agarose gels by passive elution or electroelution, were lethal upon injection. The protein concentration estimated for these samples was less than 50 ng/larva. A comparison of the weight gam and the mortality between the groups of larvae injected with protein bands 1 or 2 indicate that protein band 1 was more toxic by injection delivery.
When HPLC-purifled toxm was applied to larval diet at a concentration of 7.5 μg/larva, it caused a halt m larval weight gam (24 larvae tested) . The larvae begin to feed, but after consuming only a very small portion of the toxm treated diet they began to show pathological symptoms induced by the toxm and the larvae cease feeding. The insect frass became discolored and most larva showed signs of diarrhea. Significant insect mortality resulted when several 5 μg toxm doses were applied to the diet over a 7-10 day period.
Agarose-separated protein band 1 significantly inhibited larval weight gam at a dose of 200 ng/larva. Larvae fed similar concentrations of protein band 2 were not inhibited and gained weight at the same rate as the control larvae . Twelve larvae_were fed eluted protein and 45 larvae were fed protem-contam g agarose pieces. These two sets of data indicate that protein band 1 was orally toxic to Manduca sexta . In this experiment it appeared that protein b aid 2 was not toxic to Manduca sexta
-26-
SUBSTΓΠJTE SHEET (RULE 26)
Further analysis of protein bands 1 and 2 by SDS-PAGE under denaturing conditions showed that each band was composed of several smaller protein subunits. Proteins were visualized by Coomassie brilliant blue staining followed by silver staining to achieve maximum sensitivity.
The protein subunits in the two bands were very similar. Protein band 1 contains 8 protein subunits of 25.1, 56.2, 60.8, 65.6, 166, 171, 184 and 208 kDa Protein band 2 had an identical profile except that the 25.1, 60.8, and 65.6 kDa proteins were not present. The 56.2, 60.8, 65.6, and 184 kDa proteins were present in the complex of protein band 1 at approximately equal concentrations and represent 80% or more of the total protein content of that complex.
The native HPLC-purifled toxm was further characterized as follows. The toxm was heat labile in that after being heated to 60°C for 15 minutes it lost its ability to kill or to inhibit weight gam when injected or fed to Manduca sexta larvae Assays were designed to detect lipase, type C phospholipase, nuclease or red blood cell hemolysis activities and were performed with purified toxin. None of these activities were present. Antibiotic zone inhibition assays were also done and the purified toxin failed to inhibit growth of Gram-negative or -positive bacteria, yeast or filamentous fungi, indicating that the toxic is not a xenorhabd antibiotic. The native HPLC-purifled tox was tested for ability to kill insects other than Manduca sexta Table 3 lists insects killed by the HPLC-purifled Photorhabdus luminescens toxin in this study.
Table 3 Insects Killed by Photorhabdus luminescens Toxm
Genus and Route of
Common Name Order species Delivery Tobacco Lepidoptera Manduca sexta Oral and horn worm in ected
Mealworm Coleoptera Tenebrio mol i tor Oral Pharaoh ant Hymenoptera Mono orium pharoanis Oral
German Dictyoptera Blattella germanica Oral and cockroach injected Mosquito Diptera Aedes aegypti Oral
-27-
SUBSTΓΓUTE SHEET (RULE 26)
Further Characterization of the High Molecular Weight Toxin Complex
In yet further analysis, the tox protein complex was subjected to further characterization from W-14 growth medium The culture conditions and initial purification steps through the S-400 HR column were identical to those described above After isolation of the high molecular weight toxm complex from the S-400 HR column fractions, the toxic fractions were equilibrated with 10 mM Tris- HCl, pH 8.6, and concentrated in the centriplus 100 (Amicon) concentrators The protein toxin complex was then applied to a weak anion exchange (WAX) column, Vydac 301VPH575 (Hesparia, CA) , at a flow rate of 0.5 ml/mm. The proteins were eluted with a linear potassium chloride gradient, 0-250 mM KC1 , in 10 mM Tris-HCl pH 8.6 for 50 mm. Eight protein peaks were detected by absorbance at 280 nm. Bioassays using neonate southern corn rootworm (JDϊabrotαca undeci punctata howardi , SCR) larvae and tobacco horn worm [Manduca sexta , THW) were performed on all fractions eluted from the HPLC column. THW were grown on Gypsy Moth wheat germ diet (ICN) at 25ϋC with a 16 hr light 8 hr dark cycle. SCR were grown on Southern Corn Rootworm Larval Insecta-Diet (BioServ) at 25°C with a 16 hr light / 8 hr dark cycle.
The highest mortality for SCR and THW larvae was observed for peak 6, which eluted with ca. 112 mM to 132mM KC1 SDS-PAGE analysis of peak 6 showed predominant peptides of 170 kDa, 66 kDa, 63 kDa, 59.5 kDa and 31 kDa. Western blot analysis was performed on peak 6 protein fraction with a mixture of polyclonal antibodies made against TcaA11-syn, TcaA111-syn, TcaBι;L-syn, TcaC-syn, and TcbA11-syn peptides (described Example 21) and C5F2, a monoclonal antibody against the TcbAlxl peptide. Peak 6 contained immuno-reactive bands of 170 kDa, 90 kDa, 66 kDa, 59.5 kDa and 31 kDa These are very close to the predicted sizes for the TcaC (166 kDa) , TcaAli+ TcaAxll
(92 kDa), TcaAxll (66 kDa), TcaBι;L (60 kDa) and TcaAι;L (25 kDa), respectively Peak 6 which was further analyzed by native agarose gel electrophoresis, as described herein, migrated as a single band with similar mobility to that of band 1.
The protein concentration of the purified peak 6 tox protein was determined using the BCA reagents (Pierce) Dilutions of the protein were made in 10 mM Tris, pH 8.6 and applied to the diet bioassays. After 240 hours all neonate larvae on diet bioassays that received ng or greater of the peak 6 protein fraction were dead. The grou of larvae that received 90 ng of the same fraction
-28-
SUBSTΓΓUTE SHEET (RULE 26)
had 40% mortality. After 240 hrs the survivors that received 90 ng and 20 ng of peak 6 protein fraction were ca. 10% and 70%, respectively, of the control weight.
Example 2
Insecticide Utility
The Photorhabdus luminescens utility and toxicity were further characterized. Photorhabdus luminescens (strain W-14) culture broth was produced as follows. The production medium was 2% Bacto Proteose Peptone® Number 3 (PP3, Difco Laboratories, Detroit, Michigan) in Milli-Q* deionized water. Seed culture flasks consisted of 175 ml medium placed in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput and autoclaved for 20 minutes, T=250°F. Production flasks consisted of 500 mis in a 2.8 liter 500 ml tribaffled flask with a Delong neck, covered by a Shm-etsu silicon foam closure These were autoclaved for 45 minutes, T=250°F The seed culture was incubated at 28°C at 150 rpm in a gyrotory shaking incubator with a 2 inch throw. After 16 hours of growth, 1% of the seed culture was placed in the production flask which was allowed to grow for 24 hours before harvest Production of the toxm appears to be during log phase growth The microbial broth was transferred to a IL centrifuge bottle and the cellular biomass was pelleted (30 minutes at 2500 RPM at 4°C, [R.C.F = about 1600] HG-4L Rotor RC3 Sorval centrifuge, Dupont , Wilmington, DE) . The primary broth was chilled at 4°C for 8 - 16 hours and recentπfuged at least 2 hours (conditions above) to further clarify the broth by removal of a putative mucopolysaccharide which precipitated upon standing. (An alternative processing method combined both steps and involved the use of a 16 hour clarification centrifugation, same conditions as above.) This o broth was then stored at 4 C prior to bioassay or filtration.
Photorhabdus culture broth and protein toxm(s) purified from this broth showed activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects More specifically, the activity is seen against corn rootworm (larvae and adult) , Colorado potato beetle, and turf grubs, which are members of the insect order Coleoptera . Other members of the Coleoptera include wireworms, pollen beetles, flea beetles, seed beetles and weevils. Activity has also been observed against aster leafhopper, which is a member of the order, Homoptera Other members of the Homoptera include planthoppers, pear pyslla, apple
-29-
SUBSTΓΓUTE SHEET (RULE 26)
sucker, scale insects, whiteflies, and spittle bugs, as well as numerous host specific aphid species. The broth and purified fractions are also active against beet armyworm, cabbage looper, black cutworm, tobacco budworm, European corn borer, corn earworm, and codling moth, which are members of the order Lepidoptera . Other typical members of this order are clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm, and fall armyworm Activity is also seen against fruitfly and mosquito larvae, which are members of the order Diptera. Other members of the order Diptera are pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly, house fly, and various mosquito species. Activity is seen against carpenter ant and Argentine ant, which are members of the order that also includes fire ants, oderous house ants, and little black ants.
The broth/fraction is useful for reducing populations of insects and were used in a method of inhibiting an insect population. The method may comprise applying to a locus of the insect an effective insect inactivating amount of the active described. Results are reported in Table 4.
Activity against corn rootworm larvae was tested as follows. Photorhabdus culture broth (filter sterilized, cell-free) or purified HPLC fractions were applied directly to the surface (about 1.5 cm2) of 0.25 ml of artificial diet in 30 μl aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica undecimpunctata howardi (Southern corn rootworm, SCR) hatched from sterilized eggs, with second nstar SCR grown on artificial diet or with second instar Diabrotica virgif ra virgifera (Western corn rootworm, WCR) reared on corn seedlings grown in Metromix . Second instar larvae were weighed prior to addition to the diet. The plates were sealed, placed in a humidified growth chamber and maintained at 27°C for the appropriate period (4 days for neonate and adult SCR, 2-5 days for WCR larvae, 7-14 days for second instar SCR) . Mortality and weight determinations were scored as indicated Generally, 16 insects per treatment were used m all studies. Control mortalities were as follows: neonate larvae, <5%, adult beetles, 5%. Activity against Colorado potato beetle was tested as follows Photorhabdus culture broth or control medium was applied to the surface (about 2.0 cm*) of 1.5 ml of standard artificial diet held in the wells of a 24-well tissue culture plate Each well received
-30-
SUBSTΓΓUTE SHEET (RULE 26)
50 μl of treatment and was allowed to air dry. Individual second instar Colorado potato beetle [Leptmotarsa decemlinea ta , CPB) larvae were then placed onto the diet and mortality was scored after 4 days Ten larvae per treatment were used in all studies. Control mortality was 3.3%.
Activity against Japanese beetle grubs and beetles was tested as follows. Turf grubs [ Popillia japonica , 2-3rd instar) were collected from infested lawns and maintained in the laboratory in soil/peat mixture with carrot slices added as additional diet. Turf beetles were pheromone-trapped locally and maintained m the laboratory in plastic containers with maple leaves as food Following application of undiluted Photorhabdus culture broth or control medium to corn rootworm artificial diet (30 μl/1.54 cm2, beetles) or carrot slices (larvae), both stages were placed singly m a diet well and observed for any mortality and feeding In both cases there was a clear reduction in the amount of feeding (and feces production) observed.
Activity against mosquito larvae was tested as follows. The assay was conducted in a 96-well microtiter plate Each well contained 200 μl of aqueous solution [ Photorhabdus culture broth, control medium or H20) and approximately 20, 1-day old larvae [Aedes aegypti ) There were 6 wells per treatment. The results were read at 2 hours after infestation and did not change over the three day observation period. No control mortality was seen. Activity against fruitflies was tested as follows. Purchased Drosophila elanogaster medium was prepared using 50% dry medium and a 50% liquid of either water, control medium or Photorhabdus culture broth. This was accomplished by placing 8.0 ml of dry medium in each of 3 rearing vials per treatment and adding 8 0 ml of the appropriate liquid. Ten late instar Drosophila melanogaster maggots were then added to each vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 3, 7 and 10 days of exposure. Incorporation of Photorhabdus culture broth into the diet media for fruitfly maggots caused a slight (17%) but significant reduction in day- 10 adult emergence as compared to water and control medium (3% reduction)
Activity against aster leafhopper was tested as follows. The mgestion assay for aster leafhopper [Macrosteles severini ) is designed to allow mgestion of the active without other external contact The reservoir for the active/ "food" solution is made by making 2 holes in the center of the bottom portion of a 35 x 10 mm Petri dish. A 2 inch Parafilm " square is placed across the top of
-31-
SUBSTΓΠJTE SHEET (RULE 26)
the dish and secured with an "O" ring. A 1 oz. plastic cup is then infested with approximately 7 leafhoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using undiluted Photorhabdus culture broth, the broth and control medium were dialyzed against water to reduce control mortality Mortality is reported at day 2 where 26.5% control mortality was seen. In the tests using purified fractions (200 mg protem/ml) a final concentration of 5% sucrose was used in all treatments to improve survivability of the aster leafhoppers. The assay was held in an incubator at 28 °C, 70% RH with a 16/8 photoperiod. The assay was graded for mortality at 72 hours. Control mortality was 5.5%.
Activity against Argentine ants was tested as follows A 1.5 ml aliquot of 100% Photorhabdus culture broth, control medium or water was pipetted into 2.0 ml clear glass vials. The vials were plugged with a piece of cotton dental wick that was moistened with the appropriate treatment. Each vial was placed into a separate 60x16mm Petri dish with 8 to 12 adult Argentine ants [ Lmepi thema humile) . There were three replicates per treatment Bioassay plates were held on a laboratory bench, at room temperature under fluorescent ceiling lights. Mortality readings were made after 5 days of exposure. Control mortality was 24%
Activity against carpenter ant was tested as follows Black carpenter ant workers [ Camponotus pennsylvamcus) were collected from trees on DowElanco property in Indianapolis, IN. Tests with
Photorhabdus culture broth were performed as follows. Each plastic bioassay container (7 1/8" x 3") held fifteen workers, a paper harborage and 10 ml of broth or control media m a plastic shot glass. A cotton wick delivered the treatment to the ants through a hole m the shot glass lid. All treatments contained 5% sucrose Bioassays were held in the dark at room temperature and graded at 19 days. Control mortality was 9%. Assays delivering purified fractions utilized artificial ant diet mixed with the treatment (purified fraction or control solution) at a rate of 0.2 ml treatment/2.0 g diet in a plastic test tube. The final protein concentration of the purified fraction was less than 10 μg/g diet Ten ants per treatment, a water source, harborage and the treated diet were placed in sealed plastic containers and maintained m the dark at 27 °C in a humidified incubator Mortality was scored at day 10. No control mortality was seen.
Activity against various lepidopteran larvae was tested as follows Photorhabdus culture broth or purified fractions were
-32-
SUBSTΓΓUTE SHEET (RULE 26)
applied directly to the surface (about 1.5 cm ) of 0.25 ml of standard artificial diet in 30 μl aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate larva. European corn borer [ Ostrinia nubilalis) and corn earworm [Helicoverpa zea) eggs were supplied from commercial sources and hatched m-house, whereas beet armyworm [Ξpodoptera exigua) , cabbage looper [ Trichoplusia ni ) , tobacco budworm [Heliothis virescens) , codling moth (Laspeyresia pomonella) and black cutworm [Agrotis ipsilon) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed a humidified growth chamber and maintained in the dark at 27 °C for the appropriate period. Mortality and weight determinations were scored at days 5-7 for Photorhabdus culture broth and days 4-7 for the purified fraction. Generally, 16 insects per treatment were used in all studies. Control mortality ranged from 4-12.5% for control medium and was less than 10% for phosphate buffer.
-33-
SUBSTΓΓUTE SHEET (RULE 26)
Tabl e 4
Effect of Photorhabdus luminescens (Strain W-14)
Culture Broth and Purified Toxin Fraction on Mortality and Growth
Inhibition of Different Insect Orders/Species
or . = mor a i y, grow inhibi ion, na = not applicable, nt not tested, a.f anti-feedant
Exa p e 3, Insecticide Utility upon Soil Application
Photorhabdus l uminescens (strain W-14) culture broth was shown to be active against corn rootworm when applied directly to soil or a soil -mix (Metromix"') . Activity against neonate SCR and WCR
-34-
SUBSTΓΓUTE SHEET (RULE 26)
Metromix® was tested as follows (Table 5) . The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated m the light on moist filter paper for 6 days. After roots were approximately 3-6 cm long, a single kernel/seedlmg was planted in a 591 ml clear plastic cup with 50 gm of dry MetromixX Twenty neonate SCR or WCR were then placed directly on the roots of the seedling and covered with Metromix". Upon infestation, the seedlings were then drenched with 50 ml total volume of a diluted broth solution. After drenching, the cups were sealed and left at room temperature in the light for 7 days. Afterwards, the seedlings were washed to remove all Metromix" and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either -, +, ++, or +++, with - representing no damage and +++ representing severe damage.
Activity against neonate SCR in soil was tested as follows (Table 6) . The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 days. After the roots were approximately 3-6 cm long, a single kernel/seedlmg was planted in a 591 ml clear plastic cup with 150 gm of soil from a field in Lebanon, IN planted the previous year with corn. This soil had not been previously treated with insecticides. Twenty neonate SCR were then placed directly on the roots of the seedling and covered with soil. After infestation, the seedlings were drenched with 50 ml total volume of a diluted broth solution. After drenching, the unsealed cups were incubated in a high relative humidity chamber (80%) at 78°F Afterwards, the seedlings were washed to remove all soil and the roots" //ere excised and weighed. Activity was rated as the percentage of corn root remaining relative to the~~c~στrtrrol plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either -, +, ++, or +++, with - representing no damage and +++ representing severe damage.
-35-
SUBSTΓΠJTE SHEET (RULE 26)
labial
Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on Rootworm Larvae after Post-Infestation Drenching (Metromix")
Treatment Larvae Leaf Damage Root Weight (g) %
Broth (1.56% v/v) + - 0.4830 ± 0.031 104
Western Corn Rootworm
Water — —
Broth (2.0% v/v) - -
Water +
Broth (2.0% v/v) +
Table 6
Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on Southern Corn Rootworm Larvae after Post -Infestation Drenching
(So l) Treatment Larvae Leaf Damage Root Weight (g) %
0.2148 ± 0.014 100 0.2260 ± 0.016 103
+++ 0.0916 ± 0.009 43
0.2428 ± 0.032 113
Activity of Photorhabdus luminescens (strain W-14) culture broth against second instar turf grubs in Metromix* was observed in tests conducted as follows (Table 7) . Approximately 50 gm of dry Metromix* was added to a 591 ml clear plastic cup The Metromix* was then drenched with 50 ml total volume of a 50% (v/v) diluted Photorhabdus broth solution. The dilution of crude broth was made with water, with 50% broth being prepared by adding 25 ml of crude broth to 25 ml of water for 50 ml total volume A 1% (w/v) solution of proteose peptone #3 (PP3), which is a 50% dilution of the normal media concentration, was used as a broth control After drenching, five second instar turf grubs were placed on the top of the moistened Metromix* . Healthy turf grub larvae burrowed rapidly into the Metromix'. Those larvae that did not burrow within lh were
-36-
SUBSTΓΓUTE SHEET (RULE 26)
removed and replaced with fresh larvae. The cups were sealed and placed in a 28 °C incubator, in the dark. After seven days, larvae were removed from the Metromix" and scored for mortality. Activity was rated the percentage of mortality relative to control .
Table 7
Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on
Turf Grub after Pre- Infestation Drenching (Metromix8)
Treatment Mortality* Mortality %
Water 7/15 47
Control medium (1.0% w/v) 12/19 63
Broth (50% v/v) 17/20 85 *expressed as a ratio of dead/living larvae
Example 4 Insecticide Utility upon Leaf Application
Activity of Photorhabdus broth against European corn borer was seen when the broth was applied directly to the surface of maize leaves (Table 8) In these assays Photorhabdus broth was diluted 100 -fold with culture medium and applied manually to the surface of excised maize leaves at a rate of about 6.0 μl/cm2 of leaf surface. The leaves were air dried and cut into equal sized strips approximately 2 x 2 inches. The leaves were rolled, secured with paper clips and placed in 1 oz plastic shot glasses with 0.25 inch of 2% agar on the bottom surface to provide moisture. Twelve neonate European corn borers were then placed onto the rolled leaf and the cup was sealed. After incubation for 5 days at 27°C in the dark, the samples were scored for feeding damage and recovered larvae .
-37-
SUBSTΪTUTE SHEET (RULE 26)
Table 8
Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on
European Corn Borer Larvae Following Pre- Infestation Application to
Excised Maize Leaves
Treatment Leaf Damage Larvae Recovered Weight (mg)
Water Extensive 55/120 0.42 mg
Control Medium _ Extensive 40/120 0.50 mg
Broth (1.0% v/v) Trace 3/120 0.15 mg
Activity of the culture broth against neonate tobacco budworm (Heliothis virescens) was demonstrated using a leaf dip methodology. Fresh cotton leaves were excised from the plant and leaf disks were cut with an 18.5 mm cork-borer. The disks were individually emersed in control medium (PP3) or Photorhabdus luminescens (strain W-14) culture broth which had been concentrated approximately 10-fold using an Amicon (Beverly, MA) , Proflux M12 tangential filtration system with a 10 kDa filter. Excess liquid was removed and a straightened paper clip was placed through the center of the disk. The paper clip was then wedged into a plastic, 1.0 oz shot glass containing approximately 2.0 ml of 1% Agar . This served to suspend the leaf disk above the agar. Following drying of the leaf disk, a single neonate tobacco budworm larva was placed on the disk and the cup was capped. The cups were then sealed in a plastic bag and placed in a darkened, 27°C incubator for 5 days.
At this time the remaining larvae and leaf material were weighed to establish a measure of leaf damage (Table 9) .
Table 9 Effect of Photorhabdus luminescens (Strain W-14) Culture Byoth on Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay
Final Weights (mg) Treatment Leaf Disk Larvae Control leaves 55.7 ± 1.3 na*
Control Medium 34.0 + 2.9 4.3 ± 0.91
Photorhabdus broth 54.3 ± 1.4 0.0**
* - not applicable, ** - no live larvae found
Example 5. Part A
Characterization of Toxm Peptide Components
In a subsequent analysis, the toxm protein subunits of the bands isolated as in Example 1 were resolved on a 7% SDS
-38-
SUBSTTTUTE SHEET (RULE 26)
polyacrylamide electrophoresis gel with a ratio of 30:0.8 (acrylamide : BIS-acrylamide) . This gel matrix facilitates better resolution of the larger proteins. The gel system used to estimate the Band 1 and Band 2 subunit molecular weights Example 1 was an 18% gel with a ratio of 38:0.18 (acrylamide :BIS-acrylamιde) , which allowed for a broader range of size separation, but less resolution of higher molecular weight components.
In this analysis, 10, rather than 8, protein bands were resolved Table 10 reports the calculated molecular weights of the 10 resolved bands, and directly compares the molecular weights estimated under these conditions to those of the prior example. It is not surprising that additional bands were detected under the different separation conditions used in this example. Variations between the prior and new estimates of molecular weight are also to be expected given the differences analytical conditions In the analysis of this example, it is thought that the higher molecular weight estimates are more accurate than in Example 1, as a result of improved resolution However, these are estimates based on SDS PAGE analysis, which are typically not analytically precise and result in estimates of peptides and which may have been further altered due to post- and co-translational modifications
Am o acid sequences were determined for the N- terminal portions of five of the 10 resolved peptides Table 10 + correlates the molecular weight of the proteins and the identified sequences. In SEQ ID NO.2, certain analyses suggest that the prol e at residue 5 may be an asparag e (asn) . In SEQ ID NO: 3, certain analyses suggest that the am o acid residues at positions 13 and 14 are both arg ine (arg) In SEQ ID NO: 4, certain analyses suggest that the ammo acid residue at position 6 may be either alanme (ala) or ser e (ser) . In SEQ ID NO: 5, certain analyses suggest that the ammo acid residue at position 3 may be aspartic acid (asp) .
-39-
SUBSTΓΓUTE SHEET (RULE 26)
Table .10
ESTIMATE NEW ESTIMATE* SEP. LISTING
208 200.2 kDa SEQ ID NO:l 184 175.0 kDa SEQ ID NO: 2
65.6 68.1 kDa SEQ ID NO: 3
60.8 65.1 kDa SEQ ID NO: 4
56.2 58.3 kDa SEQ ID NO: 5
25.1 23.2 kDa SEQ ID NO: 15 *New estimates are based on SDS PAGE and are not based on gene sequences. SDS PAGE is not analytically precise.
Example 5. Part B Characterization of Toxm Peptide Components
New N-terminal sequence, SEQ ID NO: 15, Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further N- terminal sequencing of peptides isolated from Native HPLC-purifled tox as described m Example 5, Part A, above. This peptide comes from the tcaA gene. The peptide labeled TcaAn, starts at position 254 and goes to position 491, where the TcaAxll peptide starts, SEQ ID NO: 4 The estimated size of the peptide based on the gene sequence
Example 6
Characterization of Tox Peptide Components
In yet another analysis, the toxm protein complex was re- lsolated from the Photorhabdus luminescens growth medium (after culture without Tween) by performing a 10% - 80% ammonium sulfate precipitation followed by an ion exchange chromatography step (Mono Q) and two molecular sizing chromatography steps. These conditions were like those used in Example 1 During the first molecular sizing step, a second biologically active peak was found at about 100 ± 10 kDa. Based upon protein measurements, this fraction was 20 - 50 fold less active than the larger, or primary, active peak of about 860 ± 100 kDa (native) . During this isolation experiment a smaller active peak of about 325 ± 50 kDa that retained a considerable portion of the starting biological activity was also resolved. It is thought that the 325 kDa peak is related to or derived from the 860 kDa peak.
A 56 kDa protein was resolved in this analysis. The N- termmal sequence of this protein is presented in SEQ ID NO: 6 It
is noteworthy that this protein shares significant identity and conservation with SEQ ID NO: 5 at the N-termihus, suggesting that the two may be encoded by separate members of a gene family and that the proteins produced by each gene are sufficiently similar to both be operable in the insecticidal toxin complex.
A second, prominent 185 kDa protein was consistently present in amounts comparable to that of protein 3 from Table 10, and may be the same protein or protein fragment. The N-terminal sequence of this 185 kDa protein is shown at SEQ ID NO: 7. Additional N-terminal amino acid sequence data were also obtained from isolated proteins. None of the determined N-terminal sequences appear identical to a protein identified in Table 10. Other proteins were present in isolated preparation. One such protein has an estimated molecular weight of 108 kDa and an N- terminal sequence as shown in SEQ ID NO: 8. A second such protein has an estimated molecular weight of 80 kDa and an N-terminal sequence as shown in SEQ ID NO: 9.
When the protein material in the approximately 325 kDa active peak was analyzed by size, bands of approximately 51, 31, 28, and 22 kDa were observed. As in all cases in which a molecular weight was determined by analysis of electrophoretic mobility, these molecular weights were subject to error effects introduced by buffer ionic strength differences, electrophoresis power differences, and the like. One of ordinary skill would understand that definitive molecular weight values cannot be determined using these standard methods and that each was subject to variation. It was hypothesized that proteins of these sizes are degradation products of the larger protein species (of approximately 200 kDa size) that were observed in the larger primary toxin complex. Finally, several preparations included a protein having the N- terminal sequence shown in SEQ ID NO: 10. This sequence was strongly homologous to known chaperonin proteins, accessory proteins known to function in the assembly of large protein complexes. Although the applicants could not ascribe such an assembly function to the protein identified in SEQ ID NO: 10, it was consistent with the existence of the described toxin protein complex that such a chaperonin protein could be involved in its assembly. Moreover, although such proteins have not directly been suggested to have toxic activity, this protein may be important to determining the overall structural nature of the protein toxin, and thus, may contribute to the toxic activity or durability of the complex in vivo after oral delivery.
-41-
SUBSTΓΓUTE SHEET (RULE 25)
Subsequent analysis of the stability of the protein tox complex to protemase K was undertaken It was determined that after 24 hour incubation of the complex in the presence of a 10- fold molar excess of protemase K, activity was virtually eliminated (mortality on oral application dropped to about 5%) . These data confirm the prote aceous nature of the tox
The toxic activity was also retained by a dialysis membrane, again confirming the large size of the native tox complex
Example 7
Isolation. Characterization and Partial Ammo Acid Sequencing of Photorhabdus Toxms
Isolation and N-Termmal Ammo Acid Sequencing In a set of experiments conducted in parallel to Examples 5 and 6, ammonium sulfate precipitation of Photorhabdus proteins was performed by adjusting Photorhabdus broth, typically 2-3 liters, to a final concentration of either 10% or 20% by the slow addition of ammonium sulfate crystals. After stirring for 1 hour at 4°C, the material was centrifuged at 12,000 x g for 30 minutes. The supernatant was ad}usted to 80% ammonium sulfate, stirred at 4°C for 1 hour, and centrifuged at 12,000 x g for 60 minutes The pellet was resuspended in one-tenth the volume of 10 mM Na2 P04 , pH
7.0 and dialyzed against the same phosphate buffer overnight at 4°C. The dialyzed material was centrifuged at 12,000 x g for 1 hour prior to ion exchange chromatography
A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was equilibrated with 10 mM Na2 P04, pH 7 0. Centrifuged, dialyzed ammonium sulfate pellet was applied to the Q Sepharose column at a rate of 1.5 ml/mm and washed extensively at 3.0 ml/mm with equilibration buffer until the optical density (O.D. 280) reached less than 0.100. Next, either a 60 minute NaCl gradient ranging from 0 to 0.5 M at 3 ml/mm, or a series of step elutions using 0.1 M, 0.4 M and finally 1.0 NaCl for 60 minutes each was applied to the column. Fractions were pooled and concentrated using a
Centriprep 100 Alternatively, proteins could be eluted by a single 0.4 M NaCl wash without prior elution with 0.1 M NaCl.
Two illiliter aliquots of concentrated Q Sepharose samples were loaded at 0.5 ml/mm onto a HR 16/50 Superose 12 (Pharmacia) gel filtration column equilibrated with 10 mM Na2 P04 , pH 7 0 The column was washed with the same buffer for 240 mm at 0 5 ml/min and 2 mm samples were collected The void volume material was
-42-
SUBSTΪTUTE SHEET (RULE 26)
collected and concentrated using a Centriprep 100. Two milliliter aliquots of concentrated Superose 12 samples were loaded at 0.5 ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) gel filtration column equilibrated with 10 mM Na2 'P04, pH 7.0. The column was washed with the same buffer for 240 min at 0.5 ml/min and 2 min samples were collected.
The excluded protein peak was subjected to a second fractionation by application to a gel filtration column that used a Sepharose CL-4B resin, which separates proteins ranging from about 30 kDa to 1000 kDa. This fraction was resolved into two peaks,- a minor peak at the void volume (>1000 kDa) and a major peak which eluted at an apparent molecular weight of about 860 kDa. Over a one week period subsequent samples subjected to gel filtration showed the gradual appearance of a third peak (approximately 325 kDa) that seemed to arise from the major peak, perhaps by limited proteolysis. Bioassays performed on the three peaks showed that the void peak had no activity, while the 860 kDa toxin complex fraction was highly active, and the 325 kDa peak was less active, although quite potent. SDS PAGE analysis of Sepharose CL-4B toxin complex peaks from different fermentation productions revealed two distinct peptide patterns, denoted "P" and "S". The two patterns had marked differences in the molecular weights and concentrations of peptide components in their fractions. The "S" pattern, produced most frequently, had 4 high molecular weight peptides (> 150 kDa) while the "P" pattern had 3 high molecular weight peptides. In addition, the "S" peptide fraction was found to have 2-3 fold more activity against European Corn Borer. This shift may be related to variations in protein expression due to age of inoculum and/or other factors based on growth parameters of aged cultures.
Milligram quantities of peak toxin complex fractions determined to be "P" or "S" peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycine (Seprabuff™ to PVDF membranes (ProBlott™, Applied Biosystems) for 3-4 hours. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides in the "S" pattern had unique N- terminal amino acid sequences compared to the sequences identified in the previous example. A 201 kDa (TcdAϋ) peptide set forth as SEQ ID NO: 13 below shared between 33% amino acid identity and 50% similarity (similarity and identity were calculated by hand) with SEQ ID NO:l (TcbAϋ) (in Table 10 vertical lines denote ammo acid
-43-
SUBSTΓΓUTE SHEET (RULE 26)
identities and colons indicate conservative amino acid substitutions) . A second peptide of 197 kDa, SEQ ID NO:14 (TcdB), had 42% identity and 58% similarity with SEQ ID NO: 2 (TcaC) (similarity and identity were calculated by hand) . Yet a third peptide of 205 kDa was denoted TcdAι:L. In addition, a limited N- terminal ammo acid sequence, SEQ ID NO:16 (TcbA), of a peptide of at least 235 kDa was identical with the ammo acid sequence, SEQ ID NO:12, deduced from a cloned gene [ tcbA) , SEQ ID NO:ll, containing a deduced ammo acid sequence corresponding to SEQ ID NO:l (TcbA i) . This indicates that the larger 235+ kDa peptide was proteolytically processed to the 201 kDa peptide, (TcbAn), (SEQ ID NO.-l) during fermentation, possibly resulting in activation of the molecule. In yet another sequence, the sequence originally reported as SEQ ID NO: 5 (TcaBι;L) reported in Example 5 above, was found to contain an aspartic acid residue (Asp) at the third position rather than glycme (Gly) and two additional ammo acids Gly and Asp at the eighth and ninth positions, respectively. In yet two other sequences, SEQ ID NO: 2 (TcaC) and SEQ ID NO: 3 (TcaB1), additional ammo acid sequence was obtained. Densitometric quantitation was performed using a sample that was identical to the "S" preparation sent for N-termmal analysis. This analysis showed that the 201 kDa and 197 kDa peptides represent 7.0% and 7.2%, respectively, of the total Coomassie brillant blue stained protein the "S" pattern and are present in amounts similar to the other abundant peptides. It was speculated that these peptides may represent protein homologs, analogous to the situation found with other bacterial toxms, such as various Cryl Bt toxms. These proteins vary from 40-90% similarity at their N-termmal ammo acid sequence, which encompasses the toxic fragment.
Internal Ammo Acid Sequencing
To facilitate cloning of toxm peptide genes, internal ammo acid sequences of selected peptides were obtained as followed. Milligram quantities of peak 2A fractions determined to be "P" or "S" peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycme (Seprabuff™ to PVDF membranes (ProBlott™, Applied Biosystems) for 3-4 hours. Blots were sent for ammo acid analysis and N-termmal ammo acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides, referred to as TcbAl (containing SEQ ID NO:l), TcdA11( and TcaBx (containing SEQ ID NO: 3) were SUDJ ected to trypsm digestion by
-44-
SUBSTΓΠJTE SHEET (RULE 26)
Harvard MicroChem followed by HPLC chromatography to separate individual peptides N-termmal ammo acid analysis was performed on selected tryptic peptide fragments . Two internal peptides were sequenced for the peptide TcdAlx (205 kDa peptide) referred to as TcdAn-PTlll (SEQ ID NO:17) and TcdAι;L-PT79 (SEQ ID NO:18). Two internal peptides were sequenced for the peptide TcaBx (68 kDa peptide) referred to as TcaBi-PTlδθ (SEQ ID NO: 19) and TcaBi-PTlOδ (SEQ ID NO: 20) . Four internal peptides were sequenced for the peptide TcbAlx (201 kDa peptide) referred to as TcbA11-PT103 (SEQ ID N0:21), TcbAι:L-PT56 (SEQ ID NO:22), TcbAι;L-PT81 (a) (SEQ ID NO: 23), and TcbAn-PT81 (b) (SEQ ID NO: 24).
Table 11 N-Termmal Ammo Acid Sequences (similarity and identity were calculated by hand)
201 kDa (33% identity & 50% similarity to SEQ ID NO.l) L I G Y N N Q F S G * A SEQ ID NO: 13
I I I = I F I Q G Y S D L F G N - A SEQ ID NO : 1
197 kDa (42% identity & 58% similarity SEQ ID NO 2) M Q N S Q T F ≤ V G E L SEQ ID NO.14 M Q D S P E V S I T T L SEQ ID NO .2
Construction of a Cos id Library of Photorhabdus luminescens W-14 Genomic DNA and its Screening to Isolate Genes Encoding Peptides
Comprising the Toxic Protein Preparation
As a prerequisite for the production of Photorhabdus insect toxic proteins in heterologous hosts, and for other uses, it is necessary to isolate and characterize the genes that encode those peptides. This objective was pursued m parallel One approach, described later, was based on the use of monoclonal and polyclonal antibodies raised against the purified toxm which were then used to isolate clones from an expression library The other approach, described in this example, is based on the use of the N-terminal and internal ammo acid sequence data to design degenerate oligonucleotides for use in PCR amplication Either method can be used to identify DNA clones that contain the peptide-encoding genes so as to permit the isolation of the respective genes, and the determination of their DNA base sequence
-4b-
SUBSTΓΓUTE SHEET (RULE 26)
Genomic DNA Isolation
Photorhabdus luminescens strain W-14 (ATCC accession number 55397) was grown on 2% proteose peptone #3 agar (Difco Laboratories, Detroit, MI) and msecticidal tox competence was maintained by repeated bioassay after passage, using the method described in Example 1 above. A 50 ml shake culture was produced in a 175 ml baffled flask in 2% proteose peptone #3 medium, grown at 28°C and 150 rpm for approximately 24 hours. 15 ml of this culture was pelleted and frozen in its medium at -20°C until it was thawed for DNA isolation. The thawed culture was centrifuged, (700 x g, 30 mm) and the floating orange mucopolysaccharide material was removed. The remaining cell material was centrifuged (25,000 x g, 15 mm) to pellet the bacterial cells, and the medium was removed and discarded. Genomic DNA was isolated by an adaptation of the CTAB method described in section 2.4.1 of Current Protocols m Molecular Biology (Ausubel et al . eds, John Wiley & Sons, 1994) [modified to include a salt shock and with all volumes increased 10 -fold] The pelleted bacterial cells were resuspended m TE buffer (10 mM Tπs- HCl, 1 mM EDTA, pH 8.0) to a final volume of 10 ml, then 12 ml of 5 M NaCl was added; this mixture was centrifuged 20 mm at 15,000 x g. The pellet was resuspended in 5.7 ml TE and 300 ml of 10% SDS and 60 ml of 20 mg/ l protemase K (Gibco BRL Products, Grand Island, NY; sterile distilled water) were added to the suspension. This mixture was incubated at 37°C for 1 hr, then approximately 10 mg lysozyme (Worthmgton Biochemical Corp., Freehold, NJ) was added. After an additional 45 mm, 1 ml of 5 M NaCl and 800 ml of CTAB/NaCl solution (10% w/v CTAB, 0 7 M NaCl) were added. This preparation was incubated 10 mm at 65°C, then gently agitated and further incubated and agitated for approximately 20 mm to assist clearing of the cellular material. An equal volume of chloroform/isoamyl alcohol solution (24.1, v/v) was added, mixed gently and centrifuged. After two extractions with an equal volume of PCI (phenol/chloroform/isoamyl alcohol; 50:49:1, v/v/v; equilibrated with 1 M Tris-HCl, pH 8.0;
Intermountam Scientific Corporation, Kaysville, UT) , the DNA was precipitated with 0.6 volume of isopropanol. The DNA precipitate was gently removed with a glass rod, washed twice with 70% ethanol, dried, and dissolved in 2 ml STE (10 mM Tris-HCl pH 8.0 , 10 mM NaCl, 1 mM EDTA) . This preparation contained 2 5 mg/ml DNA, as determined by optical density at 260 nm (i.e., OD260) .
The molecular size range of the isolated genomic DNA was evaluated for suitability for library construction. CHEF gel analysis was performed in 1.5% agarose (Seakem^ LE, FMC BioProducts, Rockland, ME) gels with 0.5 X TBE buffer (44.5 mM Tris-HCl pH 8.0, 44.5 mM H3B03 , 1 mM EDTA) on a BioRad CHEF-DR II apparatus with a Pulsewave 760 Switcher (Bio-Rad Laboratories, Inc., Richmond, CA) . The running parameters were: initial A time, 3 sec,- final A time, 12 sec; 200 volts; running temperature, 4-18°C; run time, 16.5 hr . Ethidium bromide staining and examination of the gel under ultraviolet light indicated the DNA ranged from 30-250 kbp in size.
Construction of Library
A partial Sau3A 1 digest was made of this Photorhabdus genomic DNA preparation. The method was based on section 3 1.3 of Ausubel (supra.) Adaptions mcluded running smaller scale reactions under various conditions until nearly optimal results were achieved. Several scaled-up large reactions with varied conditions were run, the results analyzed on CHEF gels, and only the best large scale preparation was carried forward. In the optimal case, 200 μg of Photorhabdus genomic DNA was incubated with 1.5 units of Sau3A 1 (New England Biolabs, "NEB", Beverly, MA) for 15 mm at 37°C in 2 ml total volume of IX NEB 4 buffer (supplied as 10X by the manufacturer) The reaction was stopped by adding 2 ml of PCI and centrifugmg at 8000 x g for 10 mm. To the supernatant were added 200 μl of 5 M NaCl plus 6 ml of ice-cold ethanol. This preparation was chilled for 30 mm at -20°C, then centrifuged at 12,000 x g for 15 mm. The supernatant was removed and the precipitate was dried in a vacuum oven at 40°C, then resuspended m 400 μl STE . Spectrophotometπc assay indicated about 40% recovery of the input DNA. The digested DNA was size fractionated on a_ sucrose gradient according to section 5.3.2 of CPMB [ op . ci t . ) . A 10% to 40% (w/v) linear sucrose gradient was prepared with a gradient maker m Ultra-Clear™ tubes (Beckman Instruments, Inc., Palo Alto, CA) and the DNA sample was layered on top. After centrifugation, (26,000 rpm, 17 hr, Beckman SW41 rotor, 20°C) , fractions (about 750 μl) were drawn from the top of the gradient and analyzed by CHEF gel electrophoresis (as described earlier) . Fractions containing Sau3A 1 fragments in the size range 20-40 kbp were selected and DNA was precipitated by a modification (amounts of all solutions increased approximately 6.3 -fold) of the method in section 5.3.3 of Ausubel [ supra . ) After overnight precipitation, the DNA was collected by centrifugation (17,000 x g, 15 mm), dried, redissolved in TE,
pooled into a final volume of 80 μl , and reprecipitated with the addition of 8 μl 3 M sodium acetate and 220 μl ethanol. The pellet collected by centrifugation as above was resuspended in 12 μl TE. Concentration of the DNA was determined by Hoechst 33258 dye (Polysciences, Inc., Warrmgton, PA) fluorometry a Hoefer TKO100 fluoπmeter (Hoefer Scientific Instruments, San Francisco, CA) . Approximately 2.5 μg of the size-fractionated DNA was recovered.
Thirty μg of cosmid pWE15 DNA (Stratagene, La Jolla, CA) was digested to completion with 100 units of restriction enzyme BamH 1 (NEB) in the manufacturer's buffer (final volume of 200 μl, 37°C, 1 hr) . The reaction was extracted with 100 μl of PCI and DNA was precipitated from the aqueous phase by addition of 20 μl 3M sodium acetate and 550 μl -20°C absolute ethanol. After 20 mm at -70°C, the DNA was collected by centrifugation (17,000 x g, 15 mm), dried under vacuum, and dissolved m 180 μl of 10 mM Tris-HCl, pH 8.0. To this were added 20 μl of 10X CIP buffer (100 mM Tris-HCl, pH 8.3, 10 mM ZnCl2; 10 mM MgCl2) , and 1 μl (0.25 units) of 1 4 diluted calf intestinal alkaline phosphatase (Boehringer Mannheim Corporation, Indianapolis, IN) . After 30 mm at 37°C, the following additions were made. 2 μl 0.5 M EDTA, pH 8.0 , 10 μl 10% SDS, 0.5 μl of 20 mg/ml prote ase K (as above), followed by incubation at 55°C for 30 mm. Following sequential extractions with 100 μl of PCI and 100 μl phenol (Intermountam Scientific Corporation, equilibrated with 1 M Tris-HCl, pH 8.0), the dephosphorylated DNA was precipitated by addition of 72 μl of 7.5 M ammonium acetate and 550 μl -20°C ethanol, incubation on ice for 30 mm, and centrifugation as above. The pelleted DNA was washed once with 500 μl -20°C 70% ethanol, dried under vacuum, and dissolved 20 μl of TE buffer. Ligation of the size-fractionated Sau3A 1 fragments to the BamH 1 -digested and phosphatased pWE15 vector was accomplished using T4 ligase (NEB) by a modification (i.e., use of premixed 10X ligation buffer supplied by the manufacturer) of the protocol section 3.33 of Ausubel. Ligation was carried out overnight m a total volume of 20 μl at 15°C, followed by storage at - 20°C.
Four μl of the cosmid DNA ligation reaction, containing about 1 μg of DNA, was packaged into bacteriophage lambda using a commercial packaging extract (Gigapack* III Gold Packaging Extract, Stratagene), following the manufacturer's directions. The packaged preparation was stored at 4°C until use The packaged cosmid preparation was used to infect Escheπ chia coli XLI Blue MR cells
48-
SUBSTΓΓUTE SHEET (RULE 26)
(Stratagene) according to the Gιgapa,ck" III Gold protocols ("Titermg the Cosmid Library"), as follows. XLI Blue MR cells were grown in LB medium (g/L: Bacto-tryptone , 10; Bacto-yeast extract, 5; Bacto-agar, 15; NaCl, 5; [Difco Laboratories, Detroit, MI]) containing 0.2% (w/v) maltose plus 10 mM MgS04 , at 37°C. After 5 hr growth, cells were pelleted at 700 x g (15 min) and resuspended m 6 ml of 10 mM MgSO, . The culture density was adjusted with 10 mM MgSO, to OD600 = 0.5 The packaged cosmid library was diluted 1:10 or 1.20 with sterile SM medium (0 1 M NaCl, 10 mM MgSO, 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin), and 25 μl of the diluted preparation was mixed with 25 μl of the diluted XLI Blue MR cells. The mixture was incubated at 25°C for 30 mm (without shaking) , then 200 μl of LB broth was added, and incubation was continued for approximately 1 hr with occasional gentle shaking. Aliquots (20-40 μl) of this culture were spread on LB agar plates containing 100 mg/1 ampicillm d e., LB-Amp^00) and incubated overnight at 37 °C. To store the library without amplification, single colonies were picked and inoculated into individual wells of sterile 96-well microwell plates; each well containing 75 μl of Terrific Broth (TB media. 12 g/1 Bacto- tryptone, 24 g/1 Bacto-yeast extract, 0.4% v/v glycerol, 17 mM KH2P04, 72 mM K2HP0 plus 100 mg/1 ampicillm (i.e , TB-Amp100) and incubated (without shaking) overnight at 37°C. After replicating the 96-well plate into a copy plate, 75 μl/well of filter- sterilized TB:glycerol (1:1, v/v; with, or without, 100 mg/1 ampicillm) was added to the plate, it was shaken briefly at 100 rpm, 37°C, and then closed with ParafilirT (American National Can, Greenwich, CT) and placed in a -70°C freezer for storage. Copy plates were grown and processed identically to the master plates. A total of 40 such master plates (and their copies) were prepared.
Screening of the Library with Radiolabeled DNA Probes
To prepare colony filters for probing with radioactively labeled probes, ten 96-well plates of the library were thawed at 25°C (bench top at room temperature) A replica plating tool with 96 prongs was used to inoculate a fresh 96-well copy plate containing 75 μl/well of TB-Amp100. The copy plate was grown overnight (stationary) at 37°C, then shaken about 30 mm at 100 rpm at 37°C. A total of 800 colonies was represented m these copy plates, due to nongrowth of some isolates. The replica tool was used to inoculate duplicate impressions of the 96-well arrays onto Magna NT (MSI, Westboro, MA) nylon membranes (0.45 micron, 220 x
-49-
SUBSTΓΓUTE SHEET (RULE 26)
250 mm) which had been placed on solid LB-Amp100 (100 ml/dish) in Bio-assay plastic dishes (Nunc, 243 x 243 x 18 mm; Curtm Mathison Scientific, Inc., Wood Dale, IL) . The colonies were grown on the membranes at 37°C for about 3 hr . A positive control colony (a bacterial clone containing a GZ4 sequence insert, see below) was grown on a separate Magna NT membrane (Nunc, 0.45 micron, 82 mm circle) on LB medium supplemented with 35 mg/1 chloramphenicol (i.e., LB-Cam35) , and processed alongside the library colony membranes. Bacterial colonies on the membranes were lysed, and the DNA was denatured and neutralized according to a protocol taken from the Genius™ System User's Guide version 2.0 (Boehr ger Mannheim, Indianapolis, IN) Membranes were placed colony side up on filter paper soaked with 0.5 N NaOH plus 1.5 M NaCl for 15 mm to denature, and neutralized on filter paper soaked with 1 M Tris-HCl pH 8.0 , 1.5 M NaCl for 15 mm After UV-crosslinkmg using a Stratagene UV Stratalinker set on auto crosslink, the membranes were stored dry at 25°C until use Membranes were trimmed into strips containing the duplicate impressions of a single 96-well plate, then washed extensively by the method of section 6.4.1 in CPMB [ op. ci t . ) 3 hr at 25°C in 3X SSC, 0.1% (w/v) SDS, followed by 1 hr at 65°C in the same solution, then rinsed in 2X SSC in preparation for the hybridization step (20X SSC = 3 M NaCl, 0.3 M sodium citrate, pH 7.0)
Amplification of a Specific Genomic Fragment of a TcaC Gene
Based on the N-termmal ammo acid sequence determined for the purified TcaC peptide fraction [disclosed herein as SEQ ID N0:2], a pool of degenerate oligonucleotides (pool S4Psh) was synthesized by standard β-cyanoethyl chemistry on an Applied BioSystem ABI394 DNA7RNA Synthesizer (Perkm Elmer, Foster City, CA) . The oligonucleotides were deprotected 8 hours at 55°C\ ""dissolved in water, quantitated by spectrophotometric measurement, and diluted for use. This pool corresponds to the determined N-termmal ammo acid sequence of the TcaC peptide. The determined am o acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents mosme:
Amino Met Gin Asp Ser Pro Glu Val
S4Psh 5' ATG CA(A/G) GA(T/C) (T/A) (C/G) (T/A) CCI GA(A/G) GT 3'
Another set of degenerate oligonucleotides was synthesized (pool P2.3.5R), representing the complement of the coding strand for the determined ammo acid sequence of the SEQ ID NO: 17
-50-
SUBSTΠΓUTE SHEET (RULE 26)
Ammo
Acid Ala Phe Asn lie Asp Asp Val
Codons 5' GCN TT(T/C) AA(T/C) AT (A/T/C) GA(T/C) GA(T/C) GT 3' P2 3.5R 3'CG(A/C/G/T) AA(A/G) TT(A/G) TA(T/A/G) CT (A/G) CT(A/G) CA 5'
These oligonucleotides were used as primers in Polymerase Cham Reactions (PCRS, Roche Molecular Systems, Branchburg, NJ) to amplify a specific DNA fragment from genomic DNA prepared from Photorhabdus strain W-14 (see above) . A typical reaction (50 μl) contained 125 pmol of each primer pool P2Psh and P2.3.5R, 253 ng of genomic template DNA, 10 nmol each of dATP, dCTP, dGTP, and dTTP, IX GeneAmp"' PCR buffer, and 2.5 units of AmpliTaq" DNA polymerase (both from Roche Molecular Systems,- 10X GeneAmp® buffer is 100 mM Tris-HCl pH 8.3, 500 mM KC1 , 0.01% w/v gelatin). Amplifications were performed in a Perkm Elmer Cetus DNA Thermal Cycler (Perkin Elmer, Foster City, CA) using 35 cycles of 94°C (1.0 mm), 55°C (2.0 mm), 72°C (3.0 mm), followed by an extension period of 7.0 mm at 72 °C Amplification products were analyzed by electrophoresis through 2% w/v NuSieve" 3:1 agarose (FMC
BioProducts) m TEA buffer (40 mM Tris-acetate, 2 mM EDTA, pH 8.0) . A specific product of estimated size 250 bp was observed amongst numerous other amplification products by ethidium bromide (0.5 μg/ l) staining of the gel and examination under ultraviolet light. The region of the gel containing an approximately 250 bp product was excised, and a small plug (0.5 mm dia.) was removed and used to supply template for PCR amplification (40 cycles) The reaction (50 μl) contained the same components as above, minus genomic template DNA. Following amplification, the ends of the fragments were made blunt and were phosphorylated by incubation at 25°C for 20 mm with 1 unit of T4 DNA polymerase (NEB) , 1 nmol ATP, and 2.15 units of T4 kinase (Pharmacia Biotech Inc., Piscataway, NJ) .
DNA fragments were separated from residual primers by electrophoresis through 1% w/v GTG" agarose (FMC) in TEA. A gel slice containing fragments of apparent size 250 bp was excised, and the DNA was extracted using a Qiaex kit (Qiagen Inc., Chatsworth, CA) .
The extracted DNA fragments were ligated to plasmid vector pBC KS(+) (Stratagene) that had been digested to completion with restriction enzyme Sma 1 and extracted in a manner similar to that descπned for pWE15 DNA above. A typical ligation reaction (16.3 μl) contained 100 ng of digested pBC KS(+) DNA, 70 ng of 250 bp fragment DNA, 1 nmol [Co (NH3) J Cl3 , and 3.9 Weiss units of T4 DNA ligase (Collaborative Biomedical Products, Bedford, MA), IX
-51-
SUBSTITUTE SHEET (RULE 25)
ligation buffer (50 mM Tris-HCl, pH.7.4; 10 mM MgCl:; 10 mM dithiothreitol; 1 mM spermidine, 1 mM ATP, 100 mg/ml bovine serum albumin) . Following overnight incubation at 14 °C, the ligated products were transformed into frozen, competent Escherichia coli DH5α cells (Gibco BRL) according to the suppliers' recommendations, and plated on LB-Cam35 plates , containing IPTG (119 μg/ l) and X-gal (50 μg/ml) . Independent white colonies were picked, and plasmid DNA was prepared by a modified alkaline-lysis/PEG precipitation method (PRISM™ Ready Reaction DyeDeoxy™ Terminator Cycle Sequencing Kit Protocols; ABI/Perkin Elmer). The nucleotide sequence of both strands of the insert DNA was determined, using T7 primers [pBC KS(+) bases 601-623: TAAAACGACGGCCAGTGAGCGCG) and LacZ primers [pBC KS(+) bases 792-816: ATGACCATGATTACGCCAAGCGCGC) and protocols supplied with the PRISM™ sequencing kit (ABI/Perkin Elmer) . Nonincorporated dye-terminator dideoxyribonucleotides were removed by passage through Centri-Sep 100 columns (Princeton Separations, Inc., Adelphia, NJ) according to the manufacturer's instructions. The DNA sequence was obtained by analysis of the samples on an ABI Model 373A DNA Sequencer (ABI/Perkin Elmer) . The DNA sequences of two isolates, GZ4 and HB14, were found to be as illustrated in Fig. 1.
This sequence illustrates the following features: 1) bases 1- 20 represent one of the 64 possible sequences of the S4Psh degenerate oligonucleotides, ii) the sequence of amino acids 1-3 and 6-12 correspond exactly to that determined for the N-terminus of TcaC (disclosed as SEQ ID NO: 2), iii) the fourth amino acid encoded is a cysteine residue rather than serine. This difference is encoded within the degeneracy for the serine codons (see above) , iv) the fifth amino acid encoded is proline, corresponding to the TcaC N-terminal sequence given as SEQ ID NO:2, v) bases 257-276 encode one of the 192 possible sequences designed into the degenerate pool, vi ) the TGA termination codon introduced at bases 268-270 is the result of complementarity to the degeneracy built into the oligonucleotide pool at the corresponding position, and does not indicate a shortened reading frame for the corresponding gene .
Labeling of a TcaC Peptide Gene-speci ic Probe
DNA fragments corresponding to the above 276 bases were amplified (35 cycles) by PCRS in a 100 μl reaction volume, using 100 pmol each of P2Psh and P2.3.5R primers, 10 ng of plasmids GZ4 or HB14 as templates, 20 nmol each of dATP, dCTP, dGTP, and dTTP, 5
-52-
SUBSTΓΓUTE SHEET (RULE 26)
units of AmpliTAq* DNA polymerase, and IX concentration of GeneAmp" buffer, under the same temperature regimes as described above. The amplification products were extracted from a 1% GTG" agarose gel by Qiaex kit and quantitated by fluorometry The extracted amplification products from plasmid HB14 template (approximately 400 ng) were split into five aliquots and labeled with "P-dCTP using the High Prime Labeling Mix (Boehringer Mannheim) according to the manufacturer's instructions. Nonmcorporated radioisotope was removed by passage through NucTrap* Probe Purification Columns (Stratagene) , according to the supplier's instructions. The specific activity of the labeled DNA product was determined by scintillation counting to be 3.11 x 10β dpm/μg This labeled DNA was used to probe membranes prepared from 800 members of the genomic library.
Screening with a TcaC-peptide Gene Specific Probe
The radiolabeled HB14 probe was boiled approximately 10 mm, then added to "minimal hyb" solution. [Note: The "minimal hyb" method is taken from a CERES protocol, "Restriction Fragment Length Polymorphism Laboratory Manual version 4.0", sections 4-40 and 4- 47; CERES/NPI, Salt Lake City, UT. NPI is now defunct, with its successors operating as Linkage Genetics] . "Minimal hyb" solution contains 10% w/v PEG (polyethylene glycol, M.W. approx. 8000), 7% w/v SDS; 0.6X SSC, 10 mM sodium phosphate buffer (from a 1M stock containing 95 g/1 NaH2P04 1H20 and 84.5 g/1 Na2HPO„ 7H20) , 5 mM EDTA, and 100 mg/ml denatured salmon sperm DNA. Membranes were blotted dry briefly then, without prehybπdization, 5 strips of membrane were placed in each of 2 plastic boxes containing 75 ml of "minimal hyb" and 2.6 ng/ml of radiolabeled HB14 probe. These were incubated overnight with slow shaking (50 rpm) at 60 °C. The filters were washed three times for approximately 10 mm each at 25°C m "minimal hyb wash solution" (0.25X SSC, 0.2% SDS), followed by two 30-mm washes with slow shaking at 60°C in the same solution. The filters were placed on paper covered with Saran Wrap' (Dow Brands, Indianapolis, IN) in a light-tight autoradiographic cassette and exposed to X-Omat X-ray film (Kodak, Rochester, NY) with two DuPont Cronex Lightnmg-Plus Cl enhancers (Sigma Chemical Co., St. Louis, MO), for 4 hr at -70°C. Upon development (standard photographic procedures) , significant signals were evident in both replicates amongst a high background of weaker, more irregular signals The filters were again washed for about 4 hr at 68 °C in "minimal hyb wash solution" and then placed again in the cassettes
and film was exposed overnight at -7D°C. Twelve possible positives were identified due to strong signals on both of the duplicate 96- well colony impressions No signal was seen with negative control membranes (colonies of XLI Blue MR cells containing pWE15) , and a very strong signal was seen with positive control membranes (DH5α cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
The twelve putative hybridization-positive colonies were retrieved from the frozen 96-well library plates and grown overnight at 37°C on solid LB-Amp100 medium They were then patched (3/plate, plus three negative controls: XLI Blue MR cells containing the pWE15 vector) onto solid LB-Amp100 Two sets of membranes (Magna NT nylon, 0.45 micron) were prepared for hybridization. The first set was prepared by placing a filter directly onto the colonies on a patch plate, then removing it with adherent bacterial cells, and processing as below Filters of the second set were placed on plates containing LB-Amp100 medium, then inoculated by transferring cells from the patch plates onto the filters. After overnight growth at 37°C, the filters were removed from the plates and processed.
Bacterial cells on the filters were lysed and DNA denatured by placing each filter colony-side-up on a pool (1.0 ml) of 0 5 N NaOH in a plastic plate for 3 mm. The filters were blotted dry on a paper towel, then the process was repeated with fresh 0 5 N NaOH. After blotting dry, the filters were neutralized by placing each on a 1.0 ml pool of 1 M Tris-HCl, pH 7.5 for 3 min, blotted dry, and reneutralised with fresh buffer. This was followed by two similar soak gs (5 mm each) on pools of 0 5 M Tris-HCl pH 7 5 plus 1.5 M NaCl. After blotting dry, the DNA was UV crosslinked to the filter (as above) , and the filters were washed (25°C, 100 rpm) in about 100 ml of 3X SSC plus 0.1% (w/v) SDS (4 times, 30 mm each with fresh solution for each wash) They were then placed in a minimal volume of prehybridization solution [6X SSC plus 1% w/v each of Ficoll 400 (Pharmacia), polyvmylpyrrolidone (av. M.W. 360,000; Sigma ) and bovine serum albumin Fraction V, (Sigma) ] for 2 hr at 65°C, 50 rpm The prehybridization solution was removed, and replaced with the HB14 32P-labeled probe that had been saved from the previous hybridization of the library membranes and which had been denatured at 95°C for 5 mm Hybridization was performed at 60 °C for 16 hr with shaking at 50 rpm.
Following removal of the labeled probe solution, the membranes were washed 3 times at 25°C (50 rpm, 15 mm) in 3X SSC (about 150 ml each wash) They were then washed for 3 hr at 68 °C (50 rpm)
-54-
SUBSTTTUTE SHEET (RULE 26)
0.25X SSC plus 0.2% SDS (minimal hyb wash solution), and exposed to X-ray film as described above for 1.5 hr at 25°C (no enhancer screens) . This exposure revealed very strong hybridization signals to cosmid isolates 22G12, 25A10, 26A5, and 26B10, and a very weak signal with cosmid isolate 8B10. No signal was seen with the negative control (pWE15) colonies, and a very strong signal was seen with positive control membranes (DH5α cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
Amplification of a Specific Genomic Fragment of a TcaB Gene
Based on the N-termmal ammo acid sequence determined for the purified TcaB1 peptide fraction (disclosed here as SEQ ID NO.3) a pool of degenerate oligonucleotides (pool P8F) was synthesized as described for peptide TcaC. The determined ammo acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents mos e-
Ammo Acid Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg
P8F 5' TTT ACI CA(A/G) ACI (C/T)TI AAA GAA GCI (A/C)G 3' (C/T)TI Another set of degenerate oligonucleotides was synthesized
(pool P8.108.3R), representing the complement of the coding strand for the determined ammo acid sequence of the TcaBi-PTlOδ internal peptide (disclosed herein as SEQ ID NO.20) Amino
Acid Met Tyr Tyr lie Gin Ala Gin Gin
Codons ATG TA(T/C) TA(T/C) AT(T/C/A) CA(A/G) GC(A/C/G/T) CA(A/G CA(A/G) P8 108 3R 3' AT(A/G) AT(A/G) TA(A/G/T) GT(T/C) CGI GT(T/C) GT 5' TAC
These oligonucleotides were used as primers for PCR* using HotStart 50 Tubes™ (Molecular Bio-Products, Inc , San Diego, CA) to amplify a specific DNA fragment from genomic DNA prepared from Photorhabdus strain W-14 (see above). A typical reaction (50 μl) contained (bottom layer) 25 pmol of each primer pool P8F and P8.108.3R, with 2 nmol each of dATP, dCTP, dGTP, and dTTP, m IX GeneAmp* PCR buffer, and (top layer) 230 ng of genomic template DNA, 8 nmol each of dATP, dCTP, dGTP, and dTTP, and 2 5 units of AmpliTaq* DNA polymerase, in IX GeneAmp" PCR buffer Amplifications were performed by 35 cycles as described for the TcaC peptide Amplification products were analyzed by electrophoresis through
0.7% w/v SeaKem" LE agarose (FMC) in TEA buffer. A specific product of estimated size 1600 bp was observed.
Four such reactions were pooled, and the amplified DNA was extracted from a 1.0% SeaKeπT LE gel by Qiaex kit as described for the TcaC peptide. The extracted DNA was used directly as the template for sequence determination (PRISM1" Sequencing Kit) using the P8F and P8.108.3R primer pools. Each reaction contained about 100 ng template DNA and 25 pmol of one primer pool, and was processed according to standard protocols as described for the TcaC peptide. An analysis of the sequence derived from extension of the P8F primers revealed the short DNA sequence (and encoded ammo acid sequence) :
GAT GCA TTG NTT GCT Asp Ala Leu (Val) Ala which corresponds to a portion of the N-termmal peptide sequence disclosed as SEQ ID NO: 3 (TcaBx)
Labeling of a TcaB-. -peptide Gene-specific Probe
Approximately 50 ng of gel -purified TcaBi DNA fragment was labeled with 32P-dCTP as described above, and non corporated radioisotopes were removed by passage through a NICK Column' (Pharmacia) . The specific activity of the labelled DNA was determined to be 6 x 109 dpm/μg This labeled DNA was used to probe colony membranes prepared from members of the genomic library that had hybridized to the TcaC-peptide specific probe.
The membranes containing the 12 colonies identified m the TcaC-probe library screen (see above) were stripped of radioactive TcaC-specific label by boiling twice for approximately 30 mm each time in 1 liter of 0. IX SSC plus 0.1 % SDS. Removal of raαiolabel was checked with a 6 hr film exposure The stripped membranes were then incubated with the TcaB peptide-specific probe prepared above The labeled DNA was denatured by boiling for 10 mm, and then added to the filters that had been incubated for L hr in 100 ml of "minimal hyb" solution at 60°C. After overnight hybridization at this temperature, the probe solution was removed, and the filters were washed as follows (all 0.3X SSC plus 0.1% SDS) once for 5 mm at 25°C, once for 1 hr at 60°C m fresh solution, and once for 1 hr at 63 °C in fresh solution. After 1.5 hr exposure to X-ray film by standard procedures, 4 strongly- hybridizing colonies were observed. These were, as with the TcaC- specific probe, isolates 22G12, 25A10, 26A5, and 26B10
-56-
SUBSTTTUTE SHEET (RULE 26)
The same TcaBx probe solution was diluted with an equal volume (about 100 ml) of "minimal hyb" solution, and then used to screen the membranes containing the 800 members of the genomic library. After hybridization, washing, and exposure to X-ray film as described above, only the four cosmid clones 22G12, 25A10, 26A5, and 26B10, were found to hybridize strongly to this probe
Isolation of Subclones Containing Genes Encoding TcaC and YcaBj Peptides. and Determination of DNA Base Sequence Thereof Three hybridization-positive cosmids in strain XLI Blue MR were grown with shaking overnight (200 rpm) at 30°C in 100 ml TB- Amp100. After harvesting the cells by centrifugation, cosmid DNA was prepared using a commercially available kit (BIGprep™, 5 Prime 3 Prime, Inc., Boulder, CO), following the manufacturer's protocols. Only one cosmid, 26A5, was successfully isolated by this procedure When digested with restriction enzyme EcoR 1 (NEB) and analyzed by gel electrophoresis, fragments of approximate sizes 14, 10, 8 (vector), 5, 3.3, 2.9, and 1 5 kbp were detected. A second attempt to isolate cosmid DNA from the same three strains (8 ml cultures; TB-Amp100, 30 °C) utilized a boiling mmiprep method
(Evans G. and G. Wahl . , 1987, "Cosmid vectors for genomic walking and rapid restriction mapping." in Guide to Molecular Cloning Techniques. Meth. Enzymoloαy. Vol. 152, S. Berger and A. Kimmel, eds., pgs . 604-610). Only one cosmid, 25A10, was successfully isolated by this method. When digested with restriction enzyme EcoR I (NEB) and analyzed by gel electrophoresis, this cosmid showed a fragmentation pattern identical to that previously seen with cosmid 26A5.
A 0.15 μg sample of 26A5 cosmid DNA was used to transform 50 ml of E. coli DH5α cells (Gibco BRL) , by the supplier's protocols. A single colony isolate of that strain was inoculated into 4 ml of TB-Amp100, and grown for 8 hr at 37°C. Chloramphenicol was added to a final concentration of 225 μg/ml, incubation was continued for another 24 hr, then cells were harvested by centrifugation and frozen at -20°C. Isolation of the 26A5 cosmid DNA was by a standard alkaline lysis mmiprep (Maniatis et al . , op . ci t . , p. 382) , modified by increasing all volumes by 50% and with stirring or gentle mixing, rather than vortexing, at every step. After washing the DNA pellet in 70% ethanol, it was dissolved TE containing 25 μg/ml ribonuclease A (Boehπnger Mannheim)
Identification of EcoR I Fragments Hybridizing to GZ4 -derived and TcaB,_- Probes
Approximately 0.4 μg of cosmid 25A10 (from XLI Blue MR cells) β'd about 0.5 μg of cosmid 26A5 (from chloramphenicol -amplified [ 5α cells) were each digested with about 15 units of EcoR I (NEB) for 85 mm, frozen overnight, then heated at 65 °C for five mm, and electrophoresed n a 0.7% agarose gel (Seake * LE, IX TEA, 80 volts, 90 m ) . The DNA was stained with ethidium bromide as described above, and photographed under ultraviolet light. The EcoR I digest of cosmid 25A10 was a complete digestion, but the sample of cosmid 26A5 was only partially digested under these conditions. The agarose gel containing the DNA fragments was subjected to depurmation, denaturation and neutralization, followed by Southern blotting onto a Magna NT nylon membrane, using a high salt (20X SSC) protocol, all as described in section 2.9 of Ausubel et al . (CPMB, op . ci t . ) The transferred DNA was then UV-crosslmked to the nylon membrane as before .
An TcaC-peptide specific DNA fragment corresponding to the insert of plasmid isolate GZ4 was amplified by PCR° in a 100 ml reaction volume as described previously above. The amplification products from three such reactions were pooled and were extracted from a 1% GTG" agarose gel by Qiaex kit, as described above, and quantitated by fluorometry. The gel-purified DNA (100 ng) was labeled with 32P-dCTP using the High Prime Labeling Mix (Boehrmger Mannheim) as described above, to a specific activity of 6 34 x 10s dpm/μg.
The 32P-labeled GZ4 probe was boiled 10 mm, then added to "minimal hyb" buffer (at 1 ng/ml) , and the Southern blot membrane containing the digested cosmid DNA fragments was added, and incubated for 4 hr at 60°C with gentle shaking at 50 rpm. The membrane was then washed 3 times at 25°C for about 5 mm each (minimal hyb wash solution) , followed by two washes for 30 mm each at 60°C. The blot was exposed to film (with enhancer screens) for about 30 mm at -70°C The GZ4 probe hybridized strongly to the 5.0 kbp (apparent size) EcoR I fragment of both these two cosmids, 26A5 and 25A10.
The membrane was stripped of radioactivity by boiling for about 30 mm in 0. IX SSC plus 0.1 % SDS, and absence of radiolabel was checked by exposure to film It was then hybridized at 60°C for 3.5 hours with the (denatured) TcaB probe "minimal hyb" buffer previously used for screening the colony membranes (above) , washed as described previously, and exposed to film for 40 mm at -
-58-
SUBSTΓΠJTE SHEET (RULE 26)
70°C with two enhancer screens. With both cosmids, the TcaBx probe hybridized lightly with the about 5 0 kbp EcoR 1 fragment, and strongly with a fragment of approximately 2.9 kbp .
The sample of cosmid 26A5 DNA previously described, (from DH5α cells) was used as the source of DNA from which to subclone the bands of interest. This DNA (2.5 μg) was digested with about 3 units of EcoR I (NEB) a total volume of 30 μl for 1.5 hr, to give a partial digest, as confirmed by gel electrophoresis Ten μg of pBC KS (+) DNA (Stratagene) were digested for 1.5 hr with 20 units of EcoR I in a total volume of 20 μl , leading to total digestion as confirmed by electrophoresis Both EcoR I -cut DNA preparations were diluted to 50 μl with water, to each an equal volume of PCI was added, the suspension was gently mixed, spun in a microcentrifuge and the aqueous supernatant was collected. DNA was precipitated by 150 μl ethanol, and the mixture was placed at -20°C overnight. Following centrifugation and drying, the EcoR I - digested pBC KS (+) was dissolved in 100 μl TE, the partially digested 26A5 was dissolved in 20 μl TE DNA recovery was checked by fluorometry. In separate reactions, approximately 60 ng of EcoR I -digested pBC KS(+) DNA was ligated with approximately 180 ng or 270 ng of partially digested cosmid 26A5 DNA. Ligations were carried out m a volume of 20 μl at 15°C for 5 hr, using T4 ligase and buffer from New England BioLabs The ligation mixture, diluted to 100 μl with sterile TE, was used to transform frozen, competent DH5α cells
(Gibco BRL) according to the supplier's instructions. Varying amounts (25-200 μl) of the transformed cells were plated on freshly prepared solid LB-Cam35 medium with 1 mM IPTG and 50 mg/1 X-gal. Plate's' were" incubated at 37 °C about 20 hr, then chilled the dark for approximately 3 hr to intensify color for ιnsertr~selectιon. White colonies were picked onto patch plates of the same composition and incubated overnight at 37°C.
Two colony lifts of each of the selected patch plates were prepared as follows. After picking white colonies to fresh plates, round Magna NT nylon membranes were pressed onto the patch plates, the membrane was lifted off, and subjected to denaturation, neutralization and UV crosslmk g as described above for the library colony membranes. The crosslmked colony lifts were vigorously washed, including gently wiping off the excess cell debris with a tissue. One set was hybridized with the GZ4 (TcaC) probe solution described earlier, and the other set was hybridized with the TcaBx probe solution described earlier, according to the
-59-
SUBSTΪTUTE SHEET (RULE 26)
'minimal hyb' protocol, followed by washing and film exposure as described for the library colony membranes.
Colonies showing hybridization signals either only with the GZ4 probe, with both GZ4 and TcaBi probes, or only with the TcaBi probe, were selected for further work and cells were streaked for single colony isolation onto LB-Cam3r media with IPTG and X-gal as before. Approximately 35 single colonies, from 16 different isolates, were picked into liquid LB-Cam35 media and grown overnight at 37°C; the cells were collected by centrifugation and plasmid DNA was isolated by a standard alkaline lysis mmiprep according to
Maniatis et al . [ op. ci t . p. 368). DNA pellets were dissolved in TE + 25 μg/ml πbonuclease A and DNA concentration was determined by fluorometry. The EcoR I digestion pattern was analyzed by gel electrophoresis. The following isolates were picked as useful. Isolate A17.2 contains religated pBC KS(+) only and was used for a (negative) control. Isolates D38.3 and C44 1 each contain only the 2.9 kbp, TcaBi -hybridizing EcoR I fragment inserted into pBC KS(+) These plasmids, named pDAB2000 and pDAB2001, respectively, are illustrated in Fig. 2. Isolate A35.3 contains only the approximately 5 kbp, GZ4)- hybπdizmg EcoR 1 fragment, inserted into pBC KS(+ ) This plasmid was named pDAB2002 (also Fig. 2) These isolates provided templates for DNA sequencing.
Plasmids pDAB2000 and pDAB2001 were prepared using the BIGprep™ kit as before Cultures (30 ml) were grown overnight in
TB-Cam35 to an OD600 of 2, then plasmid was isolated according to the manufacturer's directions. DNA pellets were redissolved in 100 μl TE each, and sample integrity was checked by EcoR I digestion and gel electrophoretic analysis. Sequencing reactions were run in duplicate, with one replicate using as template pDAB2000 DNA, and the other replicate using as template pDAB2001 DNA. The reactions were carried out using the dideoxy dye terminator cycle sequencing method, as described above for the sequencing of the GZ4/HB14 DNAs Initial sequencing runs utilized as primers the LacZ and T7 primers described above, plus primers based on the determined sequence of the TcaBt PCR amplification product (THl = ATTGCAGACTGCCAATCGCTTCGG , TH12 = GAGAGTATCCAGACCGCGGATGATCTG) .
After alignment and editing of each sequencing output, each was truncated to between 250 to 350 bases, depending on the integrity of the chromatographic data as interpreted by the Perkm Elmer Applied Biosyste s Division SeqEd 675 software Subsequent
-60-
SUBSTΓΓUTE SHEET (RULE 26)
sequencing "steps" were made by selecting appropriate sequence for new primers. With a few exceptions, primers (synthesized as described above) were 24 bases in length with a 50% G+C composition. Sequencing by this method was carried out on both strands of the approximately 2.9 kbp EcoR I fragment .
To further serve as template for DNA sequencing, plasmid DNA from isolate pDAB2002 was prepared by BIGprep™ kit. Sequencing reactions were performed and analyzed as described above. Initially, a T3 primer (pBS SK (+) bases 774-796: CGCGCAATTAACCCTCACTAAAG) and a T7 primer (pBS KS (+) bases 621-643: GCGCGTAATACGACTCACTATAG) were used to prime the sequencing reactions from the flanking vector sequences, reading into the insert DNA. Another set of primers, (GZ4F: GTATCGATTACAACGCTGTCACTTCCC; THl3 : GGGAAGTGACAGCGTTGTAATCGATAC; TH14: ATGTTGGGTGCGTCGGCTAATGGACATAAC ; and LW1-204 :
GGGAAGTGACAGCGTTGTAATCGATAC) was made to prime from internal sequences, which were determined previously by degenerate oligonucleotide-mediated sequencing of subcloned TcaC-peptide PCR products. From the data generated during the initial rounds of sequencing, new sets of primers were designed and used to walk the entire length of the about 5 kbp fragment. A total of 55 oligo primers was used, enabling the identification of 4832 total bp of contiguous sequence.
When the DNA sequence of the EcoR I fragment insert of pDAB2002 is combined with part of the determined sequence of the pDAB2000/pDAB2001 isolates, a total contiguous sequence of 6005 bp was generated (disclosed herein as SEQ ID NO: 25) . When long open reading frames were translated into the corresponding ammo acids, the sequence clearly shows the TcaBi N-terminal peptide (disclosed as SEQ ID NO:3), encoded by bases 68-124, immediately following a methionine residue (start of translation) . Upstream lies a potential ribosome binding site (bases 51-58) , and downstream, at bases 215-277 is encoded the TcaBi-PT158 internal peptide (disclosed herein as SEQ ID NO: 19) . Further downstream, m the same reading frame, at bases 1787-1822, exists a sequence encoding the TcaBi-PT108 internal peptide (disclosed herein as SEQ ID NO:20) . Also in the same reading frame, at bases 1946-1972, is encoded the TcaBϋ N-terminal peptide (disclosed herein -as SEQ ID NO: 5), and the reading frame continues uninterrupted to a translation termination codon at nucleotides 3632-3634.
The lack of an in- frame stop codon between the end of the sequence encoding TcaB1 -PT108 and the start of the TcaBϋ encoding
-61-
SUBSTΓΓUTE SHEET (RULE 26)
region, and the lack of a discernible πbosome binding site immediately upstream of the TcaBn coding region, indicate that peptides TcaBlx and TcaBx are encoded by a single open reading frame of 3567 bp beginning at base pair 65 in SEQ ID NO:25), and are most likely derived from a single primary gene product TcaB of 1189 ammo acids (131,586 Daltons; disclosed herein as SEQ ID NO: 26) by post-translational cleavage. If the am o acid immediately preceding the TcaBxl N-termmal peptide represents the C-termmal ammo acid of peptide
, then the predicted mass of TcaBn (627 ammo acids) is 70,814 Daltons (disclosed herein as SEQ ID NO:28) , somewhat higher than the size observed by SDS-PAGE (68 kDa) This peptide would be encoded by a contiguous stretch of 1881 base pairs (disclosed herein as SEQ ID NO:27) It is thought that the native C-termmus of TcaBx lies somewhat closer to the C- terminus of TcaBi-PTlOΘ. The molecular mass of PT108 [3.438 kDa, determined during N-termmal ammo acid sequence analysis of this peptide] predicts a size of 30 ammo acids Using the size of this peptide to designate the C-termmus of the TcaBx coding region [Glu at position 604 of SEQ ID NO: 28] , the derived size of TcaBx is determined to be 604 am o acids or 68,463 Daltons, more in agreement with experimental observations.
Translation of the TcaBn peptide coding region of 1686 base pairs (disclosed herein as SEQ ID NO:29) yields a protein of 562 ammo acids (disclosed herein as SEQ ID NO 30) with predicted mass of 60,789 Daltons, which corresponds well with the observed 61 kDa A potential ribosome binding site (bases 3682-3687) is found 48 bp downstream of the stop codon for the tcaB open reading frame At bases 3694-3726 is found a sequence encoding the N-terminus of peptiαe TcaC, (disclosed as SEQ ID NO.2) The open reading frame initiated by this N-termmal peptide continues uninterrupted to base 6005 (2361 base pairs, disclosed herein as the first 2361 base pairs of SEQ ID NO.31) A gene ( tcaC) encoding the entire TcaC peptide, (apparent size about 165 kDa, about 1500 ammo acids) , would comprise about 4500 bp . Another isolate containing cloned EcoR I fragments of cosmid 26A5, E20.6, was also identified by its homology to the previously mentioned GZ4 and TcaB probes Agarose gel analysis of EcoR I digests of the DNA of the plasmid harbored by this strain (pDAB2004, Fig. 2), revealed insert fragments of estimated sizes 2.9, 5, and 3 3 kbp. DNA sequence analysis initiated from primers designed from the sequence of plasmid pDAB2002 revealed that the
-62-
SUBSTΓΓUTE SHEET (RULE 26)
3.3 kbp EcoR I fragment of pDAB2004.lies adjacent to the 5 kbp EcoR I fragment represented in pDAB2002. The 2361 base pair open reading frame discovered in pDAB2002 continues uninterrupted for another 2094 bases in pDAB2004 [disclosed herein as base pairs 2362 to 4458 of SEQ ID NO: 31] . DNA sequence analysis using the parent cosmid 26A5 DNA as template confirmed the continuity of the open reading frame. Altogether, the open reading frame ( tcaC SEQ ID NO: 31) comprises 4455 base pairs, and encodes a protein (TcaC) of 1485 amino acids [disclosed herein as SEQ ID NO: 32] . The calculated molecular size of 166,214 Daltons is consistent with the estimated size of the TcaC peptide (165 kDa) , and the derived amino acid sequence matches exactly that disclosed for the TcaC N- terminal sequence [SEQ ID NO: 2] .
The lack of an amino acid sequence corresponding to SEQ ID NO: 17; used to design the degenerate oligonucleotide primer pool in the discovered sequence indicates that the generation of the PCR® products found in isolates GZ4 and HB14, which were used as probes in the initial library screen, were fortuitously generated by reverse-strand priming by one of the primers in the degenerate pool. Further, the derived protein sequence does not include the internal fragment disclosed herein as SEQ ID NO: 18. These sequences reveal that plasmid pDAB2004 contains the complete coding region for the TcaC peptide.
Further analysis of SEQ ID NO: 25 reveals the end of an open reading frame (bases 1-43), which encodes the final 13 amino acids of the TcaAiι:L peptide, disclosed herein as SEQ ID NO: 35. Only 24 bases separate the end of the TcaAi3_i coding region and the start of the TcaBi coding region. Included within the 24 bases are sequences that may serve as a ribosome binding site. Although possible, it is not likely that a Photorhabdus gene promoter is encoded within this short region. We propose that genomic region tea, which includes three long open reading frames [ tcaA (SEQ ID NO:33), tcaB (SEQ ID NO:25, bases 65-36334), and tcaC (SEQ ID NO:31) , which is separated from the end of tcaB by only 59 bases] is regulated as an operon, with transcription initiating upstream of the start of the tcaA gene (SEQ ID NO:33), and resulting in a polycistronic messenger RNA.
-63-
SUBSTΓΓUTE SHEET (RULE 26)
Example 9 Scre ning of the PhQtQrhabύ s Genomic Library for Genes Encoding the TcbA^ Peptide
This example describes a method used to identify DNA clones that contain the TcbAxl peptide-encoding genes, the isolation of the gene, and the determination of its partial DNA base sequence.
Primers and PCR Reactions The TcbAn polypeptide of the insect active preparation is about 206 kDa. The ammo acid sequence of the N-terminus of this peptide is disclosed as SEQ ID NO:l. Four pools of degenerate oligonucleotide primers ("Forward primers"- TH-4, TH-5, TH-6, and TH-7) were synthesized to encode a portion of this ammo acid sequence, as described in Example 8, and are shown below
Table 12
Amino
Acid Phe lie Gin Gly Tyr Ser Asp Leu Phe TH-4 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) TCI GA(T/C) CTI TT-
3'
TH-5 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) CTI TT-
3'
TH-6 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) TCI GA(T/C) TT(A/G) TT- 3'
TH-7 5'-TT(T/C) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) TT(A/G) TT-
3'
In addition, a primary ("a") and a secondary ("b") sequence of an internal peptide preparation (TcbAn-PT81) have been determined and are disclosed herein as SEQ ID NO: 23 and SEQ ID NO: 24, respectively Four pools of degenerate oligonucleotides ("Reverse Primers": TH-8, TH-9, TH-10 and TH-11) were similarly designed and synthesized to encode the reverse complement of sequences that encode a portion of the peptide of SEQ ID NO: 23, as shown below.
Table 13
Leu Thr Ser Phe Glu Gin Val Ala Asn
GAI TGI AGI
TT(A/G) TGI AGI
GAI TGI TC(G/A)
TT(A/G) TGI TC(G/A)
Sets of these primers were used in PCRS reactions to amplify TcbAlx- encoding gene fragments from the genomic Photorhabdus luminescens W-14 DNA prepared Example 6. All PC reactions were run with the "Hot Start" technique using AmpliWax™ gems and other Perk Elmer reagents and protocols. Typically, a mixture (total volume 11 μl) of MgCl2, dNTP ' s , 10X GeneAmp' PCR Buffer II, and the primers were added to tubes containing a single wax bead. [10X GeneAmp" PCR Buffer II is composed of 100 mM Tris-HCl, pH 8.3; and 500 mM KC1.] The tubes were heated to 80 °C for 2 minutes and allowed to cool. To the top of the wax seals, a solution containing 10X GeneAmp" PCR Buffer II, DNA template, and A pliTaq® DNA polymerase were added. Following melting of the wax seal and mixing of components by thermal cycling, final reaction conditions (volume of 50 μl) were: 10 mM Tris-HCl, pH 8.3; 50 mM KC1; 2.5 mM MgCl2; 200 μM each in dATP, dCTP, dGTP, dTTP; 1.25 mM in a single
Forward primer pool; 1.25 μM in a single Reverse primer pool, 1.25 units of AmpliTaq" DNA polymerase, and 170 ng of template DNA.
The reactions were placed in a thermocycler (as m Example 8) and run with the following program:
Table 14
A series of amplifications was run at three different annealing tempera ures (55°, 60°, 65°C) using the degenerate primer
-66-
SUBSTΓΓUTE SHEET (RULE 26)
pools. Reactions with annealing at .65°C had no amplification products visible following agarose gel electrophoresis. Reactions having a 60°C annealing regime and containing primers TH-5+TH-10 produced an amplification product that had a mobility corresponding to 2.9 kbp. A lesser amount of the 2.9 kbp product was produced under these conditions with primers TH-7+TH-10. When reactions were annealed at 55°C, these primer pairs produced more of the 2.9 kbp product, and this product was also produced by primer pairs TH- 5+TH-8 and TH-5+TH-11. Additional very faint 2.9 kbp bands were seen in lanes containing amplification products from primer pairs TH-7 plus TH-8, TH-9, TH-10, or TH-11.
To obtain sufficient PCR amplification product for cloning and DNA sequence determination, 10 separate PCR reactions were set up using the primers TH-5+TH-10, and were run using the above conditions with a 55°C annealing temperature. All reactions were pooled and the 2.9 kbp product was purified by Qiaex extraction from an agarose gel as described above.
Additional sequences determined for TcbAϋ internal peptides are disclosed herein as SEQ ID NO: 21 and SEQ ID NO: 22. As before, degenerate oligonucleotides (Reverse primers TH-17 and TH-18) were made corresponding to the reverse complement of sequences that encode a portion of the amino acid sequence of these peptides.
Table 15 From SEQ ID NO: 21
Amino
Acid Met Glu Thr Gin Asn lie Gin Glu Pro TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI GTT/C GTT/C GG-5'
Table 16 From SEQ ID NO: 22
Amino
Acid Asn Pro lie Asn lie Asn Thr Gly He Asp
TH-18 3'-TT(A/G) GGI TAI TT (A/G) TAI TT(A?G) TGI CCI TAI CT(A/G)-5'
Degenerate oligonucleotides TH-18 and TH-17 were used in an amplification experiment with Photorhabdus l uminescens W-14 DNA as template and primers TH-4, TH-5, TH-6, or TH-7 as the 5'- (Forward) primers. These reactions amplified products of approximately 4 kbp and 4.5 kbp, respectively. These DNAs were transferred from
32 agarose gels to nylon membranes and hybridized with a P-labeled probe (as described above) prepared from the 2.9 kbp product
amplified by the TH-5+TH10 primer pair. Both the 4 kbp and the 4.5 kbp amplification products hybridized strongly to the 2.9 kbp probe. These results were used to construct a map ordering the TcbAii internal peptide sequences as shown in Fig. 3 Approximate distances between the primers are shown in nucleotides in Fig. 3.
DNA Sequence of the 2.9 kbp TcbA-, -, -encoding Fragment
Approximately 200 ng of the purified 2.9 kbp fragment (prepared above) was precipitated with ethanol and dissolved m 17 ml of water. One-half of this was used as sequencing template with 25 pmol of the TH-5 pool as primers, the other half was used as template for TH-10 priming. Sequencing reactions were as given in Example 8. No reliable sequence was produced using the TH-10 primer pool; however, reactions with TH-5 primer pool produced the sequence disclosed below: .
1 AATCGTGTTG ATCCCTATGC CGNGCCGGGT TCGGTGGAAT CGATGTCCTC ACCGGGGGTT
61 TATTNGAGGG ANTNGTCCCG TGAGGCCAAA AATGGAATG AAAGAAGTTC AATTTNTTAC
121 CTAGATAAAC GTCGCCCGGN TTTAGAAAGN TTATGNTCA GCCAGAAAAT TTTGGTTGAG
181 GAAATTCCAC CGNTGGTTCT CTCTATTGAT TNGGGCCTGG CCGGGTTCGA ANNAAAACNA 241 GGAAATNCAC AAGTTGAGGT GATGGNTTTG TNGCNANCTT NTCGTTTAGG TGGGGAGAAA
301 CCTTNTCANC ACGNTTNTGA AACTGTCCGG GAAATCGTCC ATGANCGTGA NCCAGGNTTN
361 CGCCATTGG
Based on this sequence, a sequencing primer (TH-21, 5'- CCGGGCGACGTTTATCTAGG-3 ' ) was designed to reverse complement bases
120-139, and initiate polymerization towards the 5' end (i.e., TH-5 end) of the gel -purified 2.9 kbp TcbA x -encoding PCR fragment. The determined sequence is shown below, and is compared to the biochemically determined N-termmal peptide sequence of TcbAxx SEQ ID N0:1.
TcbAn 2.9 kbp PCR Fragment Sequence Confirmation [Underlined ammo acids = encoded by degenerate oligonucleotides] SEQ ID N0 : 1 F I O G Y S D L F G - - A
2.9 kbp seq GC ATG CA IG GGIG TAIT AGIT GAIC CTIG TTIT GGIT AAIT CGIT GCIT
M Q G Y S D F G N R A > From the homology of the derived am o acid sequence to the biochemically determined one, it is clear that the 2.9 kbp PCR fragment represents the TcbA coding region. This 2.9 kbp fragment was then used as a hybridization probe to screen the Photorhabdus W-14 genomic library prepared in Example 8 for cosmids containing the TcbAxl-encoding gene.
-68-
SUBSTΓΓUTE SHEET (RULE 26)
Screening the Photorhabdus Cosmid Library
The 2.9 kb gel-purified PCR fragment was labeled with 32P using the Boehrmger Mannheim High Prime labeling kit as described m Example 8. Filters containing remnants of approximately 800 colonies from the cosmid library were screened as described previously (Example 8) , and positive clones were streaked for isolated colonies and rescreened. Three clones (8A11, 25G8, and 26D1) gave positive results through several screening and characterization steps. No hybridization of the TcbAx -specιfic probe was ever observed with any of the four cosmids identified in Example 8 , and which contain the tcaB and tcaC genes . DNA from cosmids 8All, 25G8 , and 26D1 was digested with restriction enzymes Bgl II, EcoR I or Hind III (either alone or combination with one another) , and the fragments were separated on an agarose gel and transferred to a nylon membrane as described in Example 8 The membrane was hybridized with 32P- labeled probe prepared from the 4.5 kbp fragment (generated by amplification of Photorhabdus genomic DNA with primers TH-5+TH-17) . The patterns generated from cosmid DNAs 8A11 and 26D1 were identical to those generated with similarly-cut genomic DNA on the same membrane. It is concluded that cosmids 8A11 and 26D1 are accurate representations of the genomic TcbAxx encoding locus. However, cosmid 25G8 has a single Bgl II fragment which is slightly larger than the genomic DNA. This may result from positioning of the insert withm the vector.
DNA Sequence of the tcbA-encoding Gene
The membrane hybridization analysis of cosmid 26D1 revealed that the 4.5 kbp probe hybridized to a single large EcoR I fragment (greater than 9 kbp) This fragment was gel purified and ligated into the EcoR I site of pBC KS (+) as described in Example 8, to generate plasmid pBC-Sl/Rl. The partial DNA sequence of the insert DNA of this plasmid was determined by "primer walking" from the flanking vector sequence, using procedures described in Example 8. Further sequence was generated by extension from new oligonucleotides designed from the previously determined sequence. When compared to the determined DNA sequence for the tcbA gene identified by other methods (disclosed herein as SEQ ID NO: 11 as described in Example 12 below) , complete homology was found to nucleotides 1-272, 319-826, 2578-3036, and 3068-3540 (total bases = 1712) . It was concluded that both approaches can be used to identify DNA fragments encoding the TcbAlx peptide
-69-
SUBSTΓΓUTE SHEET (RULE 26)
Analysis of the Derived Amino Acid Sequence of the tcbA Gene
The sequence of the DNA fragment identified as SEQ ID NO: 11 encodes a protein whose derived amino acid sequence is disclosed herein as SEQ ID NO: 12. Several features verify the identity of the gene as that encoding the TcbA protein. The TcbAxx N- terminal peptide (SEQ ID NO.-l; Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala) is encoded as amino acids 88-100. The TcbAxx internal peptide TcbAxx-PT81 (a) (SEQ ID NO: 23) is encoded as amino acids 1065-1077, and TcbAxx-PT81 (b) (SEQ ID NO:24) is encoded as amino acids 1571-1592. Further, the internal peptide TcbAxx-PT56 (SEQ ID NO:22) is encoded as amino acids 1474-1488, and the internal peptide TcbAxx-PT103 (SEQ ID NO: 21) is encoded as amino acids 1614-1639. It is obvious that this gene is an authentic clone encoding the TcbAxx peptide as isolated from insecticidal protein preparations of Photorhabdus luminescens strain W-14. The protein isolated as peptide TcbAxx is derived from cleavage of a longer peptide. Evidence for this is provided by the fact that the nucleotides encoding the TcbAxx N-terminal peptide SEQ ID NO:l are preceded by 261 bases (encoding 87 N- terminal - proximal amino acids) of a longer open reading frame (SEQ ID
NO:ll). This reading frame begins with nucleotides that encode the amino acid sequence Met Gin Asn Ser Leu, which corresponds to the N-terminal sequence of the large peptide TcbA, and is disclosed herein as SEQ ID NO: 16. It is thought that TcbA is the precursor protein for TcbAxx .
Relationship of tcbA . tcaB and tcaC Genes
The tcaB and tcaC genes are closely linked and may be transcribed as a single mRNA (Example 8) . The tcbA gene is borne on cosmids that apparently do not overlap the ones harboring the tcaB and tcaC cluster, since the respective genomic library screens identified different cosmids. However, comparison of the amino sequences encoded by the tcaB and tcaC genes with the tcbA gene reveals a substantial degree of homology. The amino acid conservation (Protein Alignment Mode of MacVector™ Sequence
Analysis Software, scoring matrix pam250, hash value = 2; Oxford Molecular Group, Campbell, CA) is shown in Fig. 4. On the score line of each panel in Fig. 4, up carats (A) indicate homology or conservative amino acid changes, and down carats (v) indicate nonhomo1ogy .
-70-
SUBSTΓΓUTE SHEET (RULE 26)
This analysis shows that the ammo acid sequence of the TcbA peptide from residues 1739 to 1894 is highly homologous to ammo acids 441 to 603 of the TcaBx peptide (162 of the total 627 ammo acids of TcaB; SEQ ID NO: 28) . In addition, the sequence of TcbA ammo acids 1932 to 2459 is highly homologous to ammo acids 12 to 531 of peptide TcaBlx (520 of the total 562 amino acids; SEQ ID NO:30) . Considering that the TcbA peptide (SEQ ID NO:12) comprises 2505 ammo acids, a total of 684 ammo acids (27%) at the C- proximal end of it is homologous to the TcaBx or TcaBlx peptides, and the homologies are arranged colmear to the arrangement of the putative TcaB preprotem (SEQ ID NO:26) . A sizeable gap in the TcbA homology coincides with the junction between the TcaBχ and
TcaBxl portions of the TcaB preprotem. Clearly the TcbA and TcaB gene products are evolutionarily related, and it is proposed that they share some common functio (s) Photorhabdus .
Example 10
Characterization of Zmc-metalloproteases in Photorhabdus Broth:
Protease Inhibition. Classification, and Purification
Protease Inhibition and Classification Assays: Protease assays were performed using FITC-casem dissolved water as substrate (0.08% final assay concentration) . Proteolysis reactions were performed at 25°C for 1 h in the appropriate buffer with 25 μl of Photorhabdus broth (150 μl total reaction volume) . Samples were also assayed the presence and absence of dithiothreitol After incubation, an equal volume of 12% tπchloroacetic acid was added to precipitate undigested protein. Following precipitation for 0.5 h and subsequent centrifugation, 100 μl of the supeTnatant was placed into a 96-well microtiter plate and the pH of the solution was adjusted by addition of an equal volume of 4N NaOH. Proteolysis was then quantitated using a Fluoroskan II fluorometric plate reader at excitation and emission wavelengths of 485 and 538 nm, respectively. Protease activity was tested over a range from pH 5.0-10.0 in 0.5 units increments. The following buffers were used at 50 mM final concentration: sodium acetate (pH 5.0 - 6.5); Tris-HCL (pH 7 0 - 8.0); and bis-Tris propane (pH 8 5-10.0). To identify the class of protease (s) observed, crude broth was treated with a variety of protease inhibitors (0.5 μg/μl final concentration) and then examined for protease activity at pH 8.0
-71-
SUBSTΪTUTE SHEET (RULE 26)
using the substrate described above. The protease inhibitors used mcluded E-64 (L-trans-expoxysacc ylleucylamido [4- , -guanidmo] - butane), 3,4 dichloroisocσumarm, Leupeptin, pepstatm, amastat , ethylenediammetetraacetic acid (EDTA) and 1,10 phenanthrolme. Protease assays performed over a pH range revealed that indeed protease (s) were present which exhibited maximal activity at about pH 8.0 (Table 17) . Addition of DTT did not have any effect on protease activity. Crude broth was then treated with a variety of protease inhibitors (Table 18) Treatment of crude broth with the inhibitors described above revealed that 1,10 phenanthrolme caused complete inhibition of all protease activity when added at a final concentration of 50 μg, with the IC50 = 5 μg in 100 μl of a 2 mg/ml crude broth solution. These data indicate that the most abundant protease (s) found m the Photorhabdus broth are from the zmc- metalloprotease class of enzymes
Table 17
Effect of pH on the Protease Activity Found m a Day 1 Production of Photorhabdus luminescens (Strain W-14)
pH Flu. Unιtsa Percent
Activity53
5.0 3013 ± 78 17
5.5 7994 ± 448 45
6.0 12965 ± 483 74 6.5 14390 ± 1291 82
7.0 14386 + 1287 82
7.5 14135 ± 198 80
B.O 17582 ± 831 100
8.5 16183 + 953 92 9 0 16795 ± 760 96
9.5 16279 ± 1022 93
10.0 15225 ± 210 87 a Flu. Units = Fluorescence Units (Maximum = aoout 28,000, background = about 2200) . b Percent activity relative to the maximum at pH 8 0
Tabl e 18
Effect of Different Protease Inhibitors on the Protease Activity at pH 8 Found in a Day 1 Production of Photorhabdus luminescens
(Strain W-H)
Inhibitor Corrected Flu. Unιtsa Percent Inhibition3
Control 13053 0
E-64 14259 0 1,10 Phenanthrolme0 15 99
Leupeptm 13074 0
Pepstatmc 13441 0
Amastatm 12474 4 DMSO Control 12005 8
Methanol Control 12125 7 a corrected Flu. units = luorescence Units - background [ 2 uo flu. units) . b Percent Inhibition relative to protease activity at pH 8.0. c Inhibitors were dissolved in methanol. d Inhibitors were dissolved m DMSO.
The isolation of a z c-metalloprotease was performed by applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose column equilibrated at 50 mM Na2P04, pH 7.0 as described m Example
5 for Photorhabdus toxin After extensive washing, a 0 to 0.5 M NaCl gradient was used to elute toxm protein. The majority of biological activity and protein was eluted from 0.15 - 0.45 M NaCl. However, it was observed that the majority of proteolytic activity was present in the 0.25-0.35 M NaCl fraction with some activity in the 0 15-0.25 M NaCl fraction SDS PAGE analysis of the 0.25-0.35 M NaCl fraction showed a major peptide band of approximately 60 kDa. The 0.15-0.25 M NaCl fraction contained a similar 60 kDa band but at lower relative protein concentration. Subsequent gel filtration of this fraction using a Superose 12 HR 16/50 column resulted m a major peak migrating at 57.5 kDa that contained a predominant (> 90% of total stained protein) 58.5 kDa band by SDS PAGE analysis. Additional analysis of this fraction using various protease inhibitors as described above determined that the protease was a zmc-metalloprotease. Nearly all of the protease activity present in Photorhabdus broth at day 1 of fermentation corresponded to the about 58 kDa zmc-metalloprotease.
In yet a second isolation of zmc-metalloprotease (s) , W-14 Photorhabdus broth grown for three days was taken and protease activity was visualized using sodium dodecyl sulfate-polyacrylamide
- 73-
SUBSTΓΓUTE SHEET (RULE 26)
gel electrophoresis (SDS-PAGE) laced with gelatin as described m Schmidt, T.M., Bleakley, B. and Nealson, K.M. 1988. SDS running gels (5.5 x 8 cm) were made with 12.5 % polyacrylamide (40% stock solution of acrylamide/bis-acrylamide; Sigma Chemical Co., St. Louis, MO) into which 0.1% gelatin final concentration (Biorad EIA grade reagent; Richmond CA) was incorporated upon dissolving m water. SDS-stackmg gels (1.0 x 8 cm) were made with 5% polyacrylamide, also laced with 0.1% gelatin. Typically, 2.5 μg of protein to be tested was diluted 0.03 ml of SDS-PAGE loading buffer without dithiothreitol (DTT) and loaded onto the gel.
Proteins were electrophoresed in SDS running buffer (Laemmli, U.K. 1970 Nature 227, 680) at 0° C and at 8 mA After electrophoresis was complete, the gel was washed for 2 h in 2.5% (v/v) Triton X- 100. Gels were then incubated for 1 h at 37 °C in 0 1 M glycme (pH 8.0). After incubation, gels were fixed and stained overnight with 0.1% amido black in methanol -acetic acid- water (30.10-60, vol . /vol . /vol . ; Sigma Chemical Co.). Protease activity was visualized as light areas against a dark, amido black stained background due to proteolysis and subsequent diffusion of incorporated gelatin. At least three distinct bands produced by proteolytic activity at 58-, 41-, and 38 kDa were observed
Activity assays of the different proteases in W-14 day three culture broth were performed using FITC-case dissolved in water as substrate (0.02% final assay concentration) Proteolysis experiments were performed at 37°C for 0-0 5 h in 0. IM Tris-HCl (pH 8.0) with different protein fractions in a total volume of 0 15 ml. Reactions were terminated by addition of an equal volume of 12% trichloroacetic acid (TCA) dissolved in water. After incubation at room temperature for 0.25 h, samples were centrifuged at 10,000 x g for 0.25 h and 0.10 ml aliquots were removed and placed into 96- well microtiter plates. The solution was then neutralized by the addition of an equal volume of 2 N sodium hydroxide, followed by quantitation using a Fluoroskan II fluorometπc plate reader with excitation and emission wavelengths of 485 and 538 nm, respectively Activity measurements were performed using FITC-
Casein with different protease concentrations at 37°C for 0-10 mm A unit of activity was arbitrarily defined as the amount of enzyme needed to produce 1000 fluorescent units/mm and specific activity was defined as umts/mg of protease.
- 74 -
SUBSTΓΓUTE SHEET (RULE 26)
Inhibition studies were performed using two zinc- metalloprotease inhibitors; 1,10 phenanthroline and N- (a- rhamnopyranosyloxyhydroxyphosphinyl) -Leu-Trp (phosphoramidon) with stock solutions of the inhibitors dissolved in 100% ethanol and water, respectively. Stock concentrations were typically 10 mg/ml and 5 mg/ml for 1,10 phenanthroline and phosphoramidon, respectively, with final concentrations of inhibitor at 0.5-1.0 mg/ml per reaction. Treatment of three day W-14 crude broth with 1,10 phenanthroline, an inhibitor of all zinc metalloproteases , resulted in complete elimination of all protease activity while treatment with phosphoramidon, an inhibitor of ther olysin-like proteases (Weaver, L.H., Kester, W.R., and Matthews, B.W. 1977. J. Mol. Biol. 114, 119-132), resulted in about 56% reduction of protease activity. The residual proteolytic activity could not be further reduced with additional phosphoramidon.
The proteases of three day W-14 Photorhabdus broth were purified as follows: 4.0 liters of broth were concentrated using an Amicon spiral ultra filtration cartridge Type SIYIOO attached to an Amicon M-12 filtration device. The flow-through material having native proteins less than 100 kDa in size (3.8 L) was concentrated to 0.375 L using an Amicon spiral ultra filtration cartridge Type S1Y10 attached to an Amicon M-12 filtration device. The retentate material contained proteins ranging in size from 10-100 kDa. This material was loaded onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem (Framington, MA) Poros® 50 HQ strong anion exchange packing that had been equilibrated in 10 mM sodium phosphate buffer (pH 7.0) . Proteins were loaded on the column at a flow rate of 5 ml/min, followed by washing unbound protein with buffer until A28O = 0.00. Afterwards, proteins were eluted using a NaCl gradient of 0-1.0 M NaCl in 40 min at a flow rate of 7.5 ml/min. Fractions were assayed for protease activity, supra., and active fractions were pooled. Proteolytically active fractions were diluted with 50% (v/v) 10 mM sodium phosphate buffer (pH 7.0) and loaded onto a Pharmacia HR 10/10 Mono Q column equilibrated in 10 mM sodium phosphate. After washing the column with buffer until A28O = 0.00, proteins were eluted using a NaCl gradient of 0-0.5 M NaCl for 1 h at a flow rate of 2.0 ml/min. Fractions were assayed for protease activity. Those fractions having the greatest amount of phosphoramidon-sensitive protease
activity, the phosphoramidon sensitive activity being due to the 41/38 kDa protease, infra . , were pooled. These fractions were found to elute at a range of 0.15-0.25 M NaCl. Fractions containing a predominance of phosphoramidon- insensitive protease activity, the 58 kDa protease, were also pooled. These fractions were found to elute at a range of 0.25-0.35 M NaCl. The phosphoramidon-sensitive protease fractions were then concentrated to a final volume of 0.75 ml using a Millipore Ultrafree®-15 centrifugal filter device Bιomax-5K NMWL membrane. This material was applied at a flow rate of 0.5 ml/mm to a Pharmacia HR 10/30 column that had been packed with Pharmacia Sephadex G-50 equilibrated m 10 mM sodium phosphate buffer (pH 7.0)/ 0.1 M NaCl. Fractions having the maximal phosphoramidon-sensitive protease activity were then pooled and centrifuged over a Millipore Ultrafree®- 15 centrifugal filter device Bιomax-50K NMWL membrane. Proteolytic activity analysis, supra . , indicated this material to have only phosphoramidon-sensitive protease activity. Pooling of the phosphoramidon-insensitive protease, the 58 kDa protein, was followed by concentrating in a Millipore Ultrafree®-15 centrifugal filter device Bιomax-50K NMWL membrane and further separation on a Pharmacia Superdex-75 column. Fractions containing the protease were pooled.
Analysis of purified 58- and 41/38 kDa purified proteases revealed that, while both types of protease were completely inhibited with 1,10 phenanthrolme, only the 41/38 kDa protease was inhibited with phosphoramidon. Further analysis of crude broth indicated that protease activity of day 1 W-14 broth has 23% of the total protease activity due to the 41/38 kDa protease, increasing to 44% day three W-14 broth. Standard SDS-PAGE analysis for examining protein purity and obtaining ammo terminal sequence was performed using 4-20% gradient MmiPlus SepraGels purchased from Integrated Separation Systems (Natick, MA) . Proteins to be ammo-terminal sequenced were blotted onto PVDF membrane following purification, infra . , (ProBlott™ Membranes, Applied Biosyste s, Foster City, CA) , visualized with 0.1% amido black, excised, and sent to Cambridge Prochem,- Cambridge, MA, for sequencing.
Deduced ammo terminal sequence of the 58- (SEQ ID NO:45) and 41/" kDa (SEQ ID N0:44) proteases from three day old W-14 broth
-76-
SUBSTΓΓUTE SHEET (RULE 26)
were DV-GSEKANEKLK (SEQ ID NO: 45) and DSGDDDKVTNTDIHR (SEQ ID NO: 44), respectively.
Sequencing of the 41/38 kDa protease revealed several ammo termini, each one having an additional ammo acid removed by proteolysis. Examination of the primary, secondary, tertiary and quartenary sequences for the 38 and 41 kDa polypeptides allowed for deduction of the sequence shown above and revealed that these two proteases are homologous . Example 11. Part A
Screening of Photorhabdus Genomic Library Via Use of Antibodies for
Genes Encoding TcbA Peptide
In parallel to the sequencing described above, suitable probing and sequencing was done based on the TcbAx peptide (SEQ ID NO:l) . This sequencing was performed by preparing bacterial culture broths and purifying the tox as described m Examples 1 and 2 above .
Genomic DNA was isolated from the Photorhabdus l uminescens strain W-14 grown in Grace's insect tissue culture medium The bacteria were grown in 5 ml of culture medium in a 250 ml Erlenmeyer flask at 28 °C and 250 rpm for approximately 24 hours Bacterial cells from 100 ml of culture medium were pelleted at 5000 x g for 10 minutes. The supernatant was discarded, and the cell pellets then were used for the genomic DNA isolation.
The genomic DNA was isolated using a modification of the CTAB method described in Section 2.4.3 of Ausubel [ supra . ) The section entitled "Large Scale CsCl prep of bacterial genomic DNA" was followed through step 6. At this point, an additional chloroform/isoamyl alcohol (24:1) extraction was performed followed by a phenol/chloroform/isoamyl (25:24:1) extraction step and a final chloroform/isoamyl/alcohol (24:1) extraction. The DNA was precipitated by the addition of a 0.6 volume of isopropanol . The precipitated DNA was hooked and wound around the end of a bent glass rod, dipped briefly into 70% ethanol as a final wash, and dissolved m 3 ml of TE buffer.
The DNA concentration, estimated by optical density at 280/260 nm, was approximately 2 mg/ml.
Using this genomic DNA, a library was prepared Approximately 50 μg of genomic DNA was partly digested with Sau3 Al . Then NaCl density gradient centrifugation was used to size fractionate the partially digested DNA fragments. Fractions containing DNA
-77 -
SUBSTTTUTE SHEET (RULE 26)
fragments with an average size of 12 kb, or larger, as determined by agarose gel electrophoresis, were ligated into the plasmid BluScript, Stratagene, La Jolla, California, and transformed into an E. coli DH5α or DHB10 strain. Separately, purified aliquots of the protein were sent to the biotechnology hybπdoma center at the University of Wisconsin, Madison for production of monoclonal antibodies to the proteins. The material that was sent was the HPLC purified fraction containing native bands 1 and 2 which had been denatured at 65 °C, and 20 μg of which was injected into each of four mice. Stable monoclonal antibody-producing hybrido a cell lines were recovered after spleen cells from unimmunized mouse were fused with a stable myeloma cell line. Monoclonal antibodies were recovered from the hybndomas . Separately, polyclonal antibodies were created by taking native agarose gel purified band 1 (see Example 1) protein which was then used to immunize a New Zealand white rabbit. The protein was prepared by excising the band from the native agarose gels, briefly heating the gel pieces to 65°C to melt the agarose, and immediately emulsifying with adjuvant. Freund's complete adjuvant was used for the primary immunizations and Freund's incomplete was used for 3 additional injections at monthly intervals For each injection, approximately 0.2 ml of emulsified band 1, containing 50 to 100 micrograms of protein, was delivered by multiple subcontaneous injections into the back of the rabbit. Serum was obtained 10 days after the final injection and additional bleeds were performed at weekly intervals for 3 weeks. The serum complement was inactivated by heating to 56°C for 15 minutes and then stored at -20°C. The monoclonal and polyclonal antibodies were then used to screen the genomic library for the expression of antigens which could be detected by the epitope. Positive clones were detected on nitrocellulose filter colony lifts. An immunoblot analysis of the positive clones was undertaken. An analysis of the clones as defined by both immunoblot and Southern analysis resulted in the tentative identification of four genomic regions.
In the first region was a gene encoding the peptide designated here as TcbAxx. Full DNA sequence of this gene [ tcbA) was obtained. It is set forth as SEQ ID NO: 11. Confirmation that the sequence encodes the internal sequence of SEQ ID NO:l is demonstrated by the presence of SEQ ID NO 1 at ammo acid number 88
-78-
SUBSTΓΓUTE SHEET (RULE 26)
from the deduced ammo acid sequence created by the open reading frame of SEQ ID NO: 11. This can be confirmed by referring to SEQ ID NO: 12, which is the deduced ammo acid sequence created by SEQ ID NO: 11. The second region of toxin peptides contains the segments referred to above as TcaBx, TcaBlx and TcaC. Following the screening of the library with the polyclonal antisera, this second region of toxm genes was identified by several clones which produced different size proteins, all of which cross-reacted with the polyclonal antibody on an immunoblot and were also found to share DNA homology on a Southern Blot. Sequence comparison revealed that they belonged to the gene complex designated TcaB and TcaC above.
Two other regions of antibody toxm clones were also isolated m the polyclonal screen. These regions produced proteins that cross -react with a polyclonal antibody and also shared DNA homology with the regions as determined by Southern blotting. Thus, it appears that the Photorhabdus luminescens extracellular protein genes represent a family of genes which are evolutionarily related To further pursue the concept that there might be evolutionarily related variations m the toxm peptides contained withm this organism, two approaches have been undertaken to examine other strains of Photorhabdus luminescens for the presence of related proteins This was done both by PCR amplification of genomic DNA and by immunoblot analysis using the polyclonal and monoclonal antibodies.
The results indicate that related proteins are produced by Photorhabdus . luminescens strains WX-2, WX-3, WX-4, WX-5, WX-6, WX- 7, WX-8, WX-11, WX-12, WX-15 and W-14.
Example 11. Part B Sequence and Analysis of tec Toxm Clones
Further DNA sequencing was performed on plasmids isolated from E. coli clones described in Example 11, Part A. The nucleotide sequence from the third region of E. coli clones was shown to be three closely linked open reading frames at this genomic locus. This locus was designated tec with the three open reading frames designated tccA SEQ ID NO: 56, tccB SEQ ID NO: 58 and tccC SEQ ID NO: 60 The close linkage between these open reading frames is revealed by examination of SEQ ID NO: 56, which 93 bp separate the stop codon of tccA from the start codon of tccb (bases 2992- 2994 of SEQ ID NO:56), and by examination of SEQ ID NO:58, in which
-79-
SUBSTΓΓUTE SHEET (RULE 26)
131 bases separate the stop codon of tccB and the tccC (bases 4930- 4932 of SEQ ID NO: 58) . The physical map is presented in Fig. 6B .
The deduced ammo acid sequence from the tccA open reading frame indicates that the gene encodes a protein of 105,459 Da. This protein was designated TccA (SEQ ID NO: 57) The first 12 ammo acids of this protein match the N-term al sequence obtained from a 108 kDa protein, SEQ ID NO: 8, previously identified as part of the toxm complex.
The deduced amino acid sequence from the tccB open reading frame indicates that this gene encodes a protein of 175,716 Da. This protein was designated TccB (SEQ ID NO: 59) . The first 11 ammo acids of this protein match the N-termmal sequence obtained from a protein with estimated molecular weight of 185 kDa, SEQ ID NO: 7. Similarity analysis revealed that the TccB protein is related to the proteins identified as TcbA SEQ ID NO: 12; 37% similarity and 28% identity, TcdA SEQ ID NO:47, 35% similarity and 28%ιdentιty, and TcaB SEQ ID NO:26; 32% similarity and 26% identity (using the GAP algorithm Wisconsin Package Version 9 0, Genetics Computer Group (GCG) Madison Wisconsin) The deduced ammo acid sequence of tccC indicated that this open reading frame encodes a protein of 111,694 Da and the protein product was designated TccC (SEQ ID NO: 61)
Example 12 Characterization of Photorhabdus Strains
In order to establish that the collection described herein was comprised of Photorhabdus strains, the strains herein were assessed m terms of recognized microbiological traits that are characteristic of Photorhabdus and which differentiate it from other Enterobacteriaceae and Xenorhabdus spp. (Farmer, J. J 1984. Bergey's Manual of Systemic Bacteriology, Vol 1 pp. 510-511. (ed. Kreig N. R. and Holt, J. G.). Williams & Wilkms, Baltimore,- Akhurst and Boemare, 1988, Boemare et al , 1993). These characteristic traits are as follows- Gram's stain negative rods, organism size of 0.5-2 μm in width and 2-10 μm in length, red/yellow colony pigmentation, presence of crystalline inclusion bodies, presence of catalase, inability to reduce nitrate, presence of biolummescence, ability to take up dye from growth media, positive for protease production, growth-temperature range below 37°C, survival under anaerobic conditions and positively motile.
-80-
SUBSTΓΓUTE SHEET (RULE 26)
(Table 20) . Reference Escheπchia coli , Xenorhabdus and Photorhabdus strains were mcluded in all tests for comparison. The overall results are consistent with all strains being part of the family Enterobacteriaceae and the genus Photorhabdus . A lummometer was used to establish the biolummescence of each strain and provide a quantitative and relative measurement of light production. For measurement of relative light emitting units, the broths from each strain (cells and media) were measured at three time intervals after inoculation in liquid culture (6, 12, and 24 hr) and compared to background luminosity (unmoculated media and water) . Prior to measuring light emission from the various broths, cell density was established by measuring light absorbance (560 nM) in a Gilford Systems (Oberlm, OH) spectrophotometer using a sipper cell. Appropriate dilutions were then made (to normalize optical density to 1.0 unit) before measuring luminosity Aliquots of the diluted broths were then placed into cuvettes (300 μl each) and read in a Bio-Orbit 1251 Lummometer (Bio-Orbit Oy, Twiku, Finland) . The integration period for each sample was 45 seconds. The samples were continuously mixed (spun in baffled cuvettes) while being read to provide oxygen availability. A positive test was determined as being > 5-fold background luminescence (about 5-10 units) . In addition, colony luminosity was detected with photographic film overlays and visually, after adaptation in a darkroom. The Gram's staining characteristics of each strain were established with a commercial Gram's stain kit (BBL, Cockeysville, MD) used m conjunction with Gram's stain control slides (Fisher Scientific, Pittsburgh, PA) . Microscopic evaluation was then performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil immersion objective lens (with 10X ocular and 2X body magnification) . Microscopic examination of individual strains for organism size, cellular description and inclusion bodies (the latter after logarithmic growth) was performed using wet mount slides (10X ocular, 2X body and 40X objective magnification) with oil immersion and phase contrast microscopy with a micrometer (Akhurst, R.J. and Boemare, N.E. 1990. Entomopathogenic Nematodes in Biological Control (ed. Gaugler, R. and Kaya, H ). pp. 75-90. CRC Press, Boca Raton, USA.; Baghdiguian S., Boyer-Giglio M.H., Thaler, J.O , Bonnot G., Boemare N. 1993 Biol. Cell 79, 177-185.). Colony pigmentation was observed after
-81 -
SUBSTΓΓUTE SHEET (RULE 2S)
inoculation on Bacto nutrient agar, .(Difco Laboratories, Detroit, MI) prepared as per label instructions. Incubation occurred at 28 °C and descriptions were produced after 5-7 days. To test for the presence of the enzyme catalase, a colony of the test organism was removed on a small plug from a nutrient agar plate and placed into the bottom of a glass test tube. One ml of a household hydrogen peroxide solution was gently added down the side of the tube. A positive reaction was recorded when bubbles of gas (presumptive oxygen) appeared immediately or with 5 seconds. Controls of unmoculated nutrient agar and hydrogen peroxide solution were also examined. To test for nitrate reduction, each culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI). After 24 hours incubation at 28°C, nitrite production was tested by the addition of two drops of sulfanilic acid reagent and two drops of alpha-naphthylamme reagent (see Difco Manual, 10th edition, Difco Laboratories, Detroit, MI, 1984) . The generation of a distinct pink or red color indicates the formation of nitrite from nitrate The ability of each strain to uptake dye from growth media was tested with Bacto MacConkey agar containing the dye neutral red; Bacto Tergιtol-7 agar containing the dye bromothymol blue and Bacto EMB Agar containing the dye eosm-Y (agars from Difco Laboratories, Detroit, MI, all prepared according to label instructions) After inoculation on these media, dye uptake was recorded after incubation at 28°C for 5 days. Growth on these latter media is characteristic for members of the family En t erobactena ceae Motility of each strain was tested using a solution of Bacto Motility Test Medium (Difco Laboratories, Detroit, MI) prepared as per label instructions. A butt-stab inoculation was performed with each strain and motility was judged macroscopically by a diffuse zone of growth spreading from the line of inoculum. In many cases, motility was also observed microscopically from liquid culture under wet mount slides. Biochemical nutrient evaluation for each strain was performed using BBL Enterotube II (Benton, Dickinson, Germany) . Product instructions were followed with the exception that incubation was carried out at 28°C for 5 days Results were consistent with previously cited reports for Photorhabdus . The production of protease was tested by observing hydrolysis of gelatin using Bacto gelatin (Difco Laboratories, Detroit, MI)
plates made as per label instructions. Cultures were inoculated and the plates were incubated at 28°C for 5 days. To assess growth at different temperatures, agar plates [2% proteose peptone #3 with two percent Bacto-Agar (Difco, Detroit, MI) in deionized water] were streaked from a common source of inoculum. Plates were sealed with Nesco" film and incubated at 20, 28 and 37°C for up to three weeks. Plates showing no growth at 37°C showed no cell viability after transfer to a 28°C incubator for one week. Oxygen requirements for Photorhabdus strains were tested m the following manner. A butt -stab inoculation into fluid thioglycolate broth medium (Difco, Detroit, MI) was made. The tubes were incubated at room temperature for one week and cultures were then examined for type and extent of growth. The indicator resazurm demonstrates the level of medium oxidation or the aerobiosis zone (Difco Manual, 10th edition, Difco Laboratories, Detroit, MI). Growth zone results obtained for the Photorhabdus strains tested were consistent with those of a facultative anaerobic microorganism.
-83-
SUBSTΓΠJTE SHEET (RULE 26)
Table 19 Taxonomic Traits of Photorhabdus Strains
C=Bιolummescence, D=Cell form, E=Motιlιty, F=Nιtrate reduction, G=Presence of catalase, H=Gelatm hydrolysis, I=Dye uptake, J=Pιgmentatιon, K=Growth on EMB agar, L=Growth on MacConkey agar, M=Growth on Tergιtol-7 agar, N=Facultatιve anaerobe, 0=Growth at 20°C, P=Growth at 28°C, Q=Growth at 37°C, t - +/- = positive or negative for trait, rd=rod, S=sιzed withm Genus descriptors, RO=red-orange, LR = light red, R= red, 0= orange, Y= yellow, T= tan, LY= light yellow, YT= yellow tan, and L0= light orange.
Cellular fatty acid analysis is a recognized tool for bacterial characterization at the genus and species level (Tornabene, T. G. 1985. Lipid Analysis and the Relationship to Chemotaxono y in Methods in Microbiology. Vol. 18, 209-234.; Goodfellow, M. and O'Donnell, A. G. 1993. Roots of Bacterial Systematics in Handbook of New Bacterial Systematics (ed.
Goodfellow, M. & O'Donnell, A. G.) pp. 3-54. London. Academic Press Ltd.), these references are incorporated herein by reference, and were used to confirm that our collection was related at the genus level. Cultures were shipped to an external, contract laboratory
for fatty acid methyl ester analysis (FAME) using a Microbial ID (MIDI, Newark, DE, USA) Microbial Identification System (MIS) The MIS system consists of a Hewlett Packard HP5890A gas chromatograph with a 25mm x 0.2mm 5% methylphenyl silicone fused silica capillary column. Hydrogen is used as the carrier gas and a flame-ionization detector functions in conjunction with an automatic sampler, integrator and computer. The computer compares the sample fatty acid methyl esters to a microbial fatty acid library and against a calibration mix of known fatty acids. As selected by the contract laboratory, strains were grown for 24 hours at 28°C on trypticase soy agar prior to analysis. Extraction of samples was performed by the contract lab as per standard FAME methodology. There was no direct identification of the strains to any luminescent bacterial group other than Photorhabdus. When the cluster analysis was performed, which compares the fatty acid profiles of a group of isolates, the strain fatty acid profiles were related at the genus level .
The evolutionary diversity of the Photorhabdus strains in our collection was measured by analysis of PCR (Polymerase Cham Reaction) mediated genomic fingerprinting using genomic DNA from each strain. This technique is based on families of repetitive DNA sequences present throughout the genome of diverse bacterial species (reviewed by Versalovic, J., Schneider, M., DE Brui n, F. J. and Lupski, J. R. 1994. Methods Mol. Cell. Biol., 5, 25-40.). Three of these, repetitive extragenic palmdromic sequence (REP), enterobacterial repetitive mtergenic consensus (ERIC) and the BOX element are thought to play an important role in the organization of the bacterial genome. Genomic organization is believed to be shaped by selection and the differential dispersion of these elements withm the genome of closely related bacterial strains can be used to discriminate these strains (e.g., Louws, F J., Fulbπght, D. W., Stephens, C. T. and DE Bruijn, F. J. 1994. Appl. Environ. Micro 60, 2286-2295) . Rep-PCR utilizes oligonucleotide primers complementary to these repetitive sequences to amplify the variably sized DNA fragments lying between them. The resulting products are separated by electrophoresis to establish the DNA "fingerprint" for each strain.
To isolate genomic DNA from our strains, cell pellets were resuspended TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8 0) to a
-85-
SUBSTΓΓUTE SHEET (RULE 26)
final volume of 10 ml and 12 ml of 5. M NaCl was then added. This mixture was centrifuged 20 mm. at 15,000 x g. The resulting pellet was resuspended m 5.7 ml of TE and 300 μl of 10% SDS and 60 μl 20 mg/ml protemase K (Gibco BRL Products, Grand Island, NY) were added. This mixture was incubated at 37 °C for 1 hr, approximately 10 mg of lysozy e was then added and the mixture was incubated for an additional 45 mm. One milliliter of 5M NaCl and 800 μl of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were then added and the mixture was incubated 10 mm. at 65°C, gently agitated, then incubated and agitated for an additional 20 mm. to aid clearing of the cellular material. An equal volume of chlorofor /isoamyl alcohol solution (24:1, v/v) was added, mixed gently then centrifuged. Two extractions were then performed with an equal volume of phenol/chloroform/isoamyl alcohol (50:49:1). Genomic DNA was precipitated with 0.6 volume of isopropanol.
Precipitated DNA was removed with a glass rod, washed twice with 70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl pH8.0, 10 mM NaCl, 1 mM EDTA) . The DNA was then quantitated by optical density at 260 nm. To perform rep-PCR analysis of Photorhabdus genomic DNA the following primers were used, REP1R-I; 5' -IIIICGICGICATCIGGC-3 ' and REP2-I; 5 ' -ICGICTTATCIGGCCTAC-3 ' PCR was performed using the following 25μl reaction: 7.75 μl H2O, 2.5 μl 10X LA buffer (PanVera Corp., Madison, WI), 16 μl dNTP mix (2.5 mM each), 1 μl of each primer at 50 pM/μl, 1 μl DMSO, 1.5 μl genomic DNA (concentrations ranged from 0.075-0.480 μg/μl) and 0.25 μl TaKaRa EX Taq (PanVera Corp., Madison, WI) The PCR amplification was performed in a Perkin Elmer DNA Thermal Cycler (Norwalk, CT) using the following conditions: 95°C/7 mm. then 35 cycles of; 94°C/1 mm.,44°C/l mm., 65°C/8 mm., followed by 15 mm. at 65°C. After cycling, the 25 μl reaction was added to 5 μl of 6X gel loading buffer (0.25% bromophenol blue, 40% w/v sucrose in H2O) . A 15x20cm 1%-agarose gel was then run in TBE buffer (0.09 M
Tris-borate, 0.002 M EDTA) using 8 μl of each reaction. The gel was run for approximately 16 hours at 45v. Gels were then stained in 20 μg/ml ethidium bromide for 1 hour and destamed in TBE buffer for approximately 3 hours. Polaroid photographs of the gels were then taken under UV illumination.
The presence or absence of bands at specific sizes for each strain was scored from the photographs and entered as a similarity
-86-
SUBSTΓΓUTE SHEET (RULE 26)
matrix in the numerical taxonomy software program, NTSYS-pc (Exeter Software, Setauket, NY) . Controls of E. coli strain HB101 and Xanthomonas oryzae pv. oryzae assayed at the same time produced PCR "fingerprints" corresponding to published reports (Versalovic, J., Koeuth, T. and Lupski , J. R. 1991. Nucleic Acids Res. 19, 6823-
6831; Vera Cruz, C. M., Halda-Alija, L., Louws, F., Skinner, D. Z., George, M. L., Nelson, R. J., DE Bruijn, F. J., Rice, C. and Leach, J. E. 1995. Int. Rice Res. Notes, 20, 23-24.,- Vera Cruz, C. M., Ardales, E. Y., Skinner, D. Z., Talag, J., Nelson, R. J., Louws, F. J., Leung, H. , Mew, T. W. and Leach, J. E. 1996. Phytopathology (m press, respectively) . The data from Photorhabdus strains were then analyzed with a series of programs withm NTSYS-pc; SIMQUAL (Similarity for Qualitative data) to generate a matrix of similarity coefficients (using the Jaccard coefficient) and SAHN (Sequential, Agglomerative, Heirarchical and Nested) clustering [using the UPGMA (Unweighted Pair-Group Method with Arithmetic Averages) method] which groups related strains and can be expressed as a phenogram (Fig. 5) . The COPH (cophenetic values) and MXCOMP (matrix comparison) programs were used to generate a cophenetic value matrix and compare the correlation between this and the original matrix upon which the clustering was based. A resulting normalized Mantel statistic (r) was generated which is a measure of the goodness of fit for a cluster analysis (r=0.8-0.9 represents a very good fit) . In our case r = 0.919 Therefore, our collection is comprised of a diverse group of easily distinguishable strains representative of the Photorhabdus genus.
Example 13 Insecticidal Utility of Toxm(s) Produced by Various Photorhabdus Strains
Initial "seed" cultures of the various Photorhabdus strains were produced by inoculating 175 ml of 2% Proteose Peptone #3 (PP3) (Difco Laboratories, Detroit, MI) liquid media with a primary variant subclone in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput. Inoculum for each seed culture was derived from oil-overlay agar slant cultures or plate cultures. After inoculation, these flasks were incubated for 16 hrs at 28°C on a rotary shaker at 150 rpm. These seed cultures were then used as
-87-
SUBSTΓΓUTE SHEET (RULE 26)
uniform inoculum sources for a given fermentation of each strain. Additionally, overlaying the post-log seed culture with sterile mineral oil, adding a sterile magnetic stir bar for future resuspension and storing the culture the dark, at room temperature provided long-term preservation of inoculum in a toxin- competent state. The production broths were inoculated by adding 1% of the actively growing seed culture to fresh 2% PP3 media (e.g., 1.75 ml per 175 ml fresh media). Production of broths occurred m either 500 ml tribaffled flasks (see above) , or 2800 ml baffled, convex bottom flasks (500 ml volume) covered by a silicon foam closure. Production flasks were incubated for 24-48 hrs under the above mentioned conditions. Following incubation, the broths were dispensed into sterile 1 L polyethylene bottles, spun at 2600 x g for 1 hr at 10°C and decanted from the cell and debris pellet The liquid broth was then vacuum filtered through Whatman GF/D (2.7 μM retention) and GF/B (1.0 μM retention) glass filters to remove debris. Further broth clarification was achieved with a tangential flow microflltration device (Pall Filtron, Northborough, MA) using a 0.5 μM open-channel filter. When necessary, additional clarification could be obtained by chilling the broth (to 4°C) and centrifugmg for several hours at 2600 x g. Following these procedures, the broth was filter sterilized using a 0.2 μM nitrocellulose membrane filter. Sterile broths were then used directly for biological assay, biochemical analysis or concentrated (up to 15-fold) using a 10,000 MW cut-off, M12 ultra-flltration device (Amicon, Beverly MA) or centrifugal concentrators (Millipore, Bedford, MA and Pall Filtron, Northborough, MA) with a 10,000 MW pore size. In the case of centrifugal concentrators, the broth was spun at 2000 x g for approximately 2 hr. The 10,000 MW permeate was added to the corresponding retentate to achieve the desired concentration of components greater than 10,000 MW Heat inactivation of processed broth samples was acheived by heating the samples at 100°C in a sand-filled heat block for 10 minutes
The broth (s) and toxm complex (es) from different Photorhabdus strains are useful for reducing populations of insects and were used in a method of inhibiting an insect population which comprises applying to a locus of the insect an effective insect inactivating amount of the active described. A demonstration of the breadth of msecticidal activity ODserved from broths of a selected group of
Photorhabdus strains fermented as described above is shown in Table 20. It is possible that additional msecticidal activities could be detected with these strains through increased concentration of the broth or by employing different fermentation methods. Consistent with the activity being associated with a protein, the msecticidal activity of all strains tested was heat labile (see above )
Culture broth (s) from diverse Photorhabdus strains show differential msecticidal activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects. More specifically, the activity is seen against corn rootworm larvae and boll weevil larvae which are members of the insect order Coleoptera . Other members of the Coleoptera include wireworms, pollen beetles, flea beetles, seed beetles and Colorado potato beetle Activity is also observed against aster leafhopper and corn plant hopper, which are members of the order Homoptera Other members of the Homoptera include planthoppers , pear psylla, apple sucker, scale insects, whiteflies, spittle bugs as well as numerous host specific aphid species The broths and purified toxm comple (es) are also active against tobacco budworm, tobacco hornworm and European corn borer which are members of the order Lepidoptera Other typical members of this order are beet armyworm, cabbage looper, black cutworm, corn earworm, codling moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm and fall armyworm Activity is also seen against fruitfly and mosquito larvae which are members of the order Diptera . Other members of the order Diptera are, pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly and house fly and various mosquito species. Activity with broth (s) and toxm complex (es) is also seen against two-spotted spider mite which is a member of the order Acarina which includes strawberry spider mites, broad mites, citrus red mite, European red mite, pear rust mite and tomato russet mite Activity against corn rootworm larvae was tested as follows Photorhabdus culture broth(s) (0-15 fold concentrated, filter sterilized), 2% Proteose Peptone #3, purified toxm complex(es), 10 mM sodium phosphate buffer , pH 7 0 were applied directly to the surface (about 1 5 cm2) of artificial diet (Rose, R I and McCabe ,
-89-
SUBSTΓΠJTE SHEET (RULE 26)
J. M. (1973) . J. Econ. Ento ol . 66, .(398-400) in 40 μl aliquots. Tox complex was diluted in 10 mM sodium phosphate buffer, pH 7.0 The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica uπdecunpuπctata howardi (Southern corn rootworm, SCR) hatched from surface sterilized eggs. The plates were sealed, placed m a humidified growth chamber and maintained at 27°C for the appropriate period (3-5 days) . Mortality and larval weight determinations were then scored. Generally, 16 insects per treatment were used in all studies. Control mortality was generally less than 5%.
Activity against boll weevil (Antho onas grandis) was tested as follows. Concentrated (1-10 fold) Photorhabdus broths, control medium (2% Proteose Peptone #3), purified toxm complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied 60 μl aliquots to the surface of 0.35 g of artificial diet (Stoneville Yellow lepidopteran diet) and allowed to dry. A single, 12-24 hr boll weevil larva was placed on the diet, and the wells were sealed and held at 25°C, 50% RH for 5 days. Mortality and larval weights were then assessed. Control mortality ranged between 0-13%
Activity against mosquito larvae was tested as follows The assay was conducted m a 96-well microtiter plate. Each well contained 200 μl of aqueous solution (10-fold concentrated Photorhabdus culture broth(s) , control medium (2% Proteose Peptone #3), 10 mM sodium phosphate buffer, toxin complex(es) @ 0.23 mg/ml or H2O) and approximately 20, 1-day old larvae [Aedes aegypti ) .
There were 6 wells per treatment. The results were read at 3-4 days after infestation. Control mortality was between 0-20%.
Activity against fruitflies was tested as follows. Purchased Drosophila melanogaster medium was prepared using 50% dry medium and a 50% liquid of either water, control medium (2% Proteose Peptone #3), 10-fold concentrated Photorhabdus culture broth(s), purified tox complex (es) [0.23 mg/ml] or 10 mM sodium phosphate buffer , pH 7.0. This was accomplished by placing 4.0 ml of dry medium in each of 3 rearing vials per treatment and adding 4.0 ml of the appropriate liquid. Ten late instar Drosophila melanogaster maggots were then added to each 25 ml vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights Pupal or adult counts were made after 15 days of exposure.
-90-
SUBSTTTUTE SHEET (RULE 26)
Adult emergence as compared to water, and control medium (0-16% reduction) .
Activity against aster leafhopper adults [Macrosteles severini ) and corn planthopper nymphs [ Peregrinus aidis) was tested with an ingestion assay designed to allow ingestion of the active without other external contact. The reservoir for the active/ "food" solution is made by making 2 holes in the center of the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm M square is placed across the top of the dish and secured with an "0" ring. A 1 oz. plastic cup is then infested with approximately 7 hoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using 10 -fold concentrated Photorhabdus culture broth (s) , the broth and control medium (2% Proteose Peptone #3) were dialyzed against 10 mM sodium phosphate buffer, pH 7.0 and sucrose (to 5%) was added to the resulting solution to reduce control mortality. Purified toxm complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 was also tested. Mortality is reported at day 3. The assay was held in an incubator at 28°C, 70% RH with a 16/8 photoperiod. The assays were graded for mortality at 72 hours. Control mortality was less than 6%.
Activity against lepidopteran larvae was tested as follows. Concentrated (10-fold) Photorhabdus culture broth(s), control medium (2% Proteose Peptone #3), purified toxin co plex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied directly to the surface (about 1.5 cm2) of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 μl aliquots. The diet plates were allowed to air-dry in a sterile flow-hood and each well was infested with a single, neonate larva. European corn borer [ Ostrinia nubilalis) and tobacco hornworm [Manduca sexta) eggs were obtained from commercial sources and hatched in-house, whereas tobacco budworm [ Heliothis virescens) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed m a humidified growth chamber and maintained in the dark at 27°C for the appropriate period. Mortality and weight determinations were scored at day 5. Generally, 16 insects per treatment were used in all studies. Control mortality generally ranged from about 4 to about 12.5% for control medium and was less than 10% for phosphate buffer.
-91-
SUBSTΪTUTE SHEET (RULE 26)
Activity against two-spotted spider mite [ Tetranychuε urticae) was determined as follows. Young squash plants were trimmed to a single cotyledon and sprayed to run-off with 10-fold concentrated broth(s), control medium (2% Proteose Peptone #3), purified toxin complex (es) , 10 mM sodium phosphate buffer, pH 7.0. After drying, the plants were infested with a mixed population of spider mites and held at lab temperature and humidity for 72 hr. Live mites were then counted to determine levels of control .
-92-
SUBSTΓΠJTE SHEET (RULE 26)
Table . 20
Observed Insecticidal Spectrum of Broths from
Different Photorhabdus Strains
Photorhabdus strain sensitive* Insect species
WX-l 3**, 4, 5, 6, 7,
WX-2 2, 4
WX-3 1, 4
WX-4 1, 4
WX-5 4
WX-6 4
WX-7 3, 4, 5, 6, 7,
WX-8 1, 2, 4
WX-9 1
WX-10 4
WX-11 1 4
WX-12 2, 4, 5 5, 6, 7, i
WX-14 1 4
WX-15 4
W30 5,
NC-1 3, 7, 8, 9
WIR 5,
HP88 4,
Hb 5, 7, 8
Hm 3, 4, 5, 7, 8
H9 3, 4, 5, 6, 7, 8
W-14 3, 4, 5, 6, 7, 8, 10
ATCC 43948
ATCC 43949
ATCC 43950
ATCC 43951
ATCC 43952
* > 25% mortality and/or growth inhibition vs. control ** 1; Tobacco budworm, 2; European corn borer, 3 ; Tobacco hornworm, 4; Southern corn rootworm, 5; Boll weevil, 6; Mosquito, 7; Fruit Fly, 8 ,- Aster Leafhopper, 9; Corn planthopper, 10; Two-spotted spider mite.
•93-
SUBSTΓΓUTE SHEET (RULE 26)
Example. 14
Non W-14 Photprh,flbdug Strains;
Purification. Characterization and Activity Spectrum
Pur-i£.ιcatι.gn
The protocol, as follows, is similar to that developed for the purification of W-14 and was established based on purifying those fractions having the most activity against Southern corn root worm (SCR), as determined bioassays (see Example 13) Typically, 4- 20 L of broth that had been filtered, as described in Example 13, were received and concentrated using an Amicon spiral ultra filtration cartridge Type SIYIOO attached to an Amicon M-12 filtration device. The retentate contained native proteins consisting of molecular sizes greater than 100 kDa, whereas the flow through material contained native proteins less than 100 kDa m size. The majority of the activity against SCR was contained in the 100 kDa retentate. The retentate was then continually diafiltered with 10 mM sodium phosphate (pH = 7.0) until the filtrate reached an A28O 0.100. Unless otherwise stated, all procedures from this point were performed in buffer as defined by 10 mM sodium phosphate (pH 7.0) . The retentate was then concentrated to a final volume of approximately 0 20 L and filtered using a 0.45 mm Nalgene™ Filterware sterile filtration unit. The filtered material was loaded at 7.5 ml/mm onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem Poros® 50 HQ strong anion exchange matrix equilibrated in buffer using a PerSeptive Biosystem Sprint® HPLC system After loading, the column was washed with buffer until an A28O < 0-100 was achieved.
Proteins were then eluted from the column at 2.5 ml/mm using buffer with 0.4 M NaCl for 20 mm for a total volume of 50 ml. The column was then washed using buffer with 1.0 M NaCl at the same flow rate for an additional 20 min (final volume = 50 ml) Proteins eluted with 0.4 M and 1.0 M NaCl were placed in separate dialysis bags (Spectra/Por® Membrane MWCO. 2,000) and allowed to dialyze overnight at 4° C in 12 L buffer The majority of the activity against SCR was contained m the 0 4 M fraction The 0.4 M fraction was further purified by application of 20 ml to a Pharmacia XK 26/100 column that had been prepacked with Sepharose CL4B (Pharmacia) using a flow rate of 0 75 ml/mm Fractions were
-94-
SUBSTΠΓUTE SHEET (RULE 26)
pooled based on A28O peak profile and concentrated to a final volume of 0.75 ml using a Millipore Ultrafree®- 15 centrifugal filter device Biomax-50K NMWL membrane. Protein concentrations were determined using a Biorad Protein Assay Kit with bovine gamma globulin as a standard.
Characterization
The native molecular weight of the SCR toxin complex was determined using a Pharmacia HR 16/50 that had been prepacked with Sepharose CL4B in buffer. The column was then calibrated using proteins of known molecular size thereby allowing for calculation of the toxin approximate native molecular size. As shown in Table 21, the molecular size of the toxin complex ranged from 777 kDa with strain Hb to 1,900 kDa with strain WX-14. The yield of toxin complex also varied, from strain WX-12 producing 0.8 mg/L to strain Hb, which produced 7.0 mg/L.
Proteins found in the tox complex were examined for individual polypeptide size using SDS-PAGE analysis. Typically, 20 mg protein of the toxin complex from each strain was loaded onto a 2-15% polyacrylamide gel (Integrated Separation Systems) and electrophoresed at 20 mA in Biorad SDS-PAGE buffer. After completion of electrophoresis, the gels were stained overnight in Biorad Coomassie blue R-250 (0.2% in methanol: acetic acid: water; 40:10:40 v/v/v) . Subsequently, gels were destained in methanol .-acetic acid: water; 40:10:40 (v/v/v). The gels were then rinsed with water for 15 min and scanned using a Molecular Dynamics Personal Laser Densitometer® . Lanes were quantitated and molecular sizes were calculated as compared to Biorad high molecular weight standards, which ranged from 200-45 kDa. Sizes of the individual polypeptides comprising the SCR toxin complex from each strain are listed in Table 22. The sizes of the individual polypeptides ranged from 230 kDa with strain WX-1 to a size of 16 kDa, as seen with strain WX-7. Every strain, with the exception of strain Hb, had polypeptides comprising the toxin complex that were in the 160-230 kDa range, the 100-160 kDa range, and the 50-80 kDa range. These data indicate that the toxm complex may vary in peptide composition and components from strain to strain, however, in all cases the toxin attributes appears to consist of a large, oligomeric protein complex.
-95-
SUBSTTRJTE SHEET (RULE 26)
Table 21
Characterization of a Toxm Complex from
Non W-14 Photorhabdus Strains
Activity Spectrum
As shown m Table 23, the toxm complexes purified from strains Hm and H9 were tested for activity against a variety of insects, with the toxm complex from strain W-14 for comparison. The assays were performed as described in Example 13. The toxm complex from all three strains exhibited activity against tobacco bud worm, European corn borer, Southern corn root worm, and aster leafhopper. Furthermore, the toxm complex from strains Hm and W- 14 also exhibited activity against two-spotted spider mite In addition, the toxin complex from W-14 exhibited activity against mosquito larvae These data indicate that the toxm complex, while having similarities in activities between certain orders of insects, can also exhibit differential activities against other orders of insects .
-96-
SUBSTΪTUTE SHEET (RULE 26)
Table 22
The Approx mate Sizes (in kDa) of Peptides in a Purified
Tojcin Complex From Non W-14 Photorhabdus
H9 Hb Hm HP 88 NC-1 WIR WX-1 WX-2 WX-7 WX-12 WX-14 W-14
170 230 200 200 180 160 190 170 180 160 120 170 150 110 140 110 160 120 87 139 89 110 110 75 130 79 98 82 43 110
I 74 76 64 33 92 62 58 37 28 87 51 53 30 26 80 40 41 23 73 39 35 22 59 37 31 21 56 33 28 19 51 30 24 18 37 28 22 16 33 27 32 25 26 23
Table 23 Observed Insecticidal Spectrum of a Purified Toxin Complex from
Photorhabdus Strains
Photorhabdus Strain sensitive* Insect Species
Hm Toxin Complex 1**, 2, 3, 5, 6, 7, 8
H9 Toxin Complex 1, 2, 3, 6, 7, 8
W-14 Tox Complex 1, 2, 3, 4, 5, 6, 7, f
* = > 25% mortality or growth inhibition
* = > 25% mortality or growth inhibition
** = 1, Tobacco bud worm; 2, European corn borer,- 3, Southern corn root worm; 4, Mosquito; 5, Two-spotted spider mite; 6, Aster Leafhopper; 7, Fruit Fly; 8, Boll Weevil
Example 15 Sub-Fractionation of Photorhabdus Protein Tox Complex
The Photorhabdus protein tox complex was isolated as described in Example 14. Next, about 10 mg toxm was applied to a MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a flow rate of lml/min. The column was washed with 20 mM Tris-HCl, pH 7.0 until the optical density at 280 nm returned to baseline absorbance . The proteins bound to the column were eluted with a linear gradient of 0 to 1.0 M NaCl m 20 mM Tris-HCl, pH 7.0 at 1 ml/min for 30 min. One ml fractions were collected and subjected to Southern corn rootworm (SCR) bioassay (see Example 13). Peaks of activity were determined by a series of dilutions of each fraction in SCR bioassays. Two activity peaks against SCR were observed and were named A (eluted at about 0.2-0.3 M NaCl) and B (eluted at 0.3-0.4 M NaCl) . Activity peaks A and B were pooled separately and both peaks were further purified using a 3 -step procedure described below.
Solid (NH-j.}2Sθ4 was added to the above protein fraction to a final concentration of 1.7 M. Proteins were then applied to a phenyl-Superose 5/5 column equilibrated with 1.7 M (NH4)2S04 in 50 mM potassium phosphate buffer, pH 7 at 1 ml/mm. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH4)2Sθ4, 0% ethylene glycol, 50 mM potassium phosphate, pH 7.0 to 25% ethylene glycol, 25 mM potassium phosphate, pH 7.0 (no (NH4)2≤θ4) at 0.5 ml/mm. Fractions were dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. Activities m each fraction against SCR were determined by bioassay.
-98-
SUBSTΓΓUTE SHEET (RULE 26)
The fractions with the highest activity were pooled and applied to a MonoQ 5/5 column which was equilibrated with 20 mM Tris-HCl, pH 7 0 at 1 ml/mm. The proteins bound to the column were eluted at 1 ml/mm by a linear gradient of 0 to IM NaCl in 20
For the final step of purification, the most active fractions above (determined by SCR bioassay) were pooled and subjected to a second phenyl-Superose 5/5/ column. Solid (NH4)2S04 was added to a final concentration of 1.7 M. The solution was then loaded onto the column equilibrated with 1.7 M (NH4)2S04 in 50 mM potassium phosphate buffer, pH 7 at 1ml/mm. Proteins bound to the column were eluted with a linear gradient of 1.7 M ( H4)2S04, 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. Activities in each fraction against SCR were determined by bioassay
The final purified protein by the above 3 -step procedure from peak A was named toxm A and the final purified protein from peak B was named toxm B .
Characterization and Ammo Acid Sequencing of Tox A and Toxm B
In SDS-PAGE, both toxm A and toxin B contained two major (> 90% of total Commassie stained protein) peptides: 192 kDa (named Al and Bl, respectively) and 58 kDa (named A2 and B2 , respectively) Both toxm A and toxm B revealed only one major band in native PAGE, indicating Al and A2 were subunits of one protein complex, and Bl and B2 were subunits of one protein complex Further, the native molecular weight of both toxm A and tox B were determined to be 860 kDa by gel filtration chromatography. The relative molar concentrations of Al to A2 was judged to be a 1 to 1 equivalence as determined by densiometric analysis of SDS-PAGE gels. Similarly, Bl and B2 peptides were present at the same molar concentration.
Tox A and toxm B were electrophoresed m 10% SDS-PAGE and transblotted to PVDF membranes. Blots were sent for ammo acid analysis and N-termmal ammo acid sequencing at Harvard MicroChe and Cambridge ProChem, respectively The N-termmal amino sequence of Bl was determined to be identical to SEQ ID N0:1, the TcbAxl region of the tcbA gene (SEQ ID NO: 12, position 87 to 99) A unique N-termmal sequence was obtained for peptide B2 (SEQ ID NO -.40) The N-termmal ammo acid sequence of peptide B2 was identical to the TcbAxll region of the derived ammo acid sequence
-99-
SUBSTΓΓUTE SHEET (RULE 26)
for the tcbA gene (SEQ ID NO: 12, position 1935 to 1945) . Therefore, the B toxm contained predominantly two peptides, TcbAn and Tc Ain, that were observed to be derived from the same gene product, TcbA. The N-terminal sequence of A2 (SEQ ID NO: 41) was unique in comparison to the TcbAllx peptide and other peptides. The A2 peptide was denoted TcdAnι (see Example 17) . SEQ ID NO: 6 was determined to be a mixture of amino acid sequences SEQ ID NO: 40 and 41. Peptides Al and A2 were further subjected to internal am o acid sequencing. For internal ammo acid sequencing, 10 μg of tox A was electrophoresized in 10% SDS-PAGE and transblotted to PVDF membrane. After the blot was stained with amido black, peptides Al and A2 , denoted TcdA1;L and TcdA , respectively, were excised from the blot and sent to Harvard MicroChem and Cambridge ProChem. Peptides were subjected to trypsin digestion followed by HPLC chromatography to separate individual peptides. N-termmal ammo acid analysis was performed on selected tryptic peptide fragments . Two internal am o acid sequences of peptide Al (TcdAn-PK71, SEQ ID NO.-38 and TcdAxl-PK44, SEQ ID NO: 39) were found to have significant homologies with deduced ammo acid sequences of the TcbAn region of the tcbA gene (SEQ ID NO: 12) . Similarly, the N-termmal sequence (SEQ ID NO: 41) and two internal sequences of peptides A2 (TcdA1;ιl-PK57, SEQ ID NO:42 and Tc Alu- PK20, SEQ ID NO.43) also showed significant homology with deduced ammo acid sequences of TcbAnι region of the tcbA gene (SEQ ID NO: 12) .
In summary of above results, the toxm complex has at least two~actιve protein toxm complexes against SCR; toxin A and tox B. Tox A and toxm B are similar in their nat£ve~and subunits molecular weight, however, their peptide compositions are different. Tox A contained peptides TcdAlx and TcdAlu as the major peptides and the toxm B contains TcbAn and TcbAnI as the major peptides.
Purification and Characterization of Toxm C. Tea Peptides
The Photorhabdus protein toxm complex was isolated as described above. Next, about 50 mg toxm was applied to a MonoQ 10/10 column equilibrated with 20 mM Tris-HCl, pH 7 0 at a flow rate of 2 ml/mm. The column was washed with 20 mM Tris-HCl, pH7.0
-100-
SUBSTΓΓUTE SHEET (RULE 26)
until the optical density at 280 nm returned to baseline level. The proteins bound to the column were eluted with a linear gradient of 0 to IM NaCl m 20 mM Tris-HCl, pH 7.0 at 2 ml/mm for 60 mm. 2 ml fractions were collected and subjected to Western analysis using pAb TcaBn-syn antibody (see Example 21) as the primary antibody. Fractions reacted with pAb TcaBn-syn antibody were combined and solid (NH4)2Sθ4 was added to a final concentration of
1.7 M. Proteins were then applied to a phenyl-Superose 10/10 column equilibrated with 1.7 M (NH4)2S04 in 50 mM potassium phosphate buffer, pH 7 at lml/mm. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH4)2Sθ4, 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 at 1 ml/mm for 120 mm. 2ml Fractions were collected, dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0, and analyzed by Western blots using pAb TcaBn-syn antibody as the primary antibody.
Fractions cross -reacted with the antibody were pooled and applied to a MonoQ 5/5 column which was equilibrated with 20 mM Tris-HCl, pH 7.0 at lml/mm. The proteins bound to the column were eluted at lml/mm by a linear gradient of 0 to IM NaCl in 20 mM
Tris-HCl, pH 7 0 for 30 mm.
Fractions above reacted with pAb TcaBn-syn antibody were pooled and subjected to a phenyl-Superose 5/5/ column. Solid (NH4)2Sθ4 added to a final concentration of 1 7 M The solution was then applied onto the column equilibrated with 1.7 M (NH4)2Sθ4 m 50 mM potassium phosphate buffer, pH 7 at lml/mm. Proteins bound to the column were then eluted with a linear gradient of 1.7 M (NH4)2Sθ4, 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 at 0.5 ml/mm for 60 mm. Fractions were dialyzed overnight against 10 M sodium phosphate buffer, pH 7.0. For the final purification step, fractions reacted with pAb TcaBn-syn antibody above determined by Western analysis were combined and applied to a Mono Q 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at lml/mm. The proteins bound to the column were eluted at lml/mm by a linear gradient of 0 to IM NaCl m 20 mM Tris-HCl, pH 7 0 for 30 mm.
The final purified protein fraction contained 6 major peptides examined by SDS-PAGE. 165 kDa, 90 kDa, 64 kDa, 62 kDa, 58 kDa, and 22 kDa. The LD50 of the msecticidal activities of this purified
-101-
SUBSTΓΓUTE SHEET (RULE 26)
fraction were determined to be 100 ng and 500 ng against SCR and ECB, respectively.
The above peptides were blotted to PVDF membranes and blots were sent for ammo acids analysis and 5 ammo acid long N-termmal sequencing at Harvard MicroChem and Cambridge ProChem, respectively. The N-termmal ammo acid sequence of the 165 kDa peptide was determined to be identical to peptide TcaC (SEQ ID 2, position 1 to 5) . The N-termmal ammo acid sequence of the 90 kDa peptide was determined to be TcaAlx region of the derived ammo acid sequence for the tcaA gene (SEQ ID NO 33, position 254 to 258) . The N-termmal ammo acid sequence of 64 kDa peptide was determined to be identical to peptide TcaBx (SEQ ID 3, position 1 to 5) . The N-termmal ammo acid sequence of the 62 kDa peptide was determined to be TcaA region of the derived ammo acid sequence for the tcaA gene (SEQ ID NO 33, position 489 to 493) .
The N-termmal ammo acid sequence of 58 kDa peptide was determined to be identical to peptide TcaBu (SEQ ID 5, position 1 to 5) . The
N-termmal ammo acid sequence of the 22 kDa peptide (SEQ ID NO 62) was determined to be TcaA region, denoted TcaAlv, of the derived ammo acid sequence for the tcaA gene (SEQ ID NO 34, position 98 to 102) It is noted that all tcaA, tcaB, and tcaC genes reside in the same tea operon (Fig. 6A) .
Five μg of purified Tea fraction, purified toxin A, and purified toxm B were analyzed by Western blot using the following antibodies individually as primary antibody: pAb TcaBn-syn antibody, mAb CF52 antibody, pAb TcdAii-syn antibody, and pAb Tcdm-syn antibody (Example 21) With pAb TcaBn-syn antibody only the purified Tea peptides fraction reacted, but not toxm A or toxin B. With mAb CF52 antibody, only toxin B reacted but not Tea peptides fraction or tox A. With either pAb TcdAii-syn antibody or pAb Tcdm-syn antibody only tox A reacted, but not Tea peptides fraction or toxm B. This indicated that the secticidal activity observed in the purified Tea peptides fraction is independent of toxm A and toxm B The purified Tea peptide fraction is a third unique protein tox , denoted toxm C.
-102-
SUBSTΠTJTE SHEET (RULE 26)
Exam le 16
Cleavage and Activation of TcbA Peptide
In the toxm B complex, peptide TcbAn and TcbAxll originate from the single gene product TcbA (Example 15) . The processing of TcbA peptide to TcbAn and TcbAm is presumably by the action of
Photorhabdus protease (s), and most likely, the metalloproteases described in Example 10. In some cases, it was noted that when Photorhabdus W-14 broth was processed, TcbA peptide was present in toxm B complex as a major component, m addition to peptides TcbAn and TcbA ι- Identical procedures, described for the purification of toxm B complex (Example 15) , were used to enrich peptide TcbA from tox complex fraction of W-14 broth. The final purified material was analyzed in a 4-20% gradient SDS-PAGE and major peptides were quantified by densitometry . It was determined that TcbA, TcbAn and TcbAlu comprised 58%, 36%, and 6%, respectively, of total protein The identities of these peptides were confirmed by their respective molecular sizes in SDS-PAGE and Western blot analysis using monospecific antibodies The native molecular weight of this fraction was determined to be 860 kDa. The cleavage of TcbA was evaluated by treating the above purified material with purified 38 kDa and 58 kDa W-14 Photorhabdus metalloproteases (Example 10) , and tryps as a control enzyme (Sigma, MO). The standard reaction consisted 17.5 μg the above purified fraction, 1.5 unit protease, and 0.1 M Tris buffer, pH 8.0 in a total volume of 100 μl . For the control reaction, protease was omitted. The reaction mixtures were incubated at 37°C for 90 mm. At the end of the reaction, 20 μl was taken and boiled with SDS-PAGE sample buffer immediately for electrophoresis analysis m a 4-20% gradient SDS-PAGE. It was determined from SDS-PAGE that in both 38 kDa and 58 kDa protease treatments, the amount of peptides TcbA1;L and TcbA n increased about 3 -fold while the amount of TcbA peptide decreased proportionally (Table 24) . The relative reduction and augmentation of selected peptides was confirmed by Western blot analyses. Furthermore, gel filtration of the cleaved material revealed that the native molecular size of the complex remained the same. Upon trypsm treatment, peptides TcbA and TcbAn were nonspecifically digested into small peptides This indicated that 38 kDa and 58 kDa Photorhabdus proteases can
-103-
SUBSTΓΓUTE SHEET (RULE 26)
specifically process peptide TcbA into peptides TcbAn and TcbAlu Protease treated and untreated control of the remaining 80 μl reaction mixture were serial diluted with 10 mM sodium phosphate buffer, pH 7 0 and analyzed by SCR bioassay By comparing activity in several dilution, it was determined that the 38 kDa protease treatment increased SCR msecticidal activity approximately 3 to 4 fold. The growth inhibition of remaining insects in the protease treatment was also more severe than control (Table 24) .
Table 2
Conversion and Activation of Peptide TcbA into Peptides Tcb ^ and
Control 38 kDa protease treatment TcbA (% or total protein) 5"B IB TcbAn (% of total protein) 36 64
TcbAm(% of total protein) 6 18 LDbϋ (μg protein) 271 ϋ~52
SCR Weight (mg/msect)* 0.2 0.1
"*""• an indication of growth mhioition Ey measuring the average weight of live insect after 5 days on diet in the assay
Aςt vation and Procession of Toxm B by SCR Gut Proteases
In yet a second demonstration of proteolytic activation, it was examined whether W-14 toxins are processed by insects Toxin B purified from Photorhabdus W-14 broth (see Example 15) was comprised of predominantly intact TcbA peptides as judged by SDS- PAGE and Western blot analysis using monoclonal antibody The LD50 of this fraction against SCR was determined to be around 700 ng. SCR larva were grown on coleopteran diet until they reached the fourth instar stage (about 100-125 mg total weight each insect) . SCR gut content was collected as follows: the guts were removed using dissecting scissors and forceps. After removing the excess fatty material that coats the gut lining, about 40 guts were homogenized in a microcentrifuge tube containing 100 μl sterile water. The tube was then centrifuged at 14,000 rpm for 10 minutes and the pellet discarded. The supernatant was stored at a -70°C freezer until use.
The processing of tox B by insect gut was evaluated by treating the above purified toxm B with the SCR gut content collected. The reaction consisted 40 μg tox B (1 mg/ml), 50 μl
-104-
SUBSTΓΓUTE SHEET (RULE 26)
SCR gut content, and 0. IM Tris buffer, pH 8.0 in a total volume of 100 μl . For the control reaction, SCR gut content was omitted. The reaction mixtures were incubated at 37°C for overnight. At the end of reaction, 10 μl was withdraw and boiled with equal volume 2x SDS-PAGE sample buffer for SDS-PAGE analysis. The remaining 90 μl reaction mixture was serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and analyzed by SCR bioassay. SDS-PAGE analysis indicated in SCR gut content treatment, peptide TcbA was digested completely into smaller peptides. Analysis of the undenatured toxin fraction showed that the native size, about 860 kDa, remained the same even though larger peptides were fragmented. In SCR bioassays, it was found that the LD50 of SCR gut treated toxin B to be about 70 ng; representing a 10-fold increase. In a separate experiment, protease K treatment completely eliminated toxin activity.
Example 17 Screening of the Library for a Gene Encoding the TcdAj ^ Peptide
The cloning and characterization of a gene encoding the TcdAή peptide, described as SEQ ID NO: 17 (internal peptide TcdAή-PTlll N-terminal sequence) and SEQ ID NO: 18 (internal peptide TcdAϋ-PT79 N-termmal sequence) was completed. Two pools of degenerate oligonucleotides, designed to encode the amino acid sequences of SEQ ID NO:17 (Table 25) and SEQ ID NO:18 (Table 26), and the reverse complements of those sequences, were synthesized as described in Example 8. The DNA sequence of the oligonucleotides is given below:
-105-
SUBSTΓΓUTE SHEET (RULE 26)
Table 25 Degenerate Oligonucleotide for SEQ ID NO: 17
Table 26 o Degenerate Oligonucleotide for SEQ ID NO: 18
* According to IUPAC-IUB codes for nucleotides, Y = C or T, H A, C or T, N = A, C, G or T, K = G or T, R = A or G, and M = A or C
Polymerase Cham Reactions (PCR) were performed essentially as described in Example 8, using as forward primers P2.3.6.CB or P2.3.5, and as reverse primers P2.79.R.1 or P2.79R.CB, in all forward/reverse combinations, using Photorhabdus W-14 genomic DNA as template. In another set of reactions, primers P2.79.2 or
P2.79.3 were used as forward primers, and P2.3.5R, P2.3.5RI, and P2.3R.CB were used as reverse primers in all forward/reverse combinations. Only in the reactions containing P2 3.6. CB as the forward primers combined w th P2.79.R.1 or P2.79R.CB as the reverse primers was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 2500 base pairs. The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdAn-PTlll lies ammo-proximal to the peptide fragment TcdAlx-PT79. The 2500 bp PCR products were ligated to the plasmid vector pCR™II (Invitrogen, San Diego, CA) according to the supplier's instructions, and the DNA sequences across the ends of the insert fragments of two isolates (HS24 and HS27) were determined using the supplier's recommended primers and the sequencing methods described previously. The sequence of both isolates was the same. New primers were synthesized based on the determined sequence, and used to prime additional sequencing reactions to obtain a total of 2557 bases of the insert [SEQ ID NO:36] Translation of the partial peptide encoded by SEQ ID No: 36 yields the 845 ammo acid sequence disclosed as SEQ ID NO: 37 Protein homology analysis of this portion of the TcdAn peptide fragment reveals substantial ammo acid homology ( (68% similarity, and 53% identity using the Wisconsin Package Version 8.0, Genetics Computer Group (GCG) , Madison, WI ) to residues 542 to 1390 of protein TcbA [SEQ ID NO: 12] or (60% similarity, and 54% identity using the Wisconsin Package Version
9 0, Genetics Computer Group (GCG), Madison, WI to residues 567 to 1389) ) . It is therefore apparent that the gene represented part by SEQ ID NO: 36 produces a protein of similar, but not identical, ammo acid sequence as the TcbA protein, and which likely has similar, but not identical biological activity as the TcbA protein. In yet another instance, a gene encoding the peptides TcdAn - PK44 and the TcdAlu 58 kDa N-termmal peptide, described as SEQ ID NO:39 (internal peptide TcdA11-PK44 sequence) , and SEQ ID NO:41 (TcdAni 58 kDa N-termmal peptide sequence) was isolated.
-107-
SUBSTΓΓUTE SHEET (RULE 26)
Two pools of degenerate oligonucleot.ides, designed to encode the amino acid sequences described as SEQ ID NO: 39 (Table 28) and SEQ ID NO:41 (Table 27) , and the reverse complements of those sequences, were synthesized as described in Example 8, and their DNA sequences .
-108-
SUBSTΓΓUTE SHEET (RULE 26)
Ui
Table 27 Degenerate Oligonucleotide for SEO ID NO: 41
o so
I Table 28 Degenerate Oligonucleotide for SEO ID NO: 39
Polymerase Chain Reactions (PCR) were performed essentially as described Example 8, using as forward primers Al.44.1 or Al.44.2, and reverse primers A2.3R or A2.4R, m all forward/reverse combinations, using Photorhabdus W-14 genomic DNA as template. In another set of reactions, primers A2.1 or A2.2 were used as forward primers, and A1.44.1R, and A1.44.2R were used as reverse primers in all forward/reverse combinations. Only in the reactions containing Al.44.1 or Al.44.2 as the forward primers combined with A2.3R as the reverse primer was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 1400 base pairs. The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdA1L-PK44 lies ammo-proximal to the 58 kDa peptide fragment of TcdAlu
The 1400 bp PCR products were ligated to the plasmid vector pCR,MII according to the supplier's instructions. The DNA sequences across the ends of the insert fragments of four isolates were determined using primers similar in sequence to the supplier's recommended primers and using sequencing methods described previously. The nucleic acid sequence of all isolates differed as expected in the regions corresponding to the degenerate primer sequences, but the ammo acid sequences deduced from these data were the same as the actual amino acid sequences for the peptides determined previously, (SEQ ID NOS: 41 and 39) .
Screening of the W-14 genomic cosmid library as described Example 8 with a radiolabeled probe comprised of the DNA prepared above (SEQ ID NO:36) identified five hybridizing cosmid isolates, namely 17D9, 20B10, 21D2, 27B10, and 26D1. These cosmids were distinct from those previously identified with probes corresponding to the genes described as SEQ ID NO: 11 or SEQ ID NO: 25. Restriction enzyme analysis and DNA blot hybridizations identified three EcoR I fragments, of approximate sizes 3.7, 3.7, and 1.1 kbp, that span the region comprising the DNA of SEQ ID NO:36. Screening of the W-14 genomic cosmid library using as probe the radiolabeled 1.4 kbp DNA fragment prepared in this example identified the same five cosmids (17D9, 20B10, 21D2, 27B10, and 26D1) . DNA blot hybridization to EcoR I-digested cosmid DNAs also showed hybridization to the same subset of EcoR I fragments as seen with the 2.5 kbp TcdAxl gene probe, indicating that both fragments are encoded on the genomic DNA.
-110-
SUBSTΓΓUTE SHEET (RULE 26)
DNA sequence determination of the cloned EcoR I fragments revealed an uninterrupted reading frame of 7551 base pairs (SEQ ID NO:46) , encoding a 282.9 kDa protein of 2516 ammo acids (SEQ ID NO: 47) . Analysis of the ammo acid sequence of this protein revealed all expected internal fragments of peptides TcdAn (SEQ ID N0S:17, 18, 37, 38 and 39) and the TcdA peptide N-terminus (SEQ ID NO: 41) and all TcdAm internal peptides (SEQ ID NOS.42 and 43) The peptides isolated and identified as TcdAn an TcdAm are each products of the open reading frame, denoted tcdA, disclosed as SEQ ID NO:46. Further, SEQ ID NO:47 shows, starting at position 89, the sequence disclosed as SEQ ID NO: 13, which is the N-termmal sequence of a peptide of size approximately 201 kDa, indicating that the initial protein produced from SEQ ID NO: 46 is processed in a manner similar to that previously disclosed for SEQ ID NO: 12. In addition, the protein s further cleaved to generate a product of size 209.2 kDa, encoded by SEQ ID NO:48 and disclosed as SEQ ID NO-.49 (TcdAn peptide), and a product of size 63.6 kDa, encoded by SEQ ID NO: 50 and disclosed as SEQ ID NO: 51 (TcdAm peptide) Thus, it s thought that the msecticidal activity identified as toxm A (Example 15) derived from the products of SEQ ID NO: 46, as exemplified by the full-length protein of 282.9 kDa disclosed as SEQ ID NO: 47, is processed to produce the peptides disclosed as SEQ ID NOS: 49 and 51. It is thought that the msecticidal activity identified as tox B (Example 15) derives from the products of SEQ ID NO: 11, as exemplified by the 280 6 kDa protein disclosed as SEQ ID NO.12. This protein is proteolytically processed to yield the 207 6 kDa peptide disclosed as SEQ ID NO: 53, which is encoded by SEQ ID NO: 52, and the 62.9 kDa peptide having N-termmal sequence disclosed as SEQ ID NO: 40, and further disclosed as SEQ ID NO: 55, which is encoded by SEQ ID NO: 54.
Ammo acid sequence comparisons between the proteins disclosed as SEQ ID NO: 12 and SEQ ID NO: 47 reveal that they have 69% similarity and 54% identity using the Wisconsin Package Version 8.0, Genetics Computer Group (GCG), Madison, WI or 60% similarity and 54% identity using version 9.0 of the program This high degree of evolutionary relationship is not uniform throughout the entire ammo acid sequence of these peptides, but is higher towards the carboxy- erminal end of the proteins, since the peptides disclosed as SEQ ID NO: 51 (derived from SEQ ID NO.47) and SEQ ID
-111-
SUBSTΓΠJTE SHEET (RULE 26)
NO: 55 (derived from SEQ ID NO: 12) have 76% similarity and 64% identity using the Wisconsin Package Version 8.0, Genetics Computer Group (GCG) , Madison, WI or 71% similarity and 64% identity using version 9.0 of the program.
Example 18
Control of European Cornborer-Induced Leaf Damaσe on Maize Plants by Spray Application of Photorhabdus (Strain W-14) Broth
The ability of Photorhabdus toxm(s) to reduce plant damage caused by insect larvae was demonstrated by measuring leaf damage caused by European corn borer [ Ostrinia nubilalis) infested onto maize plants treated with Photorhabdus broth. Fermentation broth from Photorhabdus strain W-14 was produced and concentrated approximately 10-fold using ultraflltration (10,000 MW pore-size) as described in Example 13. The resulting concentrated broth was then filter sterilized using 0.2 micron nitrocellulose membrane filters. A similarly prepared sample of uninoculated 2% proteose peptone #3 was used for control purposes. Maize plants (an mbred line) were grown from seed to vegetative stage 7 or 8 in pots containing a soilless mixture in a greenhouse (27°C day,- 22°C night, about 50%RH, 14 hr day-length, watered/fertilized as needed) . The test plants were arranged in a randomized complete block design (3 reps/treatment, 6 plants/treatment) in a greenhouse with temperature about 22°C day; 18°C night, no artificial light and with partial shading, about 50%RH and watered/fertilized as needfcϊ Treatments (uninoculated media and concentrated Photorhabdus broth) were applied with a syringe sprayer, 2.0 mis applied from directly (about 6 inches) over the whorl and 2.0 additional mis applied in a circular motion from approximately one foot above the whorl In addition, one group of plants received no treatment. After the treatments had dried (approximately 30 minutes) , twelve neonate European corn borer larvae (eggs obtained from commercial sources and hatched in-house) were applied directly to the whorl After one week, the plants were scored for damage to the leaves using a modified Guthrie Scale (Koziel, M. G., Beland, G. L., Bowman, C, Carozzi, N. B., Crenshaw, R. , Crossland, L , Dawson, J., Desai, N., Hill, M., Kadwell, S , Launis, K , Lewis,
-112-
SUBSTΓΓUTE SHEET (RULE 26)
K. , Maddox, D., McPherson, K., Meghj 1 , M. Z., Merlin, E., Rhodes, R., Warren, G. W., Wright, M. and Evola, S. V. 1993).
Bio/Technology, 11, 194-195.) and the scores were compared statistically [T-test (LSD) p<0.05 and Tukey's Studentized Range (HSD) Test p<0.1] . The results are shown in Table 29. For reference, a score of 1 represents no damage, a score of 2 represents fine "window pane" damage on the unfurled leaf with no pmhole penetration and a score of 5 represents leaf penetration with elongated lesions and/or mid rib feeding evident on more than three leaves (lesions < 1 inch) . These data indicate that broth or other protein containing fractions may confer protection against specific insect pests when delivered m a sprayable formulation or when the gene or derivative thereof, encoding the protein or part thereof, is delivered via a transgenic plant or microbe.
Table 29
Effect of Photorhabdus Culture Broth on
European Corn Borer- Induced Leaf Damage on Maize
Treatment Average Guthrie Score
No Treatment 5.02a
Uninoculated medium 5.15a
Photorhabdus Broth 2.24b
Means with different letters are statistically different (p<0.05 or p<0.1) .
Ex mple 19 Genetic Engineering of Genes for Expression m E. coli
Summary of Constructions
A series of plasmids were constructed to express the tcbA gene of Photorhabdus W-14 in Escherichia coli . A list of the plasmids is shown in Table 30. A brief description of each construction follows as well as a summary of the E. coli expression data obtained.
-113-
SUBSTΓΓUTE SHEET (RULE 26)
Table 30 Expression Plasmids for the tcbA Gene
revia i n : = , =c oramp en co , mp=ampιcι m
Construction of PDAB2025
In Example 9, a large EcoR I fragment which hybridizes to the TcbAxl probe is described. This fragment was subcloned into pBC
(Stratagene, La Jolla CA) to create pDAB2025. Sequence analysis indicates that the fragment is 8816 base pairs The fragment encodes the tcbA gene with the initiating ATG at position 571 and the terminating TAA at position 8086 The fragment therefore carries 570 base pairs of Photorhabdus DNA upstream of the ATG and 730 base pairs downstream of the TAA.
Construction of Plasmid pDAB2026
The tcbA gene was PCR amplified from plasmid pDAB2025 using the following primers,- 5' primer <SlAc51) 5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3' and 3' primer (SlAc31) 5' TTT AAA GCG GCC GCT TAA CGG ATG GTA TAA CGA ATA TG 3' PCR was performed using a TaKaRa LA PCR kit from PanVera (Madison, WI) in the following reaction: 57.5 microliters water, 10 microliters 10X LA buffer, 16 microliters dNTPs (2.5 mM each stock solution), 20 microliters each primer at 10 pmoles/ microliters, 300 ng of the plasmid pDAB2025 containing the W-14 tcbA gene and one microliter of TaKaRa LA Taq polymerase. The cycling conditions were 98°C/20 sec, 68°C/5 mm, 72°C/10 min for 30 cycles. A PCR product of the expected about 7526 bp was isolated in a 0.8% agarose gel m TBE (100 mM Tris, 90 mM boric acid, 1 mM EDTA) buffer and purified using a Qiaex II kit from Qiagen (Chatsworth, CA) . The purified tcbA gene was digested with Neo I and Wot T and ligated into the baculovirus transfer vector pAcGP67B (PharM gen (San Diego, CA) ) and transformed into DH5α E. coli . The resulting recombinant is called pDAB2026 The tcbA gene was then cut from pDAB2026 and transferred to pET27b to
create plasmid pDAB2027. A missense mutation the tcbA gene was repaired m pDAB2027.
The repaired tcbA gene contains two changes from the sequence shown m Sequence ID NO: 11, an A>G at 212 changing an asparagme 71 to serine 71 and a G>A at 229 changing an alanme 77 to threon e 77. These changes are both upstream of the proposed TcbAι:L N" terminus .
Construction of DDAB202B The tcbA coding region of pDAB2027 was transferred to vector pET15b. This was accomplished using shotgun ligations, the DNAs were cut with restriction enzymes Neo I and Xho I The resulting recombinant is called pDAB2028.
Expression of TcbA m E. coli from Plasmid pDAB2028
Expression of tcbA in E. coli was obtained by modification of the methods previously described by Studier et al . (Studier, F.W., Rosenberg, A., Dunn, J., and Dubendorff, J., (1990) Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol . , 185. 60-89.) Competent E. coli cells strain BL2KDE3) were transformed with plasmid pDAB2028 and plated on LB agar containing 100 μg/mL ampicillm and 40 mM glucose The transformed cells were plated to a density of several hundred isolated colonies/plate. Following overnight incubation at 37°C the cells were scraped from the plates and suspended LB broth containing 100 μg/mL ampicillm. Typical culture volumes were from 200-500 mL. At time zero, culture densities (OD600) were from 0.05-0.15 depending on the experiment. Cultures were shaken at one of three temperatures
(22°C, 30°C or 37°C) until a density of 0.15-0.5 was obtained at which time they were induced with 1 mM isopropylthio-β-galactoside
(IPTG) Cultures were incubated at the designated temperature for 4-5 hours and then were transferred to 4°C until processing (12-72 hours) .
Purification and Characterization of TcbA Expressed in E. coli from
Plasmid ΩDAB2028
E. coli cultures expressing TcbA peptides were processed as follows Cells were harvested by centrifugation at 17,000 x G and the media was decanted and saved in a separate container
-115-
SUBSTΓΠJTE SHEET (RULE 26)
The media was concentrated about 8x using the M12 (Amicon, Beverly MA) filtration system and a 100 kD molecular mass cut-off filter. The concentrated media was loaded onto an anion exchange column and the bound proteins were eluted with 1.0 M NaCl. The 1.0 M NaCl elution peak was found to cause mortality against Southern corn rootworm (SCR) larvae Table 30). The 1.0 M NaCl fraction was dialyzed against 10 mM sodium phosphate buffer pH 7.0, concentrated, and subjected to gel filtration on Sepharose CL-4B (Pharmacia, Piscataway, NJ) . The region of the CL-4B elution profile corresponding to calculated molecular weight (about 900 kDa) as the native W-14 tox complex was collected, concentrated and bioassayed against larvae. The collected 900 kDa fraction was found to have msecticidal activity (see Table 31 below) , with symptomology similar to that caused by native W-14 tox complex This fraction was subjected to Protemase K and heat treatment, the activity both cases was either eliminated or reduced, providing evidence that the activity is prote aceous m nature In addition, the active fraction tested immunologically positive for the TcbA and TcbAul peptides in immunoblot analysis when tested with an anti-TcbAm monoclonal antibody (Table 31)
The cell pellet was resuspended in 10 mM sodium phosphate buffer, pH=7 0, and lysed by passage through a Bιo-Neb' cell nebulizer (Glas-Col Inc., Terra Haute, IN) The pellets were
-116-
SUBSTΓΓUTE SHEET (RULE 26)
treated with DNase to remove DNA and centrifuged at 17,000 x g to separate the cell pellet from the cell supernatant. The supernatant fraction was decanted and filtered through a 0.2 micron filter to remove large particles and subjected to anion exchange chromatography. Bound proteins were eluted with 1.0 M NaCl, dialyzed and concentrated using Biomax™ (Millipore Corp, Bedford, MA) concentrators with a molecular mass cut-off of 50,000 Daltons. The concentrated fraction was subjected to gel filtration chromatography using Sepharose CL-4B beaded matrix. Bioassay data for material prepared in this way is shown m Table 30 and is denoted as "TcbA Cell Sup".
In yet another method to handle large amounts of material, the cell pellets were re-suspended in 10 mM sodium phosphate buffer, pH = 7 0 and thoroughly homogenized by using a Kontes Glass Company (Vmeland, NJ) 40 ml tissue grinder The cellular debris was pelleted by centrifugation at 25,000 x g and the cell supernatant was decanted, passed through a 0.2 micron filter and subjected to anion exchange chromatography using a Pharmacia 10/10 column packed with Poros HQ 50 beads The bound proteins were eluted by performing a NaCl gradient of 0.0 to 1.0 M Fractions containing the TcbA protein were combined and concentrated using a 50 kDa concentrator and subjected to gel filtration chromatography using Pharmacia CL-4B beaded matrix. The fractions containing TcbA oligomer, molecular mass of approximately 900 kDa, were collected and subjected to anion exchange chromatography using a Pharmacia Mono Q 10/10 column equilibrated with 20 mM Tris buffer pH = 7.3 A gradient of 0.0 to 1.0 M NaCl was used to elute recombinant TcbA protein. Recombinant TcbA eluted from the column at a salt concentration of approximately 0.3-0.4 M NaCl, the same olarity at which native TcbA oligomer is eluted from the Mono Q 10/10 column. The recombinant TcbA fraction was found to cause SCR mortality in bioassay experiments similar to those in Table 31.
A second set of expression constructions were prepared and tested for expression of the TcbA protein toxm.
Construction of pDAB2030. An Expression Plasmid for the tcbA Coding Region
The plasmid pDAB2028 (see herein) contains the tcbA coding region m the commercial vector pET15 (Novagen, Madison, WI),
-117-
SUBSTΓΠJTE SHEET (RULE 26)
encodes an ampicillm selection marker. The plasmid pDAB2030 was created to express the tcbA coding region from a plasmid which encodes a kanamycm selection marker. This was done by cutting pET27 (Novagen, Madison, WI) a kanamycm selection plasmid, and pDAB2028 with Xba I and Xho I . This releases the entire multiple cloning site, including the tcbA coding region from plasmid pDAB2028. The two cut plasmids, were mixed and ligated Recombinant plasmids were selected on kanamycm and those containing the pDAB2028 fragment were identified by restriction analysis. The new recombinant plasmid is called pDAB2030.
Cons ruc ion of Plas id pDAB Q3;i , Correction of Mutations in tcbA
The two mutations in the N-terminus of the tcbA coding region as described in Example 19 (Sequence ID NO: 11; A>G at 212 changing an asparagme 71 to serine 71; G>A at 229 changing an alanine 77 to threonme 77) were corrected as follows A PCR product was generated using the primers TH50 (5' ACC GTC TTC TTT ACG ATC AGT G 3') and SlAc51(5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3 ' ) and pDAB2025 as template to generate a 1778 bp product This PCR product was cloned into plasmid pCR2 1 (Invitrogen, San Diego, CA) and a clone was isolated and sequenced The clone was digested with Neo I and Pm Al and a 1670 bp fragment was purified from a 1% agarose gel. A plasmid containing the mutated tcbA coding region (pDAB2030) was digested with iVco I and Not I and purified away from the 1670 bp fragment m a 0.8% agarose with Qiaex II (Qiagen, Chatsworth, CA) . The corrected Neo I /Pin Al fragment was then ligated into pDAB2030 The ligated DNA was transformed into DH5α E. coli . A clone was isolated, sequenced and found to be correct. This plasmid, containing the corrected tcbA coding region, is called pDAB2031
Construction of PPAB2Q33 and DAB2Q34 Expression Plasmids for t bA
The expression plasmids pDAB2025 and pDAB2027-2031 all rely on the Bacteπophage T7 expression system An additional vector system was used for bacterial expression of the tcbA gene and its derivatives The expression vector Trc99a (Pharmacia Biotech, Piscataway, NJ) contains a strong trc promoter upstream of a multiple cloning site with a 5' Neo I site which is compatible with the tcbA coding region from pDAB2030 and 2031 However, the plasmid does not have a compatible 3! site Therefore, the Hind III site of Trc99a was cut and made blunt by treatment with T4 DNA
-118-
SUBSTΓΓUTE SHEET (RULE 26)
polymerase (Boehringer Mannheim, Indianapolis, IN) . The vector plasmid was then cut by Neo I followed by treatment with alkaline phosphatase. The plasmids pDAB2030 and pDAB2031 were each cut with Xho I (cuts at the 3' end of the tcbA coding region) followed by treatment with T4 DNA polymerase to blunt the ends . The plasmids were then cut with Neo I, the DNAs were extracted with phenol, ethanol precipitated and resuspended in buffer. The Trc99a and pDAB2030 and pDAB2031 plasmids were mixed separately, ligated and transformed into DH5α cells and plated on LB media containing ampicillin and 50 mM glucose. Recombinant plasmids were identified by restriction digestion. The new plasmids are called pDAB2033 (contains the tcbA coding sequence with the two mutations in tcbA-^) and pDAB2034 (contains the corrected version of tcbA from pDAB2031) .
Construction of Plasmid pDAB2032: An Expression Plasmid for tShΔuΔm
A plasmid encoding the TcbAϋAj^ portion of TcbA was created in a similar way as plasmid pDAB2031. A PCR product was generated using TH42 (5' TAG GTC TCC ATG GCT TTT ATA CAA GGT TAT AGT GAT CTG 3') and TH50 (5' ACC GTC TTC TTT ACG ATC AGT G 3') primers and plasmid pDAB2025 as template. This yielded a product of 1521 bp having an initiation codon at the beginning of the coding sequence of tcbA±i_ . This PCR product was isolated in a 1% agarose gel and purified. The purified product was cloned into pCR2.1 as above and a correct clone was identified by DNA sequence analysis. This clone was digested with Neo I and Pin Al, a 1414 bp fragment was isolated in a 1% agarose gel and ligated into the Neo I and Pin Al sites of plasmid pDAB2030 and transformed into DH5 E. coli . This new plasmid, designed to express
in E. coli , is called pDAB2032.
Expression of tcbA and tcbA±i^ii from Plasmids pDAB2030. pDAB2031 and pDAB2032 Expression of tcbA in E. coli from plasmids pDAB2030, pDAB2031 and pDAB2032 was as described herein, except expression of tcbAnA-ii -i was done in E. coli strain HMS174(DE3) (Novagen, Madison, I) .
-119-
SUBSTΓΠJTE SHEET (RULE 25)
Expression of tcbA from Plasmid pDAB2033
The plasmid pDAB2033 was transformed into BL21 cells (Novagen, Madison, WI) and plated on LB containing 100 micrograms/mL ampicillin and 50 mM glucose. The plates were spread such that several hundred well separated colonies were present on each plate following incubation at either 30°C or 37°C overnight. The colonies were scraped from the plates and suspended m LB containing 100 micrograms/mL ampicillm, but no glucose. Typical culture volume was 250 mL m a single 1 L baffle bottom flask. The cultures were induced when the culture reached a density of 0.3-0.6 OD600 n . Most often this density was achieved immediately after suspension of the cells from the plates and did not require a growth period in liquid media. Two induction methods were used. Method 1: cells were induced with 1 mM IPTG at 37°C. The cultures were shaken at 200 rpm on a platform shaker for 5 hours and harvested. Method 2: The cultures were induced with 25 micromolar IPTG at 30°C and shaken at 200 rpm for 15 hours at either 20°C or 30°C. The cultures were stored at 4°C until used for purification.
Purification of TcbA from E. coli
Purification, bioassay and immunoblot analysis of TcbA and TcbA11A111 was as described herein. Results of several representative E. coli expression experiments are shown Table 32. All materials shown in Table 32 were purified from the media fraction of the cultures. The predicted native molecular weight is approximately 900 kD as described herein The purity of the samples, the amount of TcbA relative to contaminating proteins, varied with each preparation.
Table 32
Bioassay Activity and Immunoblot Analysis of TcbA and Derivatives Produced m E. coli and Purified from the Culture Media
corin a grow in i i i n on ou ern orn Rootworm as compared to control samples,- 5-24%=" +", 25-49%= "++", 50-100%="+++"
Example 20 Characterization of Toxm Peptides witfr Matrix-Assisted Laser Desorption lomzat on Time-of-Flight Mass Spectroscopy
Toxms isolated from W-14 broth were purified as described m Example 15. In some cases, the TcaB protein toxm was pretreated with proteases (Example 16) that had been isolated from W-14 broth as previously described (Example 15) . Protein molecular mass was determined using matrix-assisted laser desorption lomzation time- of-flight mass spectroscopy, hereinafter MALDI-TOF, on a VOYAGER BIOSPECTROMETRY workstation with DELAYED EXTRACTION technology (PerSeptive Biosystems, Frammgham, MA). Typically, the protein of mterre"s rrGO-500 pmoles m 5 μl) was mixed with 1 μl of acetonitπle and dialyzed for 0.5 to 1 h on a Millipore VS filter having a pore size of 0.025 μM (Millipore Corp. Bedford, MA). Dialysis was performed by floating the filter on water (shinny side up) followed by adding protem-acetonitrile mixture as a droplet to the surface of the filter. After dialysis, the dialyzed protein removed using a pipette and was then mixed with a matrix consisting of s apinic acid and trifluoroacetic acid according to manufacturers instructions. The protein and matrix were allowed to co-crystallize on a about 3 cm2 gold-plated sample plate
(PerSeptive Corp.) . Excitation of the crystals and subsequent mass analysis was performed using the following conditions: laser setting of 3050; pressure of 4.55e-07, low mass gate of 1500.0; negative ions off; accelerating voltage of 25,000; grid voltage of
-121-
SUBSTΓΓUTE SHEET (RULE 26)
90.0%; guide wire voltage of 0.010%; linear mode; and a pulse delay time of 350 ns .
Protein mass analysis data are shown m Table 33. The data obtained from MALDI-TOF was compared to that hypothesized from gene sequence information and as previously determined by SDS-PAGE
Table 33
Molecular Analysis of Peptides bv MALDI-TOF. SDS-PAGE and Predicted
Determination Based on Gene Sequence
Peptide Predicted (Gene) SDS PAGE MALDI-TOF
TcbA 280,634 Da 240,000 Da 281 , 040 Da TcbAi n 217,710 Da not resolved 216 , 812 Da TcbAn 207,698 Da 201,000 Da 206 , 473 Da TcbAm 62,943 Da 58,000 Da 63 , 520 Da
TcdAn 209 , 218 Da 188 , 000 Da 208 , 186 Da TcdAm 63 , 520 Da 56 , 000 Da 63 , 544 Da
TcbAn Protease Generated 201 , 000 Da 216 , 614 Da' 215 , 123 Da' 210 , 391 Da' 208 , 680 Da'
TcbAlxl Protease Generated 56 , 000 Da 64 , 111 Da
Data normalized TcbA, multiple fragments observed at TcbAi/ n
Example 21 Production of Peptide Specific Polyclonal Antibodies
Nine peptide components of the W-14 toxm complex, namely, TcaA, TcaAm, TcaBi, TcaBn , TcaC, TcbAn, TcbAm, TcdAn, nd TcdAm were selected as targets against which antibodies were produced. Comprehensive DNA and deduced ammo acid sequence data for these peptides indicated that the sequence homology between some of these peptides was substantial. If a whole peptide was used as the immunogen to induce antibody production, the resulting antibodies might bind to multiple peptides in the toxm preparation. To avoid this problem antibodies were generated that would bind specifically to a unique region of each peptide of interest The unique region (subpeptide) of each target peptide was selected based on the analyses described below Each entire peptide sequence was analyzed using MacVector Protein Analysis Tool (IBI Sequence Analysis Software, International Biotechnologies, Inc., P. O Box 9558, New Haven, CT 06535) to determine its antigenicity index. This program was designed to locate possible externally-located ammo acid
-122-
SUBSTΓΓUTE SHEET (RULE 26)
sequences, i.e., regions that might be antigenic sites. This method combined information from hydrophilicity, surface probability, and backbone flexibility predictions along with the secondary structure predictions order to produce a composite prediction of the surface contour of a protein. The scores for each of the analyses were normalized to a value between -1 0 and
+1 0 (MacVector Manual) . The antigenicity index value was obtained for the entire sequence of the target peptide. From each peptide, an area covering 19 or more ammo acids that showed a high antigenicity index from the original sequence was re-analyzed to determine the antigenicity index of the subpeptide without the flanking residues This re-analysis was necessary because the antigenicity index of a peptide could be influenced by the flanking ammo acid residues. If the isolated subpeptide sequence did not maintain a high antigenicity index, a new region was chosen and the analysis was repeated
Each selected subpeptide sequence was aligned and compared to all seven target peptide sequences using MacVector ™ alignment program. If a selected subpeptide sequence showed identity (greater than 20%) to another target peptide, a new 19 or more ammo acid region was isolated and re-analyzed. Unique subpeptide sequences covering 19 or more ammo acid showing high antigenicity index were selected from all target peptides.
The sequences of seven subpeptides were sent to Gene ed Biotechnology Inc. The last ammo acid residue on each subpeptide was deleted because it showed no apparent effect on the antigenicity index. A cysteine residue was added to the N-terminal of each subpeptide sequence, except TcaBx-syn which contains an internal cysteine residue. The present of a cysteine residue facilitates conjugation of a carrier protein (KLH) The final peptide products corresponding to the appropriate toxm peptides and SEQ ID NO.s are shown in Table 34.
-123-
SUBSTΓΓUTE SHEET (RULE 26)
Table 34 Ammo Acid Sequences for Synthetic Peptides
_ SEO ID No. P_£p_ιde Amino Ad Sequence
NH2 -( C ) LRGN S P TN P D KD G I FAQVA NH2 -( C) YT PDQT P S FYETAF R SADG NH2 -HGQ S YNDNNY CN FT L S I NT NH2 -( C) VD P KTLQRQQAGGDGTGS S NH2- { C ) Y K A P Q R Q E D G D S N A V T Y D K
NH2 -( C) YNEN P S S E D KKWY F S S KDD NH2 -( C ) FD S Y S Q L Y E EN I NAGE QRA NH2 -( C ) N PNNS S NKLM F Y PVYQY S GNT
NH2 -( C) VS QG S GSAG S GNNNLA FGAG
Each conjugated synthetic peptide was injected into two rabbits according to Genemed accelerated program The pre- and post-immune sera were available for testing after one month.
The preliminary test of both pre- and post -immune sera from each rabbit was performed by Genemed Biotechnologies Inc Genemed reported that by using both ELISA and Western blot techniques, they detected the reaction of post- immune sera to the respective synthetic peptides. Subsequently, the sera were tested with the whole target peptides, by Western blot analysis Two batches of partially purified Photorhabdus strain W-14 toxin complex was used as the antigen. The two samples had shown activity against the Southern corn rootworm. Their peptide patterns on an SDS-PAGE gel were slightly different.
Pre-cast SDS-polyacrylamide gels with 4-20% gradient (Integrated Separation Systems, Natick, MA 01760) were used Between 1 to 8 μg of protein was applied to each gel well Electrophoresis was performed and the protein was electroblotted onto Hybond-ECL nitrocellulose membrane (Amersham International) The membrane was blocked with 10% milk in TBST (25 mM Tris HCl pH 7.4, 136 mM NaCl, 2.7 mM KCl, 0 1% Tween 20) for one hour at room temperature Each rabbit serum was diluted in 10% milk/TBST to 1:500. Other dilutions between 1.50 to 1 1000 were also used. The serum was added to the membrane and placed on a platform rocker for at least one hour. The membrane was washed thoroughly with the blocking solution or TBST. A 1.2000 dilution of secondary antibodies (goat anti -mouse IgG conjugated to horse radish peroxidase,- BioRad Laboratories) n 10% milk/TBST was applied to the membrane placed on a platform rocker for one hour The membrane was subsequently washed with excess amount of TBST The
-124-
SUBSTΓΓUTE SHEET (RULE 26)
detection of the protein was performed by using an ECL (Enhanced Chemiluminescence) detection kit (Amersham International) .
Western blot analyses were performed to identify binding specificity of each anti-synthetic peptide antibodies. All synthetic polyclonal antibodies showed specificity toward to processed and, when applicable, unprocessed target peptides from protein fractions derived from Photorhabdus culture broth. -Various antibodies were shown to recognize either unprocessed or processed recombinant proteins derived from heterologous expression systems such as bacteria or insect cells, using baculovirus expression constructs. In one case, the anti-TcbAiϋ-syn antibody showed some cross-reactivity to anti-TcdAiϋ peptide. In a second case, the anti-TcaC-syn antibody, recognized an unidentified 190 kDa peptide in W-14 toxin complex fractions.
Example 22 Characterization of Photorhabdus Strains
In order to establish that the collection described herein was comprised of Photorhabdus strains, the strains herein were assessed in terms of recognized microbiological traits that are characteristic of the bacterial genus Photorhabdus and which differentiate it from other Enterobacteriaceae and Xenorhabdus spp . (Farmer, J. J. 1984. Bergey's Manual of Systemic Bacteriology, Vol 1. pp. 510-511. (ed. Kreig N. R. and Holt, J. G.). Williams &
Wilkins, Baltimore.,- Akhurst and Boemare, 1988, J. Gen. Microbiol. 134, 1835-1845; Forst and Nealson, 1996. Microbiol. Rev. 60, 21- 43) . These characteristic traits are as follows: Gram stain negative rods, organism size of 0.3-2 μm in width and 2-10 μm in length [with occasional filaments (15-50 μm) and spheroplasts] , yellow to orange/red colony pigmentation on nutrient agar, presence of crystalline inclusion bodies, presence of catalase, inability to reduce nitrate, presence of bioluminescence, ability to take up dye from growth media, positive for protease production, growth at temperatures below 37°C, survival under anaerobic conditions and positively motile. (Table 33) . Test methods were checked using reference Escherichia coli , Xenorhabdus and Photorhabdus strains. The overall results are consistent with all strains being part of the family Enterobacteriaceae and the genus Photorhabdus . Note that DEP1, DEP2, and DEP3 refer to Photorhabdus strains obtained
-125-
SUBSTTΓUTE SHEET (RULE 26)
from the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA (#29304, 29999 and 51583, respectively).
A luminometer was used to establish the bioluminescence associated with these Photorhabdus strains. To measure the presence or absence of relative light emitting units, the broths from each strain (cells and media) were measured at three time intervals after inoculation in liquid culture (24, 48, 72 hr) and compared to background luminosity (uninoculated media) . Several Xenorhabdus strains were tested as negative controls for luminosity. Prior to measuring light emission from the various broths, cell density was established by measuring light absorbance (560 nM) in a Gilford Systems (Oberl , OH) spectrophotometer using a sipper cell. The resulting light emitting units could then be normalized to density of cells. Aliquots of the broths were placed into 96-well microtiter plates (100 μl each) and read in a Packard Lumicount™ luminometer (Packard Instrument Co., Meπden, CT) . The measurement period for each sample was 0.1 to 1.0 second. The samples were agitated in the luminometer for 10 sec prior to taking readings. A positive test was determined as being about 5 -fold background luminescence (about 1-15 relative light units) . In addition, degree of colony luminosity was confirmed with photographic film overlays and by eye, after visual adaptation in a darkroom. The Gram's staining characteristics of each strain were established with a commercial Gram's stain kit (BBL, Cockeysville, MD) used in conjunction with Gram's stain control slides (Fisher Scientific, Pittsburgh, PA) . Microscopic evaluation was then performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil immersion objective lens (with 10X ocular and 2X body magnification) . Microscopic examination of individual strains for organism size, cellular description and inclusion bodies (the latter two observations after logarithmic growth) was performed using wet mount slides (10X ocular, 2X body and 40X objective magnification) and phase contrast microscopy with a micrometer (Akhurst, R. J. and Boemare, N. E. 1990. Entomopathogenic Nematodes in Biological Control (ed. Gaugler, R. and Kaya, H.). pp. 75-90. CRC Press, Boca Raton, USA.,- Baghdiguian S., Boyer-Giglio M. H., Thaler, J. 0., Bonnot G., Boemare N. 1993. Biol. Cell 79, 177- 185.) . Colony pigmentation was observed after inoculation on Bacto nutrient agar, (Difco Laboratories, Detroit, MI) prepared as per
-126-
SUBSTΓΓUTE SHEET (RULE 26)
label instructions. Incubation occurred at 28 °C and descriptions were produced after 5 days. To test for the presence of the enzyme catalase, a colony of the test organism was removed on a small plug from a nutrient agar plate and placed into the bottom of a glass test tube. One ml of a household hydrogen peroxide solution was gently added down the side of the tube. A positive reaction was recorded when bubbles of gas (presumptive oxygen) appeared immediately or withm 5 seconds . Controls of uninoculated nutrient agar and hydrogen peroxide solution were also examined. To test for nitrate reduction, each culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI). After 24 hours incubation with gentle agitation at 28 °C, nitrite production was tested by the addition of two drops of sulfanilic acid reagent and two drops of alpha-naphthylamme reagent (see Difco Manual, 10th edition, Difco Laboratories, Detroit, MI, 1984) The generation of a distinct pink or red color indicates the formation of nitrite from nitrate whereas the lack of color formation indicates that the strain is nitrate reduction negative. In the latter case, finely powdered zmc was added to further confirm the presence of unreduced nitrate,- established by the formation of nitrite and the resultant red color The ability of each strain to uptake dye from growth media was tested with Bacto MacConkey agar containing the dye neutral red; Bacto Tergιtol-7 agar containing the dye bromothymol blue and Bacto EMB Agar containing the dye eosm-Y (formulated agars from Difco Laboratories, Detroit, MI, all prepared according to label instructions) After inoculation on these media, dye uptake was recorded after incubation at 28°C for 5 days Growth on these latter media is characteristic for members of the family Enterobacteriaceae . Motility of each strain was tested using a solution of Bacto Motility Test Medium (Difco
Laboratories, Detroit, MI) prepared as per label instructions. A butt-stab inoculation was performed with each strain and motility was judged macroscopically by a diffuse zone of growth spreading from the line of inoculum. The production of protease was tested by observing hydrolysis of gelatin using Bacto gelatin (Difco
Laboratories, Detroit, MI) made as per label instructions Cultures were inoculated and the tubes or plates were incubated at 28 °C for 5 days Gelatin hydrolysis was then checked at room temperature, I e. less than 22 °C. To assess growth at different
-127-
SUBSTΓΓUTE SHEET (RULE 26)
temperatures, agar plates [2% proteose peptone #3 with two percent Bacto-Agar (Difco, Detroit, MI) m deionized water] were streaked from a common source of inoculum. Plates were incubated at 20, 28 and 37°C for up to three weeks. The incubator temperature levels were checked with an electronic thermocouple and meter to insure valid temperature settings. Oxygen requirements for Photorhabdus strains were tested in the following manner. A butt -stab inoculation into fluid thioglycolate broth medium (Difco, Detroit, MI) was made. The tubes were incubated at room temperature for one week and cultures were then examined for type and extent of growth. The indicator resazurin demonstrates the presence of medium oxygenation or the aerobiosis zone (Difco Manual, 10th edition, Difco Laboratories, Detroit, MI). Growth zone results obtained for the Photorhabdus strains tested were consistent with those of a facultative anaerobic microorganism. In the case of unclear results, the final agar concentration of fluid thioglycolate broth medium was raised to 0.75% and the growth characteristics rechecked.
-128-
SUBSTTTUTE SHEET (RULE 26)
Table ,35 Taxonomic Traits of Photorhabdus Strains
*: A=Gram's stain, B=Crystaline inclusion bodies,
C=Bioluminescence, D=Cell form, E=Motility, F=Nitrate reduction, G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, J=Pigmentation on Nutrient Agar (some color shifts after Day 5) , K=Growth on EMB agar, L=Growth on MacConkey agar, M=Growth on Tergitol-7 agar, N =Facultative anaerobe, 0=Growth at 20°C, P=Growth at 28°C, Q=Growth at 37°C. t: +=positive for trait, - =negative for trait; rd=rod, S=sized within Genus descriptors . §: W = white, CR = cream, Y =yellow, YT=yellow tan, T=tan PO=pale orange, O=orange, PR=pale red, R=red.
The evolutionary diversity of the Photorhabdus strains m our collection was measured by analysis of PCR (Polymerase Cham Reaction) mediated genomic fingerprinting using genomic DNA from each strain. This technique is based on families of repetitive DNA sequences present throughout the genome of diverse bacterial species (reviewed by Versalovic, J. , Schneider, M., DE Bruijn, F. J. and Lupski , J. R. 1994. Methods Mol. Cell. Biol., 5, 25-40). Three of these, repetitive extragenic palindromic sequence (REP), enterobacterial repetitive mtergenic consensus (ERIC) and the BOX
-129-
SUBSTΓΓUTE SHEET (RULE 26)
element are thought to play an important role m the organization of the bacterial genome. Genomic organization is believed to be shaped by selection and the differential dispersion of these elements withm the genome of closely related bacterial strains can be used to discriminate these strains (e.g., Louws, F. J. ,
Fulbright, D. W., Stephens, C. T. and DE Bruijn, F. J. 1994. Appl. Environ. Micro. 60, 2286-2295) . Rep-PCR utilizes oligonucleotide primers complementary to these repetitive sequences to amplify the variably sized DNA fragments lying between them. The resulting products are separated by electrophoresis to establish the DNA "fingerprint" for each strain.
To isolate genomic DNA from our strains, cell pellets were resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to a final volume of 10 ml and 12 ml of 5 M NaCl was then added. This mixture was centrifuged 20 mm. at 15,000 x g. The resulting pellet was resuspended in 5.7 ml of TE and 300 μl of 10% SDS and 60 μl 20 mg/ml protemase K (Gibco BRL Products, Grand Island, NY) were added. This mixture was incubated at 37°C for 1 hr, approximately 10 mg of lysozyme was then added and the mixture was incubated for an additional 45 mm. One illiliter of 5M NaCl and 800 μl of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were then added and the mixture was incubated 10 mm. at 65°C, gently agitated, then incubated and agitated for an additional 20 min. to aid in clearing of the cellular material An equal volume of chloroform/isoamyl alcohol solution (24.1, v/v) was added, mixed gently then centrifuged. Two extractions were then performed with an equal volume of phenol/chloro orm/isoamyl alcohol (50:49:1) . Genomic DNA was precipitated with 0.6 volume of isopropanol . Precipitated DNA was removed with a glass rod, washed twice with 70% ethanol, dried and dissolved m 2 ml of STE (10 mM Tris -HCl pH8.0, 10 mM NaCl, 1 mM EDTA) . The DNA was then quantitated by optical density at 260 nm. To perform rep-PCR analysis of Photorhabdus genomic DNA the following primers were used, REP1R-I; 5' -IIIICGICGICATCIGGC-3 ' and REP2-I; 5 ' -ICGICTTATCIGGCCTAC-3 ' PCR was performed using the following 25μl reaction: 7.75 μl H2O, 2.5 μl 10X LA buffer (PanVera Corp., Madison, WI), 16 μl dNTP mix (2.5 mM each) , 1 μl of each primer at 50 pM/μl, 1 μl DMSO, 1.5 μl genomic DNA (concentrations ranged from 0.075-0.480 μg/μl) and 0.25 μl TaKaRa EX Taq (PanVera Corp., Madison, WI) The PCR
-130-
SUBSTTTUTE SHEET (RULE 26)
amplification was performed in a Perkm Elmer DNA Thermal Cycler (Norwalk, CT) using the following conditions. 95°C/7 min. then 35 cycles of; 94°C/1 mm.,44°C/l mm., 65°C/8 mm., followed by 15 mm. at 65°C. After cycling, the 25 μl reaction was added to 5 μl of 6X gel loading buffer (0.25% bromophenol blue, 40% w/v sucrose in
H2O) . A 15x20cm 1%-agarose gel was then run in TBE buffer (0.09 M
Tris-borate, 0.002 M EDTA) using 8 μl of each reaction. The gel was run for approximately 16 hours at 45v. Gels were then stained in 20 μg/ml ethidium bromide for 1 hour and destamed m TBE buffer for approximately 3 hours. Polaroid photographs of the gels were then taken under UV illumination.
The presence or absence of bands at specific sizes for each strain was scored from the photographs and entered as a similarity matrix in the numerical taxonomy software program, NTSYS-pc (Exeter Software, Setauket, NY) Controls of E. coli strain HB101 and Xanthomonas oryzae pv. oryzae assayed under the same conditions produced PCR fingerprints corresponding to published reports (Versalovic, J , Koeuth, T. and Lupski , J. R 1991. Nucleic Acids Res. 19, 6823-6831, Vera Cruz, C. M., Halda-Alija, L., Louws, F., Skinner, D. Z., George, M L., Nelson, R. J., DE Bruijn, F. J. ,
Rice, C. and Leach, J. E. 1995. Int. Rice Res. Notes, 20, 23-24.,- Vera Cruz, C. M., Ardales, E. Y., Skinner, D. Z., Talag, J , Nelson, R. J., Louws, F. J., Leung, H., Mew, T. W. and Leach, J. E. 1996. Phytopathology 86, 1352-1359) The data from Photorhabdus strains were then analyzed with a series of programs within NTSYS- pc; ΞIMQUAL (Similarity for Qualitative data) to generate a matrix of similarity coefficients (using the Jaccard coefficient) and SAHN (Sequential, Agglomerative, Heirarchical and Nested) clustering [using the UPGMA (Unweighted Pair-Group Method with Arithmetic Averages) method] which groups related strains and can be expressed as a phenogram (Fig. 7) . The COPH (cophenetic values) and MXCOMP (matrix comparison) programs were used to generate a cophenetic value matrix and compare the correlation between this and the original matrix upon which the clustering was based. A resulting normalized Mantel statistic (r) was generated which is a measure of the goodness of fit for a cluster analysis (r=0 8-0.9 represents a very good fit) In our case r=0.924. Therefore, the collection is comprised of a diverse group of easily distinguishable strains representative of the Photorhabdus genus
-131-
SUBSTΓΓUTE SHEET (RULE 26)
Example 23
Insecticidal Utility of Toxm(s) Produced by Various Photorhabdus Strains
Initial "storage" cultures of the various Photorhabdus strains were produced by inoculating 175 ml of 2% Proteose Peptone #3 (PP3) (Difco Laboratories, Detroit, MI) liquid medium with a primary variant colony in a 500 ml tribaffled flask with a Delong neck, covered with a Kaput closure. After inoculation, the flask was incubated for between 24-72 hrs at 28°C on a rotary shaker at 150 rpm, until stationary phase was reached. The culture was transferred to a sterile bottle containing a sterile magnetic stir bar and the culture was overlayered with sterile mineral oil, to limit exposure to air. The storage culture was kept in the dark, at room temperature. These cultures were then used as inoculum sources for the fermentation of each strain.
"Seed" flasks or cultures were produced by either inoculating 2 mis of an oil overlayered storage culture or by transferring a primary variant colony into 175 ml sterile medium in a 500 ml tribaffled flask covered with a Kaput closure (The use of other inoculum sources is also possible.) Typically, following 16 hours incubation at 28°C on a rotary shaker at 150 rpm, the seed culture was transferred into production flasks Production flasks were usually inoculated by adding about 1% of the actively growing seed culture to sterile 2% PP3 medium (e.g 2.0 ml per 175 ml sterile medium) Production of broths occurred in 500 ml tribaffled flasks covered with a Kaput Production flasks were agitated at 28 °C on a rotary shaker at 150 rpm. Production fermentations were terminated after 24-72 hrs although successful fermentation is not confined to this time duration. Following appropriate incubation, the broths were dispensed into sterile 1.0 L polyethylene bottles, spun at 2600xg for 1 hr at 10°C and decanted from the cell and debris pellet. Further broth clarification was achieved with a tangential flow microflltration device (Pall Filtron, Northborough, MA) using a 0.5 μM open-channel poly-ether sulfone (PES) membrane filter. The resulting broths were then concentrated (up to 10-fold) using a 10,000 or 100,000 MW cut-off membrane, M12 ultra-flltration device (Amicon, Beverly MA) or centrifugal concentrators (Millipore, Bedford, MA and Pall Filtron, Northborough, MA) with a 13,000 or
-132-
SUBSTΓΓUTE SHEET (RULE 26)
100,000 MW pore size In the case of centrifugal concentrators, the broth was spun at 2000xg for approximately 2 hr The membrane permeate was added to the corresponding retentate to achieve the desired concentration of components greater than the pore size used. Following these procedures, the broth was used for biochemical analysis or filter sterilized using a 0.2 μM cellulose nitrate membrane filter for biological assessment Heat inactivation of processed broth samples was achieved by heating the samples at 100°C in a sand-filled heat block for 10 minutes The broth (s) and toxm complex(es) from different Photorhabdus strains are useful for reducing populations of insects and were used in a method of inhibiting an insect population which comprises applying to a locus of the insect an effective insect inactivating amount of the active described. A demonstration of the breadth of msecticidal activity observed from broths of a selected group of
Photorhabdus strains fermented as described above is shown m Table 36. It is possible that improved or additional msecticidal activities could be detected with these strains through increased concentration of the broth or by employing different fermentation methods Consistent with the activity being associated with a protein, the msecticidal activity of all strains tested was heat labile .
Culture broth (s) from diverse Photorhabdus strains show differential msecticidal activity (mortality and/or growth inhibition) against a number of insects. More specifically, the activity is seen against corn rootworm which is a member of the insect order Coleoptera Other members of the Coleoptera include boll weevils, wireworms, pollen beetles, flea beetles, seed beetles and Colorado potato beetle The broths and purified toxm complex (es) are also active against tobacco budworm, tobacco hornworm and European corn borer which are members of the order Lepidoptera. Other typical members of this order are beet armyworm, cabbage looper, black cutworm, corn earworm, codling moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm and fall armyworm Activity is also observed against German cockroach which is a member of the order Dictyoptera (or Blattodea ) Other members of this order are oriental cockroach and American cockroach.
-133-
SUBSTΓΠJTE SHEET (RULE 26)
Activity against corn rootworm larvae was tested as follows. Photorhabdus culture broth(s) (10 fold concentrated, filter sterilized), 2% Proteose Peptone #3 (10 fold concentrated), purified toxm complex(es), 10 mM sodium phosphate buffer, pH 7.0 were applied directly to the surface (about 1 5 cm2) of artificial diet (Rose, R. I. and McCabe, J. M. 1973. J Econ Entomol 66, 398-400) in 40 μl aliquots. Toxm complex was diluted m 10 mM sodium phosphate buffer, pH 7.0 The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrotica undeeimpunetata howardi (Southern corn rootworm, SCR) hatched from surface sterilized eggs The plates were sealed, placed in a humidified growth chamber and maintained at 27°C for the appropriate period (3-5 days) Mortality and larval weight determinations were then scored Generally 16 insects per treatment were used all studies Control mortality was generally less than 5%.
Activity against lepidopteran larvae was tested as follows Concentrated (10-fold) Photorhabdus culture broth(s) , control medium (2% Proteose Peptone #3), purified toxm complex(es), ιo mM sodium phosphate buffer, pH 7.0 were applied directly to the surface (about 1.5 cm2) of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 μl aliquots. The diet plates were allowed to air-dry in a sterile flow-hood and each well was infested with a single, neonate larva. European corn borer [ Ostrima nubilalis) and tobacco hornworm [Manduca sexta) eggs were obtained from commercial sources and hatched m-house, whereas tobacco budworm [Heliothis virescens) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed m a humidified growth chamber and maintained the dark at 27°C for the appropriate period. Mortality and weight determinations were scored at day 5 Generally, 16 insects per treatment were used in all studies Control mortality generally ranged from about 0 to about 12.5% for control medium and was less than 10% for phosphate buffer. Activity against cockroach was tested as follows. Concentrated (10-fold) Photorhabdus culture broth(s) and control medium (2% Proteose Peptone #3) were applied directly to the surface (about 1.5 cm2) of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 μl aliquots The diet plates were allowed to
-134-
SUBSTΓΠJTE SHEET (RULE 26)
air-dry in a sterile flow-hood and each well was infested with a single, CO- anesthetized first instar German cockroach [ Blatella germanica) . Following infestation, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27°C for the appropriate period. Mortality and weight determinations were scored at day 5. Control mortality less than 10%.
Table 36
Observed Insecticidal Spectrum of Broths from Different Photorhabdus Strains
Photornabdus strain sensitive* insect Species
* = 3 25% mortality and/or growth inhibition vs. control ** = 1; Tobacco budworm, 2; European corn borer, 3;
Tobacco hornworm, 4; Southern corn rootworm, 5;
German cockroach.
-136-
SUBSTTTUTE SHEET (RULE 26)
Example 24
Southern Analysis of Non-W-14 Photorhabdus Strains
Using W-14 Gene Probes Photorhabdus strais were grown on 2% proteose peptone #3 agar (Difco Laboratories, Detroit, MI) and insecticidal toxin competence was maintained by repeated bioassay after passage. A 50 ml shake culture was produced in 175 ml baffled flasks m 2% proteose peptone #3 medium, grown at 28° and 150 rpm for approximately 24 hours. Fifteen ml of this culture were centrifuged (700 x g, 30 min) and frozen in its medium at -20° until it was thawed (slowly in ice water) for DNA isolation. The thawed W-14 culture was centrifuged (900 x g, 15 min 4°) , and the floating orange mucopolysaccharide material was removed. The remaining cell material was centrifuged (25,000 x g, 4°) to pellet the bacterial cells, and the medium was removed and discarded.
Total DNA was isolated by an adaptation of the CTAB method described in section 2.4.1 of Ausubel et al . (1994). The modifications mcluded a high salt shock, and all volumes were increased ten-fold over the "miniprep" recommended volumes. All centrifugations were at 4°C unless otherwise specified. The pelleted bacterial cells were resuspended in TE buffer (10 mM Tris- HCl, 1 mM EDTA, pH 8) to a final volume of 10 ml, then 12 ml 5 M NaCl were added; this mixture was centrifuged 20 mm at 15,000 x g. The pellet was resuspended in 5.7 ml TE, and 300 μl of 10% SDS and 60 μl of 20 mg/ml proteinase K (in sterile distilled water, Gibco BRL Products, Grand Island, NY) were added to the suspension. The mixture was incubated at 37°C for 1 hr; then approximately 10 mg lysozyme (Worthington Biochemical Corp., Freehold, NJ) were added. After an additional 45 min incubation, 1 ml of 5 M NaCl and 800 μl of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were added. This preparation was incubated 10 min at 65°C, then gently agitated and further incubated and agitated for approximately 20 min to assist clearing of the cellular material. An equal volume of chloroform/isoamyl alcohol solution (24:1, v:v) was added, mixed very gently, and the phases separated by centrifugation at 12,000 x g for 15 min. The upper (aqueous) phase was gently removed with a wide-bore pipette and extracted twice as above with an equal volume of PCI (phenol/choloroform/ isoamyl alcohol; 50:49:1, v:v:v; equilibrated with IM Tris-HCl, pH 8.0; Intermountam Scientific
Corporation, Kaysville, UT) . The DNA precipitated with 0.6 volume of isopropanol was gently removed on a glass rod, washed twice with 70% ethanol, dried, and dissolved m 2 ml STE (10 mM Tris-HCl, 10
mM NaCl, 1 mM EDTA, pH 8 ) . This preparation contained 2 5 mg/ml DNA, as determined by optical density at 260nm.
Identification of Bgl. II /Hind III Fragments Hybridizing to tc-gene Specific Probes
Approximately 10 μg of genomic DNA was digested to completion with about 30 units each of Bgl II and Hmd III (NEB) for 180 mm, frozen overnight, then heated at 65°C for five mm, and electrophoresed m a 0.8% agarose gel (Seake " LE, IX TEA, 80 volts, 90 mm) The DNA was stained with ethidium bromide (50 μg/ml) as described earlier, and photographed under ultraviolet light. The DNA fragments m the agarose gel were subjected to depurmation (5 mm in 0.2 M HCl), denaturation (15 mm in 0.5 M NaOH, 1.5 M NaCl), and neutralization (15 mm in 0.5 M Tris HCl pH 8.0, 1.5 M NaCl), with 3 rinses of distilled water between each step The DNA was transferred by Southern blotting from the gel onto a NYTRAN nylon membrane (Amersham, Arlington Heights, IL) using a high salt (20X SSC) protocol, as described section 2.9 of Ausubel et al . (CPMB, op. cit.) The transferred DNA was then UV-crosslmked to the nylon membrane using a Stratagene UV
Stratalmker set on auto crosslink. The membranes were stored dry at 25°C until use.
Hybidization was performed using the ECL™dιrect (Amersham, Arlington Heights, IL) labeling and detection system following protocols provided by the manufacturer. In brief, probes were prepared by covalently linking the denatured DNA to the enzyme horseradish peroxidase. Once labeled the probe was used under hybridization conditions which maintain the enzymatic activity. Unhybridized probe was removed by two gentle washes 20 minutes each at 42°C in 0.5xSSC, 0.4% SDS, and 6M Urea. This was followed by two washes 5 minutes each at room temperature 2xSSC. As directed by the manufacturer, ECL™ reagents were used to detect the hybridizing DNA bands. There are several factors which influence the ability to detect gene relatedness between various Photorhabdus strains and strain W-14. First, high stringency conditions have not been employed in these hybridizations. It is known in the art that varying the stringency of hybridization and wash conditions will influence the pattern and intensity of hybridizing bands. Second, Southern blots' blot to blot variation will influence the mobility of hybridizing bands and molecular weight estamates. Therefore, W-
14 was included as a standard on all Southern blots.
-138-
SUBSTΓΓUTE SHEET (RULE 26)
Gene specific probes derived from the W-14 toxin genes were used in these hybridizations. The following lists the specific coordinates within each gene sequence to which the probe corresponds. A probe specific for 1174 to 3642 of Sequence ID #25, a probe specific for tcaC: 3637 to 6005 of Sequence ID #25, a probe specific for tcbA: 2097 to 4964 of Sequence ID #11, and a probe specific for tcdA: 1660 to 4191 of sequence ID #46. The following tables summarize Southern Blot analyses of Photorhabdus strains. In the event that hybridization of probes occurred, the hybridized fragment (s) were noted as either identical or different from the pattern observed for the W-14 strain.
-139-
SUBSTTTUTE SHEET (RULE 26)
Table, Southern Analysis of Photorhabdus Strains
I = Identical fragment pattern; D = Different fragment pattern.
-140-
SUBSTΓΓUTE SHEET (RULE 26)
Table ,39
Southern Analysis of Photorhabdus Strains
= o e ermine ; = no e ec a e y ri iza ion product ; I = Identical fragment pattern; D = Different fragment pattern. + = Hybridization fragment pattern not determined.
Table ,39 Southern Analysis of Photorhabdus Strains
= o e ermine ; - = no e ec a e y ri iza ion produc ; I = Identical fragment pattern; D = Different fragment pattern. + = Hybridization fragment pattern not determined.
From these analyses it is apparent that homologs of W-14 genes are dispersed throughout these diverse Photorhabdus strains, as evidenced by differences in gene fragment sizes between W-14 and the other strains .
Example 25
N-Termmal Amino Acid Sequences of Tox Complex Peptides from
Different Photorhabdus Strains
The relationship of peptides isolated from different Pho torhabdus strains, as described in Example 14, were subjected to
■142-
SUBSTΓΓUTE SHEET (RULE 26)
N-termmal ammo acid sequencing. The N-terminal amino acid sequences of toxm peptides in several strains were compared to W- 14 toxin peptides. In Table 40, a comparison of toxin peptides compared to date showed that identical or homologous (at least 40% similarity to W14 gene/peptides ) tox peptides were present in all of the strains. For example, the N-terminal ammo acid sequence of TcaC, SEQ ID NO: 2, was found to be identical to that for 160 kDa peptide in HP88 but also homologs were present strains WIR, H9, Hb, WX-1, and Hm. Some W-14 peptides or homologs have not been observed other strains; however, not all peptides have Deen sequenced for toxm complexes from other strains due to N-terminal blockage or low abundance. In addition, many other N-termmal amino acid sequences (SEQ ID NOS: 82 to 88) have been obtained for tox complex peptides from other strains that have no similarity to peptides from W-14 and in some case were identical to each other. For example, an identical am o acid sequence, SEQ ID NO: 82, was obtained for 64 kDa peptide present in both HP88 and Hb strains and a homologous sequence for a 70 kDa peptide m NC-1 strain (SEQ ID NO: 83) .
Table 40 A Comparison of Ammo Terminal Sequence Homology Between Proteins
Isolated From Non-W-14 Strains
TΠ W-14 identical omoiogy
Peptide Gene SEQ ID NO:
TcaAn tcaA 15
TcaAin tcaA 4
TcaBi tcaB 3 7d kDa
71 kDa
TcaBu tcaB 5 61 kDa 61 kDa
TcaC tcaA 2 160 kDa
160 kDa
170 kDa
180 kDa
170 kDa
170 kDa
1 40
77 Hb 8_ kDa
TccB tccB 170 kDa
180 kDa
180 kDa
170 kDa
1"O kDa
140 kDa
lα0 kDa
TcdAn tcdA TcdAm tcdA 41 Hb 57 kDa
H9 69 kDa
Hb 86 kDa
HP88 86 kDa amino acid sequences that were at leasi 4U% similarity to W14 gene / peptides. Similar residues were identified as being a member in one of the following five groups: (P, A, G, S, T) ; (Q, N, E, B, D, Z); (H, K, R) ; , I, V, M) ; and (F, Y, W) .
Example 26, immunological Analysis of Phot rh,a d s rains
Culture broths of Photorhabdus strains were concentrated 10 to 15 times using Centrιprep-10 ultraflltration device (Amicon, Inc. Beverly, MA 01915) . The concentration of the protein ranges from 0.3 to 3.0 mg per ml Ten to 20 μg of total protein was loaded in each well of a precast 4-20% polyacrylamide gel (Integrated Separation Systems, Natic , MA 01760) . Gel electrophoresis was performed for 1.25 hours using a constant current set at 25 ma per gel. The gel was electro-blotted on to Hybond-ECL™ nitrocellulose membrane (Amersham Corporation, Arlington Hts, II 60005) using a semi-dry electro-blotter (Pharmacia Biotech Inc., Piscataway , NJ
-144-
SUBSTΓΓUTE SHEET (RULE 26)
08854) . A constant current was applied at 0.75 ma per cm for 2.5 hours. The membrane was blocked with 10% milk in TBST (25 mM Tris HCl pH 7.4, 136 mM NaCl, 2.7 mM KCl, 0.1% Tween 20) for one hour at room temperature. Each primary antibody was diluted in 10% milk/TBST to 1:500. Other dilution between 1:50 to 1.1000 was also used. The membrane was incubated in primary antibody for at least one hour. Then it was washed thoroughly with the blocking solution or TBST. A 1:2000 dilution of secondary antibodies (goat anti- mouse IgG or goat anti rabbit TgG conjugated to horseradish peroxidase; BioRad Laboratories, Hercules, CA 94547) 10% milk/TBST was applied to the membrane which was placed on a platform rocker for one hour. The membrane was subsequently washed with excess amount of TBST. The detection of the protein was performed by using an ECL (Enhanced Chemiluminescence) detection kit (Amersham International) .
A panel of peptide speciflc-antibodies generated against W-14 peptides were used to characterize the protein composition of broths from nine non-W-14 Photorhabdus strains using Western blot analysis. In addition, one monoclonal antibody (MAb-C5F2) which recognizes TcbA1L1 protein m W-14 -derived tox complex was used. The results (Table 39) showed cross recognition of the antibodies to some of the proteins in these broths. In some cases, the proteins that were recognized by the antibodies were the same size as the W-14 target peptides. In other cases, the proteins that were recognized by the antibodies were smaller than the W-14 target peptides. This data indicate that some of the non-W-14 Photorhabdus strains may produce similar proteins to the W-14 strain The difference could be due to deletion or protein processing or degradation process. Some of the strains did not contain protem(s) that could be recognized by some antibodies, however, it is possible that the concentration is significantly lower than those observed for W-14 peptides. When compared for various toxm peptide homologs these results showed peptide diversity among the Photorhabdus strains.
-145-
SUBSTΓΓUTE SHEET (RULE 26)
Table 41 Cross Recognition by Monoclonal Antibodies or Polyclonal Antibodies Generated Against W-14 Peptides to Protein (s) in Broths of Selected
Non-W-14 Photorhabdus
Additional non-W-14 Photorhabdus strains were characterized by Western blot analysis using the culture broth and/or partial purified protein fractions as antigen. The panel of antibodies include MAb-C5F2, MAb-DEl (recognizing TcdAi:ι), PAb-DE2 (recognizing
TcaB), PAb-TcbAi:L-syn, PAb- TcaC-syn, PAb TcaB11-syn, PAb-TcbAi;Ll- syn, PAb-TcaB1-syn. These antibodies showed cross-reactivity with proteins in the broth and in the partial purified fractions of non- W-14 strains.
The data indicate that antibodies could be used to identify proteins in the broth as well as in the partially purified protein fractions .
-146-
SUBSTΓΓUTE SHEET (RULE 25)
Table 42 Cross Recognition by Monoclonal Antibodies or Plyclonal Antibodies
Generated Against W-14 Peptides to Protein (s) Broths and/or Partial Purified Protein Fractions of Selected Non-W14 Photorhabdus
Example 27 Bacterial Expression of the tcdA Coding Region
Engineering of the tcdA Gene for Bacterial Expression
The 5' and 3' ends of the tcdA coding region (SEQ ID NO.46) were modified to add useful cloning sites for inserting the segment into heterologous expression vectors. The ends were modified using unique primers m Polymerase Cham Reactions (PCR) , performed essentially as described Example 8. Primer sets, as described below, were used conjunction with cosmid 21D2 4 as template, to created products with the appropriately modified ends The first primer set was used to modify the 5' end of the gene, to insert a unique Neo I site at the initiator codon using the forward primer A0F1 (5' GAT CGA TCG ATC CAT GGC CAA CGA GTC TGT AAA AGA GAT ACC TGA TG TAT TAA AAA GCC AGT GTG 3 ' ) and to add unique Bgl II, Sal I and Not I sites to facilitate insertion of the remainder of the gene using the reverse primer A0R1 (5' GAT CGA TCG TAC GCG
-147-
SUBSTΓΓUTE SHEET (RULE 26)
GCC GCT CGA TCG ATC GTC GAC CCA TTG ATT TGA GAT CTG GGC GGC GGG TAT CCA GAT AAT AAA CGG AGT CAC 3 ' ) .
Another PCR reaction was designed to modify the 3 ' end of the gene by adding an additional stop codon and convenient restriction sites for cloning. The forward primer A0F2 (5' ACT GGC TGC GTG GTC GAC TGG CGG CGA TTT ACT 3') was used to amplify across a unique Sal I site in the gene, later used to clone the modified 3' end. The reverse primer A0R2 (5' CGA TGC ATG CTG CGG CCG CAG GCC TTC CTC GAG TCA TTA TTT AAT GGT GTA GCG AAT ATG CAA AAT 3 ' ) was used to insert a second stop codon (TGA) and cloning sites Xho I, Stu I and Not I . Bacterial expression vector pET27b (Novagen, Madison, WI) , was modified to delete the Bgl II site at position 446, according to standard molecular biology techniques.
The 497 bp PCR product from the first amplification reaction (AOF1+AOR1) , to modify the 51 end of the gene, was ligated to the modified pET27b vector according to the supplier's instructions. The DNA sequences of the amplified portion of three isolates were determined using the supplier's recommended primers and the sequencing methods described previously The sequence of all isolates was the same.
One isolate was then used as a cloning vector to insert the middle portion of the tcdA gene on a 6341 bp Bgl II to Sal I fragment. The resulting clone was called MC4 and contained all but the 3' most portion of the tcdA coding sequence. Finally, to complete the full-length coding region, the 832 bp PCR product from the second PCR amplification (AOF2+AOR2), to modify the 3' end of the gene, was ligated to isolate MC4 on a Sal I to Not I fragment, according to standard molecular biology techniques The tcdA coding region was sequenced and found to be complete, the resulting plasmid is called pDAB2035.
Construction of Plasmids DDAB2036. pDAB2037 and pDAB2038 for
Bacteria], Expression of tcd
The tcdA coding region was cut from plasmid pDAB2035 with restriction enzymes Neo I and Xho I and gel purified. The fragment was ligated into the Neo I and Xho I sites of the expression vector pET15 to create plasmid pDAB2036 Additionally, pDAB2035 was cut with Neo I and Not I to release the tcdA coding region which was ligated into the Neo I and Not I sites of the expression vector pET28b to create plasmid pDAB2037 Finally, plasmid pDAB2035 was cut with Neo I and Stu J to release the tcdA coding region. This fragment was ligated into the expression vector Trc99a which was cut with Hind III followed by treatment with T4 DNA polymerase to blunt
the ends. The vector was then cut with Neo I and ligated with the Wco I/Stu I cut tcdA fragment. The resulting plasmid is called pDAB2038.
Expression of tcdA from Plasmid pDAB2038
Plasmid pDAB2038 was transformed into BL21 cells ana expressed as described above for plasmid pDAB2033 in Example 19.
Purification of tcdA from E. coli The expression culture was centrifuged at 10,300 g for 30 mm and the supernatant was collected. It was diluted with two volumes of H2O and applied at a flow rate of 7.5 ml/min to a poros 50 HQ (Perspective Systems, MA) column (1.6 cm x 10 cm) which was pre- equilibrated with 10 mM sodium phosphate buffer, pH 7 0 (Buffer A) The column was washed with Buffer A until the optical density at 280 nm returned to baseline level The proteins bound to the column were then eluted with IM NaCl Buffer A.
The fraction was loaded in 20 ml aliquots onto a gel filtration column, Sepharose CL-4B (2.6 x 100 cm), which was equilibrated with Buffer A. The protein was eluted in Buffer A at a flow rate of 0.75 mL/min. Fractions with a retention time between 260 minutes and 460 minutes were pooled and applied at 1 mL/min to a Mono Q 5/5 column which was equilibrated with 20 mM Tris-HCl, pH 7 0 (Buffer B) . The column was washed with Buffer B until the optical density at 280 nm returned to baseline level. The proteins bound to the column were eluted with a linear gradient of 0 to 1 M NaCl in Buffer B at lmL/min for 30 mm. One milliliter fractions were collected, serial diluted, and subjected to SCR bioassay Fractions eluted out between 0.1 and 0.3 M NaCl were found to have the highest msecticidal activity. Western analysis of the active fractions using pAb TcdA,, -syn antibody and pAb Tcd111-syn antibody indicated the presence of peptides TcdA,, and TcdA111.
-149-
SUBSTΓΓUTE SHEET (RULE 25)
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(l) APPLICANT: Ensign, Jerald C Bowen, David J Petell, James Fa ig, Raymond Schoonover, Sue ffrench-Constant, Richard Orr, Gregory L Merlo, Donald J Roberts , Jean L Rocheleau, Thomas A
(ii) TITLE OF INVENTION: Insecticidal Protein Tox s from
Photorhabdus (in) NUMBER OF SEQUENCES. 88 dv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE. DowElanco
(B) STREET: 9330 Zionsville Road (C) CITY: Indianapolis
(D) STATE: IN
(E) COUNTRY: US
(F) ZIP: 46268 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentin Release #1.0, Version #1.30
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vn) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/063,615
(B) FILING DATE: 18-MAY-1993 (vn) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/395,497
(B) FILING DATE: 28-FEB-1995
(vn) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER: US 60/007,255
(B) FILING DATE: 06-NOV-1995
(vn) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/608,423 (B) FILING DATE: 28-FEB-1996
(vn) PRIOR APPLICATION DATA.
(A) APPLICATION NUMBER: US 08/705,484
(B) FILING DATE: 28-AUG-1996
(vn) PRIOR APPLICATION DATA.
(A) APPLICATION NUMBER: US 08/743,699
(B) FILING DATE: 06-NOV-1996
-150-
SUBSTΓΓUTE SHEET (RULE 26)
(vm) ATTORNEY/AGENT INFORMATION:
(A) NAME: Borucki , Andrea T.
(B) REGISTRATION NUMBER: 33651
(C) REFERENCE/DOCKET NUMBER: 50301E
(IX) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 317-337-4846
(B) TELEFAX: 317-337-4847
[2) INFORMATION FOR SEQ ID NO : 1 :
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 ammo acids (B) TYPE: ammo acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein
(v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 (TcbA,, N- terminus)
Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn 1 5 10
(2) INFORMATION FOR SEQ ID NO : 2 :
(l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 12 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS:
(D) TOPOLOGY linear
(li) MOLECULE TYPE protein (v) FRAGMENT TYPE N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 (TcaC N- terminus)
Met Gin Asp Ser Pro Glu Val Ser lie Thr Thr Trp 1 5 10
(2) INFORMATION FOR SEQ ID NO : 3 : (l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY linear (n) MOLECULE TYPE protein
(v) FRAGMENT TYPE N- termmal
(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 3 (TcaB, N- terminus ) Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala
1 5 10 15
Leu Val Ala
(2) INFORMATION FOR SEQ ID NO: 4:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D ) TOPOLOGY : linear ( n ) MOLECULE TYPE : protein
(v) FRAGMENT TYPE : N- termmal
( xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 4 ( TcaA, ,, N- termmus ) :
Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn
1 5 10
(2) INFORMATION FOR SEQ ID NO: 5:
(l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 ammo acids
(C) STRANDEDNESS:
(D) TOPOLOGY linear
(n) MOLECULE TYPE protein (v) FRAGMENT TYPE N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 (TcaB,, N-terminus)
Ala Gly Asp Thr Ala Asn He Gly Asp 1 5
(2) INFORMATION FOR SEQ ID NO : 6 : d) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULE TYPE: protein
(v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: Leu Gly Gly Ala Ala Thr Leu Leu Asp Leu Leu Leu Pro Gin He
1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 7:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 ammo acids
(C) STRANDEDNESS: (D) TOPOLOGY linear
(n) MOLECULE TYPE protein
(v) FRAGMENT TYPE N- termmal
(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 7 (TccB N-termmus )
Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 1 5 10
- 152-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 8:
[i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY linear
(ii) MOLECULE TYPE protein
(v) FRAGMENT TYPE N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 (TccA N- erminus) :
Met Asn Leu Ala Ser Pro Leu He Ser
1 5
(2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 amino acids
(B) "TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY linear
[ii) MOLECULE TYPE protein
(v) FRAGMENT TYPE N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 :
Met He Asn Leu Asp He Asn Glu Gin Asn Lys He Met Val Val Ser 1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 10: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY linear (ii) MOLECULE TYPE protein
(v) FRAGMENT TYPE N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: Ala Ala Lys Asp Val Lys Phe Gly Ser Asp Ala Arg Val Lys Met Leu
1 5 10 15
[2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 7515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..7515
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 ( cbA gene) :
ATG CAA AAC TCA TTA TCA AGC ACT ATC GAT ACT ATT TGT CAG AAA CTG 48 Met Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin Lys Leu 1 5 10 15
CAA TTA ACT TGT CCG GCG GAA ATT GCT TTG TAT CCC TTT GAT ACT TTC 96 Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 20 25 30
CGG GAA AAA ACT CGG GGA ATG GTT AAT TGG GGG GAA GCA AAA CGG ATT 144 Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg He 35 40 45 TAT GAA ATT GCA CAA GCG GAA CAG GAT AGA AAC CTA CTT CAT GAA AAA 192 Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 50 55 60
CGT ATT TTT GCC TAT GCT AAT CCG CTG CTG AAA AAC GCT GTT CGG TTG 240 Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 65 70 75 80
GGT ACC CGG CAA ATG TTG GGT TTT ATA CAA GGT TAT AGT GAT CTG TTT 288 Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 85 90 95
GGT AAT CGT GCT GAT AAC TAT GCC GCG CCG GGC TCG GTT GCA TCG ATG 336
Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met
100 105 110
TTC TCA CCG GCG GCT TAT TTG ACG GAA TTG TAC CGT GAA GCC AAA AAC 384
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 115 120 125 TTG CAT GAC AGC AGC TCA ATT TAT TAC CTA GAT AAA CGT CGC CCG GAT 432 Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 130 135 140
TTA GCA AGC TTA ATG CTC AGC CAG AAA AAT ATG GAT GAG GAA ATT TCA 480 Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 145 150 155 160
ACG CTG GCT CTC TCT AAT GAA TTG TGC CTT GCC GGG ATC GAA ACA AAA 528 Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 165 170 175
ACA GGA AAA TCA CAA GAT GAA GTG ATG GAT ATG TTG TCA ACT TAT CGT 576
Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 180 185 190
TTA AGT GGA GAG ACA CCT TAT CAT CAC GCT TAT GAA ACT GTT CGT GAA 624
Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 195 200 205 ATC GTT CAT GAA CGT GAT CCA GGA TTT CGT CAT TTG TCA CAG GCA CCC 672 He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 210 215 220
ATT GTT GCT GCT AAG CTC GAT CCT GTG ACT TTG TTG GGT ATT AGC TCC 720 He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser 225 230 235 240
CAT ATT TCG CCA GAA CTG TAT AAC TTG CTG ATT GAG GAG ATC CCG GAA 768 His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu 245 250 255
AAA GAT GAA GCC GCG CTT GAT ACG CTT TAT AAA ACA AAC TTT GGC GAT 816 Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 260 265 270
ATT ACT ACT GCT CAG TTA ATG TCC CCA AGT TAT CTG GCC CGG TAT TAT 864
He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr
275 280 285
GGC GTC TCA CCG GAA GAT ATT GCC TAC GTG ACG ACT TCA TTA TCA CAT 912 Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 290 295 300
GTT GGA TAT AGC AGT GAT ATT CTG GTT ATT CCG TTG GTC GAT GGT GTG 960
Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 305 310 315 320
GGT AAG ATG GAA GTA GTT CGT GTT ACC CGA ACA CCA TCG GAT AAT TAT 1008
Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr
325 330 335
ACC AGT CAG ACG AAT TAT ATT GAG CTG TAT CCA CAG GGT GGC GAC AAT 1056
Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn
340 345 350 TAT TTG ATC AAA TAC AAT CTA AGC AAT AGT TTT GGT TTG GAT GAT TTT 1104
Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe
355 360 365
TAT CTG CAA TAT AAA GAT GGT TCC GCT GAT TGG ACT GAG ATT GCC CAT 1152 Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 370 375 380
AAT CCC TAT CCT GAT ATG GTC ATA AAT CAA AAG TAT GAA TCA CAG GCG 1200
Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 385 390 395 400
ACA ATC AAA CGT AGT GAC TCT GAC AAT ATA CTC AGT ATA GGG TTA CAA 1248
Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin
405 410 415
AGA TGG CAT AGC GGT AGT TAT AAT TTT GCC GCC GCC AAT TTT AAA ATT 1296
Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He
420 425 430 GAC CAA TAC TCC CCG AAA GCT TTC CTG CTT AAA ATG AAT AAG GCT ATT 1344
Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 435 440 445
CGG TTG CTC AAA GCT ACC GGC CTC TCT TTT GCT ACG TTG GAG CGT ATT 1392 Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He
450 455 460
GTT GAT AGT GTT AAT AGC ACC AAA TCC ATC ACG GTT GAG GTA TTA AAC 1440
Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 465 470 475 480
AAG GTT TAT CGG GTA AAA TTC TAT ATT GAT CGT TAT GGC ATC AGT GAA 1488
Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 485 490 495
GAG ACA GCC GCT ATT TTG GCT AAT ATT AAT ATC TCT CAG CAA GCT GTT 1536 Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 500 505 510 GGC AAT CAG CTT AGC CAG TTT GAG CAA CTA TTT AAT CAC CCG CCG CTC 1584 Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 515 520 525
AAT GGT ATT CGC TAT GAA ATC AGT GAG GAC AAC TCC AAA CAT CTT CCT 1632 Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 530 535 540
AAT CCT GAT CTG AAC CTT AAA CCA GAC AGT ACC GGT GAT GAT CAA CGC 1680 Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 545 550 555 560
-155-
SUBSTΓΓUTE SHEET (RULE 25)
AAG GCG GTT TTA AAA CGC GCG TTT CAG GTT AAC GCC AGT GAG TTG TAT 1728 Lys Ala Val Leu Lys Arg Ala Phe Gin Val' Asn Ala Ser Glu Leu Tyr 565 570 575 CAG ATG TTA TTG ATC ACT GAT CGT AAA GAA GAC GGT GTT ATC AAA AAT 1776 Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He Lys Asn 580 585 590
AAC TTA GAG AAT TTG TCT GAT CTG TAT TTG GTT AGT TTG CTG GCC CAG 1824 Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 595 600 605
ATT CAT AAC CTG ACT ATT GCT GAA TTG AAC ATT TTG TTG GTG ATT TGT 1872 He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val He Cys 610 615 620
GGC TAT GGC GAC ACC AAC ATT TAT CAG ATT ACC GAC GAT AAT TTA GCC 1920 Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn Leu Ala 625 630 635 640
AAA ATA GTG GAA ACA TTG TTG TGG ATC ACT CAA TGG TTG AAG ACC CAA 1968 Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin 645 650 655 AAA TGG ACA GTT ACC GAC CTG TTT CTG ATG ACC ACG GCC ACT TAC AGC 2016 Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 660 665 670
ACC ACT TTA ACG CCA GAA ATT AGC AAT CTG ACG GCT ACG TTG TCT TCA 2064 Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 675 680 685
ACT TTG CAT GGC AAA GAG AGT CTG ATT GGG GAA GAT CTG AAA AGA GCA 2112 Thr Leu His Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys Arg Ala 690 695 700
ATG GCG CCT TGC TTC ACT TCG GCT TTG CAT TTG ACT TCT CAA GAA GTT 2160
Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val
705 710 715 720
GCG TAT GAC CTG CTG TTG TGG ATA GAC CAG ATT CAA CCG GCA CAA ATA 2206
Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala Gin He 725 730 735 ACT GTT GAT GGG TTT TGG GAA GAA GTG CAA ACA ACA CCA ACC AGC TTG 2256 Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 740 745 750
AAG GTG ATT ACC TTT GCT CAG GTG CTG GCA CAA TTG AGC CTG ATC TAT 2304 Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 755 760 765
CGT CGT ATT GGG TTA AGT GAA ACG GAA CTG TCA CTG ATC GTG ACT CAA 2352 Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 770 775 780
TCT TCT CTG CTA GTG GCA GGC AAA AGC ATA CTG GAT CAC GGT CTG TTA 2400 Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 785 790 795 800
ACC CTG ATG GCC TTG GAA GGT TTT CAT ACC TGG GTT AAT GGC TTG GGG 2448 Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 805 810 815 CAA CAT GCC TCC TTG ATA TTG GCG GCG TTG AAA GAC GGA GCC TTG ACA 2496 Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 820 825 830
GTT ACC GAT GTA GCA CAA GCT ATG AAT AAG GAG GAA TCT CTC CTA CAA 2544 Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 835 840 845
ATG GCA GCT AAT CAG GTG GAG AAG GAT CTA ACA AAA CTG ACC AGT TGG 2592 Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 850 855 860
ACA CAG ATT GAC GCT ATT CTG CAA TGG TTA CAG ATG TCT TCG GCC TTG 2640 Thr Gin He Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 865 870 875 880 GCG GTT TCT CCA CTG GAT CTG GCA GGG ATG ATG GCC CTG AAA TAT GGG 2688 Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 8B5 890 895
ATA GAT CAT AAC TAT GCT GCC TGG CAA GCT GCG GCG GCT GCG CTG ATG 2736 He Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 900 905 910
GCT GAT CAT GCT AAT CAG GCA CAG AAA AAA CTG GAT GAG ACG TTC AGT 2784 Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 915 920 925
AAG GCA TTA TGT AAC TAT TAT ATT AAT GCT GTT GTC GAT AGT GCT GCT 2832 Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 930 935 940
GGA GTA CGT GAT CGT AAC GGT TTA TAT ACC TAT TTG CTG ATT GAT AAT 2880 Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 945 950 955 960 CAG GTT TCT GCC GAT GTG ATC ACT TCA CGT ATT GCA GAA GCT ATC GCC 2928 Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala He Ala 965 970 975
GGT ATT CAA CTG TAC GTT AAC CGG GCT TTA AAC CGA GAT GAA GGT CAG 2976 Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 980 985 990
CTT GCA TCG GAC GTT AGT ACC CGT CAG TTC TTC ACT GAC TGG GAA CGT 3024 Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 995 1000 1005
TAC AAT AAA CGT TAC AGT ACT TGG GCT GGT GTC TCT GAA CTG GTC TAT 3072 Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 1010 1015 1020
TAT CCA GAA AAC TAT GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC AAA 3120 Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 1025 1030 1035 1040 ATG ATG GAT GCG CTG TTG CAA TCC ATC AAC CAG AGC CAG CTA AAT GCG 3168 Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 1045 1050 1055
GAT ACG GTG GAA GAT GCT TTC AAA ACT TAT TTG ACC AGC TTT GAG CAG 3216 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 1060 1065 1070
GTA GCA AAT CTG AAA GTA ATT AGT GCT TAC CAC GAT AAT GTG AAT GTG 3264 Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 1075 1080 1085
GAT CAA GGA TTA ACT TAT TTT ATC GGT ATC GAC CAA GCA GCT CCG GGT 3312 Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 1090 1095 1100
ACG TAT TAC TGG CGT AGT GTT GAT CAC AGC AAA TGT GAA AAT GGC AAG 3360 Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120 TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC 3408 Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val
-157-
SUBSTΓΠJTE SHEET (RULE 26)
1125 1130 1135
AAT CCT TGG AAA AAT ATC ATC CGT CCG GTT GTT TAT ATG TCC CGC TTA 3456 Asn Pro Trp Lys Asn He He Arg Pro Val Val Tyr Met Ser Arg Leu 1140 1145 1150
TAT CTG CTA TGG CTG GAG CAG CAA TCA AAG AAA AGT GAT GAT GGT AAA 3504 Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 1155 1160 1165
ACC ACG ATT TAT CAA TAT AAC TTA AAA CTG GCT CAT ATT CGT TAC GAC 3552 Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 1170 1175 1180 GGT AGT TGG AAT ACA CCA TTT ACT TTT GAT GTG ACA GAA AAG GTA AAA 3600 Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 1185 1190 1195 1200
AAT TAC ACG TCG AGT ACT GAT GCT GCT GAA TCT TTA GGG TTG TAT TGT 3648 Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys
1205 1210 1215
ACT GGT TAT CAA GGG GAA GAC ACT CTA TTA GTT ATG TTC TAT TCG ATG 3696 Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 1220 1225 1230
CAG AGT AGT TAT AGC TCC TAT ACC GAT AAT AAT GCG CCG GTC ACT GGG 3744
Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 1235 1240 1245
CTA TAT ATT TTC GCT GAT ATG TCA TCA GAC AAT ATG ACG AAT GCA CAA 3792
Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 1250 1255 1260 GCA ACT AAC TAT TGG AAT AAC AGT TAT CCG CAA TTT GAT ACT GTG ATG 3840 Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 1265 1270 1275 1280
GCA GAT CCG GAT AGC GAC AAT AAA AAA GTC ATA ACC AGA AGA GTT AAT 3888 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn
1285 1290 1295
AAC CGT TAT GCG GAG GAT TAT GAA ATT CCT TCC TCT GTG ACA AGT AAC 3936 Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 1300 1305 1310
AGT AAT TAT TCT TGG GGT GAT CAC AGT TTA ACC ATG CTT TAT GGT GGT 3984 Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 1315 1320 1325
AGT GTT CCT AAT ATT ACT TTT GAA TCG GCG GCA GAA GAT TTA AGG CTA 4032 Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 1330 1335 1340 TCT ACC AAT ATG GCA TTG AGT ATT ATT CAT AAT GGA TAT GCG GGA ACC 4080 Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 1345 1350 1355 1360
CGC CGT ATA CAA TGT AAT CTT ATG AAA CAA TAC GCT TCA TTA GGT GAT 4128 Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp
1365 1370 1375
AAA TTT ATA ATT TAT GAT TCA TCA TTT GAT GAT GCA AAC CGT TTT AAT 4176 Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 1380 1385 1390
CTG GTG CCA TTG TTT AAA TTC GGA AAA GAC GAG AAC TCA GAT GAT AGT 4224 Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 1395 1400 1405
ATT TGT ATA TAT AAT GAA AAC CCT TCC TCT GAA GAT AAG AAG TGG TAT 4272
-158-
SUESTΠΓUTE SHEL <RULE26)
He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 1410 1415 1420
TTT TCT TCG AAA GAT GAC AAT AAA ACA GCG GAT TAT AAT GGT GGA ACT 4320 Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 1425 1430 1435 1440
CAA TGT ATA GAT GCT GGA ACC AGT AAC AAA GAT TTT TAT TAT AAT CTC 4368 Gin Cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455
CAG GAG ATT GAA GTA ATT AGT GTT ACT GGT GGG TAT TGG TCG AGT TAT 4416 Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 1460 1465 1470
AAA ATA TCC AAC CCG ATT AAT ATC AAT ACG GGC ATT GAT AGT GCT AAA 4464 Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 1475 1480 1485 GTA AAA GTC ACC GTA AAA GCG GGT GGT GAC GAT CAA ATC TTT ACT GCT 4512 Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 1490 1495 1500
GAT AAT AGT ACC TAT GTT CCT CAG CAA CCG GCA CCC AGT TTT GAG GAG 4560 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 1505 1510 1515 1520
ATG ATT TAT CAG TTC AAT AAC CTG ACA ATA GAT TGT AAG AAT TTA AAT 4608 Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 1525 1530 1535
TTC ATC GAC AAT CAG GCA CAT ATT GAG ATT GAT TTC ACC GCT ACG GCA 4656 Phe He Asp Asn Gin Ala His lie Glu He Asp Phe Thr Ala Thr Ala 1540 1545 1550
CAA GAT GGC CGA TTC TTG GGT GCA GAA ACT TTT ATT ATC CCG GTA ACT 4704 Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 1555 1560 1565 AAA AAA GTT CTC GGT ACT GAG AAC GTG ATT GCG TTA TAT AGC GAA AAT 4752 Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 1570 1575 1580
AAC GGT GTT CAA TAT ATG CAA ATT GGC GCA TAT CGT ACC CGT TTG AAT 4800 Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 1585 1590 1595 1600
ACG TTA TTC GCT CAA CAG TTG GTT AGC CGT GCT AAT CGT GGC ATT GAT 4848 Thr β ' Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 1605 1610 __ 1615
GCA GTG CTC AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAA TTA GGA 4896 Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 1620 1625 1630
GCG GGC ACA TAT GTG CAG CTT GTG TTG GAT AAA TAT GAT GAG TCT ATT 4944 Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser He 1635 1640 1645 CAT GGC ACT AAT AAA AGC TTT GCT ATT GAA TAT GTT GAT ATA TTT AAA 4992 His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 1650 1655 1660
GAG AAC GAT AGT TTT GTG ATT TAT CAA GGA GAA CTT AGC GAA ACA AGT 5040 Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu Ser Glu Thr Ser 1665 1670 1675 1680
CAA ACT GTT GTG AAA GTT TTC TTA TCC TAT TTT ATA GAG GCG ACT GGA 5088 Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala Thr Gly 1685 1690 1695
-159-
SUBSTΓΓUTE SHEET (RULE 26)
AAT AAG AAC CAC TTA TGG GTA CGT GCT AAA .TAC CAA AAG GAA ACG ACT 5136 Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 1700 1705 1710 GAT AAG ATC TTG TTC GAC CGT ACT GAT GAG AAA GAT CCG CAC GGT TGG 5184 Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 1715 1720 1725
TTT CTC AGC GAC GAT CAC AAG ACC TTT AGT GGT CTC TCT TCC GCA CAG 5232 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 1730 1735 1740
GCA TTA AAG AAC GAC AGT GAA CCG ATG GAT TTC TCT GGC GCC AAT GCT 5280 Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 1745 1750 1755 1760
CTC TAT TTC TGG GAA CTG TTC TAT TAC ACG CCG ATG ATG ATG GCT CAT 5328 Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 1765 1770 1775
CGT TTG TTG CAG GAA CAG AAT TTT GAT GCG GCG AAC CAT TGG TTC CGT 5376
Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 1780 1785 1790 TAT GTC TGG AGT CCA TCC GGT TAT ATC GTT GAT GGT AAA ATT GCT ATC 5424
Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He Ala He 1795 1800 1805
TAC CAC TGG AAC GTG CGA CCG CTG GAA GAA GAC ACC AGT TGG AAT GCA 5472 Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 1810 1815 1820
CAA CAA CTG GAC TCC ACC GAT CCA GAT GCT GTA GCC CAA GAT GAT CCG 5520
Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 1825 1830 1835 1840
ATG CAC TAC AAG GTG GCT ACC TTT ATG GCG ACG TTG GAT CTG CTA ATG 5566
Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met
1845 1850 1855
GCC CGT GGT GAT GCT GCT TAC CGC CAG TTA GAG CGT GAT ACG TTG GCT 5616 Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 1860 1865 1870 GAA GCT AAA ATG TGG TAT ACA CAG GCG CTT AAT CTG TTG GGT GAT GAG 5664 Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 1875 1880 1885
CCA CAA GTG ATG CTG AGT ACG ACT TGG GCT AAT CCA ACA TTG GGT AAT 5712 Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 1890 1895 1900
GCT GCT TCA AAA ACC ACA CAG CAG GTT CGT CAG CAA GTG CTT ACC CAG 5760 Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 1905 1910 1915 1920
TTG CGT CTC AAT AGC AGG GTA AAA ACC CCG TTG CTA GGA ACA GCC AAT 5808 Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 1925 1930 1935
TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT AGC AAG CTC AAA GGC 5856 Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 1940 1945 1950 TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT TTA CGT CAT AAT CTG 5904 Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 1955 1960 1965
TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG TAT GCT AAA CCG GCT 5952 Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980
GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA GCT TCT CAA GGG GGA 6000 Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 1985 1990 1995 2000
GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC CGC TTC CCT CAA ATG 6048 Ala Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro Gin Met 2005 2010 2015 CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT ATA CAG TTC GGT AGT 6096 Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 2020 2025 2030
TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG GAA GCT ATG AGT CAA 6144 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 2035 2040 2045
CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG ACC AGT ATT CGT ATG 6192 Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Met 2050 2055 2060
CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA AAA ACC GCC TTG CAA 6240 Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 2065 2070 2075 2080
GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC AGC TAT AGC CAA CTG 6288 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 2085 2090 2095 TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA GCG CTG GCG TTA CGC 6336 Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 2100 2105 2110
TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG ATT TCC CGT ATG GCA 6384 Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Met Ala 2115 2120 2125
GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC GGC CTG GCT GAT GGC 6432 Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Gly 2130 2135 2140
GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC GCT GAC GGT ATT GAG 6480 Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 2145 2150 2155 2160
TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG AAA GTT GCT CAG TCG 6528 Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 2165 2170 2175 GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA ATT CAG CGT GAC AAC 6576 Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 2180 21B5 2190
GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA CTG GAA TCA CTG TCT 6624 Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 2195 2200 2205
ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG TAC CTG AAA ACC CAG 6672 He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 2210 2215 2220
CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA AGA AGC AAA TTC AGT 6720 Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240
AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT TTG TCA GGT ATT TAT 6768 Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 2245 2250 2255 TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC CTG ATG GCA GAG CAA 6816 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin
-161-
SUBSTΓΓUTE SHEET (RULE 26)
2260 2265 . 2270
TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT AGC TTT GTC AAA CCG 6864 Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 2275 2280 2285
GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG TGT GGA GAA GCT TTG 6912 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300
ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT CTG AAA TGG GAA TCT 6960 He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 2305 2310 2315 2320 CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG GCA GTG GTT TAT GAT 7008 Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 2325 2330 2335
TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG GAA CAA ATA CCT GCA 7056 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 2340 2345 2350
TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT AAA GAA AAT GGG TTA 7104 Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 2355 2360 2365
TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC AAA TTG TCC GAC TTG 7152
Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 2370 2375 2380
AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT GGT AGC AAC AAG GTT 7200
Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val
2385 2390 2395 2400 CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT GCA TTG GTT GGG CCT 7248 Arg Arg He Lys Gin He Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415
TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT GGC AGT ACT CAA TTG 7296 Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 2420 2425 2430
CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT GGT ACC AAT GAT AGT 7344 Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 2435 2440 2445
GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA TAC CTG CCA TTT GAA 7392 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu
2455 2460
GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT CTT CAA TTT CCG" AAT 7440 Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2480 GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT ATG AGC GAT ATT ATT 7488 Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp He He 2485 2490 2495
TTG CAT ATT CGT TAT ACC ATC CGT TAA 7515 Leu His He Arg Tyr Thr He Arg *
2500 2505
-162-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2S04 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 (TcbA protein)
Met Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin Lys Leu 1 5 10 15
Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 20 25 30
Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg He 35 40 45 Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 50 55 60
Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 65 70 75 80
Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 85 90 95
Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 100 105 110
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 115 120 125 Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 130 135 140
Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 145 150 155 160
Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 165 170 175
Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 180 185 190
Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 195 200 205 He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 210 215 220
He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser
225 230 235 240
His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu
245 250 255
Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 260 265 270
He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 275 280 285 Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 290 295 300
Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 305 310 315 320
-163-
SUBSTΓΓUTE SHEET (RULE 26)
Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 325 330 335
Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 340 345 350
Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 355 360 365 Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 370 375 380
Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 385 390 395 400
Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 405 410 415
Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 420 425 430
Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 435 440 445 Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He 450 455 460
Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 465 470 475 480
Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 485 490 495
Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 500 505 510
Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 515 520 525 Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 530 535 540
Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg
545 550 555 560
Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 565 570 575
Gin Met Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He Lys Asn 580 585 590
Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 595 600 605 He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val He Cys 610 615 620
Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn Leu Ala
625 630 635 640
Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin
645 650 655
Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 660 665 670
Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 675 680 685 Thr Leu His Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys Arg Ala 690 695 700
-164-
SUBSTΓΓUTE SHEET (RULE 26)
Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val
705 710 715 720
Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala Gin He
725 730 735
Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 740 745 750
Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 755 760 765
Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 770 775 780
Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu
785 790 795 800 Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly
805 810 815
Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 820 825 830
Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 835 840 845
Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 850 855 860
Thr Gin He Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser Ala Leu
865 870 875 880 Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly
885 890 895
He Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Met 900 905 910
Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 915 920 925
Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 930 935 940
Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 945 950 955 960 Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala He Ala
965 970 975
Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 980 985 990
Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 995 1000 1005
Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 1010 1015 1020
Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 1025 1030 1035 1040 Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala
1045 1050 1055
Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 1060 1065 1070
Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val
-165-
SUBSTΓΓUTE SHEET (RULE 26)
1075 1080 1085
Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly
1090 1095 1100
Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120
Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 1125 1130 1135
Asn Pro Trp Lys Asn He He Arg Pro Val Val Tyr Met Ser Arg Leu 1140 1145 1150 Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 1155 1160 1165
Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 1170 1175 1180
Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 1185 1190 1195 1200
Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 1205 1210 1215
Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 1220 1225 1230 Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 1235 1240 1245
Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 1250 1255 1260
Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met
1265 1270 1275 1280
Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 1285 1290 1295
Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 1300 1305 1310 Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly
1315 1320 1325
Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 1330 1335 1340
Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 1345 1350 1355 1360
Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 1365 1370 1375
Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 1380 1385 1390 Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 1395 1400 1405
He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 1410 1415 1420
Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr
1425 1430 1435 ' 1440
Gin Cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455
-166-
SUBSTΓΓUTE SHEET (RULE 26)
Gin Glu He Glu Val He Ser Val Thr Gly. Gly Tyr Trp Ser Ser Tyr 1460 1465 1470
Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 1475 1480 1485
Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala
1490 1495 1500 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 1505 1510 1515 1520
Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 1525 1530 1535
Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 1540 1545 1550
Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 1555 1560 1565
Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 1570 1575 1580 Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 1585 1590 1595 1600
Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 1605 1610 1615
Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 1620 1625 1630
Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser He 1635 1640 1645
His Gly Thr Asn Lys Ser Phe Ala He Glu Tyr Val Asp He Phe Lys 1650 1655 1660 Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu Ser Glu Thr Ser 1665 1670 1675 1680
Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala Thr Gly 1685 1690 1695
Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr
1700 1705 1710
Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 1715 1720 1725
Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 1730 1735 1740 Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala
1745 1750 - 1755 1760
Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 1765 1770 1775
Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 1780 1785 1790
Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He Ala He 1795 1800 1805
Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 1810 1815 1820 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 1825 1830 1835 1840
Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 1845 1850 1855
Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 1860 1865 1870
Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 1875 1880 1885
Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 1890 1895 1900
Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 1905 1910 1915 1920
Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 1925 1930 1935 Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 1940 1945 1950
Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 1955 1960 1965
Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980
Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 1985 1990 1995 2000
Ala Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro Gin Met 2005 2010 2015 Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 2020 2025 2030
Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 2035 2040 2045
Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Met 2050 2055 2060
Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 2065 2070 2075 2080
Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 2085 2090 2095 Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 2100 2105 2110
Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Met Ala 2115 2120 2125
Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Gly 2130 2135 2140
Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 2145 2150 2155 2160
Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 2165 2170 2175 Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 2180 2185 2190
Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 2195 2200 2205
He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin
2210 2215 2220
Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240
Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 2245 2250 2255
Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 2260 2265 2270
Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 2275 2280 2285 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300
He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 2305 2310 2315 2320
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 2325 2330 2335
Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 2340 2345 2350
Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 2355 2360 2365 Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 2370 2375 2380
Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 2385 2390 2395 2400
Arg Arg He Lys Gin He Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415
Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 2420 2425 2430
Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 2435 2440 2445 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 2450 2455 2460
Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2480
Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp He He 2485 2490 2495
Leu His He Arg Tyr Thr He Arg * 2500 2505
(2) INFORMATION FOR SEQ ID NO:13: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 (TcdA,, N-termmus)
Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala
-169-
SUBSTΓΓUTE SHEET (RULE 26)
10
(2) INFORMATION FOR SEQ ID NO: 14:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12 ammo acids
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14 (TcdB N-termmus): Met Gin Asn Ser Gin Thr Phe Ser Val Gly Glu Leu 1 5 10
[2) INFORMATION FOR SEQ ID NO: 15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14 am o acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 (TcaA,, N-termmus) Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr
1 5 10
(2) INFORMATION FOR SEQ ID NO: 16:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: Single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 (TcbA N-termmus) Met Gin Asn Ser Leu
1 5
(2) INFORMATION FOR SEQ ID NO: 17:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10 ammo acids
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(n) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 (TcdA1,-PTlll internal peptide) :
Ala Phe Asn He Asp Asp Val Ser Leu Phe 1 5 10
-170-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO:18: .
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids (B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 (TcdAi;L- PT79 internal peptide) :
Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 19:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 (TcaBi- PT158 internal peptide) :
He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He Gly Ser 1 5 10 15
Leu Gin Leu Phe He 20
( 2 ) INFORMATION FOR SEQ ID NO : 20 :
( i ) SEQUENCE CHARACTERISTICS :
(A) LENGTH : 12 amino acids ( B ) TYPE : amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 (Tc &i^-E 108 internal peptide) :
Met Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly Pro 1 5 10
(2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 26 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21 (TcbAli- PT103 internal peptide) :
-171-
SUBSTTTUTE SHEET (RULE 25)
Gly He Asp Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro 1 5 10 15
Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 20 25
(2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY : linear ( ii ) MOLECULE TYPE : peptide
(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 22 (TcbAi ;L - PT56 internal peptide) : He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys
1 5 10 15
!2) INFORMATION FOR SEQ ID NO:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23 (TcbA- PT81 (a) internal peptide) :
Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 1 5 10
(2) INFORMATION FOR SEQ ID NO: 24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24 (TcbA^- PT81 (b) internal peptide) :
Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly
1 5 10 15 Val Gin Tyr Met Gin He
20
-172-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6054 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS
(B) LOCATION: 1..43
(D) OTHER INFORMATION: /product= "end of TcaAi χχ (ix) FEATURE:
(A) NAME/KEY: RBS (B) LOCATION: 51 58
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 65 3634
(D) OTHER INFORMATION: /product= "TcaBi"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:
A GTA GCC CAA AAC TTA AGT GCC GCA ATC AGC AAT CGT CAG TAACCGGATA 50 Val Ala Gin Asn Leu Ser Ala Ala He Ser Asn Arg Gin •••
AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA 100 Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu
1 5 10
GCG CGC CGT GAT GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC 148 Ala Arg Arg Asp Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro 15 20 25 GCA GAT TTA AAA GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT 196 Ala Asp Leu Lys Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr 30 35 40
CTG TTG CTG GAT ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG 244 Leu Leu Leu Asp Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu 45 50 55 60
TCC GAA GCG ATT GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG 292 Ser Glu Ala He Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu 65 70 75
GGC TAT GAC GGC ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT 340 Gly Tyr Asp Gly Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp 80 85 90
GAA CAG TTT TTA TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT 388 Glu Gin Phe Leu Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr 95 100 105 TGG GCT GGC AAG GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT 436 Trp Ala Gly Lys Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp 110 115 120
CCA ACA TTG CGA TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA 484 Pro Thr Leu Arg Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin 125 130 135 140
GGT ATT TCT CAA GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA 532 Gly He Ser Gin Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu 145 150 155
CGT GAT TAT CTA ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT 580
Arg Asp Tyr Leu He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He 160 165 ' 170
ACT GCC TGC CAA GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT 628 Thr Ala Cys Gin Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg 175 180 185
ACA CAG AAT GCA CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC 676 Thr Gin Asn Ala Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val 190 195 200
ACT GAT GGC GGT AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA 724 Thr Asp Gly Gly Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala 205 210 215 220
ATT AAT GCC GGG ATT AGT GAG GCA TAT TCA GGG CAT GTC GAG CCT TTC 772 He Asn Ala Gly He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe 225 230 235 TGG GAA AAT AAC AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA 820 Trp Glu Asn Asn Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu 240 245 250
GAT AAA ATA GAT TTT GTT TAT AAA AAC ATC TGG GTG ATG AGT AGC GAT 868 Asp Lys He Asp Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp 255 260 265
TAT AGC TGG GCA TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT GAC 916 Tyr Ser Trp Ala Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp 270 275 280
TAC AAT AGA GTT GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT 964 Tyr Asn Arg Val Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala 285 290 295 300
TCA CAA TAT GGT TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT 1012 Ser Gin Tyr Gly Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr 305 310 315 GTA CTT ATT TTT CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG 1060 Val Leu He Phe Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val 320 325 330
ACG TTA TGT TAT GAC TCT GGC AAC GTG ATT AAG AAC CTA TCT AGT ACA 1108 Thr Leu Cys Tyr Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr 335 340 345
GGA AGT GCA AAT TTA TCG TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC 1156 Gly-St-1 Ala-Asn Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg 350 355 360
ATG TGT CAT GGA CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA 1204 Met Cys His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr 365 370 375 380
CTC TCT ATT AAT ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA 1252 Leu Ser He Asn Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser 385 390 395 GAT GGA AAA CAA TTT ACA CCA CCT TCT GGT TCT GCC ATT GAT TTA CAC 1300 Asp Gly Lys Gin Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His 400 405 410
CTC CCT AAT TAT GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT 1348 Leu Pro Asn Tyr Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp 415 420 425
TCA CTA CTT AAT TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT AAT CCG 1396 Ser Leu Leu Asn Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro 430 435 440
-174-
SUBSTΪTUTE SHEET (RULE 26)
GTT GAT AAT TTC AGT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC 1444 Val Asp Asn Phe Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe 445 450 455 460 TTC CAT ATT CCG TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT 1492 Phe His He Pro Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg 465 470 475
TAC GAA GAC GCG GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT 1540 Tyr Glu Asp Ala Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly 480 4B5 490
TAT CGC GAT GCT AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA CGT 1588 Tyr Arg Asp Ala Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg 495 500 505
TAT TGG AAT GTG ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA 1636
Tyr Trp Asn Val Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr 510 515 520
CAG CCC GCC ACC ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG 1684
Gin Pro Ala Thr Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met 525 530 535 540 CAT TAC AAG CTG GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC 1732 His Tyr Lys Leu Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala 545 550 555
CGA GGC GAC AGC GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA 1780 Arg Gly Asp Ser Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu 560 565 570
GCC AAA ATG TAC TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT 1828 Ala Lys Met Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro 575 580 585
GAT ATC CAT ACC ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA 1876 Asp He His Thr Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu 590 595 600
GCT GGC GCT ATT GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG 1924 Ala Gly Ala He Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met 605 610 615 620 ACG TTC GCT GCC TGG CTA AGC GCA GGC GAT ACC GCA AAT ATT GGC GAC 1972
Thr Phe Ala Ala Trp Leu Ser Ala Gly Asp Thr Ala Asn He Gly Asp 625 630 635
GGT GAT TTC TTG CCA CCG TAC AAC GAT GTA CTA CTC GGT TAC TGG GAT 2020 Gly Asp Phe Leu Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp 640 645 650
AAA CTT GAG TTA CGC CTA TAC AAC CTG CGC CAC AAT CTG AGT CTG GAT 2068
Lys Leu Glu Leu Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp 655 660 665
GGT CAA CCG CTA AAT CTG CCA CTG TAT GCC ACG CCG GTA GAC CCG AAA 2116
Gly Gin Pro Leu Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys
670 675 680
ACC CTG CAA CGC CAG CAA GCC GGA GGG GAC GGT ACA GGC AGT AGT CCG 2164
Thr Leu Gin Arg Gin Gin Ala Gly Gly Asp Gly Thr Gly Ser Ser Pro 685 690 695 700 GCT GGT GGT CAA GGC AGT GTT CAG GGC TGG CGC TAT CCG TTA TTG GTA 2212
Ala Gly Gly Gin Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val 705 710 715
GAA CGC GCC CGC TCT GCC GTG AGT TTG TTG ACT CAG TTC GGC AAC AGC 2260 Glu Arg Ala Arg Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser 720 725 730
-175-
SUBSTΓΓUTE SHEET (RULE 26)
TTA CAA ACA ACG TTA GAA CAT CAG GAT AAT GAA AAA ATG ACG ATA CTG 2308 Leu Gin Thr Thr Leu Glu His Gin Asp Asn Glu Lys Met Thr He Leu 735 740 745
TTG CAG ACT CAA CAG GAA GCC ATC CTG AAA CAT CAG CAC GAT ATA CAA 2356 Leu Gin Thr Gin Gin Glu Ala He Leu Lys His Gin His Asp He Gin 750 755 760 CAA AAT AAT CTA AAA GGA TTA CAA CAC AGC CTG ACC GCA TTA CAG GCT 2404 Gin Asn Asn Leu Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala 765 770 775 780
AGC CGT GAT GGC GAC ACA TTG CGG CAA AAA CAT TAC AGC GAC CTG ATT 2452 Ser Arg Asp Gly Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu He
785 790 795
AAC GGT GGT CTA TCT GCG GCA GAA ATC GCC GGT CTG ACA CTA CGC AGC 2500 Asn Gly Gly Leu Ser Ala Ala Glu He Ala Gly Leu Thr Leu Arg Ser 800 805 810
ACC GCC ATG ATT ACC AAT GGC GTT GCA ACG GGA TTG CTG ATT GCC GGC 2548 Thr Ala Met He Thr Asn Gly Val Ala Thr Gly Leu Leu He Ala Gly 815 820 825
GGA ATC GCC AAC GCG GTA CCT AAC GTC TTC GGG CTG GCT AAC GGT GGA 2596 Gly He Ala Asn Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly 830 835 840 TCG GAA TGG GGA GCG CCA TTA ATT GGC TCC GGG CAA GCA ACC CAA GTT 2644 Ser Glu Trp Gly Ala Pro Leu He Gly Ser Gly Gin Ala Thr Gin Val 845 850 855 860
GGC GCC GGC ATC CAG GAT CAG AGC GCG GGC ATT TCA GAA GTG ACA GCA 2692 Gly Ala Gly He Gin Asp Gin Ser Ala Gly He Ser Glu Val Thr Ala
865 870 875
GGC TAT CAG CGT CGT CAG GAA GAA TGG GCA TTG CAA CGG GAT ATT GCT 2740 Gly Tyr Gin Arg Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala 880 885 890
GAT AAC GAA ATA ACC CAA CTG GAT GCC CAG ATA CAA AGC CTG CAA GAG 2788 Asp Asn Glu He Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Glu 895 900 905
CAA ATC ACG ATG GCA CAA AAA CAG ATC ACG CTC TCT GAA ACC GAA CAA 2836 Gin He Thr Met Ala Gin Lys Gin He Thr Leu Ser Glu Thr Glu Gin 910 915 920 GCG AAT GCC CAA GCG ATT TAT GAC CTG CAA ACC ACT CGT TTT ACC GGG 2884 Ala Asn Ala Gin Ala He Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly 925 930 935 940
CAG GCA CTG TAT AAC TGG ATG GCC GGT CGT CTC TCC GCG CTC TAT TAC 2932 Gin Ala Leu Tyr Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr
945 950 955
CAA ATG TAT GAT TCC ACT CTG CCA ATC TGT CTC CAG CCA AAA GCC GCA 2980 Gin Met Tyr Asp Ser Thr Leu Pro He Cys Leu Gin Pro Lys Ala Ala 960 965 970
TTA GTA CAG GAA TTA GGC GAG AAA GAG AGC GAC AGT CTT TTC CAG GTT 3028 Leu Val Gin Glu Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val 975 980 985
CCG GTG TGG AAT GAT CTG TGG CAA GGG CTG TTA GCA GGA GAA GGT TTA 3076 Pro Val Trp Asn Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu 990 995 1000 AGT TCA GAG CTA CAG AAA CTG GAT GCC ATC TGG CTT GCA CGT GGT GGT 3124 Ser Ser Glu Leu Gin Lys Leu Asp Ala He Trp Leu Ala Arg Gly Gly
-176-
SUBSTΓΓUTE SHEET (RULE 26)
1005 1010 .1015 1020
ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3172 He Gly Leu Glu Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Gly 1025 1030 1035
ACA GGG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC GGG GAA ACG 3220 Thr Gly Thr Leu Ser Glu Asn He Asn Lys Val Leu Asn Gly Glu Thr 1040 1045 1050
GTA TCT CCA TCC GGT GGC GTC ACT CTG GCG CTG ACA GGG GAT ATC TTC 3268 Val Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe 1055 1060 1065 CAA GCA ACA CTG GAT TTG AGT CAG CTA GGT TTG GAT AAC TCT TAC AAC 3316 Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn 1070 1075 1080
TTG GGT AAC GAG AAG AAA CGT CGT ATT AAA CGT ATC GCC GTC ACC CTG 3364 Leu Gly Asn Glu Lys Lys Arg Arg He Lys Arg He Ala Val Thr Leu 1085 1090 1095 1100
CCA ACA CTT CTG GGG CCA TAT CAA GAT CTT GAA GCC ACA CTG GTA ATG 3412 Pro Thr Leu Leu Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met 1105 1110 1115
GGT GCG GAA ATC GCC GCC TTA TCA CAC GGT GTG AAT GAC GGA GGC CGG 3460 Gly Ala Glu He Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg 1120 1125 1130
TTT GTT ACC GAC TTT AAC GAC AGC CGT TTT CTG CCT TTT GAA GGT CGA 3508 Phe Val Thr Asp Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg 1135 1140 1145 GAT GCA ACA ACC GGC ACA CTG GAG CTC AAT ATT TTC CAT GCG GGT AAA 3556 Asp Ala Thr Thr Gly Thr Leu Glu Leu Asn He Phe His Ala Gly Lys 1150 1155 1160
GAG GGA ACG CAA CAC GAG TTG GTC GCG AAT CTG AGT GAC ATC ATT GTG 3604 Glu Gly Thr Gin His Glu Leu Val Ala Asn Leu Ser Asp He He Val 1165 1170 1175 1180
CAT CTG AAT TAC ATC ATT CGA GAC GCG TAA ATTTCTTTTC TTTGTCGATT 3654 His Leu Asn Tyr He He Arg Asp Ala * 1185 1190
ACAGGTCCCT ATCAGGGGCC TGTTATTAAG GAGTACTTTA TGCAGGATTC ACCAGAAGTA 3714
TCGATTACAA CGCTGTCACT TCCCAAAGGT GGCGGTGCTA TCAATGGCAT GGGAGAAGCA 3774
CTGAATGCTG CCGGCCCTGA TGGAATGGCC TCCCTATCTC TGCCATTACC CCTTTCGACC 3834
GGCAGAGGGA CGGCTCCTGG ATTATCGCTG ATTTACAGCA ACAGTGCAGG TAATGGGCCT 3894 TTCGGCATCG GCTGGCAATG CGGTGTTATG TCCATTAGCC GACGCACCCA ACATGGCATT 3954
CCACAATACG GTAATGACGA CACGTTCCTA TCCCCACAAG GCGAGGTCAT GAATATCGCC 4014
CTGAATGACC AAGGGCAACC TGATATCCGT CAAGACGTTA AAACGCTGCA AGGCGTTACC 4074
TTGCCAATTT CCTATACCGT GACCCGCTAT CAAGCCCGCC AGATCCTGGA TTTCAGTAAA 4134
ATCGAATACT GGCAACCTGC CTCCGGTCAA GAAGGACGCG CTTTCTGGCT GATATCGACA 4194 CCGGACGGGC ATCTACACAT CTTAGGGAAA ACCGCGCAGG CTTGTCTGGC AAATCCGCAA 4254
AATGACCAAC AAATCGCCCA GTGGTTGCTG GAAGAAACTG TGACGCCAGC CGGTGAACAT 4314
GTCAGCTATC AATATCGAGC CGAAGATGAA GCCCATTGTG ACGACAATGA AAAAACCGCT 4374
CATCCCAATG TTACCGCACA GCGCTATCTG GTACAGGTGA ACTACAGGCA ACATCAAACC 4434
-177-
SUBSTΓΓUTE SHEET (RULE 26)
ACAAGCCAGC CTGTTCGTAC TGGATAACGC ACCTCCCGCA CCGGAAGAGT GGCTGTTTCA 4494 TCTGGTCTTT GACCACGGTG AGCGCGTACC TCACTTCATA CCGTGCCAAC ATGGGATGCA 4554
GGTACAGCGC AATGGTCTGT ACGCCCGGAT ATCTTCTCTC GCTATGAATA TGGTTTTGAA 4614
GTGCGTACTC GCCGCTTATG TCAACAAGTG CTGATGTTTC ACCGCACCGC GCTCATGGCC 4674 GGAGAAGCCA GTACCAATGA CGCCCCGGAA CTGGTTGGAC GCTTAATACT GGAATATGAC 4634
AAAAACGCCA GCGTCACCAC GTTGATTACC ATCCGTCAAT TAAGCCATGA ATCGGACGGG 4794
AGGCCAGTCA CCCAGCCACC ACTAGAACTA GCCTGGCAAC GGTTTGATCT GGAGAAAATC 4854
CCGACATGGC AACGCTTTGA CGCACTAGAT AATTTTAACT CGCAGCAACG TTATCAACTG 4865
GTTGATCTGC GGGGAGAAGG GTTGCCAGGT ATGCTGTATC AAGATCGAGG CGCTTGGTGG 4914 TATAAAGCTC CGCAACGTCA GGAAGACGGA GACAGCAATG CCGTCACTTA CGACAAAATC 4974
GCCCCACTGC CTACCCTACC CAATTTGCAG GATAATGCCT CATTGATGGA TATCAACGGA 5034
GACGGCCAAC TGGATTGGGT TGTTACCGCC TCCGGTATTC GCGGATACCA TAGTCAGCAA 5094
CCCGATGGAA AGTGGACGCA CTTTACGCCA ATCAATGCCT TGCCCGTGGA ATATTTTCAT 5214
CCAAGCATCC AGTTCGCTGA CCTTACCGGG GCAGGCTTAT CTGATTTAGT GTTGATCGGG 5274 CCGAAAAGCG TGCGTCTATA TGCCAACCAG CGAAACGGCT GGCGTAAAGG AGAAGATGTC 5334
CCCCAATCCA CAGGTATCAC CCTGCCTGTC ACAGGGACCG ATGCCCGCAA ACTGGTGGCT 5394
TTCAGTGATA TGCTCGGTTC CGGTCAACAA CATCTOGTGG AAATCAAGGG TAATCGCGTC 5454
ACCTGTTGGC CGAATCTAGG GCATGGCCGT TTCGGTCAAC CACTAACTCT GTCAGGATTT 5514
AGCCAGCCCG AAAATAGCTT CAATCCCGAA CGGCTGTTTC TGGCGGATAT CGACGGCTCC 5574 GGCACCACCG ACCTTATCTA TGCGCAATCC GGCTCTTTGC TCATTTATCT CAACCAAAGT 5634
GGTAATCAGT TTGATGCCCC GTTGACATTA GCGTTGCCAG AAGGCGTACA ATTTGACAAC 5694
ACTTGCCAAC TTCAAGTCGC CGATATTCAG GGATTAGGGA TAGCCAGCTT GATTCTGACT 5754
GTGCCACATA TCGCGCCACA TCACTGGCGT TGTGACCTGT CACTGACCAA ACCCTGGTTG 5814
TTGAATGTAA TGAACAATAA CCGGGGCGCA CATCACACGC TACATTATCG TAGTTCCGCG 5874 CAATTCTGGT TGGATGAAAA ATTACAGCTC ACCAAAGCAG GCAAATCTCC GGCTTGTTAT 5934
CTGCCGTTTC CAATGCATTT GCTATGGTAT ACCGAAATTC AGGATGAAAT CAGCGGCAAC 5994
CGGCTCACCA GTGAAGTCAA CTACAGCCAC GGCGTCTGGG ATGGTAAAGA GCGGGAATTC 6054
INFORMATION FOR SEQ ID NO: 26:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1189 am o acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26 (TcaB protein) :
Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 1 5 10 15
-178-
SUBSTΪTUTE SHEET (RULE 26)
Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 20 25 30
Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 35 40 45
Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 50 55 60 Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 65 70 75 80
Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 85 90 95
Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 100 105 110
Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 115 120 125
Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 130 135 140 Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 145 150 155 160
He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 165 170 175
Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 180 185 190
Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 195 200 205
Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 210 215 220 He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 225 230 235 240
Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu Asp Lys He Asp 245 250 255
Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp Tyr Ser Trp Ala 260 265 270
Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 275 280 285
Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly
290 295 300 Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr Val Leu He Phe
305 310 315 320
Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr
325 330 335
Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 340 345 350
Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 355 360 365
Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser He Asn 370 375 380 Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 385 390 395 400
-179-
SUBSTΓΓUTE SHEET (RULE 26)
Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 405 410 415
Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn
420 425 430
Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 435 440 445
Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His He Pro 450 455 460
Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 465 470 475 480
Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 485 490 495 Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 500 505 510
Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 515 520 525
Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 530 535 540
Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 545 550 555 560
Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 565 570 575 Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 580 5B5 590
Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 595 600 605
Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 610 615 620
Trp Leu Ser Ala Gly Asp Thr Ala Asn He Gly Asp Gly Asp Phe Leu 625 630 635 640
Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu
645 650 655 Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu 660 665 670
Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg
675 680 685
Gin Gin Ala Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin 690 695 700
Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg 705 710 715 720
Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr 725 730 735 Leu Glu His Gin Asp Asn Glu Lys Met Thr He Leu Leu Gin Thr Gin 740 745 750
Gin Glu Ala He Leu Lys His Gin His Asp He Gin Gin Asn Asn Leu 755 760 765
Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly
-180-
SUBSTΓΓUTE SHEET (RULE 26)
770 775 780
Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu He Asn Gly Gly Leu 785 790 795 800
Ser Ala Ala Glu He Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He 805 810 815
Thr Asn Gly Val Ala Thr Gly Leu Leu He Ala Gly Gly He Ala Asn 820 825 830
Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly 835 840 845 Ala Pro Leu He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He 850 855 860
Gin Asp Gin Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg 865 870 875 880
Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He 885 890 895
Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met 900 905 910
Ala Gin Lys Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin
915 920 925 Ala He Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr 930 935 940
Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp
945 950 955 960
Ser Thr Leu Pro He Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu
965 970 975
Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn 980 985 990
Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu 995 1000 1005 Gin Lys Leu Asp Ala He Trp Leu Ala Arg Gly Gly He Gly Leu Glu 1010 1015 1020
Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu 1025 1030 1035 1040
Ser Glu Asn He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser 1045 1050 1055
Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu 1060 1065 1070
Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu 1075 1080 1085 Lys Lys Arg Arg He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu 1090 1095 1100
Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He 1105 1110 1115 1120
Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp 1125 1130 1135
Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr 1140 1145 1150
Gly Thr Leu Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin 1155 1160 1165
His Glu Leu Val Ala Asn Leu Ser Asp He He Val His Leu Asn Tyr 1170 1175 1180
He He Arg Asp Ala * 1185 1190
(2) INFORMATION FOR SEQ ID NO: 27:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1881 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS
(B) LOCATION: 1..1881
(D) OTHER INFORMATION: tcaB,
(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 27 ( tcaB, coding region)
ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA GCG CGC CGT GAT 48 Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 1 5 10 15 GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC GCA GAT TTA AAA 96 Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 20 25 30
GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT CTG TTG CTG GAT 144 Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 35 40 45
ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG TCC GAA GCG ATT 192 Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 50 55 60
GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG GGC TAT GAC GGC 240
Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly
65 70 75 80
ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT GAA CAG TTT TTA 288
Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu
85 90 95 TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT TGG GCT GGC AAG 336 Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 100 105 110
GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT CCA ACA TTG CGA 384 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 115 120 125
TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA GGT ATT TCT CAA 432 Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 130 135 140
GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA CGT GAT TAT CTA 480 Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 145 150 155 160
ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT ACT GCC TGC CAA 528 He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 165 170 175
-182-
SUBSTTTUTE SHEET (RULE 26)
GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT ACA CAG AAT GCA 576 Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 180 185 190
CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC ACT GAT GGC GGT 624 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 195 200 205 AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA ATT AAT GCC GGG 672 Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 210 215 220
ATT AGT GAG GCA TAT TCA GGG CAT GTC GAG CCT TTC TGG GAA AAT AAC 720 He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 225 230 235 240
AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA GAT AAA ATA GAT 768 Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu Asp Lys He Asp 245 250 255
TTT GTT TAT AAA AAC ATC TGG GTG ATG AGT AGC GAT TAT AGC TGG GCA 816 Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp Tyr Ser Trp Ala 260 265 270
TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT GAC TAC AAT AGA GTT 864 Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 275 280 285 GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT TCA CAA TAT GGT 912 Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 290 295 300
TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT GTA CTT ATT TTT 960 Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr Val Leu He Phe 305 310 315 320
CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG ACG TTA TGT TAT 1008 Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 325 330 335
GAC TCT GGC AAC GTG ATT AAG AAC CTA TCT AGT ACA GGA AGT GCA AAT 1056 Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 340 345 350
TTA TCG TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC ATG TGT CAT GGA 1104 Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 355 360 365 CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA CTC TCT ATT AAT 1152 Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser~It£" Asn 370 375 380
ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA GAT GGA AAA CAA 1200 Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 385 390 395 400
TTT ACA CCA CCT TCT GGT TCT GCC ATT GAT TTA CAC CTC CCT AAT TAT 1248 Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 405 410 415
GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT TCA CTA CTT AAT 1296 Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 420 425 430
TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT AAT CCG GTT GAT AAT TTC 1344 Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 435 440 445 AGT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC TTC CAT ATT CCG 1392 Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His He Pro
-183-
SUBSTΪTUTE SHEET (RULE 26)
450 455 460
TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT TAC GAA GAC GCG 1440
Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala
465 470 475 480
GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT TAT CGC GAT GCT 1488
Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 485 490 495
AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA CGT TAT TGG AAT GTG 1536
Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 500 505 510 ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA CAG CCC GCC ACC 1584
Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 515 520 525
ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG CAT TAC AAG CTG 1632 Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 530 535 540
GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC CGA GGC GAC AGC 1680
Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 545 550 555 560
GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA GCC AAA ATG TAC 1728
Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 565 570 575
TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT GAT ATC CAT ACC 1776
Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr
580 585 590 ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA GCT GGC GCT ATT 1824
Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 595 600 605
GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG ACG TTC GCT GCC 1872 Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 610 615 620
TGG CTA AGC 1881 Trp Leu Ser 625
(2) INFORMATION FOR SEQ ID NO: 28: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 627 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28 (TcaBi protein) :
Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 1 5 10 15
Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 20 25 30
Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 35 40 45
Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 50 55 60
-184-
SUBSTΪTUTE SHEET (RULE 25)
Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 65 70 ' 75 80
Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 85 90 95
Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 100 105 110 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 115 120 125
Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 130 135 140
Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 145 150 155 160
He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 165 170 175
Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala
180 185 190 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 195 200 205
Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 210 215 220
He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 225 230 235 240
Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu Asp Lys He Asp 245 250 255
Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp Tyr Ser Trp Ala 260 265 270 Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 275 280 285
Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 290 295 300
Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr Val Leu He Phe 305 310 315 320
Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 325 330 335
Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 340 345 350 Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly
355 360 365
Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser He Asn 370 375 380
Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 385 390 395 400
Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 405 410 415
Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 420 425 430 Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 435 440 445
Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His He Pro 450 455 460
Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 465 470 475 480
Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 485 490 495
Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 500 505 510
Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 515 520 525
Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 530 535 540 Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 545 550 555 560
Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 565 570 575
Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 580 585 590
Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 595 600 605
Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 610 615 620
Trp Leu Ser 625
[2) INFORMATION FOR SEQ ID NO: 29
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1689 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE:
-=""- A) NAME/KEY: CDS
(B) LOCATION: 1..1689 — (D) OTHER INFORMATION: tcaBi;L
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 ( caBii coding regaion)
GCA GGC GAT ACC GCA AAT ATT GGC GAC GGT GAT TTC TTG CCA CCG TAC 48 Ala Gly Asp Thr Ala Asn He Gly Asp Gly Asp Phe Leu Pro Pro Tyr 1 5 10 15
AAC GAT GTA CTA CTC GGT TAC TGG GAT AAA CTT GAG TTA CGC CTA TAC 96 Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu Arg Leu Tyr 20 25 30
AAC CTG CGC CAC AAT CTG AGT CTG GAT GGT CAA CCG CTA AAT CTG CCA 144 Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu Asn Leu Pro 35 40 45
CTG TAT GCC ACG CCG GTA GAC CCG AAA ACC CTG CAA CGC CAG CAA GCC 192 Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala 50 55 60
-186-
SUBSTΓΓUTE SHEET (RULE 26)
GGA GGG GAC GGT ACA GGC AGT AGT CCG GCT GGT GGT CAA GGC AGT GTT 240
Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin Gly Ser Val
65 70 75 80
CAG GGC TGG CGC TAT CCG TTA TTG GTA GAA CGC GCC CGC TCT GCC GTG 288
Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg Ser Ala Val
85 90 95 AGT TTG TTG ACT CAG TTC GGC AAC AGC TTA CAA ACA ACG TTA GAA CAT 336 Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His - 100 105 110
CAG GAT AAT GAA AAA ATG ACG ATA CTG TTG CAG ACT CAA CAG GAA GCC 384 Gin Asp Asn Glu Lys Met Thr He Leu Leu Gin Thr Gin Gin Glu Ala 115 120 125
ATC CTG AAA CAT CAG CAC GAT ATA CAA CAA AAT AAT CTA AAA GGA TTA 432 He Leu Lys His Gin His Asp He Gin Gin Asn Asn Leu Lys Gly Leu 130 135 140
CAA CAC AGC CTG ACC GCA TTA CAG GCT AGC CGT GAT GGC GAC ACA TTG 480
Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly Asp Thr Leu
145 150 155 160
CGG CAA AAA CAT TAC AGC GAC CTG ATT AAC GGT GGT CTA TCT GCG GCA 528
Arg Gin Lys His Tyr Ser Asp Leu He Asn Gly Gly Leu Ser Ala Ala 165 170 175 GAA ATC GCC GGT CTG ACA CTA CGC AGC ACC GCC ATG ATT ACC AAT GGC 576 Glu He Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He Thr Asn Gly 180 185 190
GTT GCA ACG GGA TTG CTG ATT GCC GGC GGA ATC GCC AAC GCG GTA CCT 624 Val Ala Thr Gly Leu Leu He Ala Gly Gly He Ala Asn Ala Val Pro 195 200 205
AAC GTC TTC GGG CTG GCT AAC GGT GGA TCG GAA TGG GGA GCG CCA TTA 672 Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly Ala Pro Leu 210 215 220
ATT GGC TCC GGG CAA GCA ACC CAA GTT GGC GCC GGC ATC CAG GAT CAG 720 He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He Gin Asp Gin 225 230 235 240
AGC GCG GGC ATT TCA GAA GTG ACA GCA GGC TAT CAG CGT CGT CAG GAA 768 Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg Arg Gin Glu 245 250 255 GAA TGG GCA TTG CAA CGG GAT ATT GCT GAT AAC GAA ATA ACC CAA CTG 816 Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He Thr Gin Leu 260 265 270
GAT GCC CAG ATA CAA AGC CTG CAA GAG CAA ATC ACG ATG GCA CAA AAA 864 Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met Ala Gin Lys 275 280 285
CAG ATC ACG CTC TCT GAA ACC GAA CAA GCG AAT GCC CAA GCG ATT TAT 912 Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin Ala He Tyr 290 295 300
GAC CTG CAA ACC ACT CGT TTT ACC GGG CAG GCA CTG TAT AAC TGG ATG 960
Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr Asn Trp Met 305 310 315 320
GCC GGT CGT CTC TCC GCG CTC TAT TAC CAA ATG TAT GAT TCC ACT CTG 1008
Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser Thr Leu 325 330 335 CCA ATC TGT CTC CAG CCA AAA GCC GCA TTA GTA CAG GAA TTA GGC GAG 1056 Pro He Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu
-187-
SUBSTΓΓUTE SHEET (RULE 26)
340 345 . 350
AAA GAG AGC GAC AGT CTT TTC CAG GTT CCG GTG TGG AAT GAT CTG TGG 1104 Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn Asp Leu Trp 355 360 365
CAA GGG CTG TTA GCA GGA GAA GGT TTA AGT TCA GAG CTA CAG AAA CTG 1152 Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 370 375 380
GAT GCC ATC TGG CTT GCA CGT GGT GGT ATT GGG CTA GAA GCC ATC CGC 1200 Asp Ala He Trp Leu Ala Arg Gly Gly He Gly Leu Glu Ala He Arg 385 390 395 400 ACC GTG TCG CTG GAT ACC CTG TTT GGC ACA GGG ACG TTA AGT GAA AAT 1248 Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 405 410 415
ATC AAT AAA GTG CTT AAC GGG GAA ACG GTA TCT CCA TCC GGT GGC GTC 1296 He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser Gly Gly Val 420 425 430
ACT CTG GCG CTG ACA GGG GAT ATC TTC CAA GCA ACA CTG GAT TTG AGT 1344 Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu Asp Leu Ser 435 440 445
CAG CTA GGT TTG GAT AAC TCT TAC AAC TTG GGT AAC GAG AAG AAA CGT 1392
Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Lys Arg
450 455 460
CGT ATT AAA CGT ATC GCC GTC ACC CTG CCA ACA CTT CTG GGG CCA TAT 1440
Arg He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu Gly Pro Tyr 465 470 475 480 CAA GAT CTT GAA GCC ACA CTG GTA ATG GGT GCG GAA ATC GCC GCC TTA 1488 Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He Ala Ala Leu 485 490 495
TCA CAC GGT GTG AAT GAC GGA GGC CGG TTT GTT ACC GAC TTT AAC GAC 1536 Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 500 505 510
AGC CGT TTT CTG CCT TTT GAA GGT CGA GAT GCA ACA ACC GGC ACA CTG 1584 Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 515 520 525
GAG CTC AAT ATT TTC CAT GCG GGT AAA GAG GGA ACG CAA CAC GAG TTG 1632 Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 530 535 540
GTC GCG AAT CTG AGT GAC ATC ATT GTG CAT CTG AAT TAC ATC ATT CGA 1680 Val Ala Asn Leu Ser Asp He He Val His Leu Asn Tyr He He Arg 545 550 555 560 GAC GCG TAA 1689 Asp Ala *
(2) INFORMATION FOR SEQ ID NO: 30:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 562 ammo acids
(B) TYPE: ammo acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
-188-
SUBSTTΓUTE SHEET (RULE 26)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30 (TcaBi, protein)
Ala Gly Asp Thr Ala Asn He Gly Asp Gly Asp Phe Leu Pro Pro Tyr 1 5 10 15
Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu Arg Leu Tyr 20 25 30
Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu Asn Leu Pro 35 40 45
Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala 50 55 60 Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin Gly Ser Val 65 70 75 80
Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg Ser Ala Val 85 90 95
Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His 100 105 110
Gin Asp Asn Glu Lys Met Thr He Leu Leu Gin Thr Gin Gin Glu Ala 115 120 125
He Leu Lys His Gin His Asp He Gin Gin Asn Asn Leu Lys Gly Leu 130 135 140 Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly Asp Thr Leu 145 150 155 160
Arg Gin Lys His Tyr Ser Asp Leu He Asn Gly Gly Leu Ser Ala Ala 165 170 175
Glu He Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He Thr Asn Gly 180 185 190
Val Ala Thr Gly Leu Leu He Ala Gly Gly He Ala Asn Ala Val Pro 195 200 205
Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly Ala Pro Leu 210 215 220 He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He Gin Asp Gin 225 230 235 240
Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg Arg Gin Glu 245 250 255
Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He Thr Gin Leu 260 265 270
Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met Ala Gin Lys 275 280 285
Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin Ala He Tyr 290 295 300 Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr Asn Trp Met
305 310 315 320
Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser Thr Leu
325 330 335
Pro He Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu
340 345 350
Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn Asp Leu Trp 355 360 365
Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 370 375 380
Asp Ala He Trp Leu Ala Arg Gly Gly He Gly Leu Glu Ala He Arg
385 390 395 400
Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn
405 410 415
He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser Gly Gly Val 420 425 430
Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu Asp Leu Ser 435 440 445
Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Lys Arg 450 455 460 Arg He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu Gly Pro Tyr 465 470 475 480
Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He Ala Ala Leu 485 490 495
Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 500 505 510
Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 515 520 525
Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 530 535 540 Val Ala Asn Leu Ser Asp He He Val His Leu Asn Tyr He He Arg 545 550 555 560
Asp Ala
(2) INFORMATION FOR SEQ ID NO: 31:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4458 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..4458
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 ( cac gene) :
ATG CAG GAT TCA CCA GAA GTA TCG ATT ACA ACG CTG TCA CTT CCC AAA 48 Met Gin Asp Ser Pro Glu Val Ser He Thr Thr Leu Ser Leu Pro Lys 1 5 10 15 GGT GGC GGT GCT ATC AAT GGC ATG GGA GAA GCA CTG AAT GCT GCC GGC 96 Gly Gly Gly Ala He Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly 20 25 30
CCT GAT GGA ATG GCC TCC CTA TCT CTG CCA TTA CCC CTT TCG ACC GGC 144 Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 35 40 45
AGA GGG ACG GCT CCT GGA TTA TCG CTG ATT TAC AGC AAC AGT GCA GGT 192 Arg Gly Thr Ala Pro Gly Leu Ser Leu He Tyr Ser Asn Ser Ala Gly
-190-
SUBSTΓΓUTE SHEET (RULE 26)
50 55 60
AAT GGG CCT TTC GGC ATC GGC TGG CAA TGC GGT GTT ATG TCC ATT AGC 240
Asn Gly Pro Phe Gly He Gly Trp Gin Cys Gly Val Met Ser He Ser
65 70 75 80
CGA CGC ACC CAA CAT GGC ATT CCA CAA TAC GGT AAT GAC GAC ACG TTC 288 Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 85 90 95
CTA TCC CCA CAA GGC GAG GTC ATG AAT ATC GCC CTG AAT GAC CAA GGG 336 Leu Ser Pro Gin Gly Glu Val Met Asn He Ala Leu Asn Asp Gin Gly 100 105 110 CAA CCT GAT ATC CGT CAA GAC GTT AAA ACG CTG CAA GGC GTT ACC TTG 384 Gin Pro Asp He Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 115 120 125
CCA ATT TCC TAT ACC GTG ACC CGC TAT CAA GCC CGC CAG ATC CTG GAT 432 Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin He Leu Asp 130 135 140
TTC AGT AAA ATC GAA TAC TGG CAA CCT GCC TCC GGT CAA GAA GGA CGC 480 Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 145 150 155 160
GCT TTC TGG CTG ATA TCG ACA CCG GAC GGG CAT CTA CAC ATC TTA GGG 528 Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His He Leu Gly 165 170 175
AAA ACC GCG CAG GCT TGT CTG GCA AAT CCG CAA AAT GAC CAA CAA ATC 576 Lys Thr Ala Gin Ala Cyε Leu Ala Asn Pro Gin Asn Asp Gin Gin He 180 185 190 GCC CAG TGG TTG CTG GAA GAA ACT GTG ACG CCA GCC GGT GAA CAT GTC 624 Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val 195 200 205
AGC TAT CAA TAT CGA GCC GAA GAT GAA GCC CAT TGT GAC GAC AAT GAA 672 Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 210 215 220
AAA ACC GCT CAT CCC AAT GTT ACC GCA CAG CGC TAT CTG GTA CAG GTG 720 Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Gin Val 225 230 235 240
AAC TAC GGC AAC ATC AAA CCA CAA GCC AGC CTG TTC GTA CTG GAT AAC 768 Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 245 250 255
GCA CCT CCC GCA CCG GAA GAG TGG CTG TTT CAT CTG GTC TTT GAC CAC 816
Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His
260 265 270 GGT GAG CGC GAT ACC TCA CTT CAT ACC GTG CCA ACA TGG GAT GCA GGT 864 Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 275 280 285
ACA GCG CAA TGG TCT GTA CGC CCG GAT ATC TTC TCT CGC TAT GAA TAT 912 Thr Ala Gin Trp Ser Val Arg Pro Asp He Phe Ser Arg Tyr Glu Tyr 290 295 300
GGT TTT GAA GTG CGT ACT CGC CGC TTA TGT CAA CAA GTG CTG ATG TTT 960 Gly Phe Glu Val Arg Thr Arg Arg Leu Cys Gin Gin Val Leu Met Phe 305 310 315 320
CAC CGC ACC GCG CTC ATG GCC GGA GAA GCC AGT ACC AAT GAC GCC CCG 1008 His Arg Thr Ala Leu Met Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 325 330 335
GAA CTG GTT GGA CGC TTA ATA CTG GAA TAT GAC AAA AAC GCC AGC GTC 1056
-191-
SUBSTITUTE 5HEET (RULE 26)
Glu Leu Val Gly Arg Leu He Leu Glu Tyr Asp Lys Asn Ala Ser Val 340 345 ' 350
ACC ACG TTG ATT ACC ATC CGT CAA TTA AGC CAT GAA TCG GAC GGG AGG 1104 Thr Thr Leu He Thr He Arg Gin Leu Ser His Glu Ser Asp Gly Arg 355 360 365
CCA GTC ACC CAG CCA CCA CTA GAA CTA GCC TGG CAA CGG TTT GAT CTG 1152 Pro Val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 370 375 380
GAG AAA ATC CCG ACA TGG CAA CGC TTT GAC GCA CTA GAT AAT TTT AAC 1200
Glu Lys He Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 385 390 395 400
TCG CAG CAA CGT TAT CAA CTG GTT GAT CTG CGG GGA GAA GGG TTG CCA 1248 Ser Gin Gin Arg Tyr Gin Leu Val Asp Leu Arg Gly Glu Gly Leu Pro 405 410 415 GGT ATG CTG TAT CAA GAT CGA GGC GCT TGG TGG TAT AAA GCT CCG CAA 1296 Gly Met Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gin 420 425 430
CGT CAG GAA GAC GGA GAC AGC AAT GCC GTC ACT TAC GAC AAA ATC GCC 1344 Arg Gin Glu Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys He Ala 435 440 445
CCA CTG CCT ACC CTA CCC AAT TTG CAG GAT AAT GCC TCA TTG ATG GAT 1392 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Met Asp 450 455 460
ATC AAC GGA GAC GGC CAA CTG GAT TGG GTT GTT ACC GCC TCC GGT ATT 1440 He Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly He 465 470 475 480
CGC GGA TAC CAT AGT CAG CAA CCC GAT GGA AAG TGG ACG CAC TTT ACG 1488 Arg Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 485 490 495 CCA ATC AAT GCC TTG CCC GTG GAA TAT TTT CAT CCA AGC ATC CAG TTC 1536 Pro He Asn Ala Leu Pro Val Glu Tyr Phe His Pro Ser He Gin Phe 500 505 510
GCT GAC CTT ACC GGG GCA GGC TTA TCT GAT TTA GTG TTG ATC GGG CCG 1584 Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Val Leu He Gly Pro 515 520 525
AAA AGC GTG CGT CTA TAT GCC AAC CAG CGA AAC GGC TGG CGT AAA GGA 1632 Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 530 535 540
GAA GAT GTC CCC CAA TCC ACA GGT ATC ACC CTG CCT GTC ACA GGG ACC 1680
Glu Asp Val Pro Gin Ser Thr Gly He Thr Leu Pro Val Thr Gly Thr
545 550 555 560
GAT GCC CGC AAA CTG GTG GCT TTC AGT GAT ATG CTC GGT TCC GGT CAA 1728
Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gin 565 570 575 CAA CAT CTG GTG GAA ATC AAG GGT AAT CGC GTC ACC TGT TGG CCG AAT 1776 Gin His Leu Val Glu He Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 580 585 590
CTA GGG CAT GGC CGT TTC GGT CAA CCA CTA ACT CTG TCA GGA TTT AGC 1824 Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 595 600 605
CAG CCC GAA AAT AGC TTC AAT CCC GAA CGG CTG TTT CTG GCG GAT ATC 1872 Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp He 610 615 620
-192-
SUBSTΓΓUTE SHEET (RULE 26)
GAC GGC TCC GGC ACC ACC GAC CTT ATC TAT. GCG CAA TCC GGC TCT TTG 1920 Asp Gly Ser Gly Thr Thr Asp Leu He Tyr Ala Gin Ser Gly Ser Leu 625 630 635 640 CTC ATT TAT CTC AAC CAA AGT GGT AAT CAG TTT GAT GCC CCG TTG ACA 1968 Leu He Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr 645 650 655
TTA GCG TTG CCA GAA GGC GTA CAA TTT GAC AAC ACT TGC CAA CTT CAA 2016 Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Leu Gin 660 665 670
GTC GCC GAT ATT CAG GGA TTA GGG ATA GCC AGC TTG ATT CTG ACT GTG 2064 Val Ala Asp He Gin Gly Leu Gly He Ala Ser Leu He Leu Thr Val 675 680 685
CCA CAT ATC GCG CCA CAT CAC TGG CGT TGT GAC CTG TCA CTG ACC AAA 2112 Pro His He Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys 690 695 700
CCC TGG TTG TTG AAT GTA ATG AAC AAT AAC CGG GGC GCA CAT CAC ACG 2160 Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr 705 710 715 720 CTA CAT TAT CGT AGT TCC GCG CAA TTC TGG TTG GAT GAA AAA TTA CAG 2208 Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lys Leu Gin 725 730 735
CTC ACC AAA GCA GGC AAA TCT CCG GCT TGT TAT CTG CCG TTT CCA ATG 2256 Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Met 740 745 750
CAT TTG CTA TGG TAT ACC GAA ATT CAG GAT GAA ATC AGC GGC AAC CGG 2304 His Leu Leu Trp Tyr Thr Glu He Gin Asp Glu He Ser Gly Asn Arg 755 760 765
CTC ACC AGT GAA GTC AAC TAC AGC CAC GGC GTC TGG GAT GGT AAA GAG 2352
Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu
770 775 780
CGG GAA TTC AGA GGA TTT GGC TGC ATC AAA CAG ACA GAT ACC ACA ACG 2400
Arg Glu Phe Arg Gly Phe Gly Cys He Lys Gin Thr Asp Thr Thr Thr 785 790 795 800 TTT TCT CAC GGC ACC GCC CCC GAA CAG GCG GCA CCG TCG CTG AGT ATT 2448 Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He 805 810 815
AGC TGG TTT GCC ACC GGC ATG GAT GAA GTA GAC AGC CAA TTA GCT ACG 2496 Ser Trp Phe Ala Thr Gly Met Asp Glu Val Asp Ser Gin Leu Ala Thr 820 825 830
GAA TAT TGG CAG GCA GAC ACG CAA GCT TAT AGC GGA TTT GAA ACC CGT 2544 Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 835 840 845
TAT ACC GTC TGG GAT CAC ACC AAC CAG ACA GAC CAA GCA TTT ACC CCC 2592
Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 850 855 860
AAT GAG ACA CAA CGT AAC TGG CTG ACG CGA GCG CTT AAA GGC CAA CTG 2640
Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gin Leu 865 870 875 880 CTA CGC ACT GAG CTC TAC GGT CTG GAC GGA ACA GAT AAG CAA ACA GTG 2688 Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val 885 890 895
CCT TAT ACC GTC AGT GAA TCG CGC TAT CAG GTA CGC TCT ATT CCC GTA 2736 Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser He Pro Val 900 905 910
-193-
SUBSTΪTUTE SHEET (RULE26)
AAT AAA GAA ACT GAA TTA TCT GCC TGG GTG ACT GCT ATT GAA AAT CGC 2784 Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala He Glu Asn Arg 915 920 925
AGC TAC CAC TAT GAA CGT ATC ATC ACT GAC CCA CAG TTC AGC CAG AGT 2832 Ser Tyr His Tyr Glu Arg He He Thr Asp Pro Gin Phe Ser Gin Ser 930 935 940 ATC AAG TTG CAA CAC GAT ATC TTT GGT CAA TCA CTG CAA AGT GTC GAT 2880 He Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 945 950 955 960
ATT GCC TGG CCG CGC CGC GAA AAA CCA GCA GTG AAT CCC TAC CCG CCT 2928 He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro
965 970 975
ACC CTG CCG GAA ACG CTA TTT GAC AGC AGC TAT GAT GAT CAA CAA CAA 2976 Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 980 985 990
CTA TTA CGT CTG GTG AGA CAA AAA AAT AGC TGG CAT CAC CTG ACT GAT 3024
Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 995 1000 1005
GGG GAA AAC TGG CGA TTA GGT TTA CCG AAT GCA CAA CGC CGT GAT GTT 3072
Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val
1010 1015 1020 TAT ACT TAT GAC CGG AGC AAA ATT CCA ACC GAA GGG ATT TCC CTT GAA 3120 Tyr Thr Tyr Asp Arg Ser Lys He Pro Thr Glu Gly He Ser Leu Glu 1025 1030 1035 1040
ATC TTG CTG AAA GAT GAT GGC CTG CTA GCA GAT GAA AAA GCG GCC GTT 3168 He Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val
1045 1050 1055
TAT CTG GGA CAA CAA CAG ACG TTT TAC ACC GCC GGT CAA GCG GAA GTC 3216 Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 1060 1065 1070
ACT CTA GAA AAA CCC ACG TTA CAA GCA CTG GTC GCG TTC CAA GAA ACC 3264 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 1075 1080 1085
GCC ATG ATG GAC GAT ACC TCA TTA CAG GCG TAT GAA GGC GTG ATT GAA 3312 Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val He Glu 1090 1095 1100 GAG CAA GAG TTG AAT ACC GCG CTG ACA CAG GCC GGT TAT CAG CAA GTC 3360 Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 1105 1110 1115 1120
GCG CGG TTG TTT AAT ACC AGA TCA GAA AGC CCG GTA TGG GCG GCA CGG 3408 Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg
1125 1130 1135
CAA GGT TAT ACC GAT TAC GGT GAC GCC GCA CAG TTC TGG CGG CCT CAG 3456 Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1140 1145 1150
GCT CAG CGT AAC TCG TTG CTG ACA GGG AAA ACC ACA CTG ACC TGG GAT 3504 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1165
ACC CAT CAT TGT GTA ATA ATA CAG ACT CAA GAT GCC GCT GGA TTA ACG 3552 Thr His His Cys Val He He Gin Thr Gin Asp Ala Ala Gly Leu Thr 1170 1175 1180 ACG CAA GCC CAT TAC GAT TAT CGT TTC CTT ACA CCG GTA CAA CTG ACA 3600 Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr
1185 1190 . 1195 1200
GAT ATT AAT GAT AAT CAA CAT ATT GTG ACT CTG GAC GCG CTA GGT CGC 3648 Asp He Asn Asp Asn Gin His He Val Thr Leu Asp Ala Leu Gly Arg 1205 1210 1215
GTA ACC ACC AGC CGG TTC TGG GGC ACA GAG GCA GGA CAA GCC GCA GGC 3696 Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 1220 1225 1230
TAT TCC AAC CAG CCC TTC ACA CCA CCG GAC TCC GTA GAT AAA GCG CTG 3744 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 1235 1240 1245 GCA TTA ACC GGC GCA CTC CCT GTT GCC CAA TGT TTA GTC TAT GCC GTT 3792 Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 1250 1255 1260
GAT AGC TGG ATG CCG TCG TTA TCT TTG TCT CAG CTT TCT CAG TCA CAA 3840 Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin
1265 1270 1275 1280
GAA GAG GCA GAA GCG CTA TGG GCG CAA CTG CGT GCC GCT CAT ATG ATT 3888 Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met He 1285 1290 1295
ACC GAA GAT GGG AAA GTG TGT GCG TTA AGC GGG AAA CGA GGA ACA AGC 3936 Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310
CAT CAG AAC CTG ACG ATT CAA CTT ATT TCG CTA TTG GCA AGT ATT CCC 3984 His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser He Pro 1315 1320 1325 CGT TTA CCG CCA CAT GTA CTG GGG ATC ACC ACT GAT CGC TAT GAT AGC 4032
Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340
GAT CCG CAA CAG CAG CAC CAA CAG ACG GTG AGC TTT AGT GAC GGT TTT 4080 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe
1345 1350 1355 1360
GGC CGG TTA CTC CAG AGT TCA GCT CGT CAT GAG TCA GGT GAT GCC TGG 4128 Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 1365 1370 1375
CAA CGT AAA GAG GAT GGC GGG CTG GTC GTG GAT GCA AAT GGC GTT CTG 4176 Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 1380 1385 1390
GTC AGT GCC CCT ACA GAC ACC CGA TGG GCC GTT TCC GGT CGC ACA GAA 4224 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 1395 1400 1405 TAT GAC GAC AAA GGC CAA CCT GTG CGT ACT TAT CAA CCC TAT TTT CTA 4272 Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 1410 1415 1420
AAT GAC TGG CGT TAC GTT AGT GAT GAC AGC GCA CGA GAT GAC CTG TTT 4320 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 1425 1430 1435 1440
GCC GAT ACC CAC CTT TAT GAT CCA TTG GGA CGG GAA TAC AAA GTC ATC 4368 Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val He 1445 1450 1455
ACT GCT AAG AAA TAT TTG CGA GAA AAG CTG TAC ACC CCG TGG TTT ATT 4416 Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe He 1460 1465 1470
GTC AGT GAG GAT GAA AAC GAT ACA GCA TCA AGA ACC CCA TAG 4458
-195-
SUBSTΪTUTE SHEET (RULE 26)
Val Ser Glu Asp Glu Asn Asp Thr Ala Ser . Arg Thr Pro 1475 1480 1485
( 2 ) INFORMATION FOR SEQ ID NO : 32 :
( i ) SEQUENCE CHARACTERISTICS :
(A) LENGTH : 1485 ammo acids
( B ) TYPE : amino acid (D ) TOPOLOGY : linear
( ii ) MOLECULE TYPE : protein
(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 32 ( TcaC protein) : Met Gin Asp Ser Pro Glu Val Ser He Thr Thr Leu Ser Leu Pro Lys 1 5 10 15
Gly Gly Gly Ala He Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly 20 25 30
Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 35 40 45
Arg Gly Thr Ala Pro Gly Leu Ser Leu He Tyr Ser Asn Ser Ala Gly 50 55 60
Asn Gly Pro Phe Gly He Gly Trp Gin Cys Gly Val Met Ser He Ser 65 70 75 80 Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe
85 90 95
Leu Ser Pro Gin Gly Glu Val Met Asn He Ala Leu Asn Asp Gin Gly 100 105 110
Gin Pro Asp He Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 115 120 125
Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin He Leu Asp 130 135 140
Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 145 150 155 160 Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His He Leu Gly
165 170 175
Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 180 185 190
Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val
195 200 205
Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 210 215 220
Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Gin Val 225 230 235 240 Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn
245 250 255
Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 260 265 270
Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 275 280 285
Thr Ala Gin Trp Ser Val Arg Pro Asp He Phe Ser Arg Tyr Glu Tyr
290 295 300
Gly Phe Glu Val Arg Thr Arg Arg Leu Cys Gin Gin Val Leu Met Phe 305 310 315 320
His Arg Thr Ala Leu Met Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 325 330 335
Glu Leu Val Gly Arg Leu He Leu Glu Tyr Asp Lys Asn Ala Ser Val 340 345 350
Thr Thr Leu He Thr He Arg Gin Leu Ser His Glu Ser Asp Gly Arg 355 360 365 Pro Val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 370 375 380
Ser Gin Gin Arg Tyr Gin Leu Val Asp Leu Arg Gly Glu Gly Leu Pro 405 410 415
Gly Met Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gin 420 425 430
Arg Gin Glu Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys He Ala
435 440 445 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Met Asp 450 455 460
He Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly He 465 470 475 480
Arg Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 485 490 495
Pro He Asn Ala Leu Pro Val Glu Tyr Phe His Pro Ser He Gin Phe 500 505 510
Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Val Leu He Gly Pro 515 520 525 Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 530 535 540
Glu Asp Val Pro Gin Ser Thr Gly He Thr Leu Pro Val Thr Gly Thr 545 550 555 560
Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gin 565 570 575
Gin His Leu Val Glu He Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 580 585 590
Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 595 600 605 Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp He 610 615 620
Asp Gly Ser Gly Thr Thr Asp Leu He Tyr Ala Gin Ser Gly Ser Leu 625 630 635 640
Leu He Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr 645 650 655
Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Leu Gin 660 665 670
-197-
SUBSTΓΓUTE SHEET (RULE 26)
Val Ala Asp He Gin Gly Leu Gly He Ala Ser Leu He Leu Thr Val
675 680 ' 685
Pro His He Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys
690 695 700
Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr
705 710 715 720 Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lys Leu Gin
725 730 735
Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Met 740 745 750
His Leu Leu Trp Tyr Thr Glu He Gin Asp Glu He Ser Gly Asn Arg 755 760 765
Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu 770 775 780
Arg Glu Phe Arg Gly Phe Gly Cys He Lys Gin Thr Asp Thr Thr Thr 785 790 795 800 Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He
805 810 815
Ser Trp Phe Ala Thr Gly Met Asp Glu Val Asp Ser Gin Leu Ala Thr 820 825 830
Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 835 840 845
Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 850 855 860
Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gin Leu 865 870 875 880 Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val
885 890 895
Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser He Pro Val
900 905 910
Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala He Glu Asn Arg
915 920 925
Ser JEyr Hie -Tyr Glu Arg He He Thr Asp Pro Gin Phe Ser Gin Ser 930 935 940
He Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 945 950 955 960 He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro
965 970 975
Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 980 985 990
Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 995 1000 1005
Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val 1010 1015 1020
Tyr Thr Tyr Asp Arg Ser Lys He Pro Thr Glu Gly He Ser Leu Glu 1025 1030 1035 1040 He Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val
1045 1050 1055
Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 1060 1065 1070
Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 1075 1080 1085
Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val He Glu
1090 1095 1100
Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 1105 1110 1115 1120
Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 1125 1130 1135
Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin
1140 1145 1150 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1165
Thr His His Cys Val He He Gin Thr Gin Asp Ala Ala Gly Leu Thr 1170 1175 1180
Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 1185 1190 1195 1200
Asp He Asn Asp Asn Gin His He Val Thr Leu Asp Ala Leu Gly Arg 1205 1210 1215
Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 1220 1225 1230 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 1235 1240 1245
Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 1250 1255 1260
Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 1265 1270 1275 1280
Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met He 1285 1290 1295
Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310 His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser He Pro
1315 1320 1325
Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340
Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 1345 1350 1355 1360
Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 1365 1370 1375
Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 1380 1385 1390 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu
1395 1400 1405
Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 1410 1415 1420
Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe
-199-
SUBSTΪTUTE SHEET (RULE 26)
1425 1430 1435 1440
Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val He 1445 1450 1455
Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe He 1460 1465 1470
Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro * 1475 1480 1485
(2) INFORMATION FOR SEQ ID NO: 33: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3288 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33 (tcaA gene) :
ATG GTG ACT GTT ATG CAA AAT AAA ATA TCA TTT TTA TCA GGT ACA TCC 48 Met Val Thr Val Met Gin Asn Lys He Ser Phe Leu Ser Gly Thr Ser 1 5 10 15
GAA CAG CCC CTG CTT GAC GCC GGT TAT CAA AAC GTA TTT GAT ATC GCA 96 Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp He Ala 20 25 30
TCA ATC AGC CGG GCT ACT TTC GTT CAA TCC GTT CCC ACC CTG CCC GTT 144 Ser He Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 35 40 45
AAA GAG GCT CAT ACC GTC TAT CGT CAG GCG CGG CAA CGT GCG GAA AAT 192 Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 50 55 60 CTG AAA TCC CTC TAC CGA GCC TGG CAA TTG CGT CAG GAG CCG GTT ATT 240 Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val He 65 70 75 80
AAA GGG CTG GCT AAA CTT AAC CTA CAA TCC AAC GTT TCT GTG CTT CAA 288 Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin
85 90 95
GAT GCT TTG GTA GAG AAT ATT GGC GGT GAT GGG GAT TTC AGC GAT TTA 336 Asp Ala Leu Val Glu Asn He Gly Gly Asp Gly Asp Phe Ser Asp Leu 100 105 110
ATG AAC CGT GCC AGT CAA TAT GCT GAC GCT GCC TCT ATT CAA TCC CTA 384
Met Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser He Gin Ser Leu 115 120 125
TTT TCA CCG GGC CGT TAT GCT TCC GCA CTC TAC AGA GTT GCT AAA GAT 432
Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 130 135 140 CTG CAT AAA TCA GAT TCC AGT TTG CAT ATT GAT AAT CGC CGC GCT GAT 480 Leu His Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg Ala Asp 145 150 155 160
CTG AAG GAT CTG ATA TTA AGC GAA ACG ACG ATG AAT AAA GAG GTC ACT 528 Leu Lvs Asp Leu He Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr
165 170 175
TCC CTT GAT ATC TTG TTG GAT GTG CTA CAA AAA GGC GGT AAA GAT ATT 576 Ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He
-200-
SUBSTΓΓUTE SHEET (RULE 26)
180 185 190
ACT GAG CTG TCC GGC GCA TTC TTC CCA ATG ACG TTA CCT TAT GAC GAT 624
Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp 195 200 205
CAT CTG TCG CAA ATC GAT TCC GCT TTA TCG GCA CAA GCC AGA ACG CTG 672
His Leu Ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 210 215 220
AAC GGT GTG TGG AAT ACT TTG ACA GAT ACC ACG GCA CAA GCG GTT TCA 720
Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser
225 230 235 240 GAA CAA ACC AGT AAT ACG AAT ACA CGC AAA CTG TTC GCT GCC CAA GAT 768 Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 245 250 255
GGT AAT CAA GAT ACA TTT TTT TCC GGA AAC ACT TTT TAT TTC AAA GCG 816 Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 260 265 270
GTG GGA TTC AGC GGG CAA CCT ATG GTT TAC CTG TCA CAG TAC ACC AGC 864 Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 275 280 285
GGG AAC GGC ATT GTC GGC GCA CAA TTG ATT GCA GGT AAT CCA GAC CAA 912
Gly Asn Gly He Val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 290 295 300
GCC GCC GCC GCA ATA GTC GCA CCG TTG AAA CTC ACT TGG TCA ATG GCA 960
Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 305 310 315 320 AAA CAG TGT TAC TAC CTC GTC GCT CCC GAT GGT ACA ACG ATG GGA GAC 1008 Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 325 330 335
GGT AAT GTT CTG ACC GGC TGT TTC TTA AGA GGC AAC AGC CCA ACT AAC 1056 Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 340 345 350
CCG GAT AAA GAC GGT ATT TTT GCT CAG GTA GCC AAC AAA TCA GGC AGT 1104 Pro Asp Lys Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 355 360 365
ACT CAG CCT TTG CCA AGC TTC CAT CTG CCG GTC ACA CTG GAA CAC AGC 1152 Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 370 375 380
GAG AAT AAA GAT CAG TAC TAT CTG AAA ACA GAG CAG GGT TAT ATC ACG 1200 Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr He Thr 385 390 395 400 GTA GAT AGT TCC GGA CAG TCA AAT TGG AAA AAC GCG CTG GTT ATC AAT 1248 Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val He Asn 405 410 415
GGG ACA AAA GAC AAG GGG CTG TTA TTA ACC TTT TGC AGC GAT AGC TCA 1296 Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 420 425 430
GGC ACT CCG ACA AAC CCT GAT GAT GTG ATT CCT CCC GCT ATC AAT GAT 1344 Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala He Asn Asp 435 440 445
ATT CCA TCG CCG CCA GCC CGC GAA ACA CTG TCA CTG ACG CCG GTC AGT 1392 He Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 450 455 460
TAT CAA TTG ATG ACC AAT CCG GCA CCG ACA GAA GAT GAT ATT ACC AAC 1440
-201-
SUBSTΪTUTE SHEET (RULE 26)
Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp He Thr Asn 465 470 '475 480
CAT TAT GGT TTT AAC GGC GCT AGC TTA CGG GCT TCT CCA TTG TCA ACC 1488 His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr
485 490 495
AGC GAG TTG ACC AGC AAA CTG AAT TCT ATC GAT ACT TTC TGT GAG AAG 1536 Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 500 505 510
ACC CGG TTA AGC TTC AAT CAG TTA ATG GAT TTG ACC GCT CAG CAA TCT 1584 Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 515 520 525
TAC AGT CAA AGC AGC ATT GAT GCG AAA GCA GCC AGC CGC TAT GTT CGT 1632 Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 530 535 540 TTT GGG GAA ACC ACC CCA ACC CGC GTC AAT GTC TAC GGT GCC GCT TAT 1680 Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr 545 550 555 560
CTG AAC AGC ACA CTG GCA GAC GCG GCT GAT GGT CAA TAT CTG TGG ATT 1728 Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He
565 570 575
CAG ACT GAT GGC AAG AGC CTA AAT TTC ACT GAC GAT ACG GTA GTC GCC 1776 Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 580 585 590
TTA GCC GGT CGC GCT GAA AAG CTG GTA CGT TTA TCA TCC CAG ACC GGG 1824 Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 595 600 605
CTA TCA TTT GAA GAA TTG GAC TGG CTG ATT GCC AAT GCC AGT CGT AGT 1872 Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg Ser 610 615 620 GTG CCG GAC CAC CAC GAC AAA ATT GTG CTG GAT AAG CCG GTC CTT GAA 1920 Val Pro Asp His His Asp Lys He Val Leu Asp Lys Pro Val Leu Glu 625 630 635 640
GCA CTG GCA GAG TAT GTC AGC CTA AAA CAG CGC TAT GGG CTT GAT GCC 1968 Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr Gly Leu Asp Ala
645 650 655
AAT ACC TTT GCG ACC TTC ATT AGT GCA GTA AAT CCT TAT ACG CCA GAT 2016 Asn Thr Phe Ala Thr Phe He Ser Ala Val Asn Pro Tyr Thr Pro Asp 660 665 670
CAG ACA CCC AGT TTC TAT GAA ACC GCT TTC CGC TCT GCC GAC GGT AAT 2064 Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 675 680 685
CAT GTC ATT GCG CTA GGT ACA GAG GTG AAA TAT GCA GAA AAT GAG CAG 2112 His Val He Ala Leu Gly Thr Glu Val Lys Tyr Ala Glu Asn Glu Gin 690 695 700 GAT GAG TTA GCC GCC ATA TGC TGC AAA GCA TTG GGT GTC ACC AGT GAT 2160 Asp Glu Leu Ala Ala He Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 705 710 715 720
GAA CTG CTC CGT ATT GGT CGC TAT TGC TTC GGT AAT GCA GGC AGT TTT 2208 Glu Leu Leu Arg He Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe
725 730 735
ACC TTG GAT GAA TAT ACC GCC AGT CAG TTG TAT CGC TTC GGC GCC ATT 2256 Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala He 740 745 750
CCC CGT TTG TTT GGG CTG ACA TTT GCC CAA.GCC GAA ATT TTA TGG CGT 2304 Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu He Leu Trp Arg 755 760 765 CTG ATG GAA GGC GGA AAA GAT ATC TTA TTG CAA CAG TTA GGT CAG GCA 2352 Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 770 775 780
AAA TCC CTG CAA CCA CTG GCT ATT TTA CGC CGT ACC GAG CAG GTG CTG 2400 Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 785 790 795 800
GAT TGG ATG TCG TCC GTA AAT CTA AGT CTG ACT TAT CTG CAA GGG ATG 2448 Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Met 805 810 815
GTA AGT ACG CAA TGG AGC GGT ACC GCC ACC GCT GAG ATG TTC AAT TTC 2496 Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe 820 825 830
TTG GAA AAC GTT TGT GAC AGC GTG AAT AGT CAA GCT GCC ACT AAA GAA 2544 Leu Glu Asn Val- Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 835 840 845 ACA ATG GAT TCG GCG TTA CAG CAG AAA GTG CTG CGG GCG CTA AGC GCC 2592 Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 850 855 860
GGT TTC GGC ATT AAG AGC AAT GTG ATG GGT ATC GTC ACC TTC TGG CTG 2640 Gly Phe Gly He Lys Ser Asn Val Met Gly He Val Thr Phe Trp Leu 865 870 875 880
GAG AAA ATC ACA ATC GGT AGT GAT AAT CCT TTT ACA TTG GCA AAC TAC 2688 Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 885 890 895
TGG CAT GAT ATT CAA ACC CTG TTT AGC CAT GAC AAT GCC ACG TTA GAG 2736 Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 900 905 910
TCC TTA CAA ACC GAC ACT TCT CTG GTA ATT GCT ACT CAG CAA CTT AGC 2784 Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr Gin Gin Leu Ser 915 920 925 CAG CTA GTG TTA ATT GTG AAA TGG CTG AGC CTG ACC GAG CAG GAT CTG 2832 Gin Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 930 935 940
CAA TTA CTG ACA ACC TAT CCC GAA CGT TTA ATC AAC GGC ATC ACG AAT 2880 Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 945 950 955 960
GTT CCT GTA CCC AAT CCG GAG CTA TTA CTC ACG CTA TCA CGT TTT AAG 2928 Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 965 970 975
CAG TGG GAA ACT CAA GTC ACC GTT TCC CGT GAT GAA GCG ATG CGC TGT 2976 Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Met Arg Cys 980 985 990
TTC GAT CAA TTA AAT GCC AAT GAT ATG ACG ACT GAA AAT GCA GGT TCA 3024 Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu Asn Ala Gly Ser 995 1000 1005 CTG ATC GCC ACA TTG TAT GAG ATG GAT AAA GGT ACG GGA GCG CAA GTT 3072 Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr Gly Ala Gin Val 1010 1015 1020
AAT ACC TTG CTA TTA GGT GAA AAT AAC TGG CCG AAA AGT TTT ACC TCT 3120 Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 1025 1030 1035 1040
CTC TGG CAA CTT CTG ACC TGG TTA CGC GTC GGG CAA AGA CTG AAT GTC 3168 Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 1045 1050 1055
GGT AGT ACC ACT CTG GGC AAT CTG TTG TCC ATG ATG CAA GCA GAC CCT 3216 Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 1060 1065 1070 GCT GCC GAG AGT AGC GCT TTA TTG GCA TCA GTA GCC CAA AAC TTA AGT 3264 Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 1075 1080 1085
GCC GCA ATC AGC AAT CGT CAG TAA 3288 Ala Ala He Ser Asn Arg Gin ••• 1090 1095
12) INFORMATION FOR SEQ ID NO: 34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1095 amino acids
(B) TYPE: amino acids
(C) TOPOLOGY: linear (ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 (TcaA protein) Features From To Description 254 267 SEQ ID NO: 15 254 492 TcaAϋ peptide
Met Val Thr Val Met Gin Asn Lys He Ser Phe Leu Ser Gly Thr Ser 1 5 10 15
Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp He Ala 20 25 30
Ser He Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 35 40 45
Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn
50 55 60 Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val He 65 70 75 80
Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 85 90 95
Asp Ala Leu Val Glu Asn He Gly Gly Asp Gly Asp Phe Ser Asp Leu 100 105 110
Met Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser He Gin Ser Leu 115 120 125
Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 130 135 140 Leu His Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg Ala Asp 145 150 155 160
Leu Lys Asp Leu He Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 165 170 175
Ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He 180 185 190
Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp
-204-
SUBSTTTUTE SHEET (RULE 26)
195 200 205
His Leu Ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 210 215 220
Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 225 230 235 240
Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 245 250 255
Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 260 265 270 Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 275 280 285
Gly Asn Gly He Val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 290 295 300
Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 305 310 315 320
Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 325 330 335
Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 340 345 350 Pro Asp Lys Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 355 360 365
Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 370 375 380
Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr He Thr 385 390 395 400
Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val He Asn 405 410 415
Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser
420 425 430 Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala He Asn Asp
435 440 445
He Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 450 455 460
Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp He Thr Asn
465 470 475 480
His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 485 490 W4 » 495
Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 500 505 510 Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 515 520 525
Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 530 535 540
Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr
545 550 555 560
Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He 565 570 575
-205-
SUBSTΓΓUTE SHEET (RULE.26)
Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr. Asp Asp Thr Val Val Ala 580 585 ' 590
Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 595 600 605
Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg Ser 610 615 620 Val Pro Asp His His Asp Lys He Val Leu Asp Lys Pro Val Leu Glu 625 630 635 640
Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr Gly Leu Asp Ala 645 650 655
Asn Thr Phe Ala Thr Phe He Ser Ala Val Asn Pro Tyr Thr Pro Asp 660 665 670
Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 675 680 685
His Val He Ala Leu Gly Thr Glu Val Lys Tyr Ala Glu Asn Glu Gin 690 695 700 Asp Glu Leu Ala Ala He Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 705 710 715 720
Glu Leu Leu Arg He Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe 725 730 735
Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala He 740 745 750
Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu He Leu Trp Arg 755 760 765
Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 770 775 780 Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 785 790 795 800
Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Met
805 810 815
Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe
820 825 830
Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 835 840 845
Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 850 855 860 Gly Phe Gly He Lys Ser Asn Val Met Gly He Val Thr Phe Trp Leu 865 870 875 880
Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 885 890 895
Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 900 905 910
Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr Gin Gin Leu Ser 915 920 925
Gin Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 930 935 940 Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 945 950 955 960
-206-
SUBSTΓΓUTE SHEET (RULE 26)
Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 965 970 975
Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Met Arg Cys 980 985 990
Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu Asn Ala Gly Ser 995 1000 1005
Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr Gly Ala Gin Val 1010 1015 1020
Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 1025 1030 1035 1040
Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 1045 1050 1055 Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 1060 1065 1070
Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 1075 1080 1085
Ala Ala He Ser Asn Arg Gin ••• 1090 1095
(2) INFORMATION FOR SEQ ID NO: 35
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 603 amino acids
(B) TYPE: amino acid (C) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 (TcaAii;L protein) Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr 1 5 10 15
Phe Cys Glu Lys Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr 20 25 30
Ala Gin Gin Ser Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser 35 40 45
Arg Tyr Val Arg Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr 50 55 60
Gly Ala Ala Tyr Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin
65 70 75 80 Tyr Leu Trp He Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp
85 90 95
Thr Val Val Ala Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser
100 105 110
Ser Gin Thr Gly Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn
115 120 125
Ala Ser Arg Ser Val Pro Asp His His Asp Lys He Val Leu Asp Lys 130 135 140
Pro Val Leu Glu Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr
145 150 155 160
-207-
SUBSTΓΓUTE SHEET (RULE 26)
Gly Leu Asp Ala Asn Thr Phe Ala Thr Phe . He Ser Ala Val Asn Pro 165 170 175
Tyr Thr Pro Asp Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser 180 185 190
Ala Asp Gly Asn His Val He Ala Leu Gly Thr Glu Val Lys Tyr Ala 195 200 205 Glu Asn Glu Gin Asp Glu Leu Ala Ala He Cys Cys Lys Ala Leu Gly 210 215 220
Val Thr Ser Asp Glu Leu Leu Arg He Gly Arg Tyr Cys Phe Gly Asn 225 230 235 240
Ala Gly Arg Phe Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg 245 250 255
Phe Gly Ala He Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu 260 265 270
He Leu Trp Arg Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin 275 280 285 Xxx Gly Gin Ala Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr 290 295 300
Glu Gin Val Leu Asp Trp Met Ser Pro Val Asn Leu Ser Leu Thr Tyr 305 310 315 320
Leu Gin Gly Met Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu 325 330 335
Met Phe Asn Phe Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala 340 345 350
Xxx Thr Lys Glu Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg 355 360 365 Ala Leu Ser Ala Gly Phe Gly He Lys Ser Asn Val Met Gly He Val 370 375 380
Thr Phe Trp Leu Glu Lys He Thr He Gly Arg Asp Asn Pro Phe Thr
385 390 395 400
Leu Ala Asn Tyr Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn
405 410 415
Ala Thr Leu Glu Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr 420 425 430
Gin Gin Leu Ser Gin Leu Val Leu He Val Lys Trp Val Ser Leu Thr
435 440 445 Glu Gin Asp Leu Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn 450 455 460
Gly He Thr Asn Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu
465 470 475 480
Ser Arg Phe Lys Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu
485 490 495
Ala Met Arg Cys Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu 500 505 510
Asn Ala Gly Ser Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr 515 520 525 Gly Ala Gin Val Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys 530 535 540
-208-
SUBSTΪTUTE SHEET (RULE 26)
Ser Phe Thr Ser Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin
545 550 555 560 Arg Leu Asn Val Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met
565 570 575
Gin Ala Asp Pro Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala 580 585 590
Gin Asn Leu Ser Ala Ala He Ser Asn Arg Gin * 595 600
(2) INFORMATION FOR SEQ ID NO: 36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2557 base pairs
(B) TYPE: nucleic acid (C) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36 (tcdA internal fragment) :
GAATTCGGCT TGCGTTTAAT ATTGATGATG TCTCGCTCTT CCGCCTGCTT AAAATTACCG 60
ACCATGATAA TAAAGATGGA AAAATTAAAA ATAACCTAAA GAATCTTTCC AATTTATATA 120
TTGGAAAATT ACTGGCAGAT ATTCATCAAT TAACCATTGA TGAACTGGAT TTATTACTGA 180
TTGCCGTAGG TGAAGGAAAA ACTAATTTAT CCGCTATCAG TGATAAGCAA TTGGCTACCC 240 TGATCAGAAA ACTCAATACT ATTACCAGCT GGCTACATAC ACAGAAGTGG AGTGTATTCC 300
AGCTATTTAT CATGACCTCC ACCAGCTATA ACAAAACGCT AACGCCTGAA ATTAAGAATT 360
TGCTGGATAC CGTCTACCAC GGTTTACAAG GTTTTGATAA AGACAAAGCA GATTTGCTAC 420
ATGTCATGGC GCCCTATATT GCGGCCACCT TGCAATTATC ATCGGAAAAT GTCGCCCACT 480
CGGTACTCCT TTGGGCAGAT AAGTTACAGC CCGGCGACGG CGCAATGACA GCAGAGGGAN 540 TCTGGGACTG GTTGAATACT AAGTATACGC CGGGTTCATC GGAAGCCGTA GAAACGCAGG 600
AACATATCGT TCAGTATTGT CAGGCTCTGG CACAATTGGA AATGGTTTAC CATTCCACCG 660
GCATCAACGA AAACGCCTTC CGTCTATTTG TGACAAAACC AGAGATGTTT GGCGCTGCAA 720
CTGGAGCAGC GCCCGCGCAT GATGCCCTTT CACTGATTAT GCTGACACGT TTTGCGGATT 780
GGGTGAACGC ACTAGGCGAA AAAGCGTCCT CGGTGCTAGC GGCATTTGAA GCTAACTCGT 840 TAACGGCAGA ACAACTGGCT GATGCCATGA ATCTTGATGC TAATTTGCTG TTGCAAGCCA 900
GTATTCAAGC ACAAAATCAT CAACATCTTC CCCCAGTAAC TCCAGAAAAT GCGTTCTCCT 960
GTTGGACATC TATCAATACT ATCCTGCAAT GGGTTAATGT CGCACAACAA TTGAAATGTC 1020
GCCCCACAGG GCGTTTCCGC TTTGGTCGGG CTGGATTATA TTCAATCAAT GAAAGAGACA 1080
CCGACCTATG CCCAGTGGGA AAACGCGGCA GGCGTATTAA CCGCCGGGTT GAATTCAACA 1140 ACAGGCTAAT ACATTACAAC GCTTTTCTGG ATGAATCTCG CAGTGCCGCA TTAAGCACCT 1200
ACTATATCCG TCAAGTCGCC AAGGCAGCGG CGGCTATTAA AAGCCGTGAT GACTTGTATC 1260
AATACTTACT GATTGATAAT CAGGTTTCTG CGGCAATAAA AACCACCCGG ATCGCCGAAG 1320
CCATTGCCAG TATTCAACTG TACGTCAACC GGGCATTGGA AAATGTGGAA GAAAATGCCA 1380
ATTCGGGGGT TATCAGCCGC CAATTCTTTA TCGACTGGGA CAAATACAAT AAACGCTACA 1440 GCACTTGGGC GGGTGTTTCT CAATTAGTTT ACTACCCGGA AAACTATATT GATCCGACCA 1500
TGCGTATCGG ACAAACCAAA ATGATGGACG CATTACTGCA ATCCGTCAGC CAAAGCCAAT 1560
TAAACGCCGA TACCGTCGAA GATGCCTTTA TGTCTTATCT GACATCGTTT GAACAAGTGG 1620
CTAATCTTAA AGTTATTAGC GCATATCACG ATAATATTAA TAACGATCAA GGGCTGACCT 1680
ATTTTATCGG ACTCAGTGAA ACTGATGCCG GTGAATATTA TTGGCGCAGT GTCGATCACA 1740 GTAAATTCAA CGACGOTAAA TTCGCGGCTA ATGCCTGGAG TGAATGGCAT AAAATTGATT 1800
-209-
SUBSTΓΠJTE SHEET (RULE 26)
GTCCAATTAA CCCTTATAAA AGCACTATCC GTCCAGTGAT ATATAAATCC CGCCTGTATC 1860
TGCTCTGGTT GGAACAAAAG GAGATCACCA AACAGACAGG AAATAGTAAA GATGGCTATC 1920
AAACTGAAAC GGATTATCGT TATGAACTAA AATTGGCGCA TATCCGCTAT GATGGCACTT 1980
GGAATACGCC AATCACCTTT GATGTCAATA AAAAAATATC CGAGCTAAAA CTGGAAAAAA 2040 ATAGAGCGCC CGGACTCTAT TGTGCCGGTT ATCAAGGTGA AGATACGTTG CTGGTGATGT 2100
TTTATAACCA ACAAGACACA CTAGATAGTT ATAAAAACGC TTCAATGCAA GGACTATATA 2160
TCTTTGCTGA TATGGCATCC AAAGATATGA CCCCAGAACA GAGCAATGTT TATCGGGATA 2220
ATAGCTATCA ACAATTTGAT ACCAATAATG TCAGAAGAGT GAATAACCGC TATGCAGAGG 2280
ATTATGAGAT TCCTTCTTCG GTAAGTAGCC GTAAAGACTA TGGTTGGGGA GATTATTACC 2340 TCAGCATGGT ATATAACGGA GATATTCCAA CTATCAATTA CAAAGCCGCA TCAAGTGATT 2400
TAAAAATTTA TATTTCACCA AAATTAAGAA TTATTCATAA TGGATATGAA GGACAGAAGC 2460
GCAATCAATG CAATTTGATG AATAAATATG GCAAACTAGG TGATAAATTT ATTGTGTATA 2520
CCAGCCTGGG CGTTAATCCG AATAATAAGC CGAATTC 2557
(2) INFORMATION FOR SEQ ID NO: 37
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 845 ammo acids (B) TYPE: ammo acids
(C) TOPOLOGY: linear (n) MOLECULE TYPE: protein (partial)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 (TcdA internal peptide) :
Ala Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys He Thr 1 5 10 15 Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn Leu 20 25 30
Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu Thr 35 40 45
He Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys Thr 50 55 60
Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg Lys 65 70 75 80
Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe 85 90 95 Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro 100 105 110
Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe 115 120 125
Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He Ala 130 135 140
Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu 145 150 155 160
Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu Gly 165 170 175 Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala 180 185 190
Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala Gin
-210-
SUBSTΓΓUTE SHEET (RULE 26)
195 200 205
Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe Arg 210 215 220
Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ala 225 230 235 240
Pro Ala His Asp Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala Asp 245 250 255
Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe 260 265 270 Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu 275 280 285
Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His Gin
290 295 300
His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser
305 310 315 320
He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Lys Cys 325 330 335
Arg Pro Thr Gly Arg Phe Arg Phe Gly Arg Ala Gly Leu Tyr Ser He 340 345 350 Asn Glu Arg Asp Thr Asp Leu Cys Pro Val Gly Lys Arg Gly Arg Arg 355 360 365
He Asn Arg Arg Val Glu Phe Asn Asn Arg Leu He His Tyr Asn Ala
370 375 380
Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg 385 390 395 400
Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr 405 410 415
Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr
420 425 430 Arg He Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arg Ala 435 440 445
Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin
450 455 460
Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala
465 470 475 480
Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 485 490 495
Met Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 500 505 510 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 515 520 525
Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala
530 535 540
Tyr His Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gly
545 550 555 560
Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 565 570 575
-211-
SUBSTΓΓUTE SHEET (RULE 26)
Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 580 585 590
His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 595 600 605
Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 610 615 620
He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 625 630 635 640
Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 645 650 655
Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu 660 665 670
Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 675 680 685
Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 690 695 700 Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr He Phe Ala Asp 705 710 715 720
Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 725 730 735
Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 740 745 750
Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 755 760 765
Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp 770 775 780 He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 785 790 795 800
He Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 805 810 815
Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 820 825 830
Dhg rip uλι T r Thr Ser Leu Gly Val Asn Pro Asn Asn ' 835 840 845
(2) INFORMATION FOR SEQ ID NO: 38: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO-.38 (TcdA,,- pk71 internal peptide) :
Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe He Gly 1 5 10 15
-212-
SUBSTTTUTE SHEET (RULE 26)
Lys
(2) INFORMATION FOR SEQ ID NO: 39:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH. 20 ammo acids
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 (TcdA,,- pK44 internal peptide) :
Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala 1 5 10 15 He Ser Pro Ala Lys
20
(2) INFORMATION FOR SEQ ID NO: 40:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 ammo acids
(C) STRANDEDNESS: Single (D) TOPOLOGY: linear
(n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 (TcbAlη, N-termmus
Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin 1 5 10
(2) INFORMATION FOR SEQ ID NO: 41:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14 ammo acids
(B) TYPE: ammo acid (C) STRANDEDNESS- single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 (TcdA,,, N-termmus)
Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin 1 5 10
-213-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO:42:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 amino acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.42 (TcdA-pk57 internal peptide) :
Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 1 5 10 15
Ala Gly Leu Glu
(2) INFORMATION FOR SEQ ID NO: 43:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11 amino acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43 (TcdA,,,-pK20 internal peptide) :
He Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys 1 5 10
(2) INFORMATION FOR SEQ ID NO:44:
(l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:
Asp Asp Ser Gly Asp Asp Asp Lys Val Thr Asn Thr Asp He His Arg 1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 45: (l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-termmal
(xi ) SEQUENCE DESCRIPTION: ^Q ID NO: 45:
Asp Val Xaa Gly Ser Glu Lys Ala Asn Glu. Lys Leu Lys 1 5 10
(2) INFORMATION FOR SEQ ID NO: 46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7551 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
( xi ) SEQUENCE DESCRI PTION : SEQ I D NO : 4 6 ( tcdA) :
ATG AAC GAG TCT GTA AAA GAG ATA CCT GAT GTA TTA AAA AGC CAG TGT Met Asn Glu Ser Val Lys Glu He Pro Asp Val Leu Lys Ser Gin Cys 1 5 10 15
ACA CTC TCT TTG TCC AAT GAG CTG TTA TTG GAA AGC ATT AAA ACT GAA Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys Thr Glu 165 170 175 TCT AAA CTG GAA AAC TAT ACT AAA GTG ATG GAA ATG CTC TCC ACT TTC 576 Ser Lvs Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 180 185 190
CGT CCT TCC GGC GCA ACG CCT TAT CAT GAT GCT TAT GAA AAT GTG CGT 624 Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arα 195 200 205
GAA GTT ATC CAG CTA CAA GAT CCT GGA CTT GAG CAA CTC AAT GCA TCA 672 Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser
210 215 220
CCG GCA ATT GCC GGG TTG ATG CAT CAA GCC TCC CTA TTG GGT ATT AAC 720 Pro Ala He Ala Gly Leu Met His Gin Ala Ser Leu Leu Giv He Asn 225 230 235 240
GCT TCA ATC TCG CCT GAG CTA TTT AAT ATT CTG ACG GAG GAG ATT ACC 76£ Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr 245 250 255
GAA GGT AAT GCT GAG GAA CTT TAT AAG AAA AAT TTT GGT AAT ATC GAA 816 Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Giv Asn He Glu 260 265 " 270 CCG GCC TCA TTG GCT ATG CCG GAA TAC CTT AAA CGT TAT TAT AAT TTA 864 Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 275 280 285
AGC GAT GAA GAA CTT AGT CAG TTT ATT GGT AAA GCC AGC AAT TTT GGT 912 Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 290 295 300
CAA CAG GAA TAT AGT AAT AAC CAA CTT ATT ACT CCG GTA GTC AAC AGC 960 Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 305 310 315 320
AGT GAT GGC ACG GTT AAG GTA TAT CGG ATC ACC CGC GAA TAT ACA ACC 1008 Ser ASD Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 325 330 335
AAT GCT TAT CAA ATG GAT GTG GAG CTA TTT CCC TTC GGT GGT GAG AAT 1056 Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 340 345 350 TAT CGG TTA GAT TAT AAA TTC AAA AAT TTT TAT AAT GCC TCT TAT TTA 1104 Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 355 360 365
TCC ATC AAG TTA AAT GAT AAA AGA GAA CTT GTT CGA ACT GAA GGC GCT 1152 Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 370 375 380
CCT CAA GTC AAT ATA GAA TAC TCC GCA AAT ATC ACA TTA AAT ACC GCT 1200 Pro Gin Val Asn He Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ala 385 390 395 400
GAT ATC AGT CAA CCT TTT GAA ATT GGC CTG ACA CGA GTA CTT CCT TCC 1248
Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu Pro Ser
405 410 415
GGT TCT TGG GCA TAT GCC GCC GCA AAA TTT ACC GTT GAA GAG TAT AAC 1296
Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn
420 425 430 CAA TAC TCT TTT CTG CTA AAA CTT AAC AAG GCT ATT CGT CTA TCA CGT 1344 Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu Ser Arg 435 440 445
GCG ACA GAA TTG TCA CCC ACG ATT CTG GAA GGC ATT GTG CGC AGT GTT 1392 Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg Ser Val 450 455 460
AAT CTA CAA CTG GAT ATC AAC ACA GAC GTA TTA GGT AAA GTT TTT CTG 1440 Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 465 470 475 480
ACT AAA TAT TAT ATG CAG CGT TAT GCT ATT CAT GCT GAA ACT GCC CTG 1488 Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 485 490 495
ATA CTA TGC AAC GCG CCT ATT TCA CAA CGT TCA TAT GAT AAT CAA CCT 1536
-216-
SUBSTΓΓUTE SHEET (RULE 26)
He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn Gin Pro
500 505 ' 510
AGC CAA TTT GAT CGC CTG TTT AAT ACG CCA TTA CTG AAC GGA CAA TAT 1584 Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr
515 520 525
TTT TCT ACC GGC GAT GAG GAG ATT GAT TTA AAT TCA GGT AGC ACC GGC 1632
Phe Ser Thr Gly Asp Glu Glu He Asp Leu Asn Ser Gly Ser Thr Gly 530 535 540
GAT TGG CGA AAA ACC ATA CTT AAG CGT GCA TTT AAT ATT GAT GAT GTC 1680
Asp Trp Arg Lys Thr He Leu Lys Arg Ala Phe Asn He Asp Asp Val
545 550 555 560
TCG CTC TTC CGC CTG CTT AAA ATT ACC GAC CAT GAT AAT AAA GAT GGA 1728 Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys Asp Gly 565 570 575 AAA ATT AAA AAT AAC CTA AAG AAT CTT TCC AAT TTA TAT ATT GGA AAA 1776 Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He Gly Lys 580 585 590
TTA CTG GCA GAT ATT CAT CAA TTA ACC ATT GAT GAA CTG GAT TTA TTA 1824 Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Asp Leu Leu 595 600 605
CTG ATT GCC GTA GGT GAA GGA AAA ACT AAT TTA TCC GCT ATC AGT GAT 1872 Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala lie Ser Asp 610 615 620
AAG CAA TTG GCT ACC CTG ATC AGA AAA CTC AAT ACT ATT ACC AGC TGG 1920 Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 625 630 635 640
CTA CAT ACA CAG AAG TGG AGT GTA TTC CAG CTA TTT ATC ATG ACC TCC 1968 Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met Thr Ser 645 650 655
GAA ATT AAG AAT TTG CTG GAT 2016 Glu He Lys Asn Leu Leu Asp 670
GAT AAA GAC AAA GCA GAT TTG 2064 Asp Lys Asp Lys Ala Asp Leu 685
GCC ACC TTG CAA TTA TCA TCG 2112 Ala Thr Leu Gin Leu Ser Ser 700
TGG GCA GAT AAG TTA CAG CCC 2160 Trp Ala Asp Lys Leu Gin Pro
715 720
GGC GAC GGC GCA ATG ACA GCA GAA AAA TTC TGG GAC TGG TTG AAT ACT 2208 Gly Asp Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 725 730 735 AAG TAT ACG CCG GGT TCA TCG GAA GCC GTA GAA ACG CAG GAA CAT ATC 2256 Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 740 745 750
GTT CAG TAT TGT CAG GCT CTG GCA CAA TTG GAA ATG GTT TAC CAT TCC 2304 Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr His Ser 755 760 765
ACC GGC ATC AAC GAA AAC GCC TTC CGT CTA TTT GTG ACA AAA CCA GAG 2352 Thr Gly He Asn Glu Asn Ala Phe Arq Leu Phe Val Thr Lys Pro Glu 770 775 ' 780
-217-
SUBSTΓΓUTE SHEET (RULE25)
ATG TTT GGC GCT GCA ACT GGA GCA GCG CCC .GCG CAT GAT GCC CTT TCA 2400 Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Αla His Asp Ala Leu Ser 785 790 795 800 CTG ATT ATG CTG ACA CGT TTT GCG GAT TGG GTG AAC GCA CTA GGC GAA 2448 Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 805 810 815
AAA GCG TCC TCG GTG CTA GCG GCA TTT GAA GCT AAC TCG TTA ACG GCA 2496 Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 820 825 830
GAA CAA CTG GCT GAT GCC ATG AAT CTT GAT GCT AAT TTG CTG TTG CAA 2544 Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 835 840 845
GCC AGT ATT CAA GCA CAA AAT CAT CAA CAT CTT CCC CCA GTA ACT CCA 2592 Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 850 855 860
GAA AAT GCG TTC TCC TGT TGG ACA TCT ATC AAT ACT ATC CTG CAA TGG 2640 Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu Gin Trp 865 870 875 880 GTT AAT GTC GCA CAA CAA TTG AAT GTC GCC CCA CAC GGC GTT TCC GCT :688 Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 885 890 895
TTG GTC GGG CTG GAT TAT ATT CAA TCA ATG AAA GAG ACA CCG ACC TAT 2736 Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thi Tyr 900 905 910
GCC CAG TGG GAA AAC GCG GCA GGC GTA TTA ACC GCC GGG TTG AAT TCA 2784 Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 915 920 925
CAA CAG GCT AAT ACA TTA CAC GCT TTT CTG GAT GAA TCT CGC AGT GCC 2832 Gin Gin Ala Asn Thr Leu His Ala Phe Leu ASD Glu Ser Arg Ser Ala 930 935 * 940
GCA TTA AGC ACC TAC TAT ATC CGT CAA GTC GCC AAG GCA GCG GCG GCT 2880 Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala 945 950 955 960 ATT AAA AGC CGT GAT GAC TTG TAT CAA TAC TTA CTG ATT GAT AAT CAG 2928
He Lvs Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn GLn 965 970 975
GTT TCT GCG GCA ATA AAA ACC ACC CGG ATC GCC GAA GCC ATT GCC AGT 2976 Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 980 985 990
ATT CAA CTG TAC GTC AAC CGG GCA TTG GAA AAT GTG GAA GAA AAT GCC 3024
He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 995 1000 1005
AAT TCG GGG GTT ATC AGC CGC CAA TTC TTT ATC GAC TGG GAC AAA TAC 3072
Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr
1010 1015 1020
AAT AAA CGC TAC AGC ACT TGG GCG GGT GTT TCT CAA TTA GTT TAC TAC 3120 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 1025 1030 1035 1040 CCG GAA AAC TAT ATT GAT CCG ACC ATG CGT ATC GGA CAA ACC AAA ATG 3168 Pro G_u Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr Lys Met 1045 1050 1055
ATG GAC GCA TTA CTG CAA TCC GTC AGC CAA AGC CAA TTA AAC GCC GAT ^216 Met ASD Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070
-218-
SUBSTTTUTE SHEET (RULE 26)
ACC GTC GAA GAT GCC TTT ATG TCT TAT CTG ACA TCG TTT GAA CAA GTG 3264 Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 1075 1080 1085
GCT AAT CTT AAA GTT ATT AGC GCA TAT CAC GAT AAT ATT AAT AAC GAT 3312 Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn He Asn Asn Asp 1090 1095 1100 CAA GGG CTG ACC TAT TTT ATC GGA CTC AGT GAA ACT GAT GCC GGT GAA 3360 Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly Glu 1105 1110 1115 1120
TAT TAT TGG CGC AGT GTC GAT CAC AGT AAA TTC AAC GAC GGT AAA TTC 3408 Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe
1125 1130 1135
GCG GCT AAT GCC TGG AGT GAA TGG CAT AAA ATT GAT TGT CCA ATT AAC 3456 Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro He Asn 1140 1145 1150
CCT TAT AAA AGC ACT ATC CGT CCA GTG ATA TAT AAA TCC CGC CTG TAT 3504 Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 1155 1160 1165
CTG CTC TGG TTG GAA CAA AAG GAG ATC ACC AAA CAG ACA GGA AAT AGT 3552 Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly Asn Ser 1170 1175 1180 AAA GAT GGC TAT CAA ACT GAA ACG GAT TAT CGT TAT GAA CTA AAA TTG 3600 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 1185 1190 1195 1200
GCG CAT ATC CGC TAT GAT GGC ACT TGG AAT ACG CCA ATC ACC TTT GAT 3648 Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp
1205 1210 1215
GTC AAT AAA AAA ATA TCC GAG CTA AAA CTG GAA AAA AAT AGA GCG CCC 3696 Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230
GGA CTC TAT TGT GCC GGT TAT CAA GGT GAA GAT ACG TTG CTG GTG ATG 3744 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 1235 1240 1245
TTT TAT AAC CAA CAA GAC ACA CTA GAT AGT TAT AAA AAC GCT TCA ATG 3792 Phe Tyr Asn Gin Gin ASD Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 1250 1255 1260 CAA GGA CTA TAT ATC TTT GCT GAT ATG GCA TCC AAA GAT ATG ACC CCA 3840 Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 1265 1270 1275 1280
GAA CAG AGC AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 3888 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr
1285 1290 1295
AAT AAT GTC AGA AGA GTG AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3936 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He 1300 1305 1310
CCT TCC TCG GTA AGT AGC CGT AAA GAC TAT GGT TGG GGA GAT TAT TAC 3984 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325
CTC AGC ATG GTA TAT AAC GGA GAT ATT CCA ACT ATC AAT TAC AAA GCC 4032 Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr Lys Ala 1330 1335 1340 GCA TCA AGT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 4080 Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg He He
-219-
SUBSTΓΓUTE SHEET (RULE 26)
1345 1350 1355 1360
CAT AAT GGA TAT GAA GGA CAG AAG CGC AAT CAA TGC AAT CTG ATG AAT 4128 His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 1365 1370 1375
AAA TAT GGC AAA CTA GGT GAT AAA TTT ATT GTT TAT ACT AGC TTG GGG 4176 Lys Tyr Gly Lys Leu Gly ASD Lys Phe He Val Tyr Thr Ser Leu Gly 1380 1385 1390
GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATG TTT TAC CCC GTC TAT 4224 Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 1395 1400 1405 CAA TAT AGC GGA AAC ACC AGT GGA CTC AAT CAA GGG AGA CTA CTA TTC 4272
Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420
CAC CGT GAC ACC ACT TAT CCA TCT AAA GTA GAA GCT TGG ATT CCT GGA 4320 His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He Pro Gly
1425 1430 1435 1440
GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT GAT TAT 4368 Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 1445 1450 1455
GCT ACA GAC TCT CTG AAT AAA CCG GAT GAT CTT AAG CAA TAT ATC TTT 4416 Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tvr He Phe 1460 1465 1470
ATG ACT GAC AGT AAA GGG ACT GCT ACT GAT GTC TCA GGC CCA GTA GAG 4464 Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 1475 1480 1485 ATT AAT ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA GTC AAA GCG 4512 He Asn Thr Ala He Ser Pro Ala Lys Val Gin He lie Val Lys Ala 1490 1495 1500
GGT GGC AAG GAG CAA ACT TTT ACC GCA GAT AAA GAT GTC TCC ATT CAG 4560 Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 1505 1510 1515 1520
CCA TCA CCT AGC TTT GAT GAA ATG AAT TAT CAA TTT AAT GCC CTT GAA 4608 Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 1525 1530 1535
ATA GAC GGT TCT GGT CTG AAT TTT ATT AAC AAC TCA GCC AGT ATT GAT 4656 He Asp Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser He Asp 1540 1545 1550
GTT ACT TTT ACC GCA TTT GCG GAG GAT GGC CGC AAA CTG GGT TAT GAA 4704 Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 1555 1560 1565 AGT TTC AGT ATT CCT GTT ACC CTC AAG GTA AGT ACC GAT AAT GCC CTG 4752 Ser Phe Ser lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 1570 1575 1580
ACC CTG CAC CAT AAT GAA AAT GGT GCG CAA TAT ATG CAA TGG CAA TCC 4800 Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 1585 1590 1595 1600
TAT CGT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT GCA CGC 4848 Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arα 1605 1610 1615
GCC ACC ACC GGA ATC GAT ACA ATT CTG AGT ATG GAA ACT CAG AAT ATT 4896 Ala Thr Thr Giy He Asp Thr He Leu Ser Met Glu Thr Gin Asn He 1620 1625 1630
CAG GAA CCG CAG TTA GGC AAA GGT TTC TAT GCT ACG TTC GTG ATA CCT 4944
-220-
SUBSTΓΓUTE SHEET (RULE 26)
Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pre 1635 1640 ' 1645
CCC TAT AAC CTA TCA ACT CAT GGT GAT GAA CGT TGG TTT AAG CTT TAT 4992 Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 1650 1655 1660
ATC AAA CAT GTT GTT GAT AAT AAT TCA CAT ATT ATC TAT TCA GGC CAG 5040 He Lys His Val Val Asp Asn Asn Ser His He He Tyr Ser Gly Gin 1665 1670 1675 1680
CTA ACA GAT ACA AAT ATA AAC ATC ACA TTA TTT ATT CCT CTT GAT GAT 5088
Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu Asp Asp 1685 1690 1695
GTC CCA TTG AAT CAA GAT TAT CAC GCC AAG GTT TAT ATG ACC TTC AAG 5136
Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys
1700 1705 1710
AAA TCA CCA TCA GAT GGT ACC TGG TGG GGC CCT CAC TTT GTT AGA GAT 5184
Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp
1715 1720 1725
GAT AAA GGA ATA GTA ACA ATA AAC CCT AAA TCC ATT TTG ACC CAT TTT 5232 ASD Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr His Phe 1730 1735 1740
GAG AGC GTC AAT GTC CTG AAT AAT ATT AGT AGC GAA CCA ATG GAT TTC ;280 Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 1745 1750 1755 1760
AGC GGC GCT AAC AGC CTC TAT TTC TGG GAA CTG TTC TAC TAT ACC CCG 5328 Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 1765 1770 1775
ATG CTG GTT GCT CAA CGT TTG CTG CAT GAA CAG AAC TTC GAT GAA GCC 5376 Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe ASD Glu Ala 1780 1785 1790 AAC CGT TGG CTG AAA TAT GTC TGG AGT CCA TCC GGT TAT ATT GTC CAC 5424 Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 1795 1800 1805
GGC CAG ATT CAG AAC TAC CAG TGG AAC GTC CGC CCG TTA CTG GAA GAC £472 Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 1810 1815 1820
ACC AGT TGG AAC AGT GAT CCT TTG GAT TCC GTC GAT CCT GAC GCG GTA 5520 Thr S - Trp A&n Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1825 1830 1835 1840
GCA CAG CAC GAT CCA ATG CAC TAC AAA GTT TCA ACT TTT ATG CGT ACC >568 Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 1845 1850 1855
TTG GAT CTA TTG ATA GCA CGC GGC GAC CAT GCT TAT CGC CAA CTG GAA 5616 Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 1860 1865 1870
CGA GAT ACA CTC AAC GAA GCG AAG ATG TGG TAT ATG CAA GCG CTG CAT 5664 Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 1875 1880 1885
CTA TTA GGT GAC AAA CCT TAT CTA CCG CTG AGT ACG ACA TGG AGT GAT )7i; Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr TrD Ser ASD 1890 1895 1900
CCA CGA CTA GAC AGA GCC GCG GAT ATC ACT ACC CAA AAT GCT CAC GAC 5760 Pro Arg Leu ASD Arg Ala Ala Asp He Thr Thr Gin Asn Ala His ASD 1905 " 1910 1915 1920
-221-
SUBSTΪTUTE SHEET (RULE 25)
AGC GCA ATA GTC GCT CTG CGG CAG AAT ATA. CCT ACA CCG GCA CCT TTA 5808
Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala Pro Leu 1925 1930 1935 TCA TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC 5856
Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He 1940 1945 1950
AAT GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC 5904 Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965
AAT CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA 5952
Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980
ATC TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT 6000
He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1985 1990 1995 2000
GCC ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG 6048 Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 2005 2010 2015 TGG CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG 6096
Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Glr. 2020 2025 2030
CTC ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC 6144 Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp
2035 2040 2045
GCG GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA 6192 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 2050 2055 2060
TTG ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC 6240 Leu Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala 2065 2070 2075 2080
GAG AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA CAA TCG CGC TTT 6288 Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 2085 2090 2095 GAT AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC 6336 Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 2100 2105 2110
CAA GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT 6384 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125
CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 6432 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 2130 2135 2140
TTC GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG 6480 Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 2145 2150 2155 2160
ACA GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG 6528 Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 2165 2170 2175 GAT AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG 6576 Asp Lys He Ser Gin Ser Glu Thr Tvr Arg Arg Arq Arg Gin Glu Trc 2180 21.85 2190
GAG ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT 6624 Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 2195 2200 2205
CAG CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA 6672 Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 2210 2215 2220
ACC AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC 6720 Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 CTG CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT 6768 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255
CGA CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT 6816 Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270
TGC CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT 6864 Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 2275 2280 2285
GCC CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG 6912 Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 2290 2295 2300
CTT GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT 6960 Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 2305 2310 2315 2320
CAT CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG 7008 His Leu Lys Arq Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 2325 2330 2335
CTG GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC 7056 Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 2340 2345 2350
CTG GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC AGT GCC 7104 Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365
GGC AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA 7152 Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 2370 2375 2380
ACC TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA 7200 Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu 2385 2390 2395 2400
GAT TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC 7248 Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser 2405 2410 2415
GTC ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA 7296 Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 2420 2425 2430
TTG TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG 7344 Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 2435 2440 2445
GCA GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC ^392 Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 2450 2455 2460
AAC GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC 7440 Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 2465 2470 2475 2480 ACG CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA 7488 Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys
-223-
SUBSTΓΓUTE SHEET (RULE 26)
2485 2490 2495
CAA GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC ">536 Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg 2500 2505 2510
TAC ACC ATT AAA TAA 7551 Tyr Thr He Lys • • • 2516
10
(2) INFORMATION FOR SEQ ID NO: 7
(l) SEQUENCE CHARACTERISTICS:
15 (A) LENGTH: 2516 ammo acids
(B) TYPE: ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear [ii) MOLECULE TYPE: protein
20
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47 (TcdA):
Description TcdA proteins TcdAn peptide . TcdAn N-termmus (SEQ ID NO: 13) (SEQ ID NO:38) (SEQ ID NO:17) (SEQ ID NO:23; 12/13) (SEQ ID NO: 18)
30 (SEQ ID NO:39) (SEQ ID NO:21; 19/23) (SEQ ID NO:41) TcdAm peptide (SEQ ID NO:42)
35
(SEQ ID NO:43)
Met Asn Glu Ser Val Lys Glu He Pro Asp Val Leu Lys Ser Gin Cys 1 5 10 15
Gly Phe Asn Cys Leu Thr Asp He Ser His Ser Ser Phe Asn Glu Phe 20 25 30
Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu
35 40 45
45
TyrTT-.-, Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tvr Glu Ala 50 55 60
Arg He Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val His Leu
50 65 70 75 80
Ala He Leu Ala Pro Asn Ala Glu Leu He Gly Tyr Asn Asn Gin Phe 85 90 95
55 Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser Ser Met 100 105 110
Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn 115 120 125
60
Leu His Ala Ser Asp Ser Val Tyr lyr Leu Asp 'Ihr Arg Arα Pro ASD 130 135 140
Leu Lys Ser Met Ala Leu Ser Gin Gin Asn Met Asp He Glu Leu Ser 145 150 155 160
Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys Thr Glu
-224-
SUBSTΓΓUTE SHEET (RULE 26)
165 170 175
Ser Lvs Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 180 185 190
Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 195 200 205
Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 210 215 220
Pro Ala He Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly He Asn 225 230 235 240 Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr
245 250 255
Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 260 265 270
Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 275 280 285
Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 290 295 300
Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser
305 310 315 320 Ser ASD Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr
325 330 335
Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 340 345 350
Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 355 360 365
Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 370 375 380
Pro Gin Val Asn He Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ala
385 390 395 400 Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu Pro Ser
405 410 415
Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn
420 425 430
Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu Ser Arc
435 440 445
Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg Ser Val 450 455 460
Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 465 470 475 480 Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu
485 490 495
He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn Gin Pro 500 505 510
Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tvr 515 520 525
Phe Ser Thr Giv Asp Glu Glu He Asp Leu Asn Ser Gly Ser Thr Gly 530 535 540
-225-
SUBSTΓΓUTE SHEET (RULE 26)
Asp Tro Arg Lys Thr He Leu Lys Arg Ala Phe Asn He Asp Asp Val
545 " 550 "555 560
Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys Asp Gly
565 570 575
Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He Gly Lys
580 585 590 Leu Leu Ala Asp He His Gin Leu Thr He ASD Glu Leu ASD Leu Leu 595 600 " 605
Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala He Ser ASD 610 615 620
Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 625 630 635 640
Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Met Thr Ser 645 650 655
Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu He Lys Asn Leu Leu Asp
660 665 670 Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Le 675 680 685
Leu H s Val Met Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 690 695 700
Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 705 710 715 720
Gly ASD Gly Ala Met Thr Ala Glu Lys Phe Trp Asp Tro Leu Asn Thr 725 730 ' 735
Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 740 745 750 Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr His Ser 755 760 765
Thr Gly He Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu
770 775 780
Met Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser
785 790 795 800
Leu He Met Leu Thr Arg Phe Ala ASD Trp Val Asn Ala Leu Gly Glu 805 ' 810 815
Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 820 825 830 Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 835 840 845
Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro
850 855 860
Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu Gin Trp
865 870 875 880
Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 885 890 895
Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 900 905 910 Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 915 920 925
-226-
SUBSTΪTUTE SHEET (RULE 26)
Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala
930 935 940
Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala
945 950 955 960
He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin
965 970 975
Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Ser 980 985 990
He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 995 1000 1005
Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 1010 1015 1020 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr
1025 1030 1035 1040
Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr Lys Met 1045 1050 1055
Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070
Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 1075 1080 1085
Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn He Asn Asn Asp 1090 1095 1100
Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly Glu 1105 1110 1115 1120
Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 1125 1130 1135
Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp Cys Pro He Asn 1140 1145 1150
Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 1155 1160 1165
Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Giv Asn Ser 1170 1175 1180 Lys ASD Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 1185 1190 1195 1200
Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp 1205 1210 1215
Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230
Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 1235 1240 1245
Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Met 1250 1255 1260 Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro
1265 1270 1275 1280
Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295
Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala G.-U Asp Tyr Giu He
-227-
SUBST1TUTE SHEET (RULE 26)
1300 1305 1310
Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325
Leu Ser Met Val Tvr Asn Gly Asp He Pro Thr He Asn Tyr Lvs Ala 1330 ' 1335 1340
Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg He He 1345 1350 1355 1360
His A.sn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 1365 1370 1375 Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 1380 1385 1390
Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 1395 1400 1405
Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420
His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He Pro Giv 1425 1430 1435 1440
Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Giv Asp Asp Tyr 1445 1450 1455
Ala Thr Asp Ser Leu Asn Lys Pro ASD Asp Leu Lys Gin Tyr He Phe 1460 1465 1470
Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val G u 1475 1480 1485
He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys Ala 1490 1495 1500
Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 1505 1510 1515 1520
Pro Ser Pro Ser Phe Asp Glu Met Asn Tyr Gin Phe Asn Ala Leu Glu 1525 1530 1535 He ASD Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser He ASD 1540 1545 1550
Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tvr Glu 1555 1560 1565
Ser Phe Ser He Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 1570 1575 1580
Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 1585 1590 1595 1600
Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 1605 1610 1615 Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin Asn He 1620 1625 1630
Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 1635 1640 1645
Pro Tvr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 1650 1655 1660
He Lvs His Val Val Asp Asn Asn Ser His He He Tvr Ser Gly Gin 1665 1670 1675 1680
Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu Asp ASD 1685 1690 1695
Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 1700 1705 1710
Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg ASD 1715 1720 1725 Asp Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr His Phe 1730 1735 1740
Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 1745 1750 1755 1760
Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 1765 1770 1775
Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 1780 1785 1790
Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 1795 1800 1805 Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 1810 1815 1820
Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1825 1830 1835 1840
Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 1845 1850 1855
Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 1860 1865 1870
Arg Asp Thr Leu Asn Glu Ala Lvs Met Trp Tyr Met Gin Ala Leu His 1875 1880 1885 Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 1890 1895 1900
Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala His Asp 1905 1910 1915 1920
Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala Pro Leu 1925 1930 1935
Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He 1940 1945 1950
Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965 Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980
He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1985 1990 1995 2000
Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 2005 2010 2015
Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 2020 2025 2030
Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp 2035 2040 2045 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 2050 2055 2060
-229-
SUBSTΓΓUTE SHEET (RULERS)
Leu Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala 2065 2070 2075 2080
Glu Lvs Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 2085 2090 2095
Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 2100 2105 2110
Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125
Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 2130 2135 2140
Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 2145 2150 2155 2160
Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 2165 2170 2175
Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 2180 2185 2190
Glu lie Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 2195 2200 2205
Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 2210 2215 2220
Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240
Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255
Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270
Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn ASD Asp Ser 2275 2280 2285
Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 2290 2295 2300
Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 2305 2310 2315 2320 His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser
2325 2330 2335
Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 2340 2345 2350
Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365
Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 2370 2375 2380
Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu 2385 2390 2395 2400 Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser
2405 2410 2415
Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 2420 2425 2430
Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu
-230-
SUBSTΓΓUTE SHEET (RULE 26)
2435 2440 2445
Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 2450 2455 2460
Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 2465 2470 2475 2480
Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 2485 2490 2495
Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg 2500 2505 2510 Tyr Thr He Lys 2516
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5547 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 (tcdAϋ coding region) :
CTG ATA GGC TAT AAC AAT CAA TTT AGC GGT AGA GCC AGT CAA TAT GTT 48
Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val
1 5 10 15 GCG CCG GGT ACC GTT TCT TCC ATG TTC TCC CCC GCC GCT TAT TTG ACT 96
Ala Pro Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr 20 25 30
GAA CTT TAT CGT GAA GCA CGC AAT TTA CAC GCA AGT GAC TCC GTT TAT 144 Glu Leu Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 35 40 45
TAT CTG GAT ACC CGC CGC CCA GAT CTC AAA TCA ATG GCG CTC AGT CAG 192
Tyr Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 50 55 60
CAA AAT ATG GAT ATA GAA TTA TCC ACA CTC TCT TTG TCC AAT GAG CTG 240
Gin Asn Met Asp He Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 65 70 75 80
TTA TTG GAA AGC ATT AAA ACT GAA TCT AAA CTG GAA AAC TAT ACT AAA 288
Leu Leu Glu Ser He Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys
85 90 95 GTG ATG GAA ATG CTC TCC ACT TTC CGT CCT TCC GGC GCA ACG CCT TAT 336
Val Met Glu Met Leu Ser Thr Phe Arg Pro Ser Gly Ala Thr Pro Tyr 100 105 110
CAT GAT GCT TAT GAA AAT GTG CGT GAA GTT ATC CAG CTA CAA GAT CCT 384 His Asp Ala Tyr Glu Asn Val Arg Glu Val He Gin Leu Gin Asp Pro 115 120 125
GGA CTT GAG CAA CTC AAT GCA TCA CCG GCA ATT GCC GGG TTG ATG CAT 432
Gly Leu Glu Gin Leu Asn Ala Ser Pro Ala He Ala Gly Leu Met His 130 135 140
CAA GCC TCC CTA TTG GGT ATT AAC GCT TCA ATC TCG CCT GAG CTA TTT 480
Gin Ala Ser Leu Leu Gly He Asn Ala Ser He Ser Pro Glu Leu Phe 145 150 155 160
-231-
SUBSTΓΓUTE SHEET (RULE 26)
AAT ATT CTG ACG GAG GAG ATT ACC GAA GGT' AAT GCT GAG GAA CTT TAT 528
Asn He Leu Thr Glu Glu He Thr Glu Gly Asn Ala Glu Glu Leu Tyr
165 170 175
AAG AAA AAT TTT GGT AAT ATC GAA CCG GCC TCA TTG GCT ATG CCG GAA 576
Lys Lys Asn Phe Gly Asn He Glu Pro Ala Ser Leu Ala Met Pro Glu
180 185 190 TAC CTT AAA CGT TAT TAT AAT TTA AGC GAT GAA GAA CTT AGT CAG TTT 624 Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe 195 200 205
ATT GGT AAA GCC AGC AAT TTT GGT CAA CAG GAA TAT AGT AAT AAC CAA 672 He Gly Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin 210 215 220
CTT ATT ACT CCG GTA GTC AAC AGC AGT GAT GGC ACG GTT AAG GTA TAT 720 Leu He Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr 225 230 235 240
CGG ATC ACC CGC GAA TAT ACA ACC AAT GCT TAT CAA ATG GAT GTG GAG 768
Arg He Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asp Val Glu 245 250 255
CTA TTT CCC TTC GGT GGT GAG AAT TAT CGG TTA GAT TAT AAA TTC AAA 816 Leu Phe Pro Phe Gly Gly Glu Asn Tyr Arg Leu ASD Tyr Lvs Phe Lys 260 265 270 AAT TTT TAT AAT GCC TCT TAT TTA TCC ATC AAG TTA AAT GAT AAA AGA 864 Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg 275 280 285
GAA CTT GTT CGA ACT GAA GGC GCT CCT CAA GTC AAT ATA GAA TAC TCC 912 Glu Leu Val Arg Thr Glu Gly Ala Pro Gin Val Asn He Glu Tyr Ser 290 295 300
GCA AAT ATC ACA TTA AAT ACC GCT GAT ATC AGT CAA CCT TTT GAA ATT 960 Ala Asn He Thr Leu Asn Thr Ala Asp He Ser Gin Pro Phe Glu He 305 310 315 320
GGC CTG ACA CGA GTA CTT CCT TCC GGT TCT TGG GCA TAT GCC GCC GCA 1008 Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala 325 330 335
AAA TTT ACC GTT GAA GAG TAT AAC CAA TAC TCT TTT CTG CTA AAA CTT 1056
Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 340 345 350 AAC AAG GCT ATT CGT CTA TCA CGT GCG ACA GAA TTG TCA CCC ACG ATT 1104
Asn Lys Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He
355 360 365
CTG GAA GGC ATT GTG CGC AGT GTT AAT CTA CAA CTG GAT ATC AAC ACA 1152 Leu Glu Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr
370 375 380
GAC GTA TTA GGT AAA GTT TTT CTG ACT AAA TAT TAT ATG CAG CGT TAT 1200
ASD Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr 385 390 395 400
GCT ATT CAT GCT GAA ACT GCC CTG ATA CTA TGC AAC GCG CCT ATT TCA 1248
Ala He His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser 405 410 415
CAA CGT TCA TAT GAT AAT CAA CCT AGC CAA TTT GAT CGC CTG TTT AAT 1296 Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arq Leu Phe Asn 420 425 430 ACG CCA TTA CTG AAC GGA CAA TAT TTT TCT ACC GGC GAT GAG GAG ATT 1344 Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He
-232-
SUBSTΓΓUTE SHEET (RULE 25)
435 4 40 . 445
GAT TTA AAT TCA GGT AGC ACC GGC GAT TGG CGA AAA ACC ATA CTT AAG 1392 Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys 450 455 460
CGT GCA TTT AAT ATT GAT GAT GTC TCG CTC TTC CGC CTG CTT AAA ATT 1440 Arg Ala Phe Asn He Asp ASD Val Ser Leu Phe Arg Leu Leu Lys He 465 470 475 480
ACC GAC CAT GAT AAT AAA GAT GGA AAA ATT AAA AAT AAC CTA AAG AAT 1488 Thr Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn 485 490 495 CTT TCC AAT TTA TAT ATT GGA AAA TTA CTG GCA GAT ATT CAT CAA TTA 1536 Leu Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu 500 505 510
ACC ATT GAT GAA CTG GAT TTA TTA CTG ATT GCC GTA GGT GAA GGA AAA 1584 Thr He Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys 515 520 525
ACT AAT TTA TCC GCT ATC AGT GAT AAG CAA TTG GCT ACC CTG ATC AGA 1632 Thr Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg 530 535 540
AAA CTC AAT ACT ATT ACC AGC TGG CTA CAT ACA CAG AAG TGG AGT GTA 1680
Lys Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val
545 550 555 560
TTC CAG CTA TTT ATC ATG ACC TCC ACC AGC TAT AAC AAA ACG CTA ACG 1728
Phe Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr
565 570 575 CCT GAA ATT AAG AAT TTG CTG GAT ACC GTC TAC CAC GGT TTA CAA GGT 1776 Pro Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly 580 585 590
TTT GAT AAA GAC AAA GCA GAT TTG CTA CAT GTC ATG GCG CCC TAT ATT 1824 Phe Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He 595 600 605
GCG GCC ACC TTG CAA TTA TCA TCG GAA AAT GTC GCC CAC TCG GTA CTC 1872 Ala Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu 610 615 620
CTT TGG GCA GAT AAG TTA CAG CCC GGC GAC GGC GCA ATG ACA GCA GAA 1920 Leu TrD Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu
625 -=-- 630 635 640
AAA TTC TGG GAC TGG TTG AAT ACT AAG TAT ACG CCG GGT T A"TCC" GAA 1968 Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 645 650 655 GCC GTA GAA ACG CAG GAA CAT ATC GTT CAG TAT TGT CAG GCT CTG GCA 2016 Ala Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala 660 665 670
CAA TTG GAA ATG GTT TAC CAT TCC ACC GGC ATC AAC GAA AAC GCC TTC 2064 Gin Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe 675 680 685
CGT CTA TTT GTG ACA AAA CCA GAG ATG TTT GGC GCT GCA ACT GGA GCA 2112 Arg Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala 690 695 700
GCG CCC GCG CAT GAT GCC CTT TCA CTG ATT ATG CTG ACA CGT TTT GCG 2160 Ala Pro Ala His ASD Ala Leu Ser Leu He Met Leu Thr Arg Phe- Ala 705 710 715 720
GAT TGG GTG AAC GCA CTA GGC GAA AAA GCG TCC TCG GTG CTA GCG GCA 220?
-233-
SUBSTΓΓUTE SHEET (RULE 26)
Asp Trp Val Asn Ala Leu Gly Glu Lys Ala. Ser Ser Val Leu Ala Ala 725 730 735
TTT GAA GCT AAC TCG TTA ACG GCA GAA CAA CTG GCT GAT GCC ATG AAT 2256 Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn
740 745 750
CTT GAT GCT AAT TTG CTG TTG CAA GCC AGT ATT CAA GCA CAA AAT CAT 2304
Leu Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His 755 760 765
CAA CAT CTT CCC CCA GTA ACT CCA GAA AAT GCG TTC TCC TGT TGG ACA 2352
Gin His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr
770 775 780
TCT ATC AAT ACT ATC CTG CAA TGG GTT AAT GTC GCA CAA CAA TTG AAT 2400 Ser He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 785 790 795 800 GTC GCC CCA CAG GGC GTT TCC GCT TTG GTC GGG CTG GAT TAT ATT CAA 2448 Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin 805 810 815
TCA ATG AAA GAG ACA CCG ACC TAT GCC CAG TGG GAA AAC GCG GCA GGC 2496 Ser Met Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 820 825 830
GTA TTA ACC GCC GGG TTG AAT TCA CAA CAG GCT AAT ACA TTA CAC GCT 2544 Val Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala 835 840 845
TTT CTG GAT GAA TCT CGC AGT GCC GCA TTA AGC ACC TAC TAT ATC CGT 2592
Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg
850 855 860
CAA GTC GCC AAG GCA GCG GCG GCT ATT AAA AGC CGT GAT GAC TTG TAT 2640
Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr
865 870 875 880 CAA TAC TTA CTG ATT GAT AAT CAG GTT TCT GCG GCA ATA AAA ACC ACC 2688 Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr 885 890 895
CGG ATC GCC GAA GCC ATT GCC AGT ATT CAA CTG TAC GTC AAC CGG GCA 2736 Arg He Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arq Ala 900 905 910
TTG GAA AAT GTG GAA GAA AAT GCC AAT TCG GGG GTT ATC AGC CGC CAA 2784 Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin 915 920 925
TTC TTT ATC GAC TGG GAC AAA TAC AAT AAA CGC TAC AGC ACT TGG GCG 2832 Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 930 935 940
GGT GTT TCT CAA TTA GTT TAC TAC CCG GAA AAC TAT ATT GAT CCG ACC 2880 Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 945 950 955 960 ATG CGT ATC GGA CAA ACC AAA ATG ATG GAC GCA TTA CTG CAA TCC GTC 2928 Met Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu G n Ser Val 965 970 975
AGC CAA AGC CAA TTA AAC GCC GAT ACC GTC GAA GAT GCC TTT ATG TCT 2976 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 980 985 990
TAT CTG ACA TCG TTT GAA CAA GTG GCT AAT CTT AAA GTT ATT AGC GCA 3024 Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 995 1000 1005
-234-
SUBSTΓΓUTE SHEET (RULE 26)
TAT CAC GAT AAT ATT AAT AAC GAT CAA GGG .CTG ACC TAT TTT ATC GGA 3072
Tyr His Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gly
1010 1015 1020 CTC AGT GAA ACT GAT GCC GGT GAA TAT TAT TGG CGC AGT GTC GAT CAC 3120
Leu Ser Glu Thr ASD Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040
AGT AAA TTC AAC GAC GGT AAA TTC GCG GCT AAT GCC TGG AGT GAA TGG 3168 Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp
1045 1050 1055
CAT AAA ATT GAT TGT CCA ATT AAC CCT TAT AAA AGC ACT ATC CGT CCA 3216 His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 1060 1065 1070
GTG ATA TAT AAA TCC CGC CTG TAT CTG CTC TGG TTG GAA CAA AAG GAG 3264 Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 1075 1080 1085
ATC ACC AAA CAG ACA GGA AAT AGT AAA GAT GGC TAT CAA ACT GAA ACG 5312 He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 1090 1095 1100 GAT TAT CGT TAT GAA CTA AAA TTG GCG CAT ATC CGC TAT GAT GGC ACT Ξ360 Asp Tvr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 1105 ' 1110 1115 112C
TGG AAT ACG CCA ATC ACC TTT GAT GTC AAT AAA AAA ATA TCC GAG CTA 3408 Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu
1125 1130 1135
AAA CTG GAA AAA AAT AGA GCG CCC GGA CTC TAT TGT GCC GGT TAT CAA 3456 Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 1140 1145 1150
GGT GAA GAT ACG TTG CTG GTG ATG TTT TAT AAC CAA CAA GAC ACA CTA 3504 Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1160 1165
GAT AGT TAT AAA AAC GCT TCA ATG CAA GGA CTA TAT ATC TTT GCT GAT 3552 Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr He Phe Ala Asp 1170 1175 1180 ATG GCA TCC AAA GAT ATG ACC CCA GAA CAG AGC AAT GTT TAT CGG GAT 3600
Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg ASD 1185 1190 1195 120C
AAT AGC TAT CAA CAA TTT GAT ACC AAT AAT GTC AGA AGA GTG AAT AAC 3648 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn
1205 1210 1215
CGC TAT GCA GAG GAT TAT GAG ATT CCT TCC TCG GTA AGT AGC CGT AAA 3696
Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 1220 1225 1230
GAC TAT GGT TGG GGA GAT TAT TAC CTC AGC ATG GTA TAT AAC GGA GAT 3744
Asp Tyr Gly Tro Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp
1235 1240 1245
ATT CCA ACT ATC AAT TAC AAA GCC GCA TCA AGT GAT TTA AAA ATC TAT 3792 He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 1250 1255 1260 ATC TCA CCA AAA TTA AGA ATT ATT CAT AAT GGA TAT GAA GGA CAG AAG 3840
He Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lvs
1265 1270 1275 128C
CGC AAT CAA TGC AAT CTG ATG AAT AAA TAT GGC AAA CTA GGT GAT AAA 3888 Arq Asn Gin Cvs Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys
1285 1290 1295
-235-
SUBSTTTUTE SHEET (RULE 26)
TTT ATT GTT TAT ACT AGC TTG GGG GTC AAT CCA AAT AAC TCG TCA AAT 3936 Phe He Val Tyr Thr Ser Leu Gl y Val Asn Pro Asn Asn Ser Se r Asn 1300 1305 1310
AAG CTC ATG TTT TAC CCC GTC TAT CAA TAT AGC GGA AAC ACC AGT GGA 3984 Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325
CTC AAT CAA GGG AGA CTA CTA TTC CAC CGT GAC ACC ACT TAT CCA TCT 4032 Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340
AAA GTA GAA GCT TGG ATT CCT GGA GCA AAA CGT TCT CTA ACC AAC CAA 4080 Lys Val Glu Ala Trp He Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360
AAT GCC GCC ATT GGT GAT GAT TAT GCT ACA GAC TCT CTG AAT AAA CCG 4128 Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375
GAT GAT CTT AAG CAA TAT ATC TTT ATG ACT GAC AGT AAA GGG ACT GCT 1176 Asp Asp Leu Lys Gin Tyr He Phe Met Thr Asp Ser Lys Giv Thr Ala 1380 1385 1390
ACT GAT GTC TCA GGC CCA GTA GAG ATT AAT ACT GCA ATT TCT CCA GCA 4224
Thr Asp Val Ser Gly Pro Val Glu He Asn Thr A a He Ser Pro Ala
1395 1400 1405 AAA GTT CAG ATA ATA GTC AAA GCG GGT GGC AAG GAG CAA ACT TTT ACC 4272
Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr
1410 1415 1420
GCA GAT AAA GAT GTC TCC ATT CAG CCA TCA CCT AGC TTT GAT GAA ATG 4320 Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe ASD Glu Met
1425 1430 1435 1440
AAT TAT CAA TTT AAT GCC CTT GAA ATA GAC GGT TCT GGT CTG AAT TTT 4368
Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 1445 1450 1455
ATT AAC AAC TCA GCC AGT ATT GAT GTT ACT TTT ACC GCA TTT GCG GAG 4416
He Asn Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 1460 1465 1470
GAT GGC CGC AAA CTG GGT TAT GAA AGT TTC AGT ATT CCT GTT ACC CTC 4464 Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 1475 1480 1485 AAG GTA AGT ACC GAT AAT GCC CTG ACC CTG CAC CAT AAT GAA AAT GGT 4512
Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu AST: Gly
1490 1495 1500
GCG CAA TAT ATG CAA TGG CAA TCC TAT CGT ACC CGC CTG AAT ACT CTA 4560 Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520
TTT GCC CGC CAG TTG GTT GCA CGC GCC ACC ACC GGA ATC GAT ACA ATT 4608
Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 1525 1530 1535
CTG AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAG TTA GGC AAA GGT 4656
Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lys Gly
1540 1545 1550
TTC TAT GCT ACG TTC GTG ATA CCT CCC TAT AAC CTA TCA ACT CAT GGT 4704 Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 1555 1560 1565 GAT GAA CGT TGG TTT AAG CTT TAT ATC AAA CAT GTT GTT GAT AAT AAT 4752 Asp Glu Arq Trp Phe Lys Leu Tyr He Lys His Val Val Asp Asn Asn
-236-
SUBSTΓΓUTE SHEET (RULE 26)
1570 1575 1580
TCA CAT ATT ATC TAT TCA GGC CAG CTA ACA GAT ACA AAT ATA AAC ATC 4800 Ser His He He Tvr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 1585 1590 1595 1600
ACA TTA TTT ATT CCT CTT GAT GAT GTC CCA TTG AAT CAA GAT TAT CAC 4848 Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin ASD Tyr His 1605 1610 1615
GCC AAG GTT TAT ATG ACC TTC AAG AAA TCA CCA TCA GAT GGT ACC TGG 4896 Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Giv Thr Trp 1620 1625 1630 TGG GGC CCT CAC TTT GTT AGA GAT GAT AAA GGA ATA GTA ACA ATA AAC 4944
Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn
1635 1640 1645
CCT AAA TCC ATT TTG ACC CAT TTT GAG AGC GTC AAT GTC CTG AAT AAT 4992 Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 1650 1655 1660
ATT AGT AGC GAA CCA ATG GAT TTC AGC GGC GCT AAC AGC CTC TAT TTC 5040
He Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 1665 1670 1675 1680
TGG GAA CTG TTC TAC TAT ACC CCG ATG CTG GTT GCT CAA CGT TTG CTG 5088
Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 1685 1690 1695
CAT GAA CAG AAC TTC GAT GAA GCC AAC CGT TGG CTG AAA TAT GTC TGG 5136 His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Tro 1700 1705 1710 AGT CCA TCC GGT TAT ATT GTC CAC GGC CAG ATT CAG AAC TAC CAG TGG 5184 Ser Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Trp 1715 1720 1725
AAC GTC CGC CCG TTA CTG GAA GAC ACC AGT TGG AAC AGT GAT CCT TTG 5232 Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser ASD Pro Leu 1730 1735 1740
GAT TCC GTC GAT CCT GAC GCG GTA GCA CAG CAC GAT CCA ATG CAC TAC 5280 Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 1745 1750 1755 1760
AAA GTT TCA ACT TTT ATG CGT ACC TTG GAT CTA TTG ATA GCA CGC GGC 5328 Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu He Ala Arg Gly 1765 1770 1775
GAC CAT GCT TAT CGC CAA CTG GAA CGA GAT ACA CTC AAC GAA GCG AAG 5376 Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 1780 1785 1790 ATG TGG TAT ATG CAA GCG CTG CAT CTA TTA GGT GAC AAA CCT TAT CTA 5424 Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1805
CCG CTG AGT ACG ACA TGG AGT GAT CCA CGA CTA GAC AGA GCC GCG GAT 5472 Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1810 1815 1820
ATC ACT ACC CAA AAT GCT CAC GAC AGC GCA ATA GTC GCT CTG CGG CAG 5520 He Thr Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin 1825 1830 1835 1840
AAT ATA CCT ACA CCG GCA CCT TTA TCA 5547 Asn He Pro Thr Pro Ala Pro Leu Ser 1845 1849
-237-
SUBSTΓΓUTE SHEET (RULE 25)
(2) INFORMATION FOR SEQ ID NO: 49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1849 amino acids (B) TYPE: amino acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 (TcdAϋ)
Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 1 5 10 15
Ala Pre Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr 20 25 30
Glu Leu Tyr Aro Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 35 ' 40 45 Tyr Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 50 55 60
Gin Asn Met Asp He Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 65 70 75 80
Leu Leu Glu Ser He Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys 85 90 95
Val Met Glu Met Leu Ser Thr Phe Arg Pro Ser Gly Ala Thr Pro Tvr 100 105 110
His ASD Ala Tyr Glu Asn Val Arg Glu Val He Gin Leu Gin Asp Pro 115 120 125 Gly Leu Glu Gin Leu Asn Ala Ser Pro Ala He Ala Gly Leu Met His 130 135 140
Gin Ala Ser Leu Leu Gly He Asn Ala Ser He Ser Pro Glu Leu Phe 145 150 155 160
Asn He Leu Thr Glu Glu He Thr Glu Gly Asn Ala Glu Glu Leu Tyr 165 170 175
Lys Lys Asn Phe Gly Asn He Glu Pro Ala Ser Leu Ala Met Pro Glu 180 165 190
Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe
195 200 205 He Gly Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin
210 215 220
Leu He Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr
225 230 235 240
Arg He Thr Aro Glu Tyr Thr Thr Asn Ala Tyr Gin Met ASD Val Glu
245 250 255
Leu Phe Pro Phe Gly Gly Glu Asn Tyr Arg. Leu ASD Tyr Lys Phe Lys
260 265 * 270
Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg
275 280 285
Glu Leu Val Arg Thr Glu Gly Ala Pro Gin Val Asn He Glu Tyr Ser
290 295 300 Ala Asn He Thr Leu Asn Thr Ala Asp He Ser Gin Pro Phe Glu He 305 310 315 320
Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala 325 330 335
Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 340 345 350
Asn Lys Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He 355 360 365
Leu Glu Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr 370 375 380 Asp Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin Arg Tyr 385 390 395 400
Ala He His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser 405 410 415
Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 420 425 430
Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 435 440 445
Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys
450 455 460 Arg Ala Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys He 465 470 475 480
Thr Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn 485 490 495
Leu Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu 500 505 510
Thr He Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys 515 520 525
Thr Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg 530 535 540 Lys Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val
545 550 555 560
Phe Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 565 570 575
Pro Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly
580 585 590
Phe Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He 595 600 605
Ala Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu 610 615 620 Leu Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu 625 630 635 640
-239-
SUBSTΓΓUTE SHEET (RULE 25)
Lys Phe Trp ASD Tro Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu
645 650 655
Ala Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala
660 665 670
Gin Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe
675 680 685
Arg Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala 690 695 700
Ala Pro Ala His Asp Ala Leu Ser Leu He Met Leu Thr Arg Phe Ala 705 710 715 720
Asp Tro Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala 725 730 735
Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn 740 745 750
Leu ASD Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His 755 760 765 Gin His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cvs I'rp Thr 770 775 780
Ser He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 785 790 795 800
Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin 805 810 815
Ser Met Lys Glu Thr Pro Tnr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 820 825 830
Val Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His A a 835 840 845 Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg 850 855 860
Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp ASD Leu Tyr 865 870 875 ' 880
Gin T,r Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys 'Ihr Thr 885 890 895
Arg He Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arg Ala 900 905 910
Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin
915 920 925 Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala
930 935 940
Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr
945 950 955 960
Met Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val
965 970 975
Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 980 985 990
Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 995 1000 1005 Tyr Pis Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gl\ 1010 1015 1020
-240-
SUBSTITUTE SHEET (RULE 25)
Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040
Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 1045 1050 1055
His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 1060 1065 1070
Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 1075 1080 1085
He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 1090 1095 1100
Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 1105 1110 1115 1120 Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu
1125 1130 1135
Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 1140 1145 1150
Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1160 1165
ASD Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr He Phe Ala Asp 1170 1175 1180
Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 1185 1190 1195 1200 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn
1205 1210 1215
Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 1220 1225 1230
Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Met Val Tyr Asn Gly Asp
1235 1240 1245
He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 1250 1255 1260
He Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 1265 1270 1275 1280 Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys
1285 1290 1295
Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1300 1305 1310
Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325
Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340
Lys Val Glu Ala Trp He Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro
1365 1370 1375
Asp Asp Leu Lys Gin Tvr He Phe Met Thr Asp Ser Lys Gly Thr Ala 1380 1385 1390
Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala He Ser Pro Ala
-241-
SUBSTΓΠJTE SHEET (RULE 26)
1395 1400 1405
Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 1410 1415 1420
Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Met 1425 1430 1435 1440
Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 1445 1450 1455
He Asn Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 1460 1465 1470 Asp Gly Arg Lvs Leu Gly Tyr Glu Ser Phe Ser He Pro V,il Thr Leu 1475 1480 1485
Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly 1490 1495 1500
Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520
Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 1525 1530 1535
Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lvs Gly 1540 1545 1550 Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly
1555 1560 1565
Asp Glu Arg Trp Phe Lys Leu Tyr He Lys H s Val Val Asp Asn Asn 1570 1575 1580
Ser His He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 1585 1590 1595 1600
Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin ASD Tyr His 1605 1610 " 1615
Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 1620 1625 1630 Trp Gly Pro H s Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn
1635 1640 1645
Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 1650 1655 1660
He Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 1665 1670 1675 1680
Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 1685 1690 1695
His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 1700 1705 1710 Ser Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tvr Gin Tro 1715 1720 1725
Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Tro Asn Ser Asp Pro Leu 1730 1735 ' 1740
Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met H s Tvr 1745 1750 1755 1760
Lys Val Ser Thr Phe Met Arg Thr Leu ASD Leu Leu He Ala Arg Giv 1765 1770 1775
-242-
SUBSTΓΓUTE SHEET (RULE 26)
Asp His Ala Tvr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 1780 1785 ' ' 1790
Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1805
Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1810 1815 1820 He Thr Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin 1825 1830 1835 1840
Asn He Pro Thr Pro Ala Pro Leu Ser 1845 1849
(2) INFORMATION FOR SEQ ID NO: 50:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1740 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 { tcdAm coding region) :
TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC AAT 48 Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He Asn 1 5 10 15
GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC AAT 96 Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 20 25 30
CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA ATC 144
Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro He 35 40 45
TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT GCC 192
Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 50 55 60 ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG TGG 240 Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu Trp 65 70 75 80
CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG CTC 288 Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu
85 90 95
ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC GCG 336 Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp Ala 100 105 110
GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA TTG 384 Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He Leu 115 120 125
ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC GAG 432 Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala Glu 130 135 140 AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA CAA TCG CGC TTT GAT 480 Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asp 145 150 155 160
AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC CAA 528
-243-
SUBSTΓΓUTE SHEET (RULE26)
Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 165 170 175
GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT CAG 576 Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 180 185 190
GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC TTC 624 Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 195 200 205
GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG ACA 672 Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala Thr 210 215 220
GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG GAT 720 Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp 225 230 235 240 AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG GAG 768 Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp Glu 245 250 255
ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT CAG 816 He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He ASD Ala Gin 260 265 270
CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA ACC 864 Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 275 280 285
AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC CTG 912
Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu
290 295 300
CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT CGA 960
Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Arg
305 310 315 320 CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT TGC 1008 Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg Cys 325 330 335
CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT GCC 1056 Leu Met Ala Glu Gin Ala Tyr Arg Tro Glu Leu Asn ASD Asp Ser Ala 340 345 ' 350
CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG CTT 1104 Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr 'lyr Ala Gly Leu Leu 355 360 365
GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT CAT 1152 Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala His 370 375 380
CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG CTG 1200 Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 385 390 395 400 GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC CTG 1248 Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 405 410 415
GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC AGT GCC GGC 1296 Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 420 425 430
AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA ACC 1344 Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Giv Thr Asp Thr Lys Thr 435 440 " 445
-244-
SUBSTΓΠJTE SHEET (RULE 26)
TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA GAT 1392
Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asp
450 455 460 TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC GTC 1440
Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser Val
465 470 475 480
ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA TTG 1488 Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He Leu
485 490 495
TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG GCA 1536
Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 500 505 510
GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC AAC 1584
Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn
515 520 525
GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC ACG 1632 Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly Thr 530 535 540 CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA CAA 1680 Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 545 550 555 560
GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC TAC 1728 Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr
565 570 575
ACC ATT AAA TAA 1740 Thr He Lys 579
(2) INFORMATION FOR SEQ ID NO: 51: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 579 amino acids
(B) TYPE: amino acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear ui) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 (TcdAϋi) :
Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin—UP Asn 1 5 10 15
Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 20 25 30 Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro He 35 40 45
Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 50 55 60
Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu Trp 65 70 75 80
Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu 85 90 95
Thr G_n Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp Ala 100 105 110
-245-
SUBSTΓΓUTE SHEET (RULE 26)
Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin-Ala Ala Glu Leu He Leu 115 120 125
Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu ASD Ala Glu 130 135 140
Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe ASD 145 150 155 160 Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 165 170 175
Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 180 185 190
Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 195 200 205
Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala Thr 210 215 220
Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp
225 230 235 240 Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Tro GH
245 250 255
He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 260 265 270
Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 275 280 285
Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 290 295 300
Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Arα 305 310 315 320 Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arq Cvs
325 330 335
Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp ASD Ser Ala
340 345 350
Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu
355 360 365
Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala His 370 375 380
Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu
385 390 395 400 Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu
405 410 415
Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 420 425 430
Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr 435 440 445
Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asp 450 455 460
Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lvs Gin He Ser Val 465 470 475 480 Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He Leu
485 490 495
Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala
500 505 510
Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu ASD Phe Asn
515 520 525
Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He ASD Gin Giv Thr
530 535 540
Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 545 550 555 560
Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 565 570 575
Thr He Lys 579
(2) INFORMATION FOR SEQ ID NO: 52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5532 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52 (tcbAϋ coding region) :
TTT ATA CAA GGT TAT AGT GAT CTG TTT GGT AAT CGT GCT GAT AAC TAT 48 Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 1 5 10 15
GCC GCG CCG GGC TCG GTT GCA TCG ATG TTC TCA CCG GCG GCT TAT TTG 96 Ala Ala Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala Tyr Leu 20 25 30
ACG GAA TTG TAC CGT GAA GCC AAA AAC TTG CAT GAC AGC AGC TCA ATT 144 Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser He 35 40 45
CCG GAT TTA GCA AGC TTA ATG CTC AGC 192 Pro Asp Leu Ala Ser Leu Met Leu Ser 60
ATT TCA ACG CTG GCT CTC TCT AAT GAA 240 He Ser Thr Leu Ala Leu Ser Asn Glu 75 80
ACA AAA ACA GGA AAA TCA CAA GAT GAA 288 Thr Lys Thr Gly Lys Ser Gin Asp Glu 90 95
TAT CGT TTA AGT GGA GAG ACA CCT TAT 336 Tyr Arg Leu Ser Gly Glu Thr Pro Tyr
105 110
CAT CAC GCT TAT GAA ACT GTT CGT GAA ATC GTT CAT GAA CGT GAT CCA 384 His His Ala Tyr Glu Thr Val Arg Glu He Val His Glu Arσ Asp Pro 115 120 125 GGA TTT CGT CAT TTG TCA CAG GCA CCC ATT GTT GCT GCT AAG CTC GAT 432 Gly Phe Arg His Leu Ser Gin Ala Pro He Val Ala Ala Lys Leu Asp 130 135 140
CCT GTG ACT TTG TTG GGT ATT AGC TCC CAT ATT TCG CCA GAA CTG TAT 480
-247-
SUBSTTTUTE SHEET (RULE 26)
528
576
624
672
720
768
!16
864
TCC GCT GAT TGG ACT GAG ATT GCC CAT AAT CCC TAT CCT GAT ATG GTC 912 Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Met Val 290 295 300 ATA AAT CAA AAG TAT GAA TCA CAG GCG ACA ATC AAA CGT AGT GAC TCT 960 He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arq Ser Asp Ser 305 310 315 320
GAC AAT ATA CTC AGT ATA GGG TTA CAA AGA TGG CAT AGC GGT AGT TAT 1008 Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr
325 330 335
AAT TTT GCC GCC GCC AAT TTT AAA ATT GAC CAA TAC TCC CCG AAA GCT 1056 Asn X- i; Ala -Ala Ala Asn Phe Lys He Asp Gin Tyr Ser Pro Lys Ala 340 345 350
TTC CTG CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 Phe Leu Leu Lys Met Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 355 360 365
CTC TCT TTT GCT ACG TTG GAG CGT ATT GTT GAT AGT GTT AAT AGC ACC 1152 Leu Ser Phe Ala Thr Leu Glu Arg He Val Asp Ser Val Asn Ser Thr 370 375 380 AAA TCC ATC ACG GTT GAG GTA TTA AAC AAG GTT TAT CGG GTA AAA TTC 1200
Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 385 390 395 400
TAT ATT GAT CGT TAT GGC ATC AGT GAA GAG ACA GCC GCT ATT TTG GCT 1248 Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala
405 410 415
AAT ATT AAT ATC TCT CAG CAA GCT GTT GGC AAT CAG CTT AGC CAG TTT 1296
Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 420 425 430
-248-
SUBSTΓΓUTE SHEET O .E 26)
GAG CAA CTA TTT AAT CAC CCG CCG CTC AAT GGT ATT CGC TAT GAA ATC 1344 Glu Gin Leu Phe Asn His Pro Pro Leu Asn 'Gly He Arg Tvr Glu He 435 440 445 AGT GAG GAC AAC TCC AAA CAT CTT CCT AAT CCT GAT CTG AAC CTT AAA 1392 Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 450 455 460
CCA GAC AGT ACC GGT GAT GAT CAA CGC AAG GCG GTT TTA AAA CGC GCG 1440 Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 465 470 475 480
TTT CAG GTT AAC GCC AGT GAG TTG TAT CAG ATG TTA TTG ATC ACT GAT 1488 Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Met Leu Leu He Thr Asp 485 490 495
CGT AAA GAA GAC GGT GTT ATC AAA AAT AAC TTA GAG AAT TTG TCT GAT 1536 Arg Lys Glu Asp Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 500 505 510
CTG TAT TTG GTT AGT TTG CTG GCC CAG ATT CAT AAC CTG ACT ATT GCT 1584 Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 515 520 525 GAA TTG AAC ATT TTG TTG GTG ATT TGT GGC TAT GGC GAC ACC AAC ATT 1632 Glu Leu Asn He Leu Leu Val He Cys Gly Tyr Gly Asp Thr Asn He 530 535 540
TAT CAG ATT ACC GAC GAT AAT TTA GCC AAA ATA GTG GAA ACA TTG TTG 1680 Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 545 550 555 560
TGG ATC ACT CAA TGG TTG AAG ACC CAA AAA TGG ACA GTT ACC GAC CTG 1728 Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 565 570 575
TTT CTG ATG ACC ACG GCC ACT TAC AGC ACC ACT TTA ACG CCA GAA ATT 1776
Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He 580 585 590
AGC AAT CTG ACG GCT ACG TTG TCT TCA ACT TTG CAT GGC AAA GAG AGT 1824
Ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser
595 600 605 CTG ATT GGG GAA GAT CTG AAA AGA GCA ATG GCG CCT TGC TTC ACT TCG 1872 Leu He Gly Glu Asp Leu Lys Arg Ala Met Ala Pro Cys Phe Thr Ser 610 615 620
GCT TTG CAT TTG ACT TCT CAA GAA GTT GCG TAT GAC CTG CTG TTG TGG 1920 Ala Leu His Leu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 625 630 635 640
ATA GAC CAG ATT CAA CCG GCA CAA ATA ACT GTT GAT GGG TTT TGG GAA 1968 He Asp Gin He Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 645 650 655
GAA GTG CAA ACA ACA CCA ACC AGC TTG AAG GTG ATT ACC TTT GCT CAG 2016 Glu Val Gin -Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 660 665 670
GTG CTG GCA CAA TTG AGC CTG ATC TAT CGT CGT ATT GGG TTA AGT GAA 2064 Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu 675 680 685 ACG GAA CTG TCA CTG ATC GTG ACT CAA TCT TCT CTG CTA GTG GCA GGC 2112 Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 690 695 700
AAA AGC ATA CTG GAT CAC GGT CTG TTA ACC CTG ATG GCC TTG GAA GGT 2160 Lys Ser He Leu Asp His Gly Leu Leu Thr Leu Met Ala Leu Glu Gly 705 710 715 720
-249-
SUBSTΓΓUTE SHEET (RULE 26)
AAA ACT TAT TTG ACC AGC TTT GAG CAG GTA GCA AAT CTG AAA GTA ATT 2976 Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 980 985 990 AGT GCT TAC CAC GAT AAT GTG AAT GTG GAT CAA GGA TTA ACT TAT TTT 024 Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Gly Leu Thr Tyr Phe
-250-
SUBSTΓΓUTE SHEET (RULE 26)
995 1000 1005
ATC GGT ATC GAC CAA GCA GCT CCG GGT ACG TAT TAC TGG CGT AGT GTT 3072
He Gly He Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val
1010 1015 1020
GAT CAC AGC AAA TGT GAA AAT GGC AAG TTT GCC GCT AAT GCT TGG GGT 3120
Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 1025 1030 1035 1040
GAG TGG AAT AAA ATT ACC TGT GCT GTC AAT CCT TGG AAA AAT ATC ATC 3168
Glu Trp Asn Lys He Thr Cys Ala Val Asn Pro Trp Lys Asn He He 1045 1050 1055 CGT CCG GTT GTT TAT ATG TCC CGC TTA TAT CTG CTA TGG CTG GAG CAG 3216 Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 1060 1065 1070
CAA TCA AAG AAA AGT GAT GAT GGT AAA ACC ACG ATT TAT CAA TAT AAC 3264 Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr He Tyr Gin Tyr Asn 1075 1080 1085
TTA AAA CTG GCT CAT ATT CGT TAC GAC GGT AGT TGG AAT ACA CCA TTT 3312 Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100
ACT TTT GAT GTG ACA GAA AAG GTA AAA AAT TAC ACG TCG AGT ACT GAT 3360 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120
GCT GCT GAA TCT TTA GGG TTG TAT TGT ACT GGT TAT CAA GGG GAA GAC 3408 Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 1125 1130 1135 ACT CTA TTA GTT ATG TTC TAT TCG ATG CAG AGT AGT TAT AGC TCC TAT 3456 Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 1140 1145 1150
ACC GAT AAT AAT GCG CCG GTC ACT GGG CTA TAT ATT TTC GCT GAT ATG 3504 Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr He Phe Ala Asp Met 1155 1160 1165
TCA TCA GAC AAT ATG ACG AAT GCA CAA GCA ACT AAC TAT TGG AAT AAC 3552 Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Tro Asn Asn 1170 1175 1180
AGT TAT CCG CAA TTT GAT ACT GTG ATG GCA GAT CCG GAT AGC GAC AAT 3600 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala ASD Pro Asp Ser Asp Asn 1185 1190 1195 1200
AAA AAA GTC ATA ACC AGA AGA GTT AAT AAC CGT TAT GCG GAG GAT TAT 3648 Lys Lys Val He Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 1205 1210 1215 GAA ATT CCT TCC TCT GTG ACA AGT AAC AGT AAT TAT TCT TGG GGT GAT 3696
Glu He Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 1220 1225 1230
CAC AGT TTA ACC ATG CTT TAT GGT GGT AGT GTT CCT AAT ATT ACT TTT 3744 His Ser Leu Thr Met Leu Tyr Gly Gly Ser Val Pro Asn He Thr Phe 1235 1240 1245
GAA TCG GCG GCA GAA GAT TTA AGG CTA TCT ACC AAT ATG GCA TTG AGT 3792
Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 1250 1255 1260
ATT ATT CAT AAT GGA TAT GCG GGA ACC CGC CGT ATA CAA TGT AAT CTT 3840
He He His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu
1265 1270 1275 1280
ATG AAA CAA TAC GCT TCA TTA GGT GAT AAA TTT ATA ATT TAT GAT TCA 388!
-251-
SUBSTΓΓUTE SHEET (RULE 26)
Met Lys Gin Tyr Ala Ser Leu Gly ASD Lys Phe He He Tyr Asp Ser 1285 ' 1290 1295
TCA TTT GAT GAT GCA AAC CGT TTT AAT CTG GTG CCA TTG TTT AAA TTC 3936 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310
GGA AAA GAC GAG AAC TCA GAT GAT AGT ATT TGT ATA TAT AAT GAA AAC 3984 Gly Lys Asp Glu Asn Ser Asp Asp Ser He Cys He Tyr Asn Glu Asn 1315 1320 1325
CCT TCC TCT GAA GAT AAG AAG TGG TAT TTT TCT TCG AAA GAT GAC AAT 4032
Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 1330 1335 1340
AAA ACA GCG GAT TAT AAT GGT GGA ACT CAA TGT ATA GAT GCT GGA ACC 4080
Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys He Asp Ala Gly Thr 1345 1350 1355 1360 AGT AAC AAA GAT TTT TAT TAT AAT CTC CAG GAG ATT GAA GTA ATT AGT 4128 Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu He Glu Val He Ser 1365 1370 1375
GTT ACT GGT GGG TAT TGG TCG AGT TAT AAA ATA TCC AAC CCG ATT AAT 176 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys He Ser Asn Pro He Asn 1380 1385 1390
ATC AAT ACG GGC ATT GAT AGT GCT AAA GTA AAA GTC ACC GTA AAA GCG 4224 He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405
GGT GGT GAC GAT CAA ATC TTT ACT GCT GAT AAT AGT ACC TAT GTT CCT 4272
Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420
CAG CAA CCG GCA CCC AGT TTT GAG GAG ATG ATT TAT CAG TTC AAT AAC 4320
Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 1425 1430 1435 1440 CTG ACA ATA GAT TGT AAG AAT TTA AAT TTC ATC GAC AAT CAG GCA CAT 4368 Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His 1445 1450 1455
ATT GAG ATT GAT TTC ACC GCT ACG GCA CAA GAT GGC CGA TTC TTG GGT 4416 He Glu He Asp Phe Thr Ala Thr Ala Gin ASD Gly Arg Phe Leu Gly 1460 1465 ' 1470
GCA GAA ACT TTT ATT ATC CCG GTA ACT AAA AAA GTT CTC GGT ACT GAG 4464 Ala Glu Thr Phe He He Pro Val Thr Lys Lys Val Leu Gly Thr Glu 1475 1480 1485
AAC GTG ATT GCG TTA TAT AGC GAA AAT AAC GGT GTT CAA TAT ATG CAA 4512 Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 1490 1495 1500
ATT GGC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTG 4560 He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 1505 1510 1515 1520 GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA GTG CTC AGT ATG GAA ACT 4608 Val Ser Arg Ala Asn Arg Gly He Asp Ala Val Leu Ser Met Glu Thr 1525 1530 1535
CAG AAT ATT CAG GAA CCG CAA TTA GGA GCG GGC ACA TAT GTG CAG CTT 4656 Gin Asn He Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550
GTG TTG GAT AAA TAT GAT GAG TCT ATT CAT GGC ACT AAT AAA AGC TTT 4704 Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 1555 1560 1565
-252-
SUBSTΓΓUTE SHEET (RULE 26)
GCT ATT GAA TAT GTT GAT ATA TTT AAA GAG AAC GAT AGT TTT GTG ATT 4752
Ala He Glu Tyr Val Asp He Phe Lys Glu Asn Asp Ser Phe Val He
1570 1575 1580 TAT CAA GGA GAA CTT AGC GAA ACA AGT CAA ACT GTT GTG AAA GTT TTC 4800
Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1585 1590 1595 1600
TTA TCC TAT TTT ATA GAG GCG ACT GGA AAT AAG AAC CAC TTA TGG GTA 4848 Leu Ser Tyr Phe He Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val
1605 1610 1615
CGT GCT AAA TAC CAA AAG GAA ACG ACT GAT AAG ATC TTG TTC GAC CGT 4896
Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Arg 1620 1625 1630
ACT GAT GAG AAA GAT CCG CAC GGT TGG TTT CTC AGC GAC GAT CAC AAG 4944
Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys
1635 1640 1645
ACC TTT AGT GGT CTC TCT TCC GCA CAG GCA TTA AAG AAC GAC AGT GAA 4992 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660 CCG ATG GAT TTC TCT GGC GCC AAT GCT CTC TAT TTC TGG GAA CTG TTC 5040 Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 1665 1670 1675 1680
TAT TAC ACG CCG ATG ATG ATG GCT CAT CGT TTG TTG CAG GAA CAG AAT 5088 Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Glu Gin Asn
1685 1690 1695
TTT GAT GCG GCG AAC CAT TGG TTC CGT TAT GTC TGG AGT CCA TCC GGT 5136 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 1700 1705 1710
TAT ATC GTT GAT GGT AAA ATT GCT ATC TAC CAC TGG AAC GTG CGA CCG 5184 Tyr He Val ASD Gly Lys He Ala He Tyr His Trp Asn Val Arg Pro 1715 1720 1725
CTG GAA GAA GAC ACC AGT TGG AAT GCA CAA CAA CTG GAC TCC ACC GAT 5232 Leu Glu Glu ASD Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 1730 * 1735 1740 CCA GAT GCT GTA GCC CAA GAT GAT CCG ATG CAC TAC AAG GTG GCT ACC 5280 Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 1745 1750 1755 1760
TTT ATG GCG ACG TTG GAT CTG CTA ATG GCC CGT GGT GAT GCT GCT TAC 5328 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr
1765 1770 1775
CGC CAG TTA GAG CGT GAT ACG TTG GCT GAA GCT AAA ATG TGG TAT ACA 5376 Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 1780 1785 1790
CAG GCG CTT AAT CTG TTG GGT GAT GAG CCA CAA GTG ATG CTG AGT ACG 5424 Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 1795 1800 1805
ACT TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAG 5472 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1810 1815 1820 CAG GTT CGT CAG CAA GTG CTT ACC CAG TTG CGT CTC AAT AGC AGG GTA 5520 Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1825 1830 1835 1840
AAA ACC CCG TTG 5532 Lys Thr Pro Leu 1844
-253-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1844 amino acids
(B) TYPE: amino acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear ;ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 (TcbAϋ)
Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 1 5 10 15
Ala Ala Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala Tyr Leu 20 25 30
Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser He
35 40 45 Tyr Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 50 55 60
Gin Lys Asn Met Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu
65 70 75 80
Leu Cys Leu Ala Gly He Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 85 90 95
Val Met Asp Met Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tyr 100 105 110
His His Ala Tyr Glu Thr Val Arg Glu He Val His Glu Arg Asp Pro 115 120 125 Gly Phe Arg His Leu Ser Gin Ala Pro He Val Ala Ala Lys Leu Asp 130 135 140
Pro Val Thr Leu Leu Gly He Ser Ser His He Ser Pro Glu Leu Tyr
145 150 155 160
Asn Leu Leu He Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp
165 170 175
Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Met 180 185 190
Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp He
195 200 205 Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 210 215 220
Leu Val He Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arg 225 230 235 240
Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 245 250 255
Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr. Leu He Lys Tyr Asn Leu 260 265 270
Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 275 280 285
Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Met Val 290 295 300 He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arg Ser Asp Ser 305 310 315 320
Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 325 330 335
Asn Phe Ala Ala Ala Asn Phe Lys He Asp Gin Tyr Ser Pro Lys Ala 340 345 350
Phe Leu Leu Lys Met Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 355 360 365
Leu Ser Phe Ala Thr Leu Glu Arg He Val Asp Ser Val Asn Ser Thr 370 375 380 Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 385 390 395 400
Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 405 410 415
Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 420 425 430
Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu He 435 440 445
Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys
450 455 460 Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 465 470 475 480
Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Met Leu Leu He Thr Asp 485 490 495
Arg Lys Glu ASD Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 500 505 510
Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 515 520 525
Glu Leu Asn He Leu Leu Val He Cys Gly Tyr Gly Asp Thr Asn He 530 535 540 Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu
545 550 555 560
Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu
565 570 575
Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He
580 585 590
Ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser 595 600 605
Leu He Gly Glu Asp Leu Lys Arg Ala Met Ala Pro Cys Phe Thr Ser 610 615 620 Ala Leu His Leu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Tro 625 630 635 640
-255-
SUBSTΓΓUTE SHEET (RULE 26)
He Asp Gin He Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 645 650 655
Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin
660 665 670
Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu
675 680 685
Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 690 695 700
Lys Ser He Leu Asp His Gly Leu Leu Thr Leu Met Ala Leu Glu Gly 705 710 715 720
Phe His Thr Trσ Val Asn Gly Leu Gly Gin His Ala Ser Leu He Leu
725 730 735 Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 740 745 750
Met Asn Lys Glu Glu Ser Leu Leu Gin Met Ala Ala Asn Gin Val Glu 755 760 765
Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 770 775 780
Gin Trp Leu Gin Met Ser Ser Ala Leu Ala Val Ser Pro Leu Asp Leu 785 790 795 800
Ala Gly Met Met Ala Leu Lys Tyr Gly He Asp His Asn Tyr Ala Ala
805 810 815 Trp Gin Ala Ala Ala Ala Ala Leu Met Ala Asp His Ala Asn Gin Ala 820 825 830
Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn Tyr Tyr 835 840 845
He Asn Ala Val Val Asp Ser Ala Ala Gly Val Arg Asp Arg Asn Giv 850 855 860
Leu Tyr Thr Tyr Leu Leu He Asp Asn Gin Val Ser Ala ASD Val He 865 870 875 880
Thr Ser Arg He Ala Glu Ala He Ala Gly He Gin Leu Tyr Val Asn 885 890 895 Arg Ala Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Val Ser Thr 900 905 910
Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr
915 920 925
Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr Val Asp
930 935 940
Pro Thr Gin Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin 945 950 955 960
Ser He Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe 965 970 975 Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 980 985 990
Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Gly Leu Thr Tyr Phe 995 1000 1005
He Gly He Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val
-256-
SUBSTΓΓUTE SHEET (RULE 26)
1010 1015 1020
Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 1025 1030 1035 1040
Glu Trp Asn Lys He Thr Cys Ala Val Asn Pro Trp Lys Asn He He 1045 1050 1055
Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 1060 1065 1070
Gin Ser Lys Lys Ser Asp ASD Gly Lys Thr Thr He Tyr Gin Tyr Asn 1075 1080 1085 Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100
Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120
Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 1125 1130 1135
Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 1140 1145 1150
Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr He Phe Ala Asp Met 1155 1160 1165 Ser Ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn
1170 1175 1180
Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 1185 1190 1195 1200
Lys Lys Val He Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 1205 1210 1215
Glu He Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 1220 1225 1230
His Ser Leu Thr Met Leu Tyr Gly Gly Ser Val Pro Asn He Thr Phe 1235 1240 1245 Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 1250 1255 1260
He He His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu 1265 1270 1275 1280
Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe He He Tyr Asp Ser 1285 1290 1295
Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310
Gly Lys Asp Glu Asn Ser Asp Asp Ser He Cys He Tyr Asn Glu Asn 1315 1320 1325 Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys ASD Asp Asn 1330 1335 1340
Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys He Asp Ala Gly Thr 1345 1350 1355 1360
Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu He Glu Val He Ser 1365 1370 1375
Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys He Ser Asn Pro He Asn 1380 1385 1390
-257-
SUBSTΓΓUTE SHEET (RULE 26)
He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405
Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420
Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 1425 1430 1435 1440 Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His
1445 1450 1455
He Glu He Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 1460 1465 1470
Ala Glu Thr Phe He He Pro Val Thr Lys Lys Val Leu Gly Thr Glu 1475 1480 1485
Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 1490 1495 1500
He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu
1505 1510 1515 1520 Val Ser Arg Ala Asn Arg Gly He Asp Ala Val Leu Ser Met Glu Thr 1525 1530 1535
Gin Asn He Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550
Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 1555 1560 1565
Ala He Glu Tyr Val Asp He Phe Lys Glu Asn Asp Ser Phe Val He 1570 1575 1580
Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1585 1590 1595 1600 Leu Ser Tyr Phe He Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val
1605 1610 1615
Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Arα 1620 1625 1630
Thr ASD Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1645
Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660
Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 1665 1670 1675 1680 Tyr Tyr Thr Pro Met Met Met Ala His Arg Leu Leu Gin Glu Gin Asn
1685 1690 1695
Phe ASD Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Giv 1700 1705 1710
Tyr He Val Asp Gly Lys He Ala He Tyr His Trp Asn Val Arg Pro 1715 1720 1725
Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr ASD 1730 1735 1740
Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 1745 1750 1755 1760 Phe Met Ala Thr Leu Asp Leu Leu Met Ala Arg Gly Asp Ala Ala Tyr
1765 1770 1775
-258-
SUBSTTTUTE SHEET (RULE 26)
Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Met Trp Tyr Thr 1780 1785 1790
Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 1795 1800 1805
Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1810 1815 1820
Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1825 1830 1835 1840
Lys Thr Pro Leu 1844
(2) INFORMATION FOR SEQ ID NO: 54: (l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1722 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 [ tcbAm coding region) : CTA GGA ACA GCC AAT TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT 48 Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 1 5 10 15
AGC AAG CTC AAA GGC TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT 96 Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn 20 25 30
TTA CGT CAT AAT CTG TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG 144 Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu 35 40 45
TAT GCT AAA CCG GCT GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA 192 Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 50 55 60
GCT TCT CAA GGG GGA GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC 240 Ala S_er. Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr He His 65 70 75 80
CGG GGC TTG GTT AAC CAG CTT 288 Arg Gly Leu Val Asn Gin Leu 90 95
TAC AGT GAG CGT CAG GAT GCG 336 Tyr Ser Glu Arg Gin Asp Ala 110
CAA GCC AGC GAG TTA ATA CTG 384 Gin Ala Ser Glu Leu He Leu 125
TTG GCA GAG CTG GAT TCG GAA 432 Leu Ala Glu Leu Asp Ser Glu
140
AAA ACC GCC TTG CAA GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC 480 Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 145 150 155 160
-259-
SUBSTΓΓUTE SHEET (RULE 26)
GAG CAG CGA 528 Glu Gin Arg
175
GGA GCG CAG 576 Gly Ala Gin 190
AAT ATC TTC 624 Asn He Phe
TAT GCC ATC 672 Tyr Ala He
GAT GCG GAG 720 Asp Ala Glu
240
AAA GTT GCT CAG TCG GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA 76E Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 245 250 255 ATT CAG CGT GAC AAC GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA 816
He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 260 265 270
CTG GAA TCA CTG TCT ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG 864 Leu Glu Ser Leu Ser He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu
275 280 285
TAC CTG AAA ACC CAG CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA 912
Tyr Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 290 295 300
AGA AGC AAA TTC AGT AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT 960
Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 305 310 315 320
TTG TCA GGT ATT TAT TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC 1008 Leu Ser Gly He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys 325 330 335 CTG ATG GCA GAG CAA TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT 1056
Leu Met Ala Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He 340 345 350
AGC TTT GTC AAA CCG GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG 1104 Ser Phe Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu
355 360 365
TGT GGA GAA GCT TTG ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT 1152
Cys Gly Glu Ala Leu He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr 370 375 380
CTG AAA TGG GAA TCT CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG 1200
Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 385 390 395 400
GCA GTG GTT TAT GAT TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG 1248 Ala Val Val Tyr Asp Ser Leu Glu Gly Asn ASD Arg Phe Asn Leu Ala 405 410 ' 415 GAA CAA ATA CCT GCA TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT 1296 Glu Gin He Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 420 425 430
AAA GAA AAT GGG TTA TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC 1344 Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val 435 440 445
-260-
SUBSTITUTE SHEET (RULE 25)
AAA TTG TCC GAC TTG AAA CTG GGA ACG GAT' TAT CCA GAC AGT ATC GTT 1392
Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val
450 455 460
GGT AGC AAC AAG GTT CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT 1440 Gly Ser Asn Lys Val Arg Arg He Lys Gin He Ser Val Ser Leu Pro 465 470 475 480
GCA TTG GTT GGG CCT TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT 1488 Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly _
485 490 495
GGC AGT ACT CAA TTG CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT 1536 Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His 500 505 510
GGT ACC AAT GAT AGT GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA 1584
Gly Thr Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys
515 520 525
TAC CTG CCA TTT GAA GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT 1632
Tyr Leu Pro Phe Glu Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn
530 535 540
CTT CAA TTT CCG AAT GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT 1680 Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr 545 550 555 560 ATG AGC GAT ATT ATT TTG CAT ATT CGT TAT ACC ATC CGT TAA 1722 Met Ser Asp He He Leu His He Arg Tyr Thr He Arg ••• 565 570 573
(2) INFORMATION FOR SEQ ID NO: 55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 573 amino acids
(B) TYPE: amino acids (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55 (TcbAiϋ):
Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn
1 5 10 15
Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn 20 25 30
Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu
35 40 45 Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 50 55 60
Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr He His 65 70 75 80
Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 85 90 95
He Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 100 105 110
Glu Ala Met Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu 115 120 125
-261-
SUBSTΓΓUTE SHEET (RULE 26)
Thr Ser He Arg Met Gin Asp Asn Gin Leu Ala Glu Leu ASD Ser Glu 130 135 • 140
Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp
145 150 155 160
Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arq 165 170 175 Ala Leu Ala Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin 180 185 190
He Ser Arg Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn He Phe 195 200 205
Gly Leu Ala Asp Gly Gly Met His Tyr Gly Ala He Ala Tyr Ala He 210 215 220
Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu 225 230 235 240
Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys
245 250 255 He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 260 265 270
Leu Glu Ser Leu Ser He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu 275 280 285
Tyr Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 290 295 300
Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 305 310 315 320
Leu Ser Gly He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys 325 330 335 Leu Met Ala Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He
340 345 350
Ser Phe Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu
355 360 365
Cys Gly Glu Ala Leu He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr 370 375 380
Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 385 390 395 400
Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala
405 410 415 Glu Gin He Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 420 425 430
Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val 435 440 445
Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val
450 455 460
Gly Ser Asn Lys Val Arq Arg He Lys Gin He Ser Val Ser Leu Pro 465 470 475 480
Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly 485 490 495 Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His 500 505 510
-262-
SUBSTΓΠJTE SHEET (RULE 26)
Gly Thr Asn ASD Ser Gly Gin Phe Gin Leu" Asp Phe Asn ASD Gly Lys 515 ' 520 525
Tyr Leu Pro Phe Glu Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn 530 535 540
Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr 545 550 555 560
Met Ser Asp He He Leu His He Arg Tyr Thr He Arg 565 570 573
(2) INFORMATION FOR SEQ ID NO: 56
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2994 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 ( tccA)
1 ATG AAT CAA CTC GCC AGT CCC CTG ATT TCC CGC ACC GAA GAG ATC CAC 48 1 Met Asn Gin Leu Ala Ser Pro Leu He Ser Arg Thr Glu Glu He His 16
49 AAC TTA CCC GGT AAA TTG ACC GAT CTT GGT TAT ACC TCA GTG TTT GAT 96
17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 32
97 GTG GTA CGT ATG CCG CGT GAG CGT TTT ATT CGT GAG CAT CGT GCT GAT 144 33 Val Val Arg Met Pro Arg Glu Arg Phe He Arg Glu His Arg Ala Asp 48
145 CTC GGG CGC AGT GCT GAA AAA ATG TAT GAC CTG GCA GTG GGC TAT GCT 192
49 Leu Gly Arg Ser Ala Glu Lys Met Tyr Asp Leu Ala Val Gly Tyr Ala 64
193 CAT CAG GTG TTA CAC CAT TTT CGC CGT AAT TCT CTT AGT GAA GCT GTT 240
65 His Gin Val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala Val 80
241 CAG TTT GGC TTG AGA AGT CCG TTC TCC GTA TCA GGC CCG GAT TAC GCC 288
81 Gin Phe Gly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 96
289 AAT CAG TTT CTT GAT GCA AAC ACG GGT TGG AAA GAT AAA GCA CCA AGT 336
97 Asn Gin Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro Ser 112
337 GGA TCA CCG GAA GCC AAT GAT GCG CCG GTA GCC TAT CTG ACT CAT ATT 384 113 Gly Ser Pro Glu Ala Asn Asp Ala Pro Val Ala Tyr Leu Thr His He 128
365 TAT CAA TTG GCC CTT GAA CAG GAA AAG AAT GGC GCC ACT ACC ATT ATG 432
129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr He Met 144
433 AAT ACG CTG GCG GAG CGT CGC CCC GAT CTG GGT GCT TTG TTA ATT AAT 480
145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu He Asn 160
481 GAT AAA GCA ATC AAT GAG GTG ATA CCG CAA TTG CAG TTG GTC AAT GAA 528
161 Asp Lys Ala He Asn Glu Val He Pro Gin Leu Gin Leu Val Asn Glu 176
-263-
SUBSTΓΓUTE SHEET (RULE 26)
529 ATT CTG TCC AAA GCT ATT CAG AAG AAA CTG AGT TTG ACT GAT CTG GAA 576
177 He Leu Ser Lys Ala He Gin Lys Lys Leu Ser Leu Thr Asp Leu Glu 192
577 GCG GTA AAC GCC AGA CTT TCC ACT ACC CGT TAC CCG AAT AAT CTG CCG 624
193 Ala Val Asn Ala Arg Leu Ser Thr Thr Arg Tyr Pro Asn Asn Leu Pro 208
625 TAT CAT TAT GGT CAT CAG CAG ATT CAG ACA GCT CAA TCG GTA TTG GGT 672 209 Tyr His Tyr Gly His Gin Gin He Gin Thr Ala Gin Ser Val Leu Gly 224
673 ACT ACG TTG CAA GAT ATC ACT TTG CCA CAG ACG CTG GAT CTG CCG CAA 720
225 Thr Thr Leu Gin Asp He Thr Leu Pro Gin Thr Leu Asp Leu Pro Gin 240
721 AAC TTC TGG GCA ACA GCA AAA GGA AAA CTG AGC GAT ACG ACT GCC AGT 768
241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala Ser 256
769 GCT TTG ACC CGA CTG CAA ATC ATG GCG AGT CAG TTT TCG CCA GAG CAG 816 257 Ala Leu Thr Arg Leu Gin He Met Ala Ser Gin Phe Ser Pro Glu Gin 272
817 CAG AAA ATC ATT ACG GAG ACT GTC GGT CAG GAT TTC TAT CAG CTT AAC 864
273 Gin Lys He He Thr Glu Thr Val Gly Gin Asp Phe Tyr Gin Leu Asn 288
865 TAT GGT GAC AGT TCG CTT ACT GTG AAT AGT TTC AGC GAC ATG ACC ATA 912 289 Tyr Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Met Thr He 304
913 ATG ACT GAT CGA ACA AGT TTG ACT GTA CCC CAG GTA GAA CTG ATG TTG 960 305 Met Thr Asp Arg Thr Ser Leu Thr Val Pro Gin Val Glu Leu Met Leu 320
961 TGT TCA ACT GTC GGA GGT TCT ACG GTT GTT AAG TCT GAT AAT GTG AGT
1008
321 Cys Ser Thr Val Gly Gly Ser Thr Val Val Lys Ser Asp Asn Val Ser 336
1009 TCT GGT GAC ACG ACA GCG ACG CCA TTT GCG TAT GGC GCC CGC TTT ATT 1056 337 Ser Gly Asp Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe He 352
1057 CAT GCC GGT AAG CCG GAG GCG ATT ACC CTG AGT CGC AGT GGT GCG GAG 1104 353 His Ala Gly Lys Pro Glu Ala He Thr Leu Ser Arg Ser Gly Ala Glu 368
1105 GCG CAT TTT GCT CTG ACG GTT AAC AAT CTG ACA GAT GAC AAG TTG GAC 1152 369 Ala His Phe Ala Leu Thr Val Asn Asn Leu Thr Asp Asp Lys Leu Asp 384
1153 CGT ATT AAC CGC ACA GTG CGC CTG CAA AAA TGG CTG AAT CTG CCT TAT 1200 385 Arg He Asn Arg Thr Val Arg Leu Gin Lys Trp Leu Asn Leu Pro Tyr 400
1201 GAG GAT ATT GAC CTG TTA GTG ACT TCT GCT ATG GAT GCG GAA ACA GGA 1248 401 Glu Asp He Asp Leu Leu Val Thr Ser Ala Met Asp Ala Glu Thr Gly 416
1249 AAT ACC GCG CTG TCG ATG AAC GAC AAT ACG CTG CGT ATG TTG GGA GTG 1296 417 Asn Thr Ala Leu Ser Met Asn Asp Asn Thr Leu Arg Met Leu Gly Val 432
1297 TTC AAA CAT TAT CAG GCG AAG TAT GGT GTT AGC GCT AAA CAA TTT GCT 1344
433 Phe Lys His Tyr Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe Ala 448
1345 GGC TGG CTG CGC GTA GTG GCC CCG TTT GCC ATT ACA CCG GCA ACG CCG
1392
449 Gly Trp Leu Arg Val Val Ala Pro Phe Ala He Thr Pro Ala Thr Pro 464
1393 TTT TTA GAC CAA GTG TTT AAC TCC GTC GGC ACC TTT GAT ACA CCG TTT 1440
465 Phe Leu Asp Gin Val Phe Asn Ser Val Gly Thr Phe Asp Thr Pro Phe 480
14 1 GTG ATA GAT AAT CAG GAT TTT GTC TAT ACA TTG ACC ACC GGG GGC GAT 1488
481 Val He Asp Asn Gin Asp Phe Val Tyr Thr Leu Thr Thr Gly Gly Asp 496
1489 GGG GCG CGT GTT AAG CAT ATC AGC ACG GCA CTG GGC CTC AAT CAT CGT 1536
497 Gly Ala Arg Val Lys His He Ser Thr Ala Leu Gly Leu Asn His Arg 512
1537 CAG TTC CTG TTA TTG GCG GAT AAT ATT GCC CGT CAA CAG GGG AAT GTC 1584
513 Gin Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 528
1585 ACG CAA AGC ACA CTC AAC TGT AAT CTG TTT GTG GTG TCA GCT TTC TAC 1632
529 Thr Gin Ser Thr Leu Asn Cys Asn Leu Phe Val Val Ser Ala Phe Tyr 544
1633 CGT CTG GCT AAT TTG GCG CGC ACA TTG GGG ATA AAT CCA GAG TCT TTC 1680
545 Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Ser Phe 560
1681 TGT GCC TTG GTT GAT CGA TTA GAT GCA GGT ACA GGC ATC GTC TGG CAG 1728
561 Cys Ala Leu Val Asp Arg Leu Asp Ala Gly Thr Gly He Val Trp Gin 576
1729 CAA TTG GCA GGG AAA CCC ACA ATC ACG GTA CCA CAA AAA GAT TCC CCG 1776
577 Gin Leu Ala Gly Lys Pro Thr He Thr Val Pro Gin Lys Asp Ser Pro 592
1777 CTG GCG GCG GAT ATT CTG AGT TTG CTG CAA GCG CTA AGT GCG ATT GCT 1824
593 Leu Ala Ala Asp He Leu Ser Leu Leu Gin Ala Leu Ser Ala He Ala 608
1825 CAA TGG CAA CAA CAG CAC GAT TTA GAA TTT TCA GCA CTG CTT TTG CTG 1872
609 Gin Trp Gin Gin Gin His Asp Leu Glu Phe Ser Ala Leu Leu Leu Leu 624
1873 TTG AGT GAC AAC CCT ATT TCT ACC TCG CAG GGC ACT GAC GAT CAA TTG 1920
625 Leu Ser Asp Asn Pro He Ser Thr Ser Gin Gly Thr Asp Asp Gin Leu 640
1921 AAC TTT ATC CGT CAA GTG TGG CAG AAC CTA GGC AGT ACG TTT GTG GGT 1968
641 Asn Phe He Arg Gin Val Trp Gin Asn Leu Gly Ser Thr Phe Val Gly 656
1969 GCA ACA TTG TTG TCC CGC AGT GGG GCA CCA TTA GTC GAT ACC AAC GGC 2016
657 Ala Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Val Asp Thr Asn Gly 672
2017 CAC GCT ATT GAC TGG TTT GCT CTG CTC TCA GCA GGT AAT AGT CCG CTT 2064
673 His Ala He Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 688
2065 ATC GAT AAG GTT GGT CTG GTG ACT GAT GCT GGC ATA CAA AGT GTT ATA 2112
689 He Asp Lys Val Gly Leu Val Thr Asp Ala Gly He Gin Ser Val He 704
2113 GCA ACG GTG GTC AAT ACA CAA AGC TTA TCT GAT GAA GAT AAG AAG CTG 2160
705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720
2161 GCA ATC ACT ACT CTG ACT AAT ACG TTG AAT CAG GTA CAG AAA ACT CAA 2208
721 Ala He Thr Thr Leu Thr Asn Thr Leu Asn Gin Val Gin Lys Thr Gin 736
2209 CAG GGC GTG GCC GTC AGT CTG TTG GCG CAG ACT CTG AAC GTG AGT CAG 2256
737 Gin Gly Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin 752
2257 TCA CTG CCT GCG TTA TTG TTG CGC TGG AGT GGA CAA ACA ACC TAC CAG 2304
753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 768
2305 TGG TTG AGT GCG ACT TGG GCA TTG AAG GAT GCC GTT AAG ACT GCC GCC 2352
769 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 784
2353 GAT ATT CCC GCT GAC TAT CTG CGT CAA TTA CGT GAA GTG GTA CGC CGC 2400
785 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 800
2401 TCC TTG TTG ACC CAA CAA TTC ACG CTG AGT CCT GCA ATG GTG CAA ACC 2448
801 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Met Val Gin Thr 816
2449 TTG CTG GAC TAT CCA GCC TAT TTT GGC GCT TCC GCA GAA ACA GTG ACC 2496
817 Leu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr Val Thr 832
2497 GAT ATC AGT TTG TGG ATG CTT TAT ACC CTG AGC TGT TAT AGC GAT TTA 2544
833 Asp He Ser Leu Trp Met Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 848
2545 TTG CTC CAA ATG GGT GAA GCT GGT GGT ACC GAA GAT GAT GTA CTG GCC 2592
849 Leu Leu Gin Met Gly Glu Ala Gly Gly Thr Glu Asp Asp Val Leu Ala 864
2593 TAC TTA CGC ACA GCT AAT GCT ACC ACA CCG TTG AGC CAA TCT GAT GCT 2640
865 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 880
-266-
SUBSTTTUTE SHEET (RULE 26)
26 1 GCA CAG ACG TTG GCA ACG CTA TTG GGT TGG GAG GTT AAC GAG TTG CAA 2686 881 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 896
2689 GCC GCT TGG TCG GTA TTG GGC GGG ATT GCC AAA ACC ACA CCG CAA CTG 2736 897 Ala Ala Trp Ser Val Leu Gly Gly He Ala Lys Thr Thr Pro Gin Leu 912
2737 GAT GCG CTT CTG CGT TTG CAA CAG GCA CAG AAC CAA ACT GGT CTT GGC 2784 913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 928
2785 GTT ACA CAG CAA CAG CAA GGC TAT CTC CTG AGT CGT GAC AGT GAT TAT 2832 929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944
2833 ACC CTT TGG CAA AGC ACC GGT CAG GCG CTG GTG GCT GGC GTA TCC CAT 2880 945 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 960
2881 GTC AAG GGC AGT AAC TGA GCATGGCAGA GCTCACTACC TGAGTGGATT TGATTT 2934 961 Val Lys Gly Ser Asn End 965
2935 TTCCGTATGG CCTAATGAGG CTATTTCTAA ACCGCCATTT AAGTAAGGCA GATAATTATG 2994
(2) INFORMATION FOR SEQ ID NO: 57
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 965 ammo acids
(B) TYPE: ammo acid (C) TOPOLOGY: linear
(n) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 (TccA peptide) Features From To Description 1 10 SEQ ID NO: 8
1 Met Asn Gin Leu Ala Ser Pro Leu He Ser Arg Thr Glu Glu He His 16
17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 32
33 Val Val Arg Met Pro Arg Glu Arg Phe He Arg Glu His Arg Ala Asp 48
49 Leu Gly Arg Ser Ala Glu Lys Met Tyr Asp Leu Ala Val Gly Tyr Ala 64 65 His Gin Val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala Val 80
81 Gin Phe Gly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 96
97 Asn Gin Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro Ser 112
113 Gly Ser Pro Glu Ala Asn Asp Ala Pro Val Ala Tyr Leu Thr His He 128
129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr He Met 144 145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu He Asn 160
161 Asp Lys Ala He Asn Glu Val He Pro Gin Leu Gin Leu Val Asn Glu 176
177 He Leu Ser Lys Ala He Gin Lys Lys Leu Ser Leu Thr Asp Leu Glu 192
-267-
SUBSTΓRJTE SHEET (RULE 26)
193 Ala Val Asn Ala Arg Leu Ser Thr Thr' Arg Tyr Pro Asn Asn Leu Pro 208
209 Tyr His Tyr Gly His Gin Gin He Gin Thr Ala Gin Ser Val Leu Gly 224
225 Thr Thr Leu Gin Asp He Thr Leu Pro Gin Thr Leu Asp Leu Pro Gin 240
241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala Ser 256 257 Ala Leu Thr Arg Leu Gin He Met Ala Ser Gin Phe Ser Pro Glu Gin 272
273 Gin Lys He He Thr Glu Thr Val Gly Gin Asp Phe Tyr Gin Leu Asn 288
289 Tyr Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Met Thr He 304
305 Met Thr Asp Arg Thr Ser Leu Thr Val Pro Gin Val Glu Leu Met Leu 320
321 Cys Ser Thr Val Gly Gly Ser Thr Val Val Lys Ser Asp Asn Val Ser 336 337 Ser Gly Asp Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe He 352
353 His Ala Gly Lys Pro Glu Ala He Thr Leu Ser Arg Ser Gly Ala Glu 368
369 Ala His Phe Ala Leu Thr Val Asn Asn Leu Thr Asp Asp Lys Leu Asp 384
385 Arg He Asn Arg Thr Val Arg Leu Gin Lys Trp Leu Asn Leu Pro Tyr 400
401 Glu Asp He Asp Leu Leu Val Thr Ser Ala Met Asp Ala Glu Thr Gly 416 417 Asn Thr Ala Leu Ser Met Asn Asp Asn Thr Leu Arg Met Leu Gly Val 432
433 Phe Lys His Tyr Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe Ala 448
449 Gly Trp Leu Arg Val Val Ala Pro Phe Ala He Thr Pro Ala Thr Pro 464
465 Phe Leu Asp Gin Val Phe Asn Ser Val Gly Thr Phe Asp Thr Pro Phe 480
481 Val He Asp Asn Gin Asp Phe Val Tyr Thr Leu Thr Thr Gly Gly Asp 496 497 Gly Ala Arg Val Lys His He Ser Thr Ala Leu Gly Leu Asn His Arg 512
513 Gin Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 528
529 Thr Gin Ser Thr Leu Asn Cys Asn Leu Phe Val Val Ser Ala Phe Tyr 544
545 Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Ser Phe 560
561 Cys Ala Leu Val Asp Arg Leu Asp Ala Gly Thr Gly He Val Trp Gin 576 577 Gin Leu Ala Gly Lys Pro Thr He Thr Val Pro Gin Lys Asp Ser Pro 592
593 Leu Ala Ala Asp He Leu Ser Leu Leu Gin Ala Leu Ser Ala He Ala 608
609 Gin Trp Gin Gin Gin His Asp Leu Glu Phe Ser Ala Leu Leu Leu Leu 624
625 Leu Ser Asp Asn Pro He Ser Thr Ser Gin Gly Thr Asp Asp Gin Leu 640
641 Asn Phe He Arg Gin Val Trp Gin Asn Leu Gly Ser Thr Phe Val Gly 656 657 Ala Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Val Asp Thr Asn Gly 672
673 His Ala He Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 688
689 He Asp Lys Val Gly Leu Val Thr Asp Ala Gly He Gin Ser Val He 704
705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720
721 Ala He Thr Thr Leu Thr Asn Thr Leu Asn Gin Val Gin Lys Thr Gin 736 737 Gin Gly Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin 752
753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 768
769 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 784
785 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 800
801 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Met Val Gin Thr 816
817 Leu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr Val Thr 832
833 Asp He Ser Leu Trp Met Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 848
849 Leu Leu Gin Met Gly Glu Ala Gly Gly Thr Glu Asp Asp Val Leu Ala 864 865 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 880
881 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 896
897 Ala Ala Trp Ser Val Leu Gly Gly He Ala Lys Thr Thr Pro Gin Leu 912
913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 928
929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944 945 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 960
961 Val Lys Gly Ser Asn 965
(2) INFORMATION FOR SEQ ID NO: 58
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4932 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 ( ccB)
1 ATG TTA TCG ACA ATG GAA AAA CAA CTG AAT GAA TCC CAG CGT GAT GCG 48 1 Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp Ala 16
49 TTG GTG ACT GGC TAT ATG AAT TTT GTG GCG CCG ACG TTG AAA GGC GTC 96
17 Leu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Gly Val 32
97 AGT GGT CAG CCG GTG ACG GTG GAA GAT TTA TAC GAA TAT TTG CTG ATT 144 33 Ser Gly Gin Pro Val Thr Val Glu Asp Leu Tyr Glu Tyr Leu Leu He 48
145 GAC CCG GAA GTG GCT GAT GAG GTT GAG ACG AGT CGG GTA GCA CAA GCG 192
49 Asp Pro Glu Val Ala Asp Glu Val Glu Thr Ser Arg Val Ala Gin Ala 64
193 ATT GCC AGC ATA CAG CAA TAT ATG ACT CGT CTG GTC AAC GGC TCT GAA 240
65 He Ala Ser He Gin Gin Tyr Met Thr Arg Leu Val Asn Gly Ser Glu 80
241 CCG GGG CGT CAG GCG ATG GAG CCT TCT ACA GCT AAC GAA TGG CGT GAT 288 81 Pro Gly Arg Gin Ala Met Glu Pro Ser Thr Ala Asn Glu Trp Arg Asp 96
289 AAT GAT AAC CAA TAT GCT ATC TGG GCT GCG GGG GCT GAG GTT CGA AAT 336
97 Asn Asp Asn Gin Tyr Ala He Trp Ala Ala Gly Ala Glu Val Arg Asn 112
337 TAC GCT GAA AAC TAT ATT TCA CCC ATC ACC CGG CAG GAA AAA AGC CAT 384
113 Tyr Ala Glu Asn Tyr He Ser Pro He Thr Arg Gin Glu Lys Ser His 128
385 TAT TTC TCG GAG CTG GAG ACG ACT TTA AAT CAG AAT CGA CTC GAT CCG 432 129 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144
5 433 GAT CGT GTG CAG GAT GCT GTT TTG GCG TAT CTC AAT GAG TTT GAG GCA 480
145 Asp Arg Val Gin Asp Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 160
481 GTG AGT AAT CTA TAT GTG CTC AGT GGT TAT ATT AAT CAG GAT AAA TTT 528 10 161 Val Ser Asn Leu Tyr Val Leu Ser Gly Tyr He Asn Gin Asp Lys Phe 176
529 GAC CAA GCT ATC TAC TAC TTT ATT GGT CGC ACT ACC ACT AAA CCG TAT 576
177 Asp Gin Ala He Tyr Tyr Phe He Gly Arg Thr Thr Thr Lys Pro Tyr 192
15
577 CGC TAC TAC TGG CGT CAG ATG GAT TTG AGT AAG AAC CGT CAA GAT CCG 624
193 Arg Tyr Tyr Trp Arg Gin Met Asp Leu Ser Lys Asn Arg Gin Asp Pro 208 o 0
625 GCA GGG AAT CCG GTG ACG CCA AAT TGC TGG AAT GAT TGG CAG GAA ATC 672
209 Ala Gly Asn Pro Val Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu He 224
25 673 ACT TTG CCG CTG TCT GGT GAT ACG GTG CTG GAG CAT ACA GTT CGC CCG 720
225 Thr Leu Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 240
721 GTA TTT TAT AAT GAT CGA CTA TAT GTG GCT TGG GTT GAG CGT GAC CCG 768
30 241 Val Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 256
769 GCA GTA CAG AAG GAT GCT GAC GGT AAA AAC ATC GGT AAA ACC CAT GCC 816
257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn He Gly Lys Thr His Ala 272
35
817 TAC AAC ATA AAG TTT GGT TAT AAA CGT TAT GAT GAT ACT TGG ACA GCG 864
273 Tyr Asn He Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 288
40
865 CCG AAT ACG ACC ACG TTA ATG ACA CAA CAA GCA GGG GAA AGT TCA GAA 912
289 Pro Asn Thr Thr Thr Leu Met Thr Gin Gin Ala Gly Glu Ser Ser Glu 304
45 913 ACA CAG CGA TCC AGC CTG CTG ATT GAT GAA TCT AGC ACC ACA TTG CGC 960
305 Thr Gin Arg Ser Ser Leu Leu He Asp Glu Ser Ser Thr Thr Leu Arg 320
961 CAA GTT AAT CTG TTG GCT ACC ACC GAT TTT AGT ATC GAT CCG ACG GAG 50 1008
321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser He Asp Pro Thr Glu 336
1009 GAA ACG GAC AGT AAC CCG TAT GGC CGC CTA ATG TTG GGG GTG TTT GTC 55 1056
337 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 352
1057 CGT CAA TTT GAA GGT GAT GGG GCC AAT AGA AAA AAT AAA CCC GTT GTT 60 1104
353 Arg Gin Phe Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Val 368
1105 TAT GGT TAT CTC TAT TGT GAC TCA GCT TTC AAT CGT CAT GTT CTC AGG 65 1152
369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 384
1153 CCG TTA AGT AAG AAC TTT TTG TTC AGT ACT TAC CGT GAT GAA ACG GAT 70 1200
385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp 400
-270-
SUBSTΪTUTE SHEET (RULE 26)
1201 GGT CAA AAC AGC TTG CAA TTT GCG GTA TAC GAT AAA AAG TAT GTA ATT 1248 401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val He 416
1249 ACT AAG GTT GTT ACA GGT GCA ACG GAA GAT CCC GAA AAT ACA GGA TGG 1296 417 Thr Lys Val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 432
1297 GTA AGT AAA GTT GAT GAC TTG AAA CAA GGC ACT ACT GGG GCC TAT GTG 1344 433 Val Ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 448
1345 TAT ATC GAT CAA GAT GGC CTG ACG CTT CAT ATA CAA ACC ACA ACT AAT 1392 449 Tyr He Asp Gin Asp Gly Leu Thr Leu His He Gin Thr Thr Thr Asn 464
1393 GGG GAT TTT ATT AAC CGT CAT ACG TTT GGA TAT AAC GAT CTT GTA TAT 1440 465 Gly Asp Phe He Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val Tyr 480
1441 GAT TCT AAG TCT GGT TAT GGT TTC ACG TGG TCA GGA AAT GAA GGT TTT 1488 481 Asp Ser Lys Ser Gly Tyr Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 496
1489 TAT CTG GAT TAC CAT GAT GGA AAT TAT TAC ACC TTT CAT AAT GCA ATA 1536 497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Tyr Thr Phe His Asn Ala He 512
1537 ATC AAC TAC TAT CCG TCT GGA TAT GGT GGT GGA TCT GTT CCT AAT GGA 1584 513 He Asn Tyr Tyr Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 528
1585 ACG TGG GCG TTA GAG CAA AGG ATT AAT GAG GGA TGG GCT ATT GCT CCC 1632 529 Thr Trp Ala Leu Glu Gin Arg He Asn Glu Gly Trp Ala He Ala Pro 544
1633 CTG CTT GAT ACT CTC CAT ACT GTT ACT GTG AAG GGC AGT TAT ATC GCT
1680^-r _. 545 Leu Leu Asp Thr Leu His Thr Val Thr Val Lys Gly Ser Tyr He Ala 560
1681 TGG GAA GGG GAA ACA CCT ACC GGT TAT AAT CTG TAT ATT CCA GAT GGT 1728 561 Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr He Pro Asp Gly 576
1729 ACC GTG TTG CTA GAT TGG TTT GAT AAA ATA AAT TTT GCT ATT GGT CTT 1776 577 Thr Val Leu Leu Asp Trp Phe Asp Lys He Asn Phe Ala He Gly Leu 592
1777 AAT AAG CTT GAG TCT GTA TTT ACG TCG CCA GAT TGG CCA ACA CTA ACC 1824 593 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 608
1825 ACT ATC AAA AAT TTC AGT AAA ATC GCC GAT AAC CGC AAA TTC TAT CAG 1872 609 Thr He Lys Asn Phe Ser Lys He Ala Asp Asn Arg Lys Phe Tyr Gin 624
-271-
SUBSTΓΓUTE SHEET (RULE 26)
1873 GAA ATC AAT GCT GAG ACG GCG GAT GGA CGC AAC CTG TTT AAA CGT TAC 1920
625 Glu He Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr 640
1921 AGT ACT CAA ACT TTC GGA CTT ACC AGC GGT GCG ACT TAT TCT ACA ACT 1968
641 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr Tyr Ser Thr Thr 656
1969 TAT ACT TTG TCT GAG GCG GAT TTC TCC ACT GAT CCG GAC AAA AAC TAC 2016
657 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tyr 672
2017 CTA CAG GTT TGT TTG AAT GTC GTG TGG GAT CAT TAT GAC CGC CCG TCA 2064
673 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 688
2065 GGG AAA AAA GGG GCT TAT TCT TGG GTC AGT AAG TGG TTT AAC GTC TAT 2112
689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val Tyr 704
2113 GTT GCG TTG CAA GAT AGC AAA GCT CCG GAT GCC ATT CCT CGA TTA GTT 2160
705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala He Pro Arg Leu Val 720
2161 TCC CGT TAC GAT AGT AAA CGT GGT CTG GTG CAA TAT CTG GAC TTC TGG 2208
721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 736
2209 ACC TCA TCA TTA CCC GCG AAA ACC CGT CTT AAC ACC ACC TTT GTG CGT 2256
737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 752
2257 ACT TTG ATT GAG AAG GCT AAT CTG GGG CTG GAT AGT TTG CTG GAT TAC 2304
753 Thr Leu He Glu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr 768
2305 ACC TTG CAG GCA GAT CCT TCT CTG GAA GCA GAT TTA GTG ACT GAC GGC 2352
769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 784
2353 AAA AGC GAA CCA ATG GAC TTT AAT GGT TCA AAC GGT CTC TAT TTC TGG 2400
785 Lys Ser Glu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 800
2401 GAA TTG TTC TTT CAC CTG CCG TTT TTG GTT GCT ACA CGC TTT GCC AAC 2448
801 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 816
2449 GAA CAG CAA TTT TCG CCG GCA CAA AAG AGT TTG CAT TAC ATC TTT GAC 2496
817 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr He Phe Asp 832
2497 CCG GCG ATG AAA AAC AAG CCA CAC AAT GCC CCG GCT TAT TGG AAT GTA 2544
833 Pro Ala Met Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn Val 848
-272-
SUBSTTTUTE SHEET (RULE 26)
2545 CGT CCG TTG GTT GAA GGA AAC AGC GAT TTG TCA CGT CAT TTG GAC GAT 2592 849 Arg Pro Leu Val Glu Gly Asn Ser Asp Leu Ser Arg His Leu Asp Asp 864
2593 TCT ATA GAC CCA GAT ACT CAA GCT TAT GCT CAT CCG GTG ATA TAC CAG 2640 865 Ser He Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val He Tyr Gin 880
2641 AAA GCG GTG TTT ATT GCC TAT GTC AGT AAC CTG ATT GCT CAG GGA GAT 2688 881 Lys Ala Val Phe He Ala Tyr Val Ser Asn Leu He Ala Gin Gly Asp 896
2689 ATG TGG TAT CGC CAA TTG ACT CGT GAC GGT CTG ACT CAG GCC CGT GTC 2736 897 Met Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 912
2737 TAT TAC AAT CTG GCC GCT GAA TTG CTA GGG CCT CGT CCG GAT GTA TCG 2784 913 Tyr Tyr Asn Leu Ala Ala Glu Leu Leu Gly Pro Arg Pro Asp Val Ser 928
2785 CTG AGT AGC ATT TGG ACG CCG CAA ACC CTG GAT ACC TTA GCA GCC GGG 2832 929 Leu Ser Ser He Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly 944
2833 CAA AAA GCG GTT TTA CGT GAT TTT GAG CAC CAG TTG GCT AAT AGT GAT 2880 945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 960
2881 ACC GCT TTA CCC GCA TTG CCG GGC CGC AAT GTC AGC TAC TTG AAA CTG 2928 961 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Leu 976
2929 GCA GAT AAT GGC TAC TTT AAT GAA CCG CTC AAT GTT CTG ATG TTG TCT 2976 977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Met Leu Ser 992
2977 CAC TGG GAT ACG TTG GAT GCA CGG TTA TAC AAT CTG CGT CAT AAC CTG 3024
993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Leu 1008
3025 ACC GTT GAT GGC AAG CCG CTT TCG CTG CCG CTG TAT GCT GCG CCT GTT 3072
1009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val 1024
3073 GAT CCG GTA GCG TTG TTG GCT CAG CGT GCT CAG TCC GGC ACG TTG ACG 3120 1025 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Leu Thr 1040
3121 AAT GGC GTC AGT GGC GCC ATG TTG ACG GTG CCG CCA TAC CGT TTC AGC 3168
1041 Asn Gly Val Ser Gly Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser 1056
3169 GCT ATG TTG CCG CGA GCT TAC AGC GCC GTG GGT ACG TTG ACC AGT TTT 3216
-273-
SUBSTΓΓUTE SHEET (RULE 26)
1057 Ala Met Leu Pro Arg Ala Tyr Ser Ala Val Gly Thr Leu Thr Ser Phe 1072
3217 GGT CAG AAC CTG CTT AGT TTG TTG GAA CGT AGC GAA CGA GCC TGT CAA
3264
1073 Gly Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin
1088
3265 GAA GAG TTG GCG CAA CAG CAA CTG TTG GAT ATG TCC AGC TAT GCC ATC
3312
1089 Glu Glu Leu Ala Gin Gin Gin Leu Leu Asp Met Ser Ser Tyr Ala He
1104
3313 ACG TTG CAA CAA CAG GCG CTG GAT GGA TTG GCG GCA GAT CGT CTG GCG 3360
1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Ala 1120
3361 CTG CTA GCT AGT CAG GCT ACG GCA CAA CAG CGT CAT GAC CAT TAT TAC 340B 1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr
1136
3409 ACT CTG TAT CAG AAC AAC ATC TCC AGT GCG GAA CAA CTG GTG ATG GAC 3456
1137 Thr Leu Tyr Gin Asn Asn He Ser Ser Ala Glu Gin Leu Val Met Asp 1152
3457 ACC CAA ACG TCA GCA CAA TCC CTG ATT TCT TCT TCC ACT GGT GTA CAA
3504
1153 Thr Gin Thr Ser Ala Gin Ser Leu He Ser Ser Ser Thr Gly Val Gin
1168
3505 ACT GCC AGT GGG GCA CTG AAA GTG ATC CCG AAT ATC TTT GGT TTG GCT
3552
1169 Thr Ala Ser Gly Ala Leu Lys Val He Pro Asn He Phe Gly Leu Ala
1184
3553 GAT GGC GGC TCG CGC TAT GAA GGA GTA ACG GAA GCG ATT GCC ATC GGG 3600
1185_ Asp π.ly. Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala He Ala He Gly 1200
3601 TTA ATG GCT GCC GGA CAA GCC ACC AGC GTG GTG GCC GAG CGT CTG GCA 3648 1201 Leu Met Ala Ala Gly Gin Ala Thr Ser Val Val Ala Glu Arg Leu Ala 1216
3649 ACC ACG GAG AAT TAC CGC CGC CGC CGT GAA GAG TGG CAA ATC CAA TAC 3696
1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin He Gin Tyr 1232
3697 CAG CAG GCA CAG TCT GAG GTC GAC GCA TTA CAG AAA CAG TTG GAT GCG 3744
1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 1248
-274-
SUBSTTTUTE SHEET (RULE 26)
3745 CTG GCA GTG CGC GAG AAA GCA GCT CAA ACT TCC CTG CAA CAG GCG AAG
3792
1249 Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys
1264
3793 GCA CAG CAG GTA CAA ATT CGG ACC ATG CTG ACT TAC TTA ACT ACT CGT 3840
1265 Ala Gin Gin Val Gin He Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 1280
3841 TTC ACC CAG GCG ACT CTG TAC CAG TGG CTG AGT GGT CAA TTA TCC GCG 3888 1281 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 1296
3889 TTG TAT TAT CAA GCG TAT GAT GCC GTG GTT GCT CTC TGC CTC TCC GCC 3936
1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Cys Leu Ser Ala 1312
3937 CAA GCT TGC TGG CAG TAT GAA TTG GGT GAT TAC GCT ACC ACT TTT ATC 3984
1313 Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe He 1328
3985 CAG ACC GGT ACC TGG AAC GAC CAT TAC CGT GGT TTG CAA GTG GGG GAG
4032
1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu
1344
4033 ACA CTG CAA CTC AAT TTG CAT CAG ATG GAA GCG GCC TAT TTA GTT CGT 4080
1345 Thr Leu Gin Leu Asn Leu His Gin Met Glu Ala Ala Tyr Leu Val Arg 1360
4081 CAC GAA CGC CGT CTT AAT GTG ATC CGT ACT GTG TCG CTC AAA AGC CTA 4128 1361 His Glu Arg Arg Leu Asn Val He Arg Thr Val Ser Leu Lys Ser Leu 1376
4129 TTG GGT GAT GAT GGT TTT GGT AAG TTA AAA ACC GAA GGC AAA GTC GAC 4176
1377 Leu Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 1392
4177 TTT CCA TTA AGC GAA AAG CTG TTT GAC AAC GAC TAT CCG GGG CAC TAT 4224
1393 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 1408
4225 TTG CGC CAG ATT AAA ACT GTG TCA GTG ACG TTG CCG ACG TTA GTC GGG
4272
1409 Leu Arg Gin He Lys Thr Val Ser Val Thr Leu Pro Thr Leu Val Gly
1424
4273 CCG TAT CAA AAC GTG AAG GCA ACG CTC ACT CAG ACC AGC AGC AGT ATA 4320
1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser He 1440
-275-
SUBSTΓΠJTE SHEET (RULE 26)
4321 TTG TTA GCA GCA GAT ATC AAT GGT GTT AAA CGT CTC AAT GAT CCG ACA 4368
1441 Leu Leu Ala Ala Asp He Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 1456
4369 GGT AAA GAG GGT GAT GCG ACG CAT ATT GTC ACC AAT CTG CGT GCC AGC 4416 1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 1472
4417 CAG CAG GTG GCG CTC TCT TCT GGC ATT AAT GAT GCC GGT AGC TTT GAG 4464
1473 Gin Gin Val Ala Leu Ser Ser Gly He Asn Asp Ala Gly Ser Phe Glu 1488
4465 TTG CGT TTG GAA GAT GAG CGC TAT CTA TCA TTT GAG GGG ACT GGA GCT 4512
1489 Leu Arg Leu Glu Asp Glu Arg Tyr Leu Ser Phe Glu Gly Thr Gly Ala 1504
4513 GTT TCC AAA TGG ACT CTT AAC TTC CCG CGT TCT GTG GAT GAG CAT ATT
4560
1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg Ser Val Asp Glu His He
1520
4561 GAC GAT AAG ACA TTG AAA GCG GAT GAG ATG CAG GCC GCA CTG TTG GCG 4608
1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Met Gin Ala Ala Leu Leu Ala 1536
4609 AAT ATG GAT GAT GTG CTG GTG CAG GTG CAT TAT ACC GCC TGC GAC GGC 4656 1537 Asn Met Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 1552
4657 GGC GCC AGT TTC GCA AAC CAG GTC AAG AAA ACA CTC TCT TAA CATTAACTTT 4708 1553 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser End 1565
4709 TAACTAATCC CTCCCACTCT GTTCGCCAGA GTGGGAGAAG GTTTGTCATA TCTAAAATCA 4768
4770 ATCTTGCGAT CTTTCTCCAT TTCATTGGAA GGGAAGCTGT AAAACAAATA AGGAATATGA 4828
4B29 TATG 4932
(2) INFORMATION FOR SEQ ID NO: 59
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1565 amino acids
(B) TYPE: amino acid
(C) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 (TccB peptide) Features From To Description
1 Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp Ala 16
-276-
SUBSTΓΓUTE SHEET (RULE 26)
17 Leu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Gly Val
32
33 Ser Gly Gin Pro Val Thr Val Glu Asp Leu Tyr Glu Tyr Leu Leu He
48
49 Asp Pro Glu Val Ala Asp Glu Val Glu Thr Ser Arg Val Ala Gin Ala
64
65 He Ala Ser He Gin Gin Tyr Met Thr Arg Leu Val Asn Gly Ser Glu
80
81 Pro Gly Arg Gin Ala Met Glu Pro Ser Thr Ala Asn Glu Trp Arg Asp 96
97 Asn Asp Asn Gin Tyr Ala He Trp Ala Ala Gly Ala Glu Val Arg Asn 112 113 Tyr Ala Glu Asn Tyr He Ser Pro He Thr Arg Gin Glu Lys Ser His 128
129 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144
145 Asp Arg Val Gin Asp Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 160
161 Val Ser Asn Leu Tyr Val Leu Ser Gly Tyr He Asn Gin Asp Lys Phe 176
177 Asp Gin Ala He Tyr Tyr Phe He Gly Arg Thr Thr Thr Lys Pro Tyr 192 193 Arg Tyr Tyr Trp Arg Gin Met Asp Leu Ser Lys Asn Arg Gin Asp Pro 208
209 Ala Gly Asn Pro Val Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu He 224
225 Thr Leu Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 240
241 Val Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 256
257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn He Gly Lys Thr His Ala 272 273 Tyr Asn He Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 286
289 Pro Asn Thr Thr Thr Leu Met Thr Gin Gin Ala Gly Glu Ser Ser Glu 304
305 Thr Gin Arg Ser Ser Leu Leu He Asp Glu Ser Ser Thr Thr Leu Arg 320
321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser He Asp Pro Thr Glu 336
337 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 352 353 Arg Gin Phe Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Val 368
369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 384
-277-
SUBSTΪTUTE SHEET (RULE 26)
385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp 400
401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val He 416
417 Thr Lys Val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 432
433 Val Ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 448
449 Tyr He Asp Gin Asp Gly Leu Thr Leu His He Gin Thr Thr Thr Asn 464
465 Gly Asp Phe He Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val Tyr 480
481 Asp Ser Lys Ser Gly Tyr Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 496
497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Tyr Thr Phe His Asn Ala He 512
513 He Asn Tyr Tyr Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 528
529 Thr Trp Ala Leu Glu Gin Arg He Asn Glu Gly Trp Ala He Ala Pro 544
545 Leu Leu Asp Thr Leu His Thr Val Thr Val Lys Gly Ser Tyr He Ala 560
561 Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr He Pro Asp Gly 576
577 Thr Val Leu Leu Asp Trp Phe Asp Lys He Asn Phe Ala He Gly Leu 592 593 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 608
609 Thr He Lys Asn Phe Ser Lys He Ala Asp Asn Arg Lys Phe Tvr Gin 624
625 Glu He Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr 640
641 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr Tyr Ser Thr Thr 656
657 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tyr 672 673 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 688
689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val Tyr 704
705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala He Pro Arq Leu Val 720
721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 736
737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 752 753 Thr Leu He Glu L* •- Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr 768
-278-
SUBSTΓΓUTE SHEET (RULE 26)
769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 784
785 Lys Ser Glu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 800
801 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 816
817 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr He Phe Asp 832
833 Pro Ala Met Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn Val 848
849 Arg Pro Leu Val Glu Gly Asn Ser Asp Leu Ser Arg His Leu Asp Asp 864 865 Ser He Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val He Tyr Gin 680
881 Lys Ala Val Phe He Ala Tyr Val Ser Asn Leu He Ala Gin Gly Asp 896
897 Met Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 912
913 Tyr Tyr Asn Leu Ala Ala Glu Leu Leu Gly Pro Arg Pro Asp Val Ser 928
929 Leu Ser Ser He Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly 944 945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 960
961 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Leu 976
977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Met Leu Ser 992
993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Leu 1008
1009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val 1024 1025 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Leu Thr 1040
1041 Asn Gly Val Ser Gly Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser 1056
1057 Ala Met Leu Pro Arg Ala Tyr Ser Ala Val Gly Thr Leu Thr Ser Phe 1072
1073 Gly Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin 108B
1089 Glu Glu Leu Ala Gin Gin Gin Leu Leu Asp Met Ser Ser Tyr Ala He 1104 1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Ala 1120
1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr 1136
-279-
SUBSTΓΓUTE SHEET (RULE 26)
1137 Thr Leu Tyr Gin Asn Asn He Ser Sgr Ala Glu Gin Leu Val Met Asp 1152
1153 Thr Gin Thr Ser Ala Gin Ser Leu He Ser Ser Ser Thr Gly Val Gin 1168
1169 Thr Ala Ser Gly Ala Leu Lys Val He Pro Asn He Phe Gly Leu Ala 1184 1185 Asp Gly Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala He Ala He Gly 1200
1201 Leu Met Ala Ala Gly Gin Ala Thr Ser Val Val Ala Glu Arg Leu Ala 1216
1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin He Gin Tyr 1232
1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 1248
1249 Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 1264
1265 Ala Gin Gin Val Gin He Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 1280
1281 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 1296
1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Cys Leu Ser Ala 1312
1313 Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe He 1328
1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu 1344 1345 Thr Leu Gin Leu Asn Leu His Gin Met Glu Ala Ala Tyr Leu Val Arg 1360
1361 His Glu Arg Arg Leu Asn Val He Arg Thr Val Ser Leu Lys Ser Leu 1376
1377 Leu Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 1392
1393 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 1408
1409 Leu Arg Gin He Lys Thr Val Ser Val Thr Leu Pro Thr Leu Val Gly 1424 1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser He 1440
1441 Leu Leu Ala Ala Asp He Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 1456
1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 1472
1473 Gin Gin Val Ala Leu Ser Ser Gly He Asn Asp Ala Gly Ser Phe Glu I486
1489 Leu Arg Leu Glu Asp Glu Arg Tyr Leu Ser Phe Glu Gly Thr Gly Ala 1504 1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg Ser Val Asp Glu His He 1520
-280-
SUBSTΪTUTE SHEET (RULE 26)
1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Met Gin Ala Ala Leu Leu Ala 1536
1537 Asn Met Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 1552
1553 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser 1565
12) INFORMATION FOR SEQ ID NO: 60
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3132 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 ( ccC)
1 ATG AGT CCG TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA ACA GTC AGC 48 1 Met Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 16
49 GTG TTA GAT AAT CGC GGT CTG TCC ATT CGT GAT ATT GGT TTT CAC CGT 96 17 Val Leu Asp Asn Arg Gly Leu Ser He Arg Asp He Gly Phe His Arg 32
97 ATT GTA ATC GGG GGG GAT ACT GAC ACC CGC GTC ACC CGT CAC CAG TAT
144
33 He Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 48
145 GAT GCC CGT GGA CAC CTG AAC TAC AGT ATT GAC CCA CGC TTG TAT GAT 192
49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 64
193 GCA AAG CAG GCT GAT AAC TCA GTA AAG CCT AAT TTT GTC TGG CAG CAT
240
65 'Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 80
241 GAT CTG GCC GGT CAT GCC CTG CGG ACA GAG AGT GTC GAT GCT GGT CGT 288
81 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96
289 ACT GTT GCA TTG AAT GAT ATT GAA GGT CGT TCG GTA ATG ACA ATG AAT 336
97 Thr Val Ala Leu Asn Asp He Glu Gly Arg Ser Val Met Thr Met Asn 112
337 GCG ACC GGT GTT CGT CAG ACC CGT CGC TAT GAA GGC AAC ACC TTG CCC 384
113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 128
385 GGT CGC TTG TTA TCT GTG AGC GAG CAA GTT TTC AAC CAA GAG AGT GCT 432 129 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 144
433 AAA GTG ACA GAG CGC TTT ATC TGG GCT GGG AAT ACA ACC TCG GAG AAA 480
-281-
SUBSTΓΓUTE SHEET (RULE 26)
145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 160
481 GAG TAT AAC CTC TCC GGT CTG TGT ATA CGC CAC TAC GAC ACA GCG GGA 528
161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly 176
529 GTG ACC CGG TTG ATG AGT CAG TCA CTG GCG GGC GCC ATG CTA TCC CAA 576
177 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 192
577 TCT CAC CAA TTG CTG GCG GAA GGG CAG GAG GCT AAC TGG AGC GGT GAC 624 193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 208
625 GAC GAA ACT GTC TGG CAG GGA ATG CTG GCA AGT GAG GTC TAT ACG ACA 672 209 Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 224
673 CAA AGT ACC ACT AAT GCC ATC GGG GCT TTA CTG ACC CAA ACC GAT GCG 720
225 Gin Ser Thr Thr Asn Ala He Gly Ala Leu Leu Thr Gin Thr Asp Ala 240
721 AAA GGC AAT ATT CAG CGT CTG GCT TAT GAC ATT GCC GGT CAG TTA AAA 768
241 Lys Gly Asn He Gin Arg Leu Ala Tyr Asp He Ala Gly Gin Leu Lys 256
769 GGG AGT TGG TTG ACG GTG AAA GGC CAG AGT GAA CAG GTG ATT GTT AAG 816
257 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Glu Gin Val He Val Lys 272
817 TCC CTG AGC TGG TCA GCC GCA GGT CAT AAA TTG CGT GAA GAG CAC GGT 864 273 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 288
865 AAC GGC GTG GTT ACG GAG TAC AGT TAT GAG CCG GAA ACT CAA CGT CTG 912 289 Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gin Arg Leu 304
913 ATA GGT ATC ACC ACC CGG CGT GCC GAA GGG AGT CAA TCA GGA GCC AGA 960
305 He Gly He Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 320
961 GTA TTG CAG GAT CTA CGC TAT AAG TAT GAT CCG GTG GGG AAT GTT ATC 1008 321 Val Leu Gin Asp Leu Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val He 336
1009 AGT ATC CAT AAT GAT GCC GAA GCT ACC CGC TTT TGG CGT AAT CAG AAA 1056
-282-
SUBSTΓΓUTE SHEET (RULE 26)
337 Ser He His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gin Lys 352
1057 GTG GAG CCG GAG AAT CGC TAT GTT TAT GAT TCT CTG TAT CAG CTT ATG 1104
353 Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Leu Tyr Gin Leu Met 368
1105 AGT GCG ACA GGG CGT GAA ATG GCT AAT ATC GGT CAG CAA AGC AAC CAA 1152
369 Ser Ala Thr Gly Arg Glu Met Ala Asn He Gly Gin Gin Ser Asn Gin 384
1153 CTT CCC TCA CCC GTT ATA CCT GTT CCT ACT GAC GAC AGC ACT TAT ACC 1200
385 Leu Pro Ser Pro Val He Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr 400
1201 AAT TAC CTT CGT ACC TAT ACT TAT GAC CGT GGC GGT AAT TTG GTT CAA 1248
401 Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gin 416
1249 ATC CGA CAC AGT TCA CCC GCG ACT CAA AAT AGT TAC ACC ACA GAT ATC 1296
417 He Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp He 432
1297 ACC GTT TCA AGC CGC AGT AAC CGG GCG GTA TTG AGT ACA TTA ACG ACA 1344
433 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 448
1345 GAT CCA ACC CGA GTG GAT GCG CTA TTT GAT TCC GGC GGT CAT CAG AAG 1392
449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 464
1393 ATG TTA ATA CCG GGG CAA AAT CTG GAT TGG AAT ATT CGG GGT GAA TTG 1440
465 Met Leu He Pro Gly Gin Asn Leu Asp Trp Asn He Arg Gly Glu Leu 480
1441 CAA CGA GTC ACA CCG GTG AGC CGT GAA AAT AGC AGT GAC AGT GAA TGG 1488
481 Gin Arg Val Thr Pro Val Ser Arg Glu Asn Ser Ser Asp Ser Glu Trp 496
1489 TAT CGC TAT AGC AGT GAT GGC ATG CGG CTG CTA AAA GTG AGT GAA CAG
1536 —
497 Tyr Arg Tyr Ser Ser Asp Gly Met Arg Leu Leu Lys Val Ser Glu Gin 512
1537 CAG ACG GGC AAC AGT ACT CAA GTA CAA CGG GTG ACT TAT CTG CCG GGA 1584
513 Gin Thr Gly Asn Ser Thr Gin Val Gin Arg Val Thr Tyr Leu Pro Gly 528
1585 TTA GAG CTA CGG ACA ACT GGG GTT GCA GAT AAA ACA ACC GAA GAT TTG 1632
529 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu 544
1633 CAG GTG ATT ACG GTA GGT GAA GCG GGT CGC GCA CAG GTA AGG GTA TTG 1680
545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 560
1681 CAC TGG GAA AGT GGT AAG CCG ACA GAT ATT GAC AAC AAT CAG GTG CGC 1726
-283-
SUBSTΓΓUTE SHEET (RULE 26)
561 His Trp Glu Ser Gly Lys Pro Thr Asp He Asp Asn Asn Gin Val Arg 576
1729 TAC AGC TAC GAT AAT CTG CTT GGC TCC AGC CAG CTT GAA CTG GAT AGC 1776
577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 592
1777 GAA GGG CAG ATT CTC AGT CAG GAA GAG TAT TAT CCG TAT GGC GGT ACG 1824
593 Glu Gly Gin He Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 608
1825 GCG ATA TGG GCG GCG AGA AAT CAG ACA GAA GCC AGC TAC AAA TTT ATT 1872
609 Ala He Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe He 624
1873 CGT TAC TCC GGT AAA GAG CGG GAT GCC ACT GGA TTG TAT TAT TAC GGC 1920
625 Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly 640
1921 TAC CGT TAT TAT CAA CCT TGG GTG GGT CGA TGG TTG AGT GCT GAT CCG 1968
641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro 656
1969 GCG GGA ACC GTG GAT GGG CTG AAT TTG TAC CGA ATG GTG AGG AAT AAC 2016
657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn 672
2017 CCC ATC ACA TTG ACT GAC CAT GAC GGA TTA GCA CCG TCT CCA AAT AGA 2064
673 Pro He Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 688
2065 AAT CGA AAT ACA TTT TGG TTT GCT TCA TTT TTG TTT CGT AAA CCT GAT 2112
689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 704
2113 GAG GGA ATG TCC GCG TCA ATG AGA CGG GGA CAA AAA ATT GGC AGA GCC 2160
705 Glu Gly Met Ser Ala Ser Met Arg Arg Gly Gin Lys He Gly Arg Ala 720
2161 ATT GCC GGC GGG ATT GCG ATT GGC GGT CTT GCG GCT ACC ATT GCC GCT 2208
721 He Ala Gly Gly He Ala He Gly Gly Leu Ala Ala Thr He Ala Ala 736
2209 ACG GCT GGC GCG GCT ATC CCC GTC ATT CTG GGG GTT GCG GCC GTA GGC 2256
737 Thr Ala Gly Ala Ala He Pro Val He Leu Gly Val Ala Ala Val Gly 752
2257 GCG GGG ATT GGC GCG TTG ATG GGA TAT AAC GTC GGT AGC CTG CTG GAA 2304
753 Ala Gly He Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Leu Leu Glu 768
2305 AAA GGC GGG GCA TTA CTT GCT CGA CTC GTA CAG GGG AAA TCG ACG TTA 2352
769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr Leu 784
2353 GTA CAG TCG GCG GCT GGC GCG GCT GCC GGA GCG AGT TCA GCC GCG GCT 2400
785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 800
2401 TAT GGC GCA CGG GCA CAA GGT GTC GGT GTT GCA TCA GCC GCC GGG GCG 2448 801 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 816
2449 GTA ACA GGG GCT GTG GGA TCA TGG ATA AAT AAT GCT GAT CGG GGG ATT 2496 817 Val Thr Gly Ala Val Gly Ser Trp He Asn Asn Ala Asp Arg Gly He 832
2497 GGC GGC GCT ATT GGG GCC GGG AGT GCG GTA GGC ACC ATT GAT ACT ATG 2544 833 Gly Gly Ala He Gly Ala Gly Ser Ala Val Gly Thr He Asp Thr Met 848
2545 TTA GGG ACT GCC TCT ACC CTT ACC CAT GAA GTC GGG GCA GCG GCG GGT 2592 849 Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly 864
2593 GGG GCG GCG GGT GGG ATG ATC ACC GGT ACG CAA GGG AGT ACT CGG GCA 2640 865 Gly Ala Ala Gly Gly Met He Thr Gly Thr Gin Gly Ser Thr Arg Ala 880
2641 GGT ATC CAT GCC GGT ATT GGC ACC TAT TAT GGC TCC TGG ATT GGT TTT 2688 881 Gly He His Ala Gly He Gly Thr Tyr Tyr Gly Ser Trp He Gly Phe 896
2689 GGT TTA GAT GTC GCT AGT AAC CCC GCC GGA CAT TTA GCG AAT TAC GCA 2736 897 Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala 912
2737 GTG GGT TAT GCC GCT GGT TTG GGT GCT GAA ATG GCT GTC AAC AGA ATA 2784 913 Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Met Ala Val Asn Arg He 928
2785 ATG GGT GGT GGA TTT TTG AGT AGG CTC TTA GGC CGG GTT GTC AGC CCA 2832 929 Met Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 944
2833 TAT GCC GCC GGT TTA GCC AGA CAA TTA GTA CAT TTC AGT GTC GCC AGA 2880 945 Tyr Ala Ala Gly Leu Ala Arg Gin Leu Val His Phe Ser Val Ala Arg 960
2881 CCT GTC TTT GAG CCG ATA TTT AGT GTT CTC GGC GGG CTT GTC GGT GGT 2928 961 Pro Val Phe Glu Pro He Phe Ser Val Leu Gly Gly Leu Val Gly Gly 976
2929 ATT GGA ACT GGC CTG CAC AGA GTG ATG GGA AGA GAG AGT TGG ATT TCC 2976 977 He Gly Thr Gly Leu His Arg Val Met Gly Arg Glu Ser Trp He Ser 992
2977 AGA GCG TTA AGT GCT GCC GGT AGT GGT ATA GAT CAT GTC GCT-GGC ATG 3024 993 Arg Ala Leu Ser Ala Ala Gly Ser Gly He Asp His Val Ala Gly Met 1008
3025 ATT GGT AAT CAG ATC AGA GGC AGG GTC TTG ACC ACA ACC GGG ATC GCT 3072
-285-
SUBSTTTUTE SHEET (RULE 26)
1009 He Gly Asn Gin He Arg Gly Arg Val Leu Thr Thr Thr Gly He Al; 1024
3073 AAT GCG ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC GCA CGA CGA GTT
3120
1025 Asn Ala He Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg Arg Val
1040
3121 TTT TCT TTG TAA 3132 1041 Phe Ser Leu End 1043
(2) INFORMATION FOR SEQ ID NO: 61
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1043 amino acids
(B) TYPE: amino acid (C) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 (TccC peptide)
1 Met Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 16
17 Val Leu Asp Asn Arg Gly Leu Ser He Arg Asp He Gly Phe His Arg 32 33 He Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 46
49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 64
65 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 80
81 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96
97 Thr Val Ala Leu Asn Asp He Glu Gly Arg Ser Val Met Thr Met Asn 112 113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 128
129 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 144
145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 160
161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly 176
177 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 192 193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 208
209 Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 224
225 Gin Ser Thr Thr Asn Ala He Gly Ala Leu Leu Thr Gin Thr Asp Ala 240
241 Lys Gly Asn He Gin Arg Leu Ala Tyr Asp He Ala Gly Gin Leu Lys 256
257 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Glu Gin Val He Val Lys 272 273 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 288
289 Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gin Arg Leu 304
305 He Gly He Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 320
321 Val Leu Gin Asp Leu Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val He 336
337 Ser He His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gin Lys 352
-286-
SUBSTΓΠJTE SHEET (RULE 26)
353 Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Leu Tyr Gin Leu Met 368
369 Ser Ala Thr Gly Arg Glu Met Ala Asn He Gly Gin Gin Ser Asn Gin 384
385 Leu Pro Ser Pro Val He Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr 400
401 Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gin 416
417 He Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp He 432
433 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 448
449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 464 465 Met Leu He Pro Gly Gin Asn Leu Asp Trp Asn He Arg Gly Glu Leu 480
481 Gin Arg Val Thr Pro Val Ser Arg Glu Asn Ser Ser Asp Ser Glu Trp 496
497 Tyr Arg Tyr Ser Ser Asp Gly Met Arg Leu Leu Lys Val Ser Glu Gin 512
513 Gin Thr Gly Asn Ser Thr Gin Val Gin Arg Val Thr Tyr Leu Pro Gly 528
529 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu 544 545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 560
561 His Trp Glu Ser Gly Lys Pro Thr Asp He Asp Asn Asn Gin Val Arg 576
577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 592
593 Glu Gly Gin He Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 608
609 Ala He Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe He 624 625 Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly 640
641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu Ser Ala Asp Pro 656
657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn 672
673 Pro He Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 688
689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 704 705 Glu Gly Met Ser Ala Ser Met Arg Arg Gly Gin Lys He Gly Arg Ala 720
721 He Ala Gly Gly He Ala He Gly Gly Leu Ala Ala Thr He Ala Ala 736
737 Thr Ala Gly Ala Ala He Pro Val He Leu Gly Val Ala Ala Val Gly 752
753 Ala Gly He Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Leu Leu Glu 768
769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr Leu 784 785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 800
801 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 816
817 Val Thr Gly Ala Val Gly Ser Trp He Asn Asn Ala Asp Arg Gly He 832
833 Gly Gly Ala He Gly Ala Gly Ser Ala Val Gly Thr He Asp Thr Met 848
849 Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly 864 865 Gly Ala Ala Gly Gly Met He Thr Gly Thr Gin Gly Ser Thr Arg Ala 880
881 Gly He His Ala Gly He Gly Thr Tyr Tyr Gly Ser Trp He Gly Phe 896
897 Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala 912
913 Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Met Ala Val Asn Arg He 928
-287-
SUBSTΠΓUTE SHEET (RULE 26)
929 Met Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 944
945 Tyr Ala Ala Gly Leu Ala Arg Gin Leu Val His Phe Ser Val Ala Arg 960
961 Pro Val Phe Glu Pro He Phe Ser Val Leu Gly Gly Leu Val Gly Gly 976
977 He Gly Thr Gly Leu His Arg Val Met Gly Arg Glu Ser Trp He Ser 992 993 Arg Ala Leu Ser Ala Ala Gly Ser Gly He Asp His Val Ala Gly Met 1008
1009 He Gly Asn Gin He Arg Gly Arg Val Leu Thr Thr Thr Gly He Ala 1024
1025 Asn Ala He Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg Arg Val 1040
1041 Phe Ser Leu 1043
(2) INFORMATION FOR SEQ ID NO: 62: TcaAlv
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5 amino acids
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: internal
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 62. TcaAlv
Asn He Gly Gly Asp 1 5
(2) INFORMATION FOR SEQ ID NO:63: TcaAι;L-syn
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: internal
(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 63: TcaAι;L-syn
Cys Leu Arg Gly Asn Ser Pro Thr Asn Pro Asp Lys Asp Gly He 1 5 10 15
Phe Ala Gin Val Ala 20
(2) INFORMATION FOR SEQ ID NO: 64: TcaA^-syn
(l) SEQUENCE CHARACTERISTICS;
(A) LENGTH: 20 amino acids (B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: Internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: TcaA11;L-syn Cys Tyr Thr Pro Asp Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe
-288-
SUBSTΓΓUTE SHEET (RULE 26)
1 5 . 10 15
Arg Ser Ala Asp Gly 20 (2) INFORMATION FOR SEQ ID NO: 65: TcaB, -syn
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: Internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: TcaBi-syn
His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu 1 5 10 15
Ser He Asn Thr 19
(2) INFORMATION FOR SEQ ID NO: 66: TcaBi;L-syn
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: TcaBχi-syn
Cys Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala Gly Gly Asp 1 5 10 15
Gly Thr Gly Ser Ser 20
(2) INFORMATION FOR SEQ ID NO: 67: TcaC-syn
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: TcaC-syn
Cys Tyr Lys Ala Pro Gin Arg Gin Glu Asp Gly Asp Ser Asn Ala
1 5 10 15
Val Thr Tyr Asp Lys 20
-289-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 68: TcbA.,χ- syn
(l) SEQUENCE CHARACTERISTICS-
(A) LENGTH: 20 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: internal (xi ) SEQUENCE DESCRIPTION: SEQ ID NO.68- TcbAχι-syn
Cys Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe 1 5 10 15
Ser Ser Lys Asp Asp 20
(2) INFORMATION FOR SEQ ID NO: 69: TcbA111-syn
(l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: TcbAι;L1-syn
Cys Phe Asp Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala 1 5 10 15
Gly Glu Gin Arg Ala 20
(2) INFORMATION FOR SEQ ID NO.70: TcdA,, -syn
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: TcdAl;L-syn
Cys Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val 1 5 10 15
Tyr Gin Tyr Ser Gly Asn Thr 20
(2) INFORMATION FOR SEQ ID NO: 71. TcdAι:L1-syn
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: internal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 71. TcdAlι;L-syn
Val Ser Gin Gly Ser Gly Ser Ala Gly Ser Gly Asn Asn Asn Leu
1 5 10 15
Ala Phe Gly Ala Gly 20
(2) INFORMATION FOR SEQ ID NO: 72:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 12 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 160 kDa - Hb
Met Gin Asp Ser Pro Glu Val Ala He Thr Thr Leu 1 5 10
(2) INFORMATION FOR SEQ ID NO: 73:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 8 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 170 kDa - WIR
Met Gin Arg Ser Ser Glu Val Ser 1 5
(2) INFORMATION FOR SEQ ID NO: 74:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 12 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 180 kDa - H9
Met Gin Asp He Pro Glu Val Gin Leu Asn 1 5 10
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 170 kDa - Hm(2) INFORMATION FOR SEQ ID NO: 75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-terminal
Met Gin Asp Ser Pro Glu Val Ser Val Thr Gin Asn
-291-
SUBSTΓΠJTE SHEET (RULE 26)
10
(2) INFORMATION FOR SEQ ID NO: 76
(l) SEQUENCE CHARACTERISTICS.
(A) LENGTH: 15 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-terminal
(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 76- 74 kDa - H9
Ser Glu Ser Leu Phe Thr Gin Ser Leu Lys Glu Ala Arg Arg Asp 1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 77:
(l) SEQUENCE CHARACTERISTICS.
(A) LENGTH: 14 amino acids
(B) TYPE: ammo acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 71 kDa - Hb
Met Asn Leu He Glu Ala Lys Leu Gin Glu Asn Arg Asp Ala 1 5 10
(2) INFORMATION FOR SEQ ID NO: 78:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 ammo acids (B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n ) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-terminal
(XI ) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 170 kDa - H9
Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp
1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 79-
(l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 ammo acids
(B) TYPE: ammo acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION SEQ ID NO- 79- 109 kDa - Hm
Met Leu Asp He Met Glu Lys Gin Leu Asn Glu Ser Glu Arg Asp 1 5 10 15
-292-
SUBSTΓ JTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 80: (l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 170 kDa - X-1
Met Gin Asp Ser Arg Glu Val Ser
1 5
(2) INFORMATION FOR SEQ ID NO: 81:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12 ammo acids
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 69 kDa - H9
Leu Arg Ser Ala Xxx Ser Ala Leu Thr Thr Leu Leu 1 5 10
(2) INFORMATION FOR SEQ ID NO: 82:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 ammo acids
(B) TYPE: ammo acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 64 kDa - HP88
Leu Lys Leu Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val
1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 83
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 70 kDa - NC-1
Leu Lys Leu Ala Asp Asn Ser Tyr Phe Asn Glu Pro Leu Asn
1 5 ιo 15
-293-
SUBSTΓΓUTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 84:
(l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 ammo acids
(C) STRANDEDNESS: single
(D) TOPOLOGY, linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 60 kDa - WIR
Ser Lys Asp Glu Ser Lys Ala Asp Ser Gin Leu Val Tyr His Thr 1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 85: (l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE : N- termmal
(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 85 : 58 kDa - NC- 1 Met Lys Lys Arg Gly Leu Thr Thr Asn Ala Gly Ala Pro Val
1 5 10
(2) INFORMATION FOR SEQ ID NO: 86
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 60 kDa - WX-12
Met Leu Asn Pro He Val Arg Lys Phe Glu Tyr Gly Glu His Thr 1 5 10 15
(2) INFORMATION FOR SEQ ID NO: 87:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 ammo acids
(B) TYPE: ammo acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (n) MOLECULAR TYPE: protein (v) FRAGMENT TYPE: N-termmal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87- 60 kDa - Hm
Ala Glu He Tyr Asn Lys Asp Gly Asn Lys Leu Asp Leu Tyr Giv
1 5 10 15'
-294-
SUBSTΓΠJTE SHEET (RULE 26)
(2) INFORMATION FOR SEQ ID NO: 88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULAR TYPE: protein
(v) FRAGMENT TYPE: N-terminal
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 140 kDa - Hm
Asn Leu He Glu Ala Thr Leu Glu Gin Asn Leu Arg Asp Ala 1 5 10 15
Claims (99)
1. A composition, comprising an effective amount of a Photorhabdus protein tox that has functional activity against an insect .
2. The composition of Claim 1, wherein the Photorhabdus toxm is produced by a purified culture of Photorhabdus , a transgenic plant, baculovirus, or heterologous microbial host.
3. The composition of Claim 2, wherein the Photorhabdus toxin produced by a purified culture of Photorhabdus luminescens .
4 The composition of Claim 2, wherein the toxm is produced from a purified culture of Photorhabdus l uminescens strain designated ATCC 55397.
5. The composition of Claim 2, wherein the toxm is produced by a purified culture of Photorhabdus luminescens strain designated W-14.
6. The composition of Claim 1, wherein the toxm is produced by a purified culture of Photorhabdus strain designated WX-1, WX-2, WX-3, WX-4, WX-5, WX6 , WX-7, WX-8, WX- 9, WX-10, WX-11, WX-12, WX-14, WX-15, H9, Hb, Hm, HP88, NC-1, W30, WIR, B2, ATCC# 43948, ATCCtf 43949, ATCC# 43950, ATCC# 43951, ATCC# 43952, DEP1 , DEP2 , DEP3 , P. zealandrica, P hepialus, HB-Arg, HB Oswego, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A Cows, MP1, MP2 , MP3, MP4, MP5, GL98 , GL101, GL138, GL55, GL217, or GL257
7. The composition of Claim 2, wherein the toxm is produced from a purified culture of Photorhabdus luminescens strain designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, WX-8, WX-9, WX-10, WX-11, WX-12, WX-14, WX-15, H9, Hb , Hm, HP88, NC-1, W30, WIR, B2 , ATCC# 43948, ATCC# 43949, ATCC# 43950, ATCC# 43951, ATCC# 43952, DEP1 , DEP2 , DEP3 , P zealandrica, P hepialus, HB-Arg, HB Oswego, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2, MP3, MP4 , MP5 , GL98, GL101, GL138, GL55, GL217, or GL257
-296-
SUBSTΓΠJTE SHEET (RUL.: 26)
8. The composition of Claim 1, wherein the tox is represented by ammo acid sequence is SEQ ID NO: 12.
9. The composition of Claim 6, wherein the composition is a mixture of one or more tox s produced from purified cultures of Photorhabdus.
10. The composition of Claim 1 or 6 , wherein the insect is of the order Lepidoptera , Coleoptera, Hymenoptera , Diptera ,
Dictyoptera , Acarina or Homoptera .
11. The composition of Claim 1 or 6, wherein the insect species is from order Coleoptera and is Southern Corn Rootworm, Western Corn Rootworm, Colorado Potato Beetle, Mealworm, Boll Weevil or Turf Grub.
12. The composition of Claim 1 or 6, wherein the insect species is from order Lepidoptera and is Beet Armyworm, Black Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European Corn Borer, Tobacco Hornworm, or Tobacco Budworm.
13. The composition of Claim 1 or 6 , wherein the toxm is formulated as a sprayable insecticide.
14. The composition of Claim 1 or Claim 6, wherein the toxm is formulated as a bait matrix and delivered in an above ground or below ground bait station.
15. A method of controlling an insect, comprising orally delivering to an insect an effective amount of a protein tox that has functional activity against an insect, wherein the protein is produced by a purified bacterial culture of the genus Photorhabdus
16. The method of Claim 15, wherein the bacterium is a purified culture of Photorhabdus luminescens .
17. The method of Claim 15, wherein the tox is produced from a purified culture of Photorhabdus l uminescens strain designated ATCC 55397
18. The method of Claim 16, wherein the tox is produced from a purified culture of Photorhabdus luminescens strain designated W-14.
19. The method of Claim 15, wherein the toxm is produced from a purified culture of Photorhabdus strains designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, WX-8, WX- 9, WX-10, WX-11, WX-12, WX-14, WX-15, H9 , Hb, Hm, HP88, NC-1, W30, WIR, B2, ATCC# 43948, ATCC# 43949, ATCC# 43950, ATCC# 43951, ATCC# 43952, DEP1 , DEP2 , DEP3 , P. zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2 , MP3, MP4, MP5, GL98, GL101, GL138, GL155, GL217, or GL257.
20. The method of Claim 15, wherein the tox is produced from a purified culture of Photorhabdus l uminescens strains designated WX-1, WX-2, WX-3, WX-4, WX-5, WX-6, WX-7, WX-8, WX-9, WX-10, WX-11, WX-12, WX-14, WX-15, H9, Hb, Hm, HP88, NC-1, W30, WIR, B2 , ATCC# 43948, ATCC# 43949, ATCC# 43950, ATCC# 43951, ATCC# 43952, DEP1 , DEP2 , DEP , P. zealandrica, P. hepialus, HB-Arg, HB Oswego, HB Oswego, HB Lewiston, K-122, HMGD, Indicus, GD, PWH-5, Megidis, HF-85, A. Cows, MP1, MP2, MP3, MP4 , MP5 , GL98, GL101, GL138, GL155, GL217, or GL257.
21. The method of Claim 19, wherein a mixture of one or more tox s is produced from a purified culture of Photorhabdus and said toxms are orally delivered to an insect:.* "
22. The method of Claim 15, wherein the toxm is produced by a prokaryotic host transformed with a gene encoding the toxin.
23. The method of Claim 15, wherein the toxin is produced by a eukaryotic host transformed with a gene encoding the toxin .
24. The method of Claim 23, wherein the eukaryotic host is baculovirus .
-298-
SUBSTΓΓUTE SHEET (RULE 25)
25. The method of Claim 15 or 19, wherein the insect is of the order Lepidoptera, Coleoptera, Hymenoptera, Diptera, Dictyoptera, Acarina or Homoptera .
26. The method of Claim 15 or 19, wherein the insect species is from order Coleoptera and is Southern Corn Rootworm, Western Corn Rootworm, Colorado Potato Beetle, Mealworm, Boll Weevil or Turf Grub.
27. The method of Claim 15 or 19, wherein the insect species is from order Lepidoptera and is Beet Armyworm, Black Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European Corn Borer, Tobacco Hornworm, or Tobacco Budworm.
28. The method of Claim 15 or 19, wherein the toxm is formulated as a sprayable insecticide.
29. The method of Claim 15 or Claim 19, wherein the toxm is formulated as a bait matrix and delivered in an above ground or below ground bait station.
30. A method of isolating a gene coding for a protein subunit, comprising the steps of: constructing at least one RNA or DNA oligonucleotide molecule that corresponds to at least a part of a DNA coding region of an ammo acid sequence selected from a group consisting of SEQ ID N0:1, SEQ ID NO.2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO.-42, SEQ ID NO: 43, SEQ ID NO: 62, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88, wherein the nucleotide molecule is used to isolate genetic material from Photorhabdus or Photorhabdus luminescens .
31. A method for expressing a protein produced by a purified bacterial culture of the genus Photorhabdus in a prokaryotic or eukaryotic host in an effective amount so that
-299-
SUBSTΓΓUTE SHEET (RULE 26) the protein has functional activity against an insect, wherein the method comprises: constructing a chimeric DNA construct having 5' to 3 ' a promoter, a DNA sequence encoding a protein, a transcription terminator, and then transferring the chimeric DNA construct into the host .
32. The method of Claim 31, wherein the protein has functional activity against insects selected from a group consisting of Coleoptera , Lepidoptera , Diptera , Homoptera , Hymenoptera , Dictyoptera , and Acarina .
33. The method of Claim 31, wherein the protein encoded by the DNA sequence has an N-termmal am o acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO: 2, SEQ ID N0:3, SEQ ID NO:4, SEQ ID NO : 5 , SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO : 8 , SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO.17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 62, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88.
34. The method of Claim 31, wherein the protein encoded by the DNA sequence includes the ammo acid sequence selected from the group consisting of SEQ ID NO: 12, SEQ ID NO: 26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID
NO:35, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:61.
35. A chimeric DNA construct, adapted for expression in a prokaryotic or eukaryotic host comprising, 5' to 3 ' a transcriptional promoter active in the host; a DNA sequence encoding a Photorhabdus protein that has functional activity against an insect; and a transcriptional terminator
36. A chimeric DNA construct of Claim 35, wherein the protein encoded by the DNA sequence has an N-termmal ammo acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO .2 , SEQ ID NO : 3 , SEQ ID NO.4, SEQ ID NO: 5, SEQ
-300-
SUBSTTTUTE SHEET (RULE 26) ID NO: 6, SEQ ID NO : 7 , SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO.-41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:62, SEQ ID NO:72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88.
37. The chimeric DNA construct of Claim 35, wherein the protein encoded by the DNA sequence has an am o acid sequence selected from the group consisting of SEQ ID NO: 12, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO.32, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 7, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID NO : 61.
38 The chimeric DNA construct of Claim 35, wherein the DNA sequence encoding the Photorhabdus luminescens protein is selected from the group comprising SEQ ID NO: 11, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, and SEQ ID NO: 60.
39 The chimeric DNA construct of Claim 35, wherein the host is baculovirus or a plant cell.
40. An isolated and substantially purified preparation comprising, a DNA molecule capable of encoding an effective amount of a protein that is produced by a bacterium of the genus Photorhabdus and that has functional activity against an insect .
41. The preparation of Claim 40, wherein the bacterium is Photorhabdus luminescens .
42. A purified preparation comprising, a protein produced by Photorhabdus or Photorhabdus luminescens having an N-termmal amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO.2, SEQ ID NO 3 , SEQ ID NO:4, SEQ ID NO .5 , SEQ ID NO : 6 , SEQ ID NO : 7 , SEQ ID NO : 8 , SEQ
-301-
SUBSTΪTUTE SHEET (RULE 26) ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 13,. SEQ ID NO: 14, SEQ ID
NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,
SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID
NO:24, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39,
SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID
NO: 62, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75,
SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID
NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84,
SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88.
43. A purified protein preparation comprising, a protein that has an N-terminal amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO : 3 , SEQ ID NO: , SEQ ID NO: 5, SEQ ID NO : 6 , SEQ ID NO : 7 , SEQ ID NO : 8 , SEQ ID NO: 9, and SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 62, SEQ ID NO: 72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, and SEQ ID NO: 88.
44. A purified protein preparation comprising, a protein selected from the group of SEQ ID NO: 12, SEQ ID NO: 26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:61.
45. A purified DNA preparation comprising, a DNA sequence selected from the group consisting of SEQ ID NO: 11, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58 and SEQ ID NO: 60, wherein the DNA sequence is isolated from its native host .
46. A purified protein preparation comprising, a Photorhabdus luminescens protein with at least one subunit having an approximate molecular weight between 18 kDa to about 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to
-302-
SUBSTΓΓUTE SHEET (RULE 26) 160 kDa; about 80 kDa to about 100 kDa; or about 50 kDa to about 80 kDa.
47. A purified protein preparation comprising, a Photorhabdus luminescens protein with at least one subunit having an approximate molecular weight of about 280 kDa.
48. A substantially pure microorganism culture comprising, ATCC 55397.
49. The culture of Claim 48, wherein the culture is a derivative of ATCC 55397 that produces a protein toxm that has functional activity against an insect.
50. A transgenic plant comprising in its genome, a chimeric artificial gene construction imbuing the plant with an ability to express an effective amount of a Photorhabdus protein that has functional activity against an insect.
51. The transgenic plant of Claim 50, wherein the plant is transformed using acceleration of genetic material coated onto microparticles directly into cells, Agrobacteria , whiskers, or electroporation techniques
52. The transgenic plant of Claim 50, wherein the selectable marker is selected from the group consisting of kanamycm, neomycm, glyphosate, hygro ycm, methotrexate, phosphmothπcin (bialophos) , chlorosulfuron, bromoxynil, dalapon and the like.
53. The transgenic plant of Claim 50, wherein the promoter is selected from the group consisting of octopme synthase, nopaline synthase, mannopme synthase, 35S, 19S, 35T, rιbulose-l,6-bιsphosphate (RUBP) carboxylase small subunit (ssu) , beta-conglyc in, phaseolm, alcohol dehydrogenase (ADH) , heat-shock, ubiquit , zem, oleosm, napm, or acyl carier protein (ACP) .
54. The transgenic plant of Claim 50, wherein embryogenic tissue, callus tissue type I or II, hypocotyl, meristem, or plant tissue during dedifferentiation is used in preparing the transgenic plant.
-303-
SUBSTΓΓUTE SHEET (RULE 26)
55. The transgenic plant of Claim 50, wherein the chimeric gene is a DNA sequence which encodes a Photorhabdus protein that has functional activity against an insect and at least one codon of the gene has been modified so that the codon is a plant preferred codon.
56. A method of controlling an insect comprising orally delivering to an insect an effective amount of a protein toxin, wherein the protein is produced by a transgenic plant, which said insect feeds.
57. A composition of matter, comprising a purified DNA sequence from a purified bacterial culture from the genus Photorhabdus .
58. A substantially pure microorganism culture comprising, H9.
59. A substantially pure microorganism culture comprising,
Hb.
60. A substantially pure microorganism culture comprising,
Hm.
61. A substantially pure microorganism culture comprising, HP88.
62. A substantially pure microorganism culture comprising,
NC-1.
63. A substantially pure microorganism culture comprising, W30.
64. A substantially pure microorganism culture comprising, WIR .
65. A substantially pure microorganism culture comprising, B2.
66. A substantially pure microorganism culture comprising, P. zealandrica.
67. A substantially pure microorganism culture comprising, P. hepialus.
68. A substantially pure microorganism culture comprising, HB-Arg.
69. A substantially pure microorganism culture comprising, HB Oswego.
70. A substantially pure microorganism culture comprising, HB Lewiston.
71. A substantially pure microorganism culture comprising, K-122.
72. A substantially pure microorganism culture comprising, HMGD.
73. A substantially pure microorganism culture comprising, Indicus.
74. A substantially pure microorganism culture comprising, GD.
75. A substantially pure microorganism culture comprising, PWH-5.
76. A substantially pure microorganism culture comprising, Megidis.
77. A substantially pure microorganism culture comprising, HF-85.
-305-
SUBSTTTUTE SHEET (RULE 26)
78. A substantially pure microorganism culture comprising, A. Cows.
79. A substantially pure microorganism culture comprising, MP1.
80. A substantially pure microorganism culture comprising, MP2.
81. A substantially pure microorganism culture comprising, MP3.
82. A substantially pure microorganism culture comprising, MP4.
83 A substantially pure microorganism culture comprising, MP5
84. A substantially pure microorganism culture comprising, GL98.
85. A substantially pure microorganism culture comprising, GL155.
86. A substantially pure microorganism culture comprising, GL101.
87. A substantially pure microorganism culture comprising, GL138.
88. A substantially pure microorganism culture comprising, GL217.
89. A substantially pure microorganism culture comprising, GL257.
90. A method of making an antibody against a protein fragment that is part of a protein having functional activity, where the protein is produced by bacteria of the Enterobacteracaea family, wherein the method comprises:
a) isolating a fragment of the protein, where the protein fragment is at least six ammo acids ;
-306-
SUBSTΓΓUTE SHEET (RULE 26) b) immunizing a mammalian species with the protein fragment; and
c) harvesting serum containing antibody or antibody from the spleen of the mammalian species, where the antibody harvested is antibody to the protein fragment having functional activity.
91. The method of Claim 1, wherein the protein fragment is selected from the group consisting of SEQ ID NO -.63, SEQ ID NO: 64, SEQ ID NO: 65, SEQ ID NO: 66, SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, and SEQ ID NO: 71.
92. The method of Claim 90, wherein the bacteria is from the genus Photorhabdus .
93. The method of Claim 90, wherein the bacteria is from the genus Photorhabdus luminescens .
94. A method of selecting a DNA fragment which encodes a portion of a protein that has functional activity, where the protein is produced from a bacteria of the Enterobacteracaea family, wherein the method comprises:
a) isolating a fragment of the DNA sequence having at least 30 nucleotides;
b) tagging the DNA fragment with a radioactive or chemical agent;
c) hybridizing the DNA fragment to a DNA library, where the DNA library is an Enterobacteracaea cDNA or Enterobacteracaea genomic library; and.
d) selecting the fragment that is hybridized to the DNA n the library that encodes for the protein that has functional activity.
95. The method of Claim 94, wherein the bacteria is from the genus Photorhabdus .
-307-
SUBSTΓΓUTE SHEET (RULE 26)
96. The method of Claim 95, wherein the bacteria is from the genus Photorhabdus luminescens .
97. A method of selecting a DNA fragment which encodes a portion of a protein that has functional activity, where the protein is produced from a bacteria of the Enterobacteracaea family, wherein the method comprises:
a) isolating at least two primers, where a primer is a fragment of DNA having at least twelve nucleotides;
b) using the primers from step a) , amplifying a DNA fragment from Enterobacteracaea by using primers with polymerase chain reaction technology and purifying the DNA fragment;
c) tagging the purified DNA fragment with a radioactive or chemical agent;
d) hybridizing the purified DNA fragment to a DNA library, where the DNA library is an Enterobacteracaea cDNA or Enterobacteracaea genomic library; .and
e) selecting a DNA fragment that is equal or larger in size to the purified DNA fragment from the library, where the selected DNA fragment or portion thereof encodes for a protein that has functional activity.
98. The method of Claim 97, wherein the bacteria is from the genus Photorhabdus .
99 . The method of Claim 98, wherein the bacteria is from the genus Photorhabdus luminescens .
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US70548496A | 1996-08-29 | 1996-08-29 | |
US08705484 | 1996-08-29 | ||
US74369996A | 1996-11-06 | 1996-11-06 | |
WO618003 | 1996-11-06 | ||
PCT/US1996/018003 WO1997017432A1 (en) | 1995-11-06 | 1996-11-06 | Insecticidal protein toxins from photorhabdus |
US08743699 | 1996-11-06 | ||
PCT/US1997/007657 WO1998008932A1 (en) | 1996-08-29 | 1997-05-05 | Insecticidal protein toxins from $i(photorhabdus) |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU97125/01A Division AU9712501A (en) | 1996-08-29 | 2001-12-06 | Insecticidal protein toxins from photorhabdus |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2829997A true AU2829997A (en) | 1998-03-19 |
Family
ID=27107515
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU28299/97A Abandoned AU2829997A (en) | 1996-08-29 | 1997-05-05 | Insecticidal protein toxins from (photorhabdus) |
Country Status (11)
Country | Link |
---|---|
EP (1) | EP0970185A4 (en) |
JP (1) | JP2000515024A (en) |
KR (1) | KR20000037116A (en) |
AR (1) | AR007133A1 (en) |
AU (1) | AU2829997A (en) |
CA (1) | CA2263819A1 (en) |
IL (1) | IL128590A0 (en) |
SK (1) | SK24699A3 (en) |
TR (1) | TR199901126T2 (en) |
TW (1) | TW509722B (en) |
WO (1) | WO1998008932A1 (en) |
Families Citing this family (143)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0915909B1 (en) * | 1997-05-05 | 2007-06-13 | Dow AgroSciences LLC | Insecticidal protein toxins from xenorhabdus |
US6281413B1 (en) | 1998-02-20 | 2001-08-28 | Syngenta Participations Ag | Insecticidal toxins from Photorhabdus luminescens and nucleic acid sequences coding therefor |
JP2002504336A (en) * | 1998-02-20 | 2002-02-12 | ノバルティス アクチエンゲゼルシャフト | Insecticidal toxins from Photolabdus |
US6174860B1 (en) | 1999-04-16 | 2001-01-16 | Novartis Ag | Insecticidal toxins and nucleic acid sequences coding therefor |
AUPP911399A0 (en) * | 1999-03-10 | 1999-04-01 | Commonwealth Scientific And Industrial Research Organisation | Plants and feed baits for controlling damage |
EP1069134A1 (en) * | 1999-07-15 | 2001-01-17 | Wisconsin Alumni Research Foundation | Photorhabdus luminescens strains |
AR025097A1 (en) * | 1999-08-11 | 2002-11-06 | Dow Agrosciences Llc | TRANSGENIC PLANTS EXPRESSING THE PHOTORHABDUS TOXIN |
WO2001016305A2 (en) * | 1999-09-02 | 2001-03-08 | Agresearch Limited | Nucleotide sequences encoding an insectidal protein complex from serratia |
FR2803592A1 (en) | 2000-01-06 | 2001-07-13 | Aventis Cropscience Sa | NOVEL DERIVATIVES OF 3-HYDROXYPICOLINIC ACID, PROCESS FOR THEIR PREPARATION AND FUNGICIDAL COMPOSITIONS CONTAINING SAME |
US6639129B2 (en) | 2000-03-24 | 2003-10-28 | Wisconsin Alumni Research Foundation | DNA sequences from photorhabdus luminescens |
FR2815969B1 (en) | 2000-10-30 | 2004-12-10 | Aventis Cropscience Sa | TOLERANT PLANTS WITH HERBICIDES BY METABOLIC BYPASS |
EP2213681A1 (en) | 2002-03-22 | 2010-08-04 | Bayer BioScience N.V. | Novel Bacillus thuringiensis insecticidal proteins |
GB0212423D0 (en) * | 2002-05-29 | 2002-07-10 | Univ Bath | Toxin Genes and uses thereof |
JP2005536198A (en) | 2002-06-28 | 2005-12-02 | ダウ アグロサイエンス リミテッド ライアビリティー カンパニー | Insecticidal proteins and polynucleotides derived from Penibacillus sp. |
WO2004067750A2 (en) | 2003-01-21 | 2004-08-12 | Dow Agrosciences Llc | Xenorhabdus tc proteins and genes for pest control |
AR043328A1 (en) | 2003-01-21 | 2005-07-27 | Dow Agrosciences Llc | METHOD FOR THE INHIBITION OF INSECTS |
BRPI0613111A2 (en) | 2005-07-08 | 2010-12-21 | Univ Mexico Nacional Autonoma | bacterial proteins with pesticidal activity |
TW201142029A (en) | 2009-11-24 | 2011-12-01 | Univ Leuven Kath | Banana promoters |
AU2010334818B2 (en) | 2009-12-23 | 2015-07-09 | Bayer Intellectual Property Gmbh | Plants tolerant to HPPD inhibitor herbicides |
MX2012007358A (en) | 2009-12-23 | 2012-11-06 | Bayer Ip Gmbh | Plants tolerant to hppd inhibitor herbicides. |
EA201290559A1 (en) | 2009-12-23 | 2013-01-30 | Байер Интеллектуэль Проперти Гмбх | PLANTS RESISTANT TO HERBICIDES - HPPD INHIBITORS |
BR112012015690A2 (en) | 2009-12-23 | 2015-08-25 | Bayer Intelectual Property Gmbh | Herbicide tolerant plants of hppd inhibitors. |
AR079882A1 (en) | 2009-12-23 | 2012-02-29 | Bayer Cropscience Ag | TOLERANT PLANTS TO INHIBITING HERBICIDES OF HPPD |
AU2011212538B2 (en) | 2010-02-02 | 2014-12-04 | BASF Agricultural Solutions Seed US LLC | Soybean transformation using HPPD inhibitors as selection agents |
EP2669373B1 (en) | 2010-11-10 | 2016-06-01 | Bayer CropScience AG | HPPD variants and methods of use |
UA111193C2 (en) | 2011-03-25 | 2016-04-11 | Баєр Інтеллекчуел Проперті Гмбх | Use of n-(tetrazol-4-yl)- or n-(triazol-3-yl)arylcarboxamides or salts thereof for controlling unwanted plants in areas of transgenic crop plants being tolerant to hppd inhibitor herbicides |
AU2012234448A1 (en) | 2011-03-25 | 2013-10-17 | Bayer Intellectual Property Gmbh | Use of N-(1,2,5-Oxadiazol-3-yl)benzamides for controlling unwanted plants in areas of transgenic crop plants being tolerant to HPPD inhibitor herbicides |
KR101314263B1 (en) * | 2011-06-09 | 2013-10-02 | 경북대학교 산학협력단 | Novel Pesticidal proteins, compound and method for controlling harmful insects using novel Pesticidal proteins |
KR101246707B1 (en) * | 2012-08-01 | 2013-03-25 | ㈜엠알이노베이션 | Nematocide compound containing photorhabdus temperata subsp. temperata |
UA119532C2 (en) | 2012-09-14 | 2019-07-10 | Байєр Кропсайєнс Лп | Hppd variants and methods of use |
US10221429B2 (en) | 2013-03-07 | 2019-03-05 | Bayer Cropscience Lp | Toxin genes and methods for their use |
CA2942171C (en) | 2014-03-11 | 2023-05-09 | Bayer Cropscience Lp | Hppd variants and methods of use |
MX2018003044A (en) | 2015-09-11 | 2018-04-11 | Bayer Cropscience Ag | Hppd variants and methods of use. |
CA3043493A1 (en) | 2016-11-23 | 2018-05-31 | BASF Agricultural Solutions Seed US LLC | Axmi669 and axmi991 toxin genes and methods for their use |
CN110225974B (en) | 2016-12-22 | 2024-03-29 | 巴斯夫农业种子解决方案美国有限责任公司 | Control of nematode pests using CRY14 |
BR112019014727A2 (en) | 2017-01-18 | 2020-04-07 | BASF Agricultural Solutions Seed US LLC | nucleic acid molecule, vector, cell, plant, seed, polypeptide, composition, methods for controlling a pest population, to kill a pest, to produce a polypeptide, to protect a plant and to increase yield on a plant, use of nucleic acid and basic product |
US11286498B2 (en) | 2017-01-18 | 2022-03-29 | BASF Agricultural Solutions Seed US LLC | Use of BP005 for the control of plant pathogens |
WO2018165091A1 (en) | 2017-03-07 | 2018-09-13 | Bayer Cropscience Lp | Hppd variants and methods of use |
US11279944B2 (en) | 2017-10-24 | 2022-03-22 | BASF Agricultural Solutions Seed US LLC | Of herbicide tolerance to 4-hydroxyphenylpyruvate dioxygenase (HPPD) inhibitors by down-regulation of HPPD expression in soybean |
BR112020008096A2 (en) | 2017-10-24 | 2020-11-03 | Basf Se | method for checking tolerance to a GM herbicide and soy plant |
CN110338153B (en) * | 2019-07-15 | 2021-08-03 | 杭州森康林业科技有限公司 | Pine wood nematode polypide fixing method for microinjection |
CN114341116A (en) | 2019-07-22 | 2022-04-12 | 拜耳公司 | 5-amino-substituted pyrazoles and triazoles as pesticides |
KR20220038403A (en) | 2019-07-23 | 2022-03-28 | 바이엘 악티엔게젤샤프트 | Novel heteroaryl-triazole compounds as pesticides |
US20220264880A1 (en) | 2019-07-23 | 2022-08-25 | Bayer Aktiengesellschaft | Novel heteroaryl-triazole compounds as pesticides |
EP3701796A1 (en) | 2019-08-08 | 2020-09-02 | Bayer AG | Active compound combinations |
US20220403410A1 (en) | 2019-09-26 | 2022-12-22 | Bayer Aktiengesellschaft | Rnai-mediated pest control |
CA3156302A1 (en) | 2019-10-02 | 2021-04-08 | Bayer Aktiengesellschaft | Active compound combinations comprising fatty acids |
WO2021069569A1 (en) | 2019-10-09 | 2021-04-15 | Bayer Aktiengesellschaft | Novel heteroaryl-triazole compounds as pesticides |
UY38911A (en) | 2019-10-09 | 2021-05-31 | Bayer Ag | HETEROARYL-TRIAZOLE COMPOUNDS AS PESTICIDES, FORMULATIONS, USES AND METHODS OF USE OF THEM |
KR20220098170A (en) | 2019-11-07 | 2022-07-11 | 바이엘 악티엔게젤샤프트 | Substituted sulfonyl amides for controlling animal pests |
WO2021097162A1 (en) | 2019-11-13 | 2021-05-20 | Bayer Cropscience Lp | Beneficial combinations with paenibacillus |
TW202134226A (en) | 2019-11-18 | 2021-09-16 | 德商拜耳廠股份有限公司 | Novel heteroaryl-triazole compounds as pesticides |
WO2021099271A1 (en) | 2019-11-18 | 2021-05-27 | Bayer Aktiengesellschaft | Active compound combinations comprising fatty acids |
TW202136248A (en) | 2019-11-25 | 2021-10-01 | 德商拜耳廠股份有限公司 | Novel heteroaryl-triazole compounds as pesticides |
EP4097125A1 (en) | 2020-01-31 | 2022-12-07 | Pairwise Plants Services, Inc. | Suppression of shade avoidance response in plants |
US20230148601A1 (en) | 2020-02-18 | 2023-05-18 | Bayer Aktiengesellschaft | Novel heteroaryl-triazole compounds as pesticides |
EP3708565A1 (en) | 2020-03-04 | 2020-09-16 | Bayer AG | Pyrimidinyloxyphenylamidines and the use thereof as fungicides |
US20230212600A1 (en) | 2020-04-16 | 2023-07-06 | Pairwise Plants Services, Inc. | Methods for controlling meristem size for crop improvement |
WO2021209490A1 (en) | 2020-04-16 | 2021-10-21 | Bayer Aktiengesellschaft | Cyclaminephenylaminoquinolines as fungicides |
JP2023522350A (en) | 2020-04-21 | 2023-05-30 | バイエル・アクチエンゲゼルシヤフト | 2-(Het)aryl-substituted Fused Heterocyclic Derivatives as Pesticides |
CN116096236A (en) * | 2020-05-01 | 2023-05-09 | 韦斯塔隆公司 | Insecticidal combinations |
EP4146628A1 (en) | 2020-05-06 | 2023-03-15 | Bayer Aktiengesellschaft | Pyridine (thio)amides as fungicidal compounds |
TW202208347A (en) | 2020-05-06 | 2022-03-01 | 德商拜耳廠股份有限公司 | Novel heteroaryl-triazole compounds as pesticides |
US20230180756A1 (en) | 2020-05-12 | 2023-06-15 | Bayer Aktiengesellschaft | Triazine and pyrimidine (thio)amides as fungicidal compounds |
BR112022023550A2 (en) | 2020-05-19 | 2023-01-03 | Bayer Cropscience Ag | AZABICYCLIC (THIO)AMIDES AS FUNGICIDAL COMPOUNDS |
CA3185017A1 (en) | 2020-06-02 | 2021-12-09 | Pairwise Plants Services, Inc. | Methods for controlling meristem size for crop improvement |
WO2021245087A1 (en) | 2020-06-04 | 2021-12-09 | Bayer Aktiengesellschaft | Heterocyclyl pyrimidines and triazines as novel fungicides |
MX2022015625A (en) | 2020-06-10 | 2023-01-11 | Bayer Ag | Azabicyclyl-substituted heterocycles as fungicides. |
AR122661A1 (en) | 2020-06-17 | 2022-09-28 | Pairwise Plants Services Inc | MERISTEM SIZE CONTROL METHODS FOR CROP IMPROVEMENT |
EP4168404A1 (en) | 2020-06-18 | 2023-04-26 | Bayer Aktiengesellschaft | 3-(pyridazin-4-yl)-5,6-dihydro-4h-1,2,4-oxadiazine derivatives as fungicides for crop protection |
CA3187291A1 (en) | 2020-06-18 | 2021-12-23 | Bayer Aktiengesellschaft | Composition for use in agriculture |
WO2021255091A1 (en) | 2020-06-19 | 2021-12-23 | Bayer Aktiengesellschaft | 1,3,4-oxadiazoles and their derivatives as fungicides |
UY39275A (en) | 2020-06-19 | 2022-01-31 | Bayer Ag | 1,3,4-OXADIAZOLE PYRIMIDINES AS FUNGICIDES, PROCESSES AND INTERMEDIARIES FOR THEIR PREPARATION, METHODS OF USE AND USES OF THE SAME |
UY39276A (en) | 2020-06-19 | 2022-01-31 | Bayer Ag | USE OF 1,3,4-OXADIAZOL-2-ILPYRIMIDINE COMPOUNDS TO CONTROL PHYTOPATHOGENIC MICROORGANISMS, METHODS OF USE AND COMPOSITIONS. |
WO2021255089A1 (en) | 2020-06-19 | 2021-12-23 | Bayer Aktiengesellschaft | 1,3,4-oxadiazole pyrimidines and 1,3,4-oxadiazole pyridines as fungicides |
EP3929189A1 (en) | 2020-06-25 | 2021-12-29 | Bayer Animal Health GmbH | Novel heteroaryl-substituted pyrazine derivatives as pesticides |
KR20230039665A (en) | 2020-07-02 | 2023-03-21 | 바이엘 악티엔게젤샤프트 | Heterocycle derivatives as pest control agents |
WO2022033991A1 (en) | 2020-08-13 | 2022-02-17 | Bayer Aktiengesellschaft | 5-amino substituted triazoles as pest control agents |
WO2022053453A1 (en) | 2020-09-09 | 2022-03-17 | Bayer Aktiengesellschaft | Azole carboxamide as pest control agents |
WO2022058327A1 (en) | 2020-09-15 | 2022-03-24 | Bayer Aktiengesellschaft | Substituted ureas and derivatives as new antifungal agents |
EP3974414A1 (en) | 2020-09-25 | 2022-03-30 | Bayer AG | 5-amino substituted pyrazoles and triazoles as pesticides |
EP3915971A1 (en) | 2020-12-16 | 2021-12-01 | Bayer Aktiengesellschaft | Phenyl-s(o)n-phenylamidines and the use thereof as fungicides |
CA3205419A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | Use of dhodh inhibitor for controlling resistant phytopathogenic fungi in crops |
WO2022129196A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | Heterobicycle substituted 1,2,4-oxadiazoles as fungicides |
WO2022129188A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | 1,2,4-oxadiazol-3-yl pyrimidines as fungicides |
WO2022129190A1 (en) | 2020-12-18 | 2022-06-23 | Bayer Aktiengesellschaft | (hetero)aryl substituted 1,2,4-oxadiazoles as fungicides |
EP4036083A1 (en) | 2021-02-02 | 2022-08-03 | Bayer Aktiengesellschaft | 5-oxy substituted heterocycles as pesticides |
CN112920261B (en) * | 2021-02-02 | 2022-06-21 | 安徽农业大学 | Preparation and application of tea polypeptide with antibacterial effect |
CA3210785A1 (en) | 2021-02-11 | 2022-08-18 | Pairwise Plants Services, Inc. | Methods and compositions for modifying cytokinin oxidase levels in plants |
WO2022182834A1 (en) | 2021-02-25 | 2022-09-01 | Pairwise Plants Services, Inc. | Methods and compositions for modifying root architecture in plants |
BR112023019400A2 (en) | 2021-03-30 | 2023-12-05 | Bayer Ag | 3-(HETERO)ARYL-5-CHLORODIFLOROMETHYL-1,2,4-OXADIAZOLE AS A FUNGICIDE |
BR112023019788A2 (en) | 2021-03-30 | 2023-11-07 | Bayer Ag | 3-(HETERO)ARYL-5-CHLORODIFLOROMETHYL-1,2,4-OXADIAZOLE AS A FUNGICIDE |
US20240254121A1 (en) | 2021-05-06 | 2024-08-01 | Bayer Aktiengesellschaft | Alkylamide substituted, annulated imidazoles and use thereof as insecticides |
US20240294533A1 (en) | 2021-05-12 | 2024-09-05 | Bayer Aktiengesellschaft | 2-(het)aryl-substituted condensed heterocycle derivatives as pest control agents |
EP4355083A1 (en) | 2021-06-17 | 2024-04-24 | Pairwise Plants Services, Inc. | Modification of growth regulating factor family transcription factors in soybean |
UY39827A (en) | 2021-06-24 | 2023-01-31 | Pairwise Plants Services Inc | MODIFICATION OF UBIQUITIN LIGASE E3 HECT GENES TO IMPROVE PERFORMANCE TRAITS |
CN117794358A (en) | 2021-07-01 | 2024-03-29 | 成对植物服务股份有限公司 | Methods and compositions for enhancing root development |
WO2023019188A1 (en) | 2021-08-12 | 2023-02-16 | Pairwise Plants Services, Inc. | Modification of brassinosteroid receptor genes to improve yield traits |
KR20240039209A (en) | 2021-08-13 | 2024-03-26 | 바이엘 악티엔게젤샤프트 | Active compound combinations and fungicide compositions comprising the same |
AR126798A1 (en) | 2021-08-17 | 2023-11-15 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR MODIFYING CYTOKININ RECEPTOR HISTIDINE KINASE GENES IN PLANTS |
IL310966A (en) | 2021-08-25 | 2024-04-01 | Bayer Ag | Novel pyrazinyl-triazole compounds as pesticides |
CA3230167A1 (en) | 2021-08-30 | 2023-03-09 | Pairwise Plants Services, Inc. | Modification of ubiquitin binding peptidase genes in plants for yield trait improvement |
AR126938A1 (en) | 2021-09-02 | 2023-11-29 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS TO IMPROVE PLANT ARCHITECTURE AND PERFORMANCE TRAITS |
EP4144739A1 (en) | 2021-09-02 | 2023-03-08 | Bayer Aktiengesellschaft | Anellated pyrazoles as parasiticides |
AU2022352997A1 (en) | 2021-09-21 | 2024-04-04 | Pairwise Plants Services, Inc. | Methods and compositions for reducing pod shatter in canola |
CN118434850A (en) | 2021-10-04 | 2024-08-02 | 成对植物服务股份有限公司 | Method for improving fertility and seed yield of florets |
CN118302434A (en) | 2021-10-07 | 2024-07-05 | 成对植物服务股份有限公司 | Method for improving floret fertility and seed yield |
WO2023078915A1 (en) | 2021-11-03 | 2023-05-11 | Bayer Aktiengesellschaft | Bis(hetero)aryl thioether (thio)amides as fungicidal compounds |
WO2023099445A1 (en) | 2021-11-30 | 2023-06-08 | Bayer Aktiengesellschaft | Bis(hetero)aryl thioether oxadiazines as fungicidal compounds |
AR127904A1 (en) | 2021-12-09 | 2024-03-06 | Pairwise Plants Services Inc | METHODS TO IMPROVE FLOWER FERTILITY AND SEED YIELD |
AR128372A1 (en) | 2022-01-31 | 2024-04-24 | Pairwise Plants Services Inc | SUPPRESSION OF THE SHADE AVOIDANCE RESPONSE IN PLANTS |
WO2023148029A1 (en) | 2022-02-01 | 2023-08-10 | Globachem Nv | Methods and compositions for controlling pests in cereals |
WO2023148031A1 (en) | 2022-02-01 | 2023-08-10 | Globachem Nv | Methods and compositions for controlling pests in cotton |
WO2023148028A1 (en) | 2022-02-01 | 2023-08-10 | Globachem Nv | Methods and compositions for controlling pests |
WO2023148036A1 (en) | 2022-02-01 | 2023-08-10 | Globachem Nv | Methods and compositions for controlling pests in soybean |
WO2023148035A1 (en) | 2022-02-01 | 2023-08-10 | Globachem Nv | Methods and compositions for controlling pests in rice |
WO2023148030A1 (en) | 2022-02-01 | 2023-08-10 | Globachem Nv | Methods and compositions for controlling pests in corn |
US20240327858A1 (en) | 2022-03-02 | 2024-10-03 | Pairwise Plants Services, Inc. | Modification of brassinosteroid receptor genes to improve yield traits |
WO2023192838A1 (en) | 2022-03-31 | 2023-10-05 | Pairwise Plants Services, Inc. | Early flowering rosaceae plants with improved characteristics |
WO2023196886A1 (en) | 2022-04-07 | 2023-10-12 | Pairwise Plants Services, Inc. | Methods and compositions for improving resistance to fusarium head blight |
WO2023205714A1 (en) | 2022-04-21 | 2023-10-26 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield traits |
US20230348922A1 (en) | 2022-05-02 | 2023-11-02 | Pairwise Plants Services, Inc. | Methods and compositions for enhancing yield and disease resistance |
WO2023213626A1 (en) | 2022-05-03 | 2023-11-09 | Bayer Aktiengesellschaft | Use of (5s)-3-[3-(3-chloro-2-fluorophenoxy)-6-methylpyridazin-4-yl]-5-(2-chloro-4-methylbenzyl)-5,6-dihydro-4h-1,2,4-oxadiazine for controlling unwanted microorganisms |
WO2023213670A1 (en) | 2022-05-03 | 2023-11-09 | Bayer Aktiengesellschaft | Crystalline forms of (5s)-3-[3-(3-chloro-2-fluorophenoxy)-6-methylpyridazin-4-yl]-5-(2-chloro-4-methylbenzyl)-5,6-dihydro-4h-1,2,4-oxadiazine |
US20230416767A1 (en) | 2022-05-05 | 2023-12-28 | Pairwise Plants Services, Inc. | Methods and compositions for modifying root architecture and/or improving plant yield traits |
AR129709A1 (en) | 2022-06-27 | 2024-09-18 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS TO MODIFY SHADE ESCAPE IN PLANTS |
WO2024006792A1 (en) | 2022-06-29 | 2024-01-04 | Pairwise Plants Services, Inc. | Methods and compositions for controlling meristem size for crop improvement |
AR129749A1 (en) | 2022-06-29 | 2024-09-25 | Pairwise Plants Services Inc | METHODS AND COMPOSITIONS FOR CONTROLLING MERISTEM SIZE FOR CROP IMPROVEMENT |
US20240043857A1 (en) | 2022-08-04 | 2024-02-08 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield traits |
US20240060081A1 (en) | 2022-08-11 | 2024-02-22 | Pairwise Plants Services, Inc. | Methods and compositions for controlling meristem size for crop improvement |
WO2024054880A1 (en) | 2022-09-08 | 2024-03-14 | Pairwise Plants Services, Inc. | Methods and compositions for improving yield characteristics in plants |
EP4295688A1 (en) | 2022-09-28 | 2023-12-27 | Bayer Aktiengesellschaft | Active compound combination |
WO2024068520A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
WO2024068519A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
WO2024068517A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
WO2024068518A1 (en) | 2022-09-28 | 2024-04-04 | Bayer Aktiengesellschaft | 3-heteroaryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide |
EP4385326A1 (en) | 2022-12-15 | 2024-06-19 | Kimitec Biogorup | Biopesticide composition and method for controlling and treating broad spectrum of pests and diseases in plants |
WO2024137438A2 (en) | 2022-12-19 | 2024-06-27 | BASF Agricultural Solutions Seed US LLC | Insect toxin genes and methods for their use |
WO2024137446A1 (en) | 2022-12-19 | 2024-06-27 | BASF Agricultural Solutions Seed US LLC | Methods of identifying and evaluating genes for insect control |
WO2024137445A1 (en) | 2022-12-20 | 2024-06-27 | BASF Agricultural Solutions Seed US LLC | Methods of identifying and evaluating genes for insect control |
US20240279673A1 (en) | 2023-02-16 | 2024-08-22 | Pairwise Plants Services, Inc. | Methods and compositions for modifying shade avoidance in plants |
US20240294933A1 (en) | 2023-03-02 | 2024-09-05 | Pairwise Plants Services, Inc. | Methods and compositions for modifying shade avoidance in plants |
WO2024186950A1 (en) | 2023-03-09 | 2024-09-12 | Pairwise Plants Services, Inc. | Modification of brassinosteroid signaling pathway genes for improving yield traits in plants |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU675335B2 (en) * | 1993-06-25 | 1997-01-30 | Commonwealth Scientific And Industrial Research Organisation | Toxin gene from xenorhabdus nematophilus |
WO1995003695A1 (en) * | 1993-07-27 | 1995-02-09 | Agro-Biotech Corporation | Novel fungicidal properties of metabolites, culture broth, stilbene derivatives and indole derivatives produced by the bacteria xenorhabdus and photorhabdus spp. |
GB9618083D0 (en) * | 1996-08-29 | 1996-10-09 | Mini Agriculture & Fisheries | Pesticidal agents |
-
1997
- 1997-05-05 AU AU28299/97A patent/AU2829997A/en not_active Abandoned
- 1997-05-05 CA CA002263819A patent/CA2263819A1/en not_active Abandoned
- 1997-05-05 KR KR1019990701698A patent/KR20000037116A/en not_active Application Discontinuation
- 1997-05-05 SK SK246-99A patent/SK24699A3/en unknown
- 1997-05-05 IL IL12859097A patent/IL128590A0/en unknown
- 1997-05-05 TR TR1999/01126T patent/TR199901126T2/en unknown
- 1997-05-05 JP JP10511612A patent/JP2000515024A/en active Pending
- 1997-05-05 WO PCT/US1997/007657 patent/WO1998008932A1/en not_active Application Discontinuation
- 1997-05-05 EP EP97922696A patent/EP0970185A4/en not_active Withdrawn
- 1997-05-14 AR ARP970102017A patent/AR007133A1/en not_active Application Discontinuation
- 1997-08-28 TW TW086112391A patent/TW509722B/en not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
IL128590A0 (en) | 2000-01-31 |
WO1998008932A1 (en) | 1998-03-05 |
CA2263819A1 (en) | 1998-03-05 |
EP0970185A1 (en) | 2000-01-12 |
JP2000515024A (en) | 2000-11-14 |
AR007133A1 (en) | 1999-10-13 |
EP0970185A4 (en) | 2003-02-26 |
SK24699A3 (en) | 2000-04-10 |
TR199901126T2 (en) | 1999-07-21 |
KR20000037116A (en) | 2000-07-05 |
TW509722B (en) | 2002-11-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2209659C (en) | Insecticidal protein toxins from photorhabdus | |
AU2829997A (en) | Insecticidal protein toxins from (photorhabdus) | |
AU755389B2 (en) | Insecticidal protein toxins from xenorhabdus | |
JP2001523944A (en) | BACILLUS THURINGIENSIS CRYET33 and CRYET34 Compositions and Uses Thereof | |
WO2008134072A2 (en) | Hemipteran- and coleopteran- active toxin proteins from bacillus thuringiensis | |
JP2001506973A (en) | Coleopteran insects and Ctenocephalides spp. CryET29 composition of Bacillus Thuringiensis, toxic to | |
JPH07503845A (en) | Peptides effective in killing insects | |
US6528484B1 (en) | Insecticidal protein toxins from Photorhabdus | |
US7569748B2 (en) | Nucleic acid encoding an insecticidal protein toxin from photorhabdus | |
US6280722B1 (en) | Antifungal Bacillus thuringiensis strains | |
EP1130970B1 (en) | Insecticidal agents | |
AU9712501A (en) | Insecticidal protein toxins from photorhabdus | |
MXPA99001288A (en) | Insecticidal protein toxins from xenorhabdus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MK5 | Application lapsed section 142(2)(e) - patent request and compl. specification not accepted |