AU741600B2

AU741600B2 - Improved bacillus thuringiensis toxin

Info

Publication number: AU741600B2
Application number: AU88044/98A
Authority: AU
Inventors: Sandra De Roeck; Jeroen Van Rie
Original assignee: Plant Genetic Systems NV
Current assignee: Bayer CropScience NV
Priority date: 1997-06-27
Filing date: 1998-06-25
Publication date: 2001-12-06
Anticipated expiration: 2018-06-25
Also published as: EP0989998A1; WO1999000407A3; WO1999000407A2; CA2290718A1; AU8804498A

Description

WO 99/00407 PCT/EP98/04033 IMPROVED BACILLUS THURINGIENSIS TOXIN BACKGROUND OF THE INVENTION Field of the Invention The present invention provides new improved proteins derived from a Bacillus thuringiensis Cry9C crystal protein. In accordance with this invention, amino acid positions in a Cry9C protein were identified as involved in insect toxicity. Further in accordance with this invention are provided modified Cry9C proteins with increased of decreased toxicity to an insect species, and DNA sequences encoding such modified Cry9C proteins. Plants can be protected from insect damage by expressing a chimeric gene encoding an improved Cry9C protein with an increased toxicity to an insect species.

(ii) Description of Related Art Bacillus thuringiensis (Bt)-derived proteins are currently widely used to protect plants from insects by expression of such proteins in transgenic plants. Concerns of insect resistance development and the desire to achieve the optimum toxicity and control of additional insect species resulted in efforts to modify existing Bt-derived proteins so as to increase their toxicity or alter their mode of action.

Most studies on the mode of action of Bacillus thuringiensis toxins have focused on lepidopteran-specific Cry1 insecticidal crystal proteins The following picture has emerged from these studies (Gill et al., 1992, Annu. Rev.

Entomol. 37, 615-36; Knowles, 1993, BioEssays, 15, 469-476). Following ingestion of the crystals by a susceptible insect, they are dissolved in the alkaline reducing environment of the insect midgut lumen. The liberated proteins, the protoxins, are then proteolytically processed by insect midgut proteases to a protease-resistant fragment. This active fragment, the toxin, then passes through the peritrophic membrane and binds to specific receptors located on the brush border membrane of gut epithelial cells. Subsequent to binding, the toxin or part thereof inserts in the membrane resulting in the formation of pores. These pores lead to colloid osmotic swelling and ultimately lysis of the midgut cells, causing death of the insect.

-1- CONFIRMATION COPY WO 99/00407 PCT/EP98/04033 Binding studies have demonstrated that receptor binding is a crucial step in the mode of action of ICPs (Hofmann et al., 1988, 173, 85-91; Hofmann et al., 1988, Proc. Natl. Acad. Sci. USA, 85, 7844-7848; Van Rie et al., 1990, Appl. Environm.

Microbiol. 56, 1378-85).

The three dimensional structure of two ICPs, Cry3A and the CrylAa toxic fragment, has been solved (Li et al., 1991, Nature 353, 815-21; Grochulski et al., 1994, Journal of Molecular Biology 254, 1-18). The Cry proteins have been found to have three structural domains: the N-terminal domain I consists of 7 alpha helices, domain II contains three beta-sheets and the C-terminal domain III is a betasandwich. Based on this structure, a hypothesis has been formulated regarding the structure-function relationships of ICPs. The bundle of long, hydrophobic and amphipathic helices (domain I) is equipped for pore formation in the insect membrane, and regions of the three-sheet domain (domain II) are probably responsible for receptor binding (Li et al, 1991, supra). The function of domain III is less clear. When different ICP amino acid sequences are aligned, five conserved sequence blocks are evident (H6fte Whiteley, 1989, Microbiol. Revs. 53, 242-255).

These conserved blocks are all located in the interior of a structural domain or at the interface between domains. The high degree of conservation of these internal residues implies that homologous proteins would adopt a similar fold (Li et al., 1991, supra).

Data from Ahmad et al. (1991, FEMS Microbiol. Lett. 68, 97-104); Wu et al.

(1992, J. Biol. Chem. 267, 2311-2317) and Gazit et al. (1993, Biochemistry 32, 3429-3436) provide evidence for the function of domain I of ICPs as a pore formation unit.

Deletions and alanine substitutions in the CrylAa protoxin at a position predicted to be at or near the second loop of domain II significantly altered toxicity and receptor binding ability (Lu et al., 1993, XXVIth Annual meeting of the Society for Invertebrate Pathology, Asheville, USA, Conference book, page 31, Abstract 17).

Smith and Ellar (1992, XXVth Annual meeting of the Society for Invertebrate Pathology, Heidelberg, Germany, Conference book, page 111, abstract 68) observed dramatic effects on toxicity towards in vitro insect cell cultures with mutant CrylC proteins, differing in the amino acid sequence of the predicted loop regions.

WO 99/00407 PCT/EP98/04033 They formulated the hypothesis that it should be possible to map the putative receptor binding domain of this toxin and eventually generate toxins with increased potency. In some cases however, a contribution to specificity and binding from domain III of the Cry toxin could not be excluded (Schnepf et al., 1990, supra; Ge et al., 1991, J. Biol. Chem. 266, 17954-17958). Furthermore, a recent study using hybrid ICPs, constructed by exchanging gene fragments between crylC and crylE, has indicated that domain II of Cryl C is not sufficient to confer the high activity of this protein towards Spodoptera exigua and Mamestra brassicae (Schipper et al., 1993, Seventh International Conference on Bacillus, Institut Pasteur, July 18-23, Abstracts of lectures, p. L69). Site-directed mutagenesis experiments on Cryl Ac indicated that certain amino acids in domain I are important for receptor binding (Wu et al., 1992, supra). Rajamohan et al. (1996, J. Biol. Chem. 271, 2390-2396) explored the role of loop 2 residues in domain II of the CrylAb protein in reversible and irreversible binding to Manduca sexta and Heliothis virescens.

Also, changes outside the 60 kD toxin region of the Bt protoxin were found to influence toxicity. It was suggested that this may be related to the activation processes by the gut juice (Nakamura et al., 1990, Agric. Biol. Chem. 54, 715-24).

Visser et al. (1993, In "Bacillus thuringiensis, an Environmental Biopesticide Theory and Practice", pp.71-88, eds.: Entwistle, Cory, Bailey, and Higgs, John Wiley Sons, NY) reviewed the domain-function studies with Bt ICPs and concluded that in general, the function of essential stretches of the toxic fragment of Bt ICPs is unknown. From studies of mutant proteins, it was found that several amino acid residues from different regions of the toxic fragment, either conserved or variable, were shown to affect toxic activity.

Lambert et al. (1996, Appl. Environm. Microbiol. 62, p. 80-86) and PCT patent publication WO 94/05771 describe a new Bt protein which is currently named cry9Cal (abbreviated as Cry9C) (Peferoen et al., 1997, in Advances in Insect Control: The role of transgenic plants, pp. 21-48, Taylor Francis Ltd., London).

This protein was found to have a broad insect target range within the group of lepidopteran pest insects making it interesting for insect control applications in agriculture.

-4- De Roeck et al. (1995, the 2 8 th annual meeting of the Society for Invertebrate Pathology, Cornell University, Ithaca, New York, p. 52) suggests to determine the likely position of the binding epitope of the CrylH protein by making Alanine mutants so as to allow the determination of the contribution of amino acid positions in binding of the CrylH protein to different insects. The CrylH protein is currently named Cry9C in the new nomenclature (Crickmore et al., 1995, 28 th annual meeting of the Society for Invertebrate Pathology, Cornell University, Ithaca, New York, p. 14). De Roeck et al. (1997, the 6 th International Conference on Perspectives in Protein Engineering, John Innes Centre, Norwich, UK, June 28-July 1, p. 34) determined the likely position of residues in the loops at the apex of the molecule in domain II of the Cry9C protein.

SUMMARY OF THE INVENTION This invention provides a modified Cry9C protein with an improved toxicity to an insect species, comprising the amino acid sequence of SEQ ID No. 2 or an insecticidally-effective fragment thereof, wherein at least one amino acid at the following amino acid positions in SEQ ID No. 2 is replaced by another amino acid: 316, 317, 319, 321, 329, 330, 364, 369, 422, or 488.

This invention further provides preferred improved Cry9C proteins that comprise the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 20 44 to amino acid position 658, wherein at least one of the amino acids at the following positions are replaced by other amino acids: 316, 317, 319, 321, 329, 0o 330, 364, 369, 422, and 488.

This invention also provides a modified Cry9C protein with improved toxicity to Ostrinia nubila/is, comprising the amino acid sequence of SEQ ID No. 2 25 from amino acid position 1 or 44 to amino acid position 658, wherein at least the amino acids at position 488 or at least at positions 364 and 488 are replaced by other amino acids, preferably by alanine.

WO 99/00407 PCT/EP98/04033 This invention also provides modified Cry9C proteins with improved toxicity to Heliothis virescens, comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658, wherein the amino acid at position 321 or position 329, is replaced by another amino acid, preferably by alanine.

This invention further provides modified Cry9C proteins with improved toxicity to Diatraea grandiosella, comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658, wherein the amino acid at any or all of positions 316, 317, 319, 321, 330, 369, or 422 is replaced by another amino acid, preferably by alanine.

Further in accordance are provided DNA sequences encoding the modified Cry9C proteins, and particularly chimeric genes designed for expression in plants comprising these DNA sequences.

In another preferred embodiment of this invention, a plant transformed with a DNA sequence encoding a modified Cry9C protein is provided, so that the plant acquires increased resistance to insects, particularly a corn plant transformed with a modified Cry9C protein yielding increased toxicity towards Heliothis virescens, Ostrinia nubilalis, or Diatraea grandiosella insects.

Other objects and advantages of this invention will become evident from the following description.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS In this invention, certain amino acid residues important for toxicity of the Cry9C protein have been identified. These amino acid residues can be replaced by other amino acids to increase the toxicity to a specific insect species.

The "Cry9C protein", as used herein, refers to an insecticidal protein characterized by the amino acid sequence of SEQ ID No. 2 or any equivalents thereof such as the insecticidally effective truncated proteins or the fusion proteins of the Cry9C protein described in PCT patent publications WO 94/05771 and WO 94/24264. Particularly preferred Cry9C proteins, in accordance with this invention, are proteins containing at least the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658. Throughout the description WO 99/00407 PCT/EP98/04033 and the claims, the new nomenclature for Bt crystal proteins as suggested by Crickmore et al. (1995, 28th annual meeting of the Society for Invertebrate Pathology, Cornell University, Ithaca, New York, p. 14) and reported in Peferoen et al. (1997, in Advances in Insect Control: The role of transgenic plants, pp. 21-48, Taylor Francis Ltd., London) has been used.

"Cry9C protein variants", for a particular insect species, are insecticidal proteins that differ from but are indirectly or directly derived from the Cry9C protein.

Indeed, several variants of a Bt protein in which some amino acids are changed into others without significantly changing activity and/or specificity to a particular insect species can be found in nature (H6fte Whiteley, 1989, supra) or can be made by recombinant DNA techniques. Variants of a Cry9C protein, as used herein, also include proteins containing the specificity- or toxicity-determining domain or region of the Cry9C protein, in a hybrid with another protein, such as another Bt ICP, a membrane-permeating protein domain, a cytotoxin or an antibody fragment, provided that the Cry9C specificity- or toxicity-determining domain or region contributes to the toxicity or specificity of the hybrid protein. Particularly preferred Cry9C protein variants are those proteins comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein the arginine at position 164 has been replaced by another amino acid, preferably alanine or lysine. These variants with a replacement of the arginine at position 164 in the sequence of SEQ ID No. 2 show a significantly lower susceptibility to breakdown upon protease treatment, and are named herein the "protease-resistant Cry9C variants". Like here for the protease-resistant variants, whenever reference to a particular region or position in SEQ ID No. 2 is made, this does not necessarily imply that the protein referred to is the full-length protein-of SEQ ID No. 2; this statement merely refers to the position corresponding to the particular position in the reference Cry9C protein in SEQ ID No. 2. Indeed, improved Cry9C proteins of the invention can be truncated so that the actual position of an amino acid in that protein will differ but nevertheless reference will be made throughout this invention to the positions in the full-length reference protein, shown in SEQ ID No. 2.

Following the teachings of this invention, Cry9C proteins or variants thereof can be modified to have an increased toxicity for an insect species. "Modified Cry9C WO 99/00407 PCT/EP98/04033 protein", as used herein, refers to a Cry9C protein or its protease-resistant variant wherein amino acids have been modified to analyse the contribution of amino acid positions in toxicity, particularly a Cry9C protein or its protease-resistant variant wherein amino acids have been modified in the regions at the following positions in SEQ ID No. 2: 313-334, 358-369, 418-425, 480-492. "Improved Cry9C protein", in accordance with this invention, refers to a Cry9C protein or its protease-resistant variant wherein at least one amino acid has been replaced, so that the toxicity of this improved protein towards an insect species is significantly increased. In a particularly preferred improved Cry9C protein or its protease-resistant variant, the at least one amino acid change is located in domain II of the Cry9C protein, particularly in the regions of the Cry9C protein characterized by the following positions in SEQ ID No. 2: 313-334, 358-369, 418-425, 480-492. A modified Cry9C protein, differing in one amino acid from the native protein or its protease-resistant variant and being significantly less toxic towards the target insect, allows the direct identification of this amino acid position as involved in toxicity (provided no gross structural changes are introduced), and thus has considerable value in improving toxicity. In accordance with this invention, the identification of these amino acid positions involved in toxicity allows the construction of modified proteins having increased toxicity to the target insect by amino acid randomization at these positions. Preferred modified Cry9C proteins in accordance with this invention are the modified Cry9C proteins having altered toxicity to Ostrinia nubilalis, Heliothis virescens or Diatraea grandiosella as shown in Table 1, as well as combinations of those modifications in one modified protein.

An example of an improved Cry9C protein in accordance with this invention is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 or 666 wherein an amino acid in at least one of the following amino acid positions of SEQ ID No. 2 has been replaced by another amino acid: 313, 316, 317, 318, 319, 321, 323, 325, 329, 330, 362, 364, 368, 369, 418, 420, 421, 422, 480, 481, 483, 484, 485, 487, 488, 490 and 491; or an amino acid position located in the immediate vicinity of any one of these positions in the three-dimensional structure of the protein, preferably those amino acids whose C-alpha atom is at a maximum distance of about 7 Angstrom from the C-alpha atom WO 99/00407 PCT/EP98/04033 of the amino acid listed above. A preferred improved Cry9C protein in accordance with this invention is the protein of SEQ ID No. 2 with at least one of the following amino acid changes: P316A, A317V, V319A, L321A, P329A, Y330A, S364A, Y369A, 1422A, and 1488A. "V319A" or "Cry9C(V319A)", as used herein, means a change of the valine amino acid at position 319 in SEQ ID No. 2 to an alanine amino acid.

Preferred improved Cry9C proteins also include Cry9C proteins having also the arginine amino acid at position 164 in SEQ ID No. 2 altered into another amino acid, particularly alanine or lysine, to enhance stability upon protease, particularly trypsin, cleavage.

A preferred Cry9C protein for the control of Ostrinia nubilalis insects in accordance with this invention is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein an amino acid in at least one of the following amino acid positions in SEQ ID No. 2 has been replaced by another amino acid: 325, 364, 418, 421, 485, and 488. A particularly preferred improved Cry9C protein for the control of Ostrinia nubilalis insects is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein the amino acids in at least position 364 or at least in positions 364 and 488 of SEQ ID No. 2 are replaced by another amino acid, particularly alanine.

A preferred Cry9C protein for the control of Heliothis virescens insects in accordance with this invention is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein an amino acid in at least one of the following amino acid positions in SEQ ID No. 2 has been replaced by another amino acid: 313, 316, 317, 318, 319, 321, 323, 325, 329, 330, 368, 369, 418, 420, 421, 422, 480, 481, 483, 484, 485, 487, 488, 490 and 492, particularly at least one of the following amino acid positions: 321, 325, 329, 418, 420, and 480. A particularly preferred improved Cry9C protein for the control of Heliothis virescens insects is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein the amino acids in at least one of the amino acid positions 321 and 329 of SEQ ID No. 2 are replaced by another amino acid, particularly alanine.

WO 99/00407 PCT/EP98/04033 A preferred Cry9C protein for the control of Diatraea grandiosella insects in accordance with this invention is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein an amino acid in at least one of the following amino acid positions in SEQ ID No. 2 has been replaced by another amino acid: 316, 317, 319, 321, 325, 330, 369, 421, 422, 480, 483, 484, 485, 487, 488, 490, and 491; particularly at least one of the following amino acid positions: 480, 484, 485, 487, and 490. A particularly preferred improved Cry9C protein for the control of Diatraea grandiosella insects is a protein comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658 wherein the amino acids in at least one of the amino acid positions 316, 317, 319, 321, 330, 369 and 422 of SEQ ID No. 2 are replaced by another amino acid, particularly alanine or valine (for 317).

By using DNA sequences encoding improved Cry9C proteins in accordance with this invention, improved toxicity to a selected insect species can be obtained upon expression of such DNA in a transgenic plant.

A "cry9C gene", as used herein, is a DNA sequence comprising a DNA encoding a Cry9C protein (a coding region), and includes necessary regulatory sequences so that a Cry9C protein can be expressed in a cell, preferably a plant or bacterial cell. A cry9C gene does not necessarily need to be expressed everywhere at all times, expression can be periodic at certain stages of development in a plant) and/or can be spatially restricted in certain cells or tissues in a plant), mainly depending on the activity of regulatory elements provided in the chimeric gene or in the site of insertion in the plant genome. A cry9C gene can be naturallyoccurring or can be a hybrid or synthetic DNA and the regulatory elements can be from prokaryotic or eucaryotic origin.

The "modified cry9C gene", as used herein, is a DNA sequence comprising a DNA encoding a modified Cry9C protein (a modified coding region), and includes necessary regulatory sequences so that a Cry9C protein can be expressed in a cell, preferably a plant or bacterial cell. An example of a modified cry9C coding region is the cry9C coding region of SEQ ID No. 3 wherein the valine codon at nucleotide positions 844-846 of SEQ ID No. 3 has been replaced by an alanine codon.

WO 99/00407 PCT/EP98/04033 "Substantial sequence homology" to a DNA sequence, as used herein, refers to DNA sequences differing in some, most or all of their codons from another DNA sequence but encoding the same or substantially the same protein. Indeed, because of the degeneracy of the genetic code, the codon usage of a particular DNA coding region can be substantially modified, so as to more closely resemble the codon usage of the genes in the host cell, without changing the encoded protein. Changing the codon usage of a DNA coding region to that of the host cell has been described to be desired for gene expression in foreign hosts Bennetzen Hall, 1982, J.

Biol. Chem. 257, 3026-3031.; Itakura, 1977, Science 198, 1056-1063). Codon usage tables are available in the literature (Wada et al., 1990, Nucl. Acids Res. 18, 2367- 1411; Murray et al., 1989, Nucl. Acids Res. 17(2), 477-498) and in the major DNA sequence databanks at EMBL in Heidelberg, Germany). Accordingly, recombinant or synthetic DNA sequences can be constructed so that the same or substantially the same proteins with substantially the same insecticidal activity are produced (Koziel et al., 1993, Bio/technology 11, 194-200; Perlak et al., 1993, Plant Mol. Biol. 22, 313-321). A modified cry9C gene has all appropriate control regions so that the modified Cry9C protein can be expressed in a host cell, e.g. for expression in plants, a plant-expressible promoter and a 3' termination and polyadenylation region active in plants.

A "chimeric improved cry9C gene", as used herein, refers to a chimeric gene comprising a DNA sequence encoding the improved Cry9C protein inserted in between controlling elements of different origin, e.g. a DNA sequence encoding the improved Cry9C protein under the control of a promoter transcribing the DNA in the plant cell, and fused to 3' transcription termination sequences active in plant cells.

Protection of a plant, preferably a corn or cotton plant, against an insect species which is known to feed on said plant is preferably accomplished by expressing an improved Cry9C protein in the cells of the plant. This is preferably accomplished by expressing a chimeric improved cry9C gene encoding such an improved Cry9C protein in the cells of a plant, preferably a corn or cotton plant. An improved Cry9C protein of this invention preferably only has a small number, particularly less than 20, more particularly less than 15, preferably less than amino acids replaced by other amino acids as compared to the Cry9C protein, WO 99/00407 PCTIEP98/04033 preferably as compared to the region from between amino acid positions 1 and 45 to amino acid position 658 of the Cry9C protein of SEQ ID No. 2. A significant increase in toxicity can already be obtained by replacing only 1 amino acid, but it is preferred that more than one amino acid is changed to improve toxicity.

The following steps are followed to construct the new modified Cry9C proteins: amino acids in domain II of the Cry9C protein from amino acid positions 313-334, 358-369, 418-425, and 480-492 were chosen for modification, using alaninescanning mutagenesis (Cunningham Wells, 1989, Science 244,1081-85). In case the original position is alanine, a substitution by valine is done. These regions occur at positions corresponding to the solvent-exposed positions in the loop between beta-strands 1 and 2 (comprising alpha-helix 8) and in loop 1 (located between beta strands 2 and in loop 2 (located between beta-strands 6 and and in loop 3 (located between beta-strands 10 and 11) in the three-dimensional model of the Cry3A protein (Li et al., 1991, supra). To discount any observed lower toxicity of a modified Cry9C protein which is due to misfolding or structural distortion, the structural stability of mutant ICPs can be analysed by a variety of methods including toxicity to another target insect, crystal formation, solubilization, monoclonal antibody binding analysis, protease resistance, fluorometric monitoring of unfolding and circular dichroism spectrum analysis. In the case of structural distortion, it is impossible to determine the functional role of this position by alanine replacement.

However, a more conservative amino acid substitution may yield a correctly folded mutant protein which allows to determine the functional role of this position.

The amino acid positions, identified above, which yield modified proteins with significantly decreased toxicity ("down-mutants") are randomized. This means that a set of 20 different mutants, representing each type of amino acid, is generated for each position of interest (the original amino acid and the alanine substitution function as a control). This method is further referred to as "amino acid randomization".

Such mutants may be generated by a variety of methods, e.g. following the PCR overlap extension method (Ho et al., 1989, Gene 77, 51-59). These mutant proteins are then tested in toxicity assays on the target insect. Mutants at each position which are more toxic, yield higher mortality than the wild type protein, are selected.

Such mutants with improved toxicity are termed "up-mutants". Alternatively, it is also -11- WO 99/00407 PCT/EP98/04033 possible to select potential up-mutants on the basis of increased reversible binding which can be measured following the procedures of Van Rie et al. (1990, Appl.

Environm. Microbiol. 56, 1378-1385) or Liang et al. (1995, J. Biol. Chem. 270, 24719-24724), which is incorporated herein by reference.

All or some of the "up-mutant" amino acids, identified in step 2, are combined in a single modified protein. According to additivity principles, mutations in noninteracting parts of a protein should combine to give simple additive changes in the free energy of binding (Lowman and Wells, 1993, J. Mol. Biol., 234, 564-578).

Increases in toxicity are thus accumulated by combining several single mutants into one multiple mutant. Finally a modified protein with improved toxicity is designed, which comprises some or all, preferably all, of the up-mutant amino acids previously identified.

In accordance with this invention, amino acids of domain II of a Cry9C protein, located at the protruding regions of domain II are chosen for modification. By "protruding regions of domain II", as used herein, are meant the solvent-exposed regions organized in loops, alpha helices or beta-strands which are protruding from domain II and are located at or towards the apex of the molecule.

This invention is particularly suited for improving the toxicity to an insect species for which the Cry9C protein has a rather weak toxicity. The toxicity of this improved Cry9C protein can be increased by combining amino acid mutations in the protein, each yielding an increased toxicity when compared to the amino acid present in the native Cry9C protein. Insect species for which improved Cry9C proteins can be made also include Spodoptera frugiperda, Heliothis zea, Heliothis armigera, and Agrotis ipsilon. Also, this invention is suited to increase toxicity of a Cry9C protein or its protease-resistant variant to one insect species and to decrease toxicity of the same protein to another insect species by making the proper amino acid substitutions in the protein. This may be advantageous, to limit the likelihood of insect resistance occurrence to the protein in a particular insect species.

An insecticidally effective part of the modified cry9C gene of this invention encoding an insecticidally effective portion of the modified Cry9C protein, can be made in a conventional manner. An "insecticidally effective part" of the modified -12- WO 99/00407 PCT/EP98/04033 cry9C gene refers to a gene comprising a DNA coding region encoding a polypeptide with fewer amino acids than the full length modified Cry9C protein but that still retains toxicity to insects. A preferred insecticidally effective part of the Cry9C protein is the part from amino acid position 1 or 44 to amino acid position 658 in SEQ ID No. 2.

In order to express all or an insecticidally effective part of the improved cry9C gene in E. coli, in Bt strains and in plants, suitable restriction sites can be introduced, flanking each gene or gene part. This can be done by site-directed mutagenesis, using well-known procedures (Stanssens et al., 1989, Nucl. Acids Res. 12, 4441-4454; White et al., 1989, Trends in Genet. 5,185-189).

In order to improve expression in foreign host cells such as plant cells, it may be preferred to alter the improved cry9C coding region or its insecticidally effective part to form an equivalent, artificial improved cry9C coding region. Expression is improved by selectively inactivating certain cryptic regulatory or processing elements present in the native sequence as described in PCT publications WO 91/16432 and WO 93/09218. This can be done by site-directed mutagenesis or site-directed introninsertion (WO 93/09218), or by introducing overall changes to the codon usage, e.g., adapting the codon usage to that most preferred by the host organism (publication of European patent application number 0 385 962, EP 0 359 472, publication of PCT patent application WO 93/07278, Murray et al., 1989, supra) without significantly changing, preferably without changing, the encoded amino acid sequence. Small modifications to a DNA sequence such as described above can be routinely made by PCR-mediated mutagenesis (Ho et al., 1989, supra; White et al., 1989, supra). For major changes to the DNA sequence, DNA synthesis methods are available in the art Davies et al., 1991, Society for Applied Bacteriology, Technical Series 28, pp. 351-359). For obtaining enhanced expression in monocot plants such as corn, a monocot intron can be added to the chimeric improved cry9C gene (Callis et al., 1987, Genes Development 1, 1183-1200; PCT publication WO 93/07278). Another preferred embodiment of this invention is the expression of the improved Cry9C proteins by the method described in PCT patent publication WO 97/49814, which is incorporated herein by reference.

The chimeric improved cry9C gene can be stably inserted in a conventional manner into the nuclear genome of a single plant cell, and the so-transformed plant -13- WO 99/00407 PCT/EP98/04033 cell can be used in a conventional manner to produce a transformed plant that is insect-resistant. Particularly preferred plants in accordance with this invention are corn plants. Corn cells can be stably transformed by electroporation) using wounded or enzyme-degraded intact tissues capable of forming compact embryogenic callus (such as corn immature embryos), or the embryogenic callus (such as type I callus in corn) obtained thereof, as described in PCT patent publication WO 92/09696 or US Patent 5,641,664. Other methods for transformation of corn include the methods by Fromm et al. (1990, Bio/Technology 8, 833-839), Gordon-Kamm et al. (1990, The Plant Cell 2, 603-618) and Ishida et al. (1996, Nature Biotechnology 14, 745-750).

Alternatively, a disarmed Ti plasmid, containing the insecticidally effective chimeric improved cry9C gene, in Agrobacterium tumefaciens can be used to transform the plant cell, preferably the corn or cotton cell, and thereafter, a transformed plant can be regenerated from the transformed plant cell using the procedures described, for example, in EP 0116718, EP 0270822, PCT publication WO 84/02913 and EP 0242246 (which are also incorporated herein by reference), and in Gould et al. (1991, Plant Physiol. 95, 426-434) or Ishida et al. (1996, supra), particularly the method described in PCT publication WO 94/00977. Preferred Tiplasmid vectors each contain the insecticidally effective chimeric improved cry9C gene between the border sequences, or at least located to the left of the right border sequence, of the T-DNA of the Ti-plasmid. Of course, other types of vectors can be used to transform the plant cell, using procedures such as direct gene transfer (as described, for example in EP 0233247), pollen mediated transformation (as described, for example in EP 0270356, PCT publication WO 85/01856, and US Patent 4,684,611), plant RNA virus-mediated transformation (as described, for example in EP 0067553 and US Patent 4,407,956), and liposome-mediated transformation (as described, for example in US Patent 4,536,475).

A resulting transformed plant, such as a transformed corn or cotton plant, can be used in a conventional plant breeding scheme to produce more transformed plants with the same characteristics or to introduce the improved cry9C gene, or an insecticidally effective part thereof in other varieties of the same or related plant species. Seeds, which are obtained from the transformed plants, contain the -14- WO 99/00407 PCT/EP98/04033 chimeric improved cry9C gene or its insecticidally effective part as a stable genomic insert. Cells of the transformed plant can be cultured in a conventional manner to produce the improved Cry9C protein or insecticidally effective portions thereof, which can be recovered for use in conventional insecticide compositions against insects, particularly lepidopteran insects Patent 5,254,799). Preferred plants in accordance with this invention, besides corn and cotton, include rice, plants of the genus Brassica such as oilseed rape, cauliflower and broccoli, and also soybean, tomato, tobacco, potato, eggplant, beet, oat, pepper, gladiolus, dahlia, chrysanthemum, sorghum, and garden peas.

The improved cry9C coding region or its insecticidally effective part is inserted in a plant cell genome so that the inserted coding region is downstream of, and under the control of, a promoter which can direct the expression of the gene part in the plant cell. This is preferably accomplished by inserting the chimeric improved cry9C gene or its insecticidally effective part in the plant cell genome. Preferred promoters include: the strong constitutive 35S promoters (the "35S promoters") of the cauliflower mosaic virus of isolates CM 1841 (Gardner et al., 1981, Nucleic Acids Research 9, 2871-2887), CabbB-S (Franck et al., 1980, Cell 21, 285-294) and CabbB-JI (Hull and Howell, 1987, Virology 86, 482-493); the ubiquitin promoter (EP 0342926), and the TR1' promoter and the TR2' promoter which drive the expression of the 1' and 2' genes, respectively, of the T-DNA (Velten et al., 1984, EMBO J. 3, 2723-2730). Alternatively, a promoter can be utilized which is not constitutive but rather is specific for one or more tissues or organs of the plant, preferably leaf and stem tissue, whereby the inserted chimeric improved cry9C gene or its insecticidally effective part is expressed only in cells of the specific tissue(s) or organ(s). Another alternative is to use a promoter whose expression is inducible by insect feeding or by chemical factors). Known wound-induced promoters inducing systemic expression of their gene product throughout the plant are also of particular interest.

The improved cry9C coding region, or its insecticidally effective part, is inserted in the plant genome so that the inserted coding region is upstream of suitable 3' end transcription regulation signals transcript termination and polyadenylation signals). Preferred polyadenylation and transcript formation signals include those of the 35S gene (Mogen et al., 1990, The Plant Cell 2, 1261-1272), the WO 99/00407 PCT/EP98/04033 octopine synthase gene (Gielen et al., 1984, EMBO J. 3, 835-845) and the T-DNA gene 7 (Velten and Schell, 1985, Nucl. Acids Res. 13, 6981-6998), which act as 3'-untranslated DNA sequences in transformed plant cells.

The chimeric improved cry9C gene, or its insecticidally effective gene part, can optionally be inserted in the plant genome as a hybrid gene (EP 0 193 259; Vaeck et al., 1987, Nature 327, 33-37) under the control of the same promoter as the coding region of a selectable marker gene, such as the coding region of the neo gene (EP 0 242 236) encoding kanamycin resistance, so that the plant expresses a fusion protein.

Preferably, the improved cry9C gene is expressed in a plant in combination with another insect control protein, another Bt-derived crystal protein or an insecticidal fragment thereof, particularly a CrylAb- or CrylB-type protein, to prevent or delay the occurrence of insect resistance development (EP 0 408 403).

All or part of the improved cry9C coding region can also be used to transform bacteria, such as a B. thuringiensis which produces other insecticidal toxins (Lereclus et al., 1992, Bio/Technology 10, 418-421; Gelernter Schwab, 1993, In Bacillus thuringiensis, An Environmental Biopesticide: theory and Practice, pp. 89-104, eds.

Entwistle, Cory, Bailey, M.J. and Higgs, John Wiley Sons Ltd.).

Thereby, a transformed Bt strain is produced which is useful for combating a wide spectrum of insect pests or for combating insects in such a manner that insect resistance development is prevented or delayed (EP 0 408 403). Preferred promoter and 3' termination and polyadenylation sequences for the chimeric improved cry9C gene are derived from Bacillus thuringiensis genes, such as the native ICP genes.

Alternatively, the improved coding region of the invention can be inserted and expressed in endophytic and/or root-colonizing bacteria, such as bacteria of the genus Pseudomonas or Clavibacter, under the control of a Bt ICP gene promoter and 3' termination sequences. Successful transfer and expression of ICP genes into such bacteria has been described by Stock et al. (1990, Can. J. Microbiol.

36, 879-884), Dimock et al. (1989, In Biotechnology, Biopesticides and Novel Plant Pest Resistance Management, eds. Roberts, D.W. Granados, pp.

88 92 Boyce Thompson Institute for Plant Research, Ithaca, New York), and Waalwijk et al.

(1991, FEMS Microbiol. Lett. 77, 257-264). Transformation of bacteria with all or part -16- WO 99/00407 PCT/EP98/04033 of the improved cry9C coding region of the invention, incorporated in a suitable cloning vehicle, can be carried out in a conventional manner, preferably using conventional electroporation techniques as described in Mahillon et al. (1989, FEMS Microbiol. Letters 60, 205-210), in PCT patent publication WO 90/06999, Chassy et al. (1988, Trends Biotechnol. 6, 303-309) or other methods, as described by Lereclus et al. (1992, Bio/Technology 10, 418).

The improved Cry9C-producing strain can also be transformed with all or an insecticidally effective part of one or more DNA sequences encoding a Bt protein or an insecticidally effective part thereof, such as: a DNA encoding the Bt2 or CrylAb protein (US patent 5,254,799; EP 0 193 259) or the Btl09P or Cry3C protein (PCT publication WO 91/16433), or another DNA coding for an anti-lepidoptera or an anti- Coleoptera protein. Thereby, a transformed Bt strain can be produced which is useful for combating an even greater variety of insect pests Coleoptera and/or additional lepidoptera) or for preventing or delaying the development of insect resistance.

For the purpose of combating insects by contacting them with the improved Cry9C protein, e.g. in the form of transformed plants or insecticidal formulations, any DNA sequence encoding any of the above described improved Cry9C proteins, can be used.

The following Examples are offered by way of illustration and not by way of limitation. The sequence listing referred to in the description and the Examples is as follows: SEQUENCE LISTING SEQ ID No. 1: Nucleotide sequence of the Bacillus thuringiensis cry9C gene, showing the coding region and flanking 5' and 3' regions.

SEQ ID No. 2: Amino acid sequence of the full length Bacillus thuringiensis Cry9C protein.

SEQ ID No. 3: Nucleotide sequence of a codon-optimized DNA sequence encoding a truncated Cry9C protein wherein the arginine at -17- WO 99/00407 PCTIEP98/04033 amino acid position 123 (corresponding to amino acid position 164 in the protein of SEQ ID No. 2) has been replaced by lysine.

SEQ ID No. 4: Amino acid sequence of the modified Cry9C protein encoded by the DNA of SEQ ID No. 3.

Unless otherwise stated in the Examples, all general materials and methods, including procedures for making and manipulating recombinant DNA are carried out by the standardized procedures as described in volumes 1 and 2 of Ausubel et al., Current Protocols in Molecular Biology, Current Protocols, USA (1994), in Plant Molecular Biology Labfax (1993, by R.D.D. Croy, jointly published by BIOS Scientific publications Ltd. UK and Blackwell Scientific Publications, UK) and Sambrook et al., Molecular Cloning A Laboratory Manual, Second Ed., Cold Spring Harbor Laboratory Press, NY (1989).

EXAMPLES:

1. CONSTRUCTION OF MODIFIED CRY9C PROTEINS Multiple alignments between Bt crystal protein sequences including the sequences of Cry9C, Cry3A and CrylAa allowed identification of the amino acids located in the expected binding site of the Cry9C domain II. Using known alignment programs, 52 amino acid positions were identified for amino acid replacement. The amino acids in the Cry9C protein of SEQ ID No. 2 from amino acid positions 313- 334, 358-369, 418-425, 480-492 have been identified to correspond to the solventaccessible regions most likely involved in receptor-binding in the Cry3A protein, and these positions in the Cry9C protein were chosen for amino acid modification. Since alanine substitution does not alter the main chain of a protein, and does not impose extreme electrostatic or steric effects and since it eliminates the side chain beyond the beta carbon, each of the amino acids in these identified regions was changed into alanine, one by one, using splice overlap extension PCR (Ho et al., 1989, supra) on the protease-resistant form of the native cry9C gene wherein the arginine codon at position 164 was replaced by an alanine codon. The codon most preferred in the cry9C native gene for alanine, GCA, was used for these modifications. When the original codon encodes alanine, then this is replaced by a valine codon (GTA). The -18- WO 99/00407 PCT/EP98/04033 obtained PCR fragments were ligated in pUC19-derived vectors. If not present, suitable unique restriction sites were created in the cry9C DNA. All plasmids containing modified DNA sequences were controlled by sequencing the relevant portions and were found to be correctly constructed. The modified cry9C genes were expressed in transformed WK6 cells. Every mutant protein was expressed in these E. coli cells at least twice. Mutants causing problems in expression, probably caused by structural changes in these mutants, were discarded. No gross folding aberrations of the mutants identified to be involved in toxicity (and listed in Table 1) are found, as was evidenced by the similar SDS-PAGE patterns following trypsin cleavage or treatment with midgut juice of the insect larvae of solubilized mutant and Cry9C(R164A) proteins.

2. INSECT TOXICITY OF THE MODIFIED CRY9C PROTEINS Bio assays on the modified Cry9C proteins obtained in Example 1 were carried out with first instar larvae of the Southwestern corn borer, Diatraea grandiosella (family Pyralidae); the European corn borer, Ostrinia nubilalis (family Pyralidae); and the tobacco budworm, Heliothis virescens (family Noctuidae). A dilution series of each protein was surface-layered on the artificial diet to determine the LCso value. The artificial diet consisted of: agar (20 water (1,000 ml), corn flour (96 g, ICN Biochemicals), yeast (30 wheat germs (64 g, ICN Biochemicals), wesson salt (7.5 g, ICN Biochemicals), casein (15 sorbic acid (2 aureomycin (0.3 nipagin (1 wheat germ oil (4 ml), sucrose (15 cholesterol (1 g), ascorbic acid (3.5 Vanderzand modified vitamin mix (12 g, ICN Biochemicals).

Larvae were placed on the diet in multi-well plates, 1 larva per well (2 for Ostrinia nubilalis). For each dilution, 24 larvae were tested, and dead and living larvae were counted after 5 days. Prior to application, the mutant proteins were digested with trypsin to release the toxin fragments. For each mutant protein, the assays are repeated at least 5 times, using two different protein preparations. As control protein, the trypsin-digested Cry9C(R164A) protein was used. The Cry9C(R164A) protein has the amino acid sequence of SEQ ID No. 2 wherein the arginine at position 164 was replaced by alanine. This protein was found to be more stable than the wild-type Cry9C toxin while retaining its toxicity to the test insects (see, PCT patent -19- WO 99/00407 PCT/EP98/04033 publication WO 94/24264). The LC 50 values were calculated with the POLOprogram, which is based on the probit analysis (POLO-PC, LeOra Software, 1119 Shattuck Ave., Berkeley California 94707). The results of these assays for those protein mutants which gave an LC 50 value that is significantly different from that of the control protein in repeated bio assays are summarized in Table 1. It is clear that different positions in the Cry9C protein when substituted to alanine cause increased toxicity in each of the tested insects.

Binding assays on isolated brush border membrane vesicles of Heliothis virescens and Ostrinia nubilalis performed as described in Van Rie et al. (1990, Appl.

Environm. Microbiol. 56, 1378-1385) showed that for all, with the exception of two, of the modified Cry9C proteins with altered toxicity, receptor binding is also altered an observed shift in KD value), thus confirming that for most amino acid residues altered toxicity is due to altered receptor binding. Hence, these residues are proper candidates for improvement of toxicity by amino acid randomization at or near the identified critical position.

3. COMPETITION BINDING EXPERIMENTS The Cry9C(R164A) protein was tested in competition binding assays using the ECL protein biotinylation system (Amersham Life Sciences, Amersham International plc., UK) as described by Lambert et al. (1996, supra) to determine if competition occurred with other Bt toxins in selected insects. For the assays, 3ng biotinylated Cry9C(R164A) protein was added to 30 pg brush border membrane vesicles in PBS buffer (comprising 0,1 BSA) in the presence of a 300-fold excess of nonbiotinylated toxin (homologous competition assays were included in every test as control). Repeated competition tests showed that in both Ostrinia nubilalis and Heliothis virescens brush border membranes, there was no detectable competition in receptor binding between the (activated) Cry9C(R164A) protein and any one of the following (activated) Bt toxins: the CrylAa (Schnepf et al., 1985, J. Biol. Chem. 260, 6264-6272), CrylAb (Hofte et al.,1986, Eur. J. Biochem. 161, 271-280), CrylAc (Adang et al., 1985, Gene 36, 289-300), CrylB (Brizzard Whiteley, 1988, Nucl.

Acids Res. 16, 4168-4169) and CrylC (Honee et al., 1988, Nucl. Acids Res. 16, 6240) toxins. Thus, in these insects the Cry9C(R164A) protein binds to a different WO 99/00407 PCT/EP98/04033 receptor than these other Bt toxins. In Diatreae grandiosella competition assays, it was found that the Cry9C(R164A) does compete for a receptor site with the Cryl B and CrylC Bt toxins, but does not compete with any one of the CrylAa, CrylAb, and CrylAc toxins.

The same results are found for all three insects when testing the Cry9C protein with the amino acid sequence of SEQ ID No. 2 from amino acids 1-658.

Thus, in all these three insects, combination of the Cry9C and a selected noncompetitively binding Bt toxin with good toxicity to the target insect can be used simultaneously in order to prevent or delay insect resistance development. In transgenic corn plants, a particularly interesting combination would be the Cry9C (or its protease-resistant variant) and a CrylB and/or any of the CrylA-type toxins for Ostrinia nubilalis control and the Cry9C (or its protease-resistant variant) and any one of the CrylA-type toxins, preferably a CrylAb-type toxin, for D. grandiosella control. For Heliothis virescens control, the Cry9C (or its protease-resistant variant) and any of the CrylA-type toxins are preferred toxins to be co-expressed.

4. CONSTRUCTION OF IMPROVED CRY9C PROTEINS The modified position in every mutant protein of Example 2 giving rise to a significantly decreased or increased toxicity to an insect species is altered to all other amino acids and the toxicity is re-evaluated. The amino acids yielding the highest toxicity at a particular position are combined to form an improved Cry9C protein.

Also the alanine mutants yielding an increase in toxicity (up-mutant amino acid positions) are included in such combinations to form improved Cry9C proteins for the selected insect species. Table 1 indeed shows already two up-mutant proteins for every insect tested. Analysis of all these improved Cry9C proteins in the bio assay shows that combinations of up-mutant amino acid positions can substantially increase toxicity of the Cry9C protein towards selected insect species.

GENE CONSTRUCTION AND PLANT TRANSFORMATION A modified DNA sequence encoding a truncated Cry9C(R164K) protein for expression in corn and cotton plants is shown in SEQ ID No. 3. This DNA sequence has an optimized codon usage for plants and encodes an N- and C-terminally -21- WO 99/00407 PCT/EP98/04033 truncated Cry9C protein wherein an arginine amino acid has been replaced by a lysine (at position 123 in SEQ ID No. Based on this DNA sequence, DNA sequences are made encoding the above improved Cry9C proteins and comprising amino acids 1 to 666 of the Cry9C(R164K) protein. Preferred codons to encode the amino acid replacements in the improved Cry9C proteins are those which are most preferred by the plant host (see, Murray, 1989, supra). A chimeric improved cry9C gene comprising the 35S promoter and 35S 3' transcription termination and polyadenylation signal is constructed by routine molecular biology techniques as described in the detailed description.

Corn cells are stably transformed by either Agrobacterium-mediated transformation (Ishida et al., 1996, supra and U.S. Patent No. 5,591,616) or by electroporation using wounded and enzyme-degraded embryogenic callus, as described in WO 92/09696 or US Patent 5,641,664 (incorporated herein by reference). The resulting transformed cells are selected by means of the incorporated selectable marker gene, grown into plants and tested for susceptibility towards insects. Corn plants expressing a truncated improved Cry9C(R164K) protein wherein the amino acids at positions 364, 488, 319 and 321 have been replaced into alanine show a significantly higher protection from Ostrinia nubilalis and Diatraea grandiosella damage in comparative tests against corn plants expressing a truncated Cry9C(R164K) protein. A positive correlation is found between the level of expression, as measured by RNA and protein analysis, and the observed insecticidal effect.

Cotton cells are stably transformed by Agrobacterium-mediated transformation (Umbeck et al., 1987, Bio/Technology 5, 263-266; US Patent 5,004,863, incorporated herein by reference). The resulting transformed cells are selected by means of the incorporated selectable marker gene, grown into plants and tested for susceptibility towards insects. Cotton plants expressing the truncated improved Cry9C(R164K, L321A, P329A) protein at similar levels than cotton plants expressing the truncated Cry9C(R164K) protein show a significantly higher protection from Heliothis virescens damage. A positive correlation is found between the level of expression, as measured by RNA and protein analysis, and the observed insecticidal effect.

-22- WO 99/00407 PCT/EP98/04033 The examples and embodiments of this invention described herein are only supplied for illustrative purposes. Many variations and modifications in accordance with the present invention are known to the person skilled in the art and are included in this invention and the scope of the claims. For instance, it is possible to alter, delete or add some nucleotides or amino acids to certain regions of the DNA or protein sequences of the invention without departing from the invention.

All publications (including patent publications) referred to in this application are hereby incorporated by reference, particularly WO 94/05771, WO 94/24264, and Lambert et al. (1996, supra).

-23- WO 99/00407 PCT/EP98/04033 Table 1: relative toxicity of modified trypsin-digested Cry9C proteins to different insects when compared with the Cry9C(R164A) trypsin-digested protein (mutant 'F313A': the Cry9C(R164A) trypsin-digested protein wherein also the phenylalanine at position 313 is replaced by alanine; 'down(2x)': mutant protein with a significantly lower toxicity (LC50 value about 2 times higher than the control protein), 'up mutant with a significantly higher toxicity (LC50 value about two times lower than that of the control protein), no difference in toxicity found): mutant H. virescens 0. nubilalis D. grandiosella F313A down (2x) P316A up (2x) A317V up (2x) N318A down (2-3x) V319A -up (3x) L321A up (2x) up (2x) R323A down (3x) W325A down (4-5x) down (2x) down (2-3x) P329A up (2x) Y330A -down (1.5x) up (2x) V362A down (3-4x) S364A -up (2x) D368A down (2-3x) Y369A up (2x) R418A down (16x) down (2x) A420V down (12x) L421A down (2x)- 1422A up (2x) F480A down (5x) -down -24- WO 99/00407 WO 9900407PCT/EP98/04033 mutant H. virescens 0. nubilalis D. grandiosella Q481 A down (3x)- N483A down (2x) Q484A down A485V down (3x) down (2x) down S487A down (2x) -down 1488A down (2x) up (2-3x) down N490A down A491V down (3x) WO 99/00407 PCT/EP98/04033 SEQUENCE LISTING GENERAL INFORMATION:

APPLICANT:

NAME: PLANT GENETIC SYSTEMS N.V.

STREET: Jozef Plateaustraat 22 CITY: Gent COUNTRY: Belgium POSTAL CODE (ZIP): B-9000 TELEPHONE: (32) (9)2358411 TELEFAX: (32)(9)2231923 (ii) TITLE OF INVENTION: Improved Bacillus thuringiensis toxin (iii) NUMBER OF SEQUENCES: 4 (iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (EPO) (vi) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/884,389 FILING DATE: 27-JUN-1997 INFORMATION FOR SEQ ID NO: 1: SEQUENCE CHARACTERISTICS: LENGTH: 4344 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: LOCATION:668..4141 OTHER INFORMATION:/note= "coding sequence" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: GAATTCGAGC TCGGTACCTT TTCAGTGTAT CGTTTCCCTT CCATCAGGTT TTCAAATTGA

AAAGCCGAAT

GTGTAAAAAA

AAAGATTAAG

AAACATGTAT

ATATGTGGAC

TGATCATATT

TCAAGTATCA

AATAGGAATA

GATTTGAAAC

CGTATTGAGA

GGTGTCCTTT

ACAACGGTTG

CATATTTTAA

CAAGTACGTG

GGTTTGTTTT

AATACTATCC

TTGTTTACGA

TTGATGAATG

CTTTTATCGG

ATAGAGATCC

AATATAGCGT

ATTTACAAAT

GTTTTGTATA

ATTTTTTCAA

TGTAAGTCAT

TGGACAAGTA

AAATTTCTCT

GTCTATTTCC

CCAACAACTA

CAAACTGATG

TGAGTAAGAA

GAAATATTTT

TTGTCTATGA

GAAATTGACT

ATTGAACCTA

TTAAGTTTCC

CCATATTATG

AAAGAGAATC

CCGAAGGTTT

TTTATTAGAA

CGAAAGATAC

TACAAGTATT

TTCTGTGTGA

AAGATACGGT

TAATTGATGG

CGCAATCTGC

GTAAAAAAGA

AGGAATCTTT

120 180 240 300 360 420 480 540 -26- WO 99/00407 W099/0407PCT/EP98/04033 CTTACACGGG AAAATCCTAA GATTGAGAGT AAAGATATAT ATATATAAAT ACAATAAAGA 600

GTTTGTCAGG

AAAAAGTATG

GTGTCCATCA

AAATATGAAC

AAATCCTAGT

GAGAATACTC

CCTTTTAAAT

GGTGGAGGAA

ATTGCAAGGA

TGATCGAAAT

TGATTTTGTT

AGTATATGCA

AGA1AGGATGG

AACCGCTAAG

AGGAACAAAT

ATTTTTGAAA

AATCGAAATA

GATGACGATG

TATAAAGATT

TTATCTATTA

GGGGCTTTAG

ACACTGTGGC

CTTGTCAATC

TTAGGAGACT

GATACACGAA

AATGCTATTC

CAAGCTGTGA

GGATTCACAC

TACACTAATT

ACTGAAAGTT

GATATGATAT

ATCAAAATGA

TGAGGTATCC

ACTTACAAAT

GTGGTAGAGA

GTGTTCCGTT

CAGTTAATGA

AACAAATAAC

CTTTTAATGT

ATTTAAGTGT

CATTGTTTGC

ATTTACATTT

AGGGGGAAAT

ACTGTGAAAC

GGTTAAGATA

TATTTCCATA

AGGTATATAC

GGGGTACTAA

ATCTTTTTGA

ATTTTATGGA

TACAAGAAGA

ATGGAACAAA

ATGGCGTGAA

CTAATGGAGG

CCGGAAGTTC

CTGGATC TAT

ACCTTAATAA

CACCTGTTTC

GAAGAACAAC

GAACATGCAC

ATATGAAATT

TTTGGCAAGT

GACAGATGAG

TGCAGTTCAG

TTCTGGACAA

TACAGCTATA

AGAATTTGCA

ATATCAACGT

TGTTCGTGCT

AGTAAATGGA

GTTATTATTA

TTCCACATAT

TTGGTATAAT

TCATCAATTC

TTATGATGTA

AGATCCGATT

TCCCTATAAT

TAGGCTGAAT

TTATTGGTCA

TAGTTATGGC

CCGCATAGAG

TAGAGCTTCT

ATGTAGAGAT

AACCCATAGA

AGCTAATGCA

TACGATTACC

GGGTACTACG

TAATGGCACA

TAGATTTATA

PTTGATGCCC

GACCCAAATG

GACTACACTG

1CtGCGCTTA 1ATAGTGAGTT

TGGGAAGCTT

AGAAATCAGG

TCCCTTCAAA

CAATTTATAG

CAGCAGGTTC

AAAGATGCAT

TATGACCGTC

ACAGGTTTAG

CGTAGAGAAA

CGACTTTATC

GTATTTAATC

ACTTTTTCTG

AGCTTAACAA

GGACATACGT

CTAATTACAA

TCAACGGCAG

TTTGTCCCAG

CTCTATGATA

CTATCTCATG

GGAAGTGTAC

CCAAATAGAA

GTCTTAAAAG

TTTGGAACGT

CGTTTTGCCT

GTATAGGAGG

CCCATTGTGG

CAGCGTTACA

ATTCTTATAT

CTGTTGTTGG

TTTATCAATT

TCATGCGACA

CACTTGCAAG

ATTGGTTGGC

CTTTAGACCT

CATTACTGTC

CTCTTTTTGG

AATTGGAACT

ATCGTTTAAG

TGACTTTAGT

CAACGGGATC

CACCAGCTAA

AGCTCGAAAA

TCAGCAGTAA

TACGCCGTAG

CCACAAGAGC

TAGATTTTCG

GAGGCTTGTT

CAAATGATGA

TTACCTTTTT

CTACTTATGT

TTACACAATT

GTCCAGGATT

TAAGAGTAAC

CAACAGGAAA

660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 GGTATTAGAT GTTGTGGCGC

AAACCCACAG

TGTTGGACTT

TGCCTTCATT

TCGATTTCCA

TTATCTGAAC

AACAATTAAT

TTCTGCATTG

TAATGGTACG

ATTACCACCA

TAGCTTTCAA

TTGGACCCGT

ACCATTGGTA

TACAGGAGGG

CTTACACGTG

TGCCGACGTT

CGCCCACCAC

GTTTCATCTA

GATTCAGCAG

CCCGGAGTTG

ATAGGTATAT

ACTTCTCCTG

GATGAAAGTA

ACTAATCAGG

CGTGATGTGG

AAGGCATCTG

GGTATACTCC

GGTTAATTCA CCATTAACAC AACAATATCG CCTAAGAGTT -27- WO 99/00407 WO 9900407PCT/EP98/04033

TTTCAGTATA

AATGAACAGA

TGGTCCGTTC

AGAAGGTGTT

TCCGGCACGA

TACACGTACA

AAATTTAGTG

AGCGGTAAGA

TAATACAATC

CGAGGGCGGT

TCCAACATAC

ACTAGATGGA

AGTCCATCTT

TTGCAGCGGA

GCATCATCCA

TACAGGGGAT

AACAGATGGG

TGAATCTCTA

ACGTGCAGAA

AGACTATCAA

AAATCTTGTA

TAACTACGAA

GTCTAGAAAT

TATGGATGCA

TGCACAAGTT

AGCAAGAAAA

AGAAACTCTT

TTCGTATATA

GAGTGAATCC

GAAGAGGGGG

AAAAAAGGTA

TGAAAAAATG

AGGGTACTCC

GGGCAGGAAC

AATCCGCCTT

AGCACCGGTG

GAAGCGGAAG

AGGGACGGAT

TCATGCTTAT

GCGGCAAAAC

AATAGTACAG

CCATTCTTTA

ATTTATCAAA

TTTGTGAAGA

GTAAAAAATG

ATCAACCGTT

ATGGATTGCT

CTAAATGCAA

TATGCGACGT

GAACGGGAAC

ATAGATCGTG

GATCAACAAT

GAGTCAATTT

ATTTACACAG

GCGGTGCAAA

TCGGTTCAGC

TCCCAACAAT

GTAGGAGGCG

ACATTTAATG

ACAGAAGAAG

GAAGGTTCAT

ATCCTAACGT

AAATGAATAG

TAGCTGTTTA

GTGGAGGGGT

TAACTTACGA

TTACATTTAC

GTGAATATTA

AGGATTTAGA

TACAGGTAAA

CCGATGAACA

GCCTCAGCCG

AAGAGAATGG

AAGGTCGTGC

AAGTAGATGC

GTAGTCAAGA

TACCAGATAA

GTGATGAACA

GTGAAGCGGC

GTGTAGATCA

TAGGAAATCT

AAAGAGATAA

TGTATTTAGC

TAAATCCAGA

CGGGTGTATA

AGTTATCCGA

ATGGAGACTT

AAGATGGCAA

TGAGAGTAAA

GAGATGGATA

CATGTGACTA

TGGTATTCTA

TCTATATAGA

ATAGCAACTA

AACCCCCTAC

CTAAGGTGTA

rTCTATCGGT kTCCTTTTTC k.CAAGCTCAA

TATAGATAGA

kGCGGCGAAG rGTGACAGAT

ATATGGGCAT

C.GAACGCAAC

C.TGGAAGGCA

ACTTCAGTTA

ATCGGTGTTA

TTTAGAAATT

TTTAGTATCT

GCATCAGGTA

TCAAACACAT

GGGCATTTGG

TGAATTGGTA

TGCGAAATGG

TGCGAAACAA

AATTGGGCTA

TAGTGATACA

TCGCTTACAA

TAACAGTGGT

TATGCATTTC

TCCGAATTGT

CGTCACAATC

CGATGTAAAT

CCCAGAGACA

CAGTATTGAG

TGAGAGGATA

TGGTAGAAGG

TAAAAAACAG

,ATGTTAGAT

kCAAGAGAGT

GAGATTCTAA

k.TTGAAATTG

AAAGCGGTGG

rATCAAGTGG

GACAAAAAGA

TTACTTCAAG

1AGTAACGGTG

GCAAGCGCAA

AAGCCTTATA

GATCTCATCC

GATACTTACT

GATATGCAGC

GAGTTTTCTT

GTTGTATTAA

GAGGTTGGGC

AATGCAGAGC

GCAATTAATC

GCAGAAATTA

CTATTACAGA

CAAGCATCGT

CTAGATAGTT

TTAGTTCTTT

AAGTATGTCT

CGAGATGGCG

GGTACGTATG

AAACATATGT

TTTATTGAAA

CTCCGTACAA

ACCGATAGGG

CATATCTGAT

rAGGGAGCAC rTACTACTAC

CAGTGA.ATGC

TCCCTGTGAA

CGAGCTTGTT

IkCCAAGCGGC rGTTATTGGA

A~TCCAGATTT

TTACTATTAG

GAGAAAATTA

CACGCTATAG

ACCATCATAA

CAGATGGTTC

TAGATGCGGA

CCTATATTAA

AAGTTCGAAC

CATTATCGGG

TAGGAAGAAA

ATCTGTTTGT

ATGAAGCTTC

TTCCTGGGAT

ATCTGTATAC

GGAATACAAC

CGCATTGGGA

TACGTGTGAC

CTCATCACCA

TCAATGACAA

GGGTAGAGGT

CACAAGAGTA

ACAAAGATTA

GGTTCTTACA

AGAAAAAAGT

2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 -28- WO 99/00407 PCT/EP98/04033 GAGTACCTTA TAAAGAAAGA ATTC INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 1157 amino acids TYPE: amino acid

STRANDEDNESS:

TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: Met Asn Arg Asn Asn Gin Asn Glu Tyr Glu Ile Ile Asp Ala Pro His 1 5 10 4344 Cys Pro Thr Ser Leu Gin Glu Glu Ser 145 Asn Asp Gin Gly Asn Asp Gly Gly Phe Ala Phe 130 Phe Asp Leu Val Cys Ala Glu Arg Ala Leu Phe 115 Ala Asn Thr Asp Pro Pro Ser Asp Ala Leu Asp Tyr Asp Ala Leu Gly Leu Asn 100 Met Arg Arg Asn Val Tyr Arg Asn 165 Phe Val 180 Leu Leu Gin Thr Val 70 Val Thr Gin Gin Gin 150 Leu Asn Ser Asp Asp Val Arg Tyr Pro Leu Ala Ser Asp 25 Asn Met Asn Tyr Lys Asp Tyr Leu Gin Met 40 Asp Ser Tyr Ile Asn Pro Ser Leu Ser Ile 55 Gin Thr Ala Leu Thr Val Val Gly Arg Ile 75 Pro Phe Ser Gly Gin Ile Val Ser Phe Tyr 90 Leu Trp Pro Val Asn Asp Thr Ala Ile Trp 105 110 Val Glu Glu Leu Val Asn Gin Gin Ile Thr 120 125 Ala Leu Ala Arg Leu Gin Gly Leu Gly Asp 135 140 Arg Ser Leu Gin Asn Trp Leu Ala Asp Arg 155 160 Ser Val Val Arg Ala Gin Phe Ile Ala Leu 170 175 Ala Ile Pro Leu Phe Ala Val Asn Gly Gin 185 190 Val Tyr Ala Gin Ala Val Asn Leu His Leu 200 205 195 Leu Leu 210 Leu Lys Asp Ala Ser 215 Leu Phe Gly Glu Trp Gly Phe Thr -29- WO 99/00407 Gin Gly 225 Lys Tyr Leu Arg Arg Glu Tyr Asp 290 Glu Val 305 Leu Cys Glu Asn Leu Thr Tyr Trp 370 Val Gin 385 Asn Pro Phe Arg Val Pro Cys Arg 450 Thr Gly 465 Gin Thr Tyr Val Asn Arg Gly Thr 530 Glu Thr Gly Met 275 Val Tyr Arg Ala Ile 355 Ser Glu Gly Ser Gly 435 Asp Ser Asn Trp Ile 515 Thr Ile Asn Thr 260 Thr Arg Thr Arg Phe 340 Ser Gly Asp Va1 Ala 420 Gly Leu Ser Gin Thr 500 Thr Val Ser Tyr 245 Asn Leu Leu Asp Trp 325 Ile Ser His Ser Asp 405 Leu Leu Tyr Thr Ala 485 Arg Gin Leu rhr 230 Cys Thr Val Tyr Pro 310 Gly Arg Asn Thr Tyr 390 Gly Ile Phe Asp His 470 Gly Arg Leu Lys Tyr Glu Glu Val Pro 295 Ile Thr Pro Arg Leu 375 Gly Thr Gly Asn Thr 455 Arg Ser Asp Pro Gly 535 Tyr rhr Ser Leu 280 Thr Vai Asn Pro Phe 360 Arg Leu Asn Ile Gly 440 Asn Leu Ile Val Leu 520 Pro Asp Trp Trp 265 Asp Gly Phe Pro His 345 Pro Arg Ile Arg Tyr 425 Thr Asp Ser Ala Asp 505 Val Gly Arg Tyr 250 Leu Va1 Ser Asn Tyr 330 Leu Vai Ser Thr Ile 410 Gly Thr Glu His Asn 490 Leu Lys Phe Gin 235 Asn Arg Vai Asn Pro 315 Asn Phe Ser Tyr Thr 395 Glu Vai Ser Leu Val 475 Ala Asn Ala Thr Leu Thr Tyr I Ala Pro 300 Pro Thr Asp Ser Leu 380 Thr Ser Asn Pro Pro 460 Thr Gly Asn Ser Gly 540 3lu ;iy His Leu 285 ln Ala Phe Arg Asn 365 Asn Arg Thr Arg Ala 445 Pro Phe Ser Thr Ala 525 Gly Leu Leu Gin 270 Phe Leu Asn Ser Leu 350 Phe Asp Ala Ala Ala 430 Asn Asp Phe Val lE 51c Prc G1 PCT/EP98/04033 Thr Ala 240 Asp Arg 255 Phe Arg Pro Tyr Thr Arg Val Gly 320 Glu Leu 335 Asn Ser Met Asp Ser Ala Thr Ile 400 Val Asp 415 Ser Phe Gly Gly Glu Ser Ser Phe 480 Pro Thr 495 Thr Pro Val Ser Ile Leu Arg Thr Thr Asn Thr Phe Gly Thr Leu 555 Arg Vai Thr Val Asn 560 WO 99/00407 PCT/EP98/04033 Ser Pro Leu Thr Gin Gin Tyr Arg Leu Arg Val Arg Phe Ala Ser Thr 565 570 575 Gly Asn Phe Ser Ile Arg Val Leu Arg Gly Gly Val Ser Ile Gly Asp 580 585 590 Val Arg Leu Gly Ser Thr Met Asn Arg Gly Gin Glu Leu Thr Tyr Glu 595 600 605 Ser Phe Phe Thr Arg Glu Phe Thr Thr Thr Gly Pro Phe Asn Pro Pro 610 615 620 Phe Thr Phe Thr Gin Ala Gin Glu Ile Leu Thr Val Asn Ala Glu Gly 625 630 635 640 Val Ser Thr Gly Gly Glu Tyr Tyr Ile Asp Arg Ile Glu Ile Val Pro 645 650 655 Val Asn Pro Ala Arg Glu Ala Glu Glu Asp Leu Glu Ala Ala Lys Lys 660 665 670 Ala Val Ala Ser Leu Phe Thr Arg Thr Arg Asp Gly Leu Gin Val Asn 675 680 685 Val Thr Asp Tyr Gin Val Asp Gin Ala Ala Asn Leu Val Ser Cys Leu 690 695 700 Ser Asp Glu Gin Tyr Gly His Asp Lys Lys Met Leu Leu Glu Ala Val 705 710 715 720 Arg Ala Ala Lys Arg Leu Ser Arg Glu Arg Asn Leu Leu Gin Asp Pro 725 730 735 Asp Phe Asn Thr Ile Asn Ser Thr Glu Glu Asn Gly Trp Lys Ala Ser 740 745 750 Asn Gly Val Thr Ile Ser Glu Gly Gly Pro Phe Phe Lys Gly Arg Ala 755 760 765 Leu Gin Leu Ala Ser Ala Arg Glu Asn Tyr Pro Thr Tyr Ile Tyr Gin 770 775 780 Lys Val Asp Ala Ser Val Leu Lys Pro Tyr Thr Arg Tyr Arg Leu Asp 785 790 795 800 Gly Phe Val Lys Ser Ser Gin Asp Leu Glu Ile Asp Leu Ile His His 805 810 815 His Lys Val His Leu Val Lys Asn Val Pro Asp Asn Leu Val Ser Asp 820 825 830 Thr Tyr Ser Asp Gly Ser Cys Ser Gly Ile Asn Arg Cys Asp Glu Gin 835 840 845 His Gin Val Asp Met Gin Leu Asp Ala Glu His His Pro Met Asp Cys 850 855 860 Cys Glu Ala Ala Gin Thr His Glu Phe Ser Ser Tyr Ile Asn Thr Gly 865 870 875 880 Asp Leu Asn Ala Ser Val Asp Gin Gly Ile Trp Val Val Leu Lys Val 885 890 895 -31- WO 99/00407 PCT/EP98/04033 Arg Thr Thr Asp Gly Tyr Ala Thr Leu Gly Asn Leu Glu Leu Val Glu 900 905 910 Val Gly Pro Leu Ser Gly Glu Ser Leu Glu Arg Glu Gin Arg Asp Asn 915 920 925 Ala Lys Trp Asn Ala Glu Leu Gly Arg Lys Arg Ala Glu Ile Asp Arg 930 935 940 Val Tyr Leu Ala Ala Lys Gin Ala Ile Asn His Leu Phe Val Asp Tyr 945 950 955 960 Gin Asp Gin Gin Leu Asn Pro Glu Ile Gly Leu Ala Glu Ile Asn Glu 965 970 975 Ala Ser Asn Leu Val Glu Ser Ile Ser Gly Val Tyr Ser Asp Thr Leu 980 985 990 Leu Gin Ile Pro Gly Ile Asn Tyr Glu Ile Tyr Thr Glu Leu Ser Asp 995 1000 1005 Arg Leu Gin Gin Ala Ser Tyr Leu Tyr Thr Ser Arg Asn Ala Val Gin 1010 1015 1020 Asn Gly Asp Phe Asn Ser Gly Leu Asp Ser Trp Asn Thr Thr Met Asp 1025 1030 1035 1040 Ala Ser Val Gin Gin Asp Gly Asn Met His Phe Leu Val Leu Ser His 1045 1050 1055 Trp Asp Ala Gin Val Ser Gin Gin Leu Arg Val Asn Pro Asn Cys Lys 1060 1065 1070 Tyr Val Leu Arg Val Thr Ala Arg Lys Val Gly Gly Gly Asp Gly Tyr 1075 1080 1085 Val Thr Ile Arg Asp Gly Ala His His Gin Glu Thr Leu Thr Phe Asn 1090 1095 1100 Ala Cys Asp Tyr Asp Val Asn Gly Thr Tyr Val Asn Asp Asn Ser Tyr 1105 1110 1115 1120 Ile Thr Glu Glu Val Val Phe Tyr Pro Glu Thr Lys His Met Trp Val 1125 1130 1135 Glu Val Ser Glu Ser Glu Gly Ser Phe Tyr Ile Asp Ser Ile Glu Phe 1140 1145 1150 Ile Glu Thr Gin Glu 1155 INFORMATION FOR SEQ ID NO: 3: SEQUENCE CHARACTERISTICS: LENGTH: 1897 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "synthetic DNA" -32- WO 99/00407 WO 9900407PCT/EP98/04033 (ix) FEATURE: NAME/KEY: LOCATION:13. .1890 OTHER INFORMATION:/note= "coding sequence" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:

GGTACCAAAA

ATCAACCCCA

GGTCGCATCC

TTCCTGCTGA

CAGGTGGAGG

CGCCTGCAGG

GCCGACCGCA

CTGGACTTCG

AGCGTGTACG

GGCGAGGGCT

CTGACCGCCA

AGGGGCACCA

GTGGTGCTGG

AGCAACCCCC

AACGTGGGCC

AACGCCTTCA

AATCGATTCC

AGCTACCTGA

GCCACCATCA

CGCAGCGCTC

TTCAACGGCA

GAGCTGCCAC

TTCAGCTTCC

GTGTGGACCT

CTGCCCCTGC

TTCACCGGTC

ACCGTGAATJ

CCATGGCTGA

GCCTGAGCAT

TGGGTGCCCT

ACACCCTGTG

AGCTGGTGAA

GCCTGGGCGA

ACGACACCAA

TGAACGCCAT

CCCAGGCCGT

GGGGCTTCAC

AGTACACCAA

ACACCGAGAG

ACGTGGTGGC

AGCTGACACG

TGTGCCGCAG

TCAGGCCACC

CCGTGAGCAC

ACGACAGCGC

.ACCCAGGCG1 TGATCGGCA9

CCACCAGCCC

CCGACGAGAC

AGACCAACC2

GGAGGGACG'

TGAAGGCCA(

GCGGTATAC'

P' CCCCACTGAI

CTACCTGCAG

CAGCGGTCGC

GGGCGTGCCC

GCCAGTGAAC

CCAGCAGATC

CAGCTTCAAC

GAACCTGAGC

CCCCCTGTTC

GAACCTGCAC

CCAGGGCGAG

CTACTGCGAG

CTGGCTGCGC

CCTGTTCCCC

TGAGGTGTAC

GTGGGGCACC

CCACCTGTTC

CAACTTCATG

CGTGCAGGAG

GGACGGCACC

CTACGGCGTC

AGCCAACGGI

CACCGGCAGC

k. GGCTGGCAGC r' GGACCTGAAC 3CGCTCCCGTC r GCGCAGGACC

:CCAGCAGTAC

ATGACCGACG AGGACTACAC

GACGCCGTGC

TTCAGCGGTC

GACACCGCCA

ACCGAGTTCG

GTGTACCAGC

GTGGTGAGGG

GCCGTGAACG

CTGCTGCTGC

ATCAGCACCT

ACCTGGTACA

TACCACCAGT

TACTACGACG

ACCGACCCCA

AACCCCTACA

GACCGCCTGA

GACTACTGGA

GACAGC TACO

AACCGCATCG

AACAGGGCC.P

GGCTGCCGAC

*AGCACCCACC

*ATCGCCAACC

-AACACCATC1

AGCGGCACCI

-ACCAACGGC2

SCGCCTGCGC(

AGACCGCTCT

AGATCGTGAG

TCTGGGAAGC

CTCGCAACCA

GCAGCCTGCA

CCCAGTTCAT

GCCAGCAGGT

TGAAGGATGC

ACTACGACCG

ACACCGGTCT

TCCGCAGGGA

TGCGCCTGTA

TCGTGTTCAA

ACACCTTCAG

ACAGCCTGAC

*GCGGTCACAC

GCCTGATCAC

AGAGCACCGC

GCTTCGTGCC

ATCTGTACG.A

GCCTGAGCCA

CTGGCAGCG7

SCCCCCAACCC

k. CCGTGCTGM7 k. CCTTCGGCAC 3 TGCGCTTCGC

CGACAGCTAC

GACCGTGGTG

CTTCTACCAG

TTTCATGCGC

GGCCCTGGCT

GAACTGGCTG

CGCCCTGGAC

GCCCCTGCTG

ATCCCTGTTC

CCAGCTCGAG

GGACCGCCTG

GATGACCCTG

CCCCACCGGC

CCCACCAGCC

CGAGCTGGAG

CATCAGCAGC

CCTGCGCAGG

CACCACCAGG

TGTGGACTTC

AGGTGGCCTG

*CACCAACGAC

CGTCACCTTC

GCCCACCTAC

SCATCACCCAG

LGGGTCCAGGC

CCTGCGCGTG

CAGCACCGGC

120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 AACTTCAGCA TCCGCGTGCT GAGGGGTGGC GTGAGCATCG GCGACGTGCG CCTGGGCAGC -33- WO 99/00407 WO 9900407PCTIEP98/04033 ACCATGAACA GGGGCCAGGA GCTGACCTAC GAGAGCTTCT TCACCCGCGA GTTCACCACC ACCGGTCCCT TCAACCCACC CTTCACCTTC ACCCAGGCCC AGGAGATCCT GACCGTGAAC GCCGAGGGCG TGAGCACCGG TGGCGAGTAC TACATCGACC GCATCGAGAT CGTGCCCGTG AACCCAGCTC GCGAGGCCGA GGAGGACTGA GGCTAGC INFORMATION FOR SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 625 amino acids TYPE: amino acid

STRANDEDNESS:

TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: Met Ala Asp Tyr Leu Gin Met Thr Asp Giu Asp Tyr Thr Asp Ser Tyr 1 5 10 1740 1800 1860 1897 Ile Leu Gly Vai Leu Arg Gin Arg Leu 145 Gln Giy Asn Thr Gin Asn Val1 Leu Asn Ala 130 Phe Ala Glu Pro Val1 Ile Asp Asn Gin Trp, 115 Gin Ala Val Gly Ser Val Val Thr Gin Gly 100 Leu Phe Val Asn Trp 180 Leu Gly Ser Ala Gin Leu Ala Ile Asn Leu 165 Gly Ser Arg Phe Ile 70 Ile Gly Asp Ala Gly 150 His Phe Ile Ile Tyr 55 Trp Thr Asp Arg Leu 135 Gin Leu Thr Ser Leu 40 Gin Giu Giu Ser Asn 120 Asp Gin Leu Gin Lys 200 Gly 25 Gly Phe Aia Phe Phe 105 Asp Leu Val Leu Giy 185 Tyr Arg Al a Leu Phe Al a 90 Asn Thr Asp Pro Leu 170 Giu Thr Asp Leu Leu Met 75 Arg Vai Lys Phe Leu 155 Lys Ile Asn Ala Gly Asn Arg Asn Tyr Asn Val 140 Leu Asp Ser Tyr Val Val Thr Gln Gln Gin Leu 125 Asn Ser Ala Thr Cys 205 Gln Pro Leu Val1 Al a Arg 110 Ser Ala Vai Ser Tyr 190 Glu Thr Phe Trp Giu Leu Ser Val Ile Tyr Leu 175 Tyr Thr Ala Ser Pro Glu Ala Leu Val Pro Ala 160 Phe Asp Trp Arg Gin Leu Glu Leu Thr Ala 195 -34- WO -99/00407 PCTIEP98/04033 Tyr Asn Thr Gly Leu Asp Arg Leu Arg Gly Thr Asn Thr Glu Ser Trp 210 215 220 Leu 225 Val Ser Asn Tyr Leu 305 Val Ser Thr Ile Gly 385 Thr Glu His Asn Leu 465 Lys Phe Thr krg Ja1 Asn Pro Asn 290 Phe Ser Tyr Thr Glu 370 Va1 Ser Leu Val Ala 450 Asn Ala Thi Let Tyr Ala I Pro Pro 275 Thr Asp Ser Leu Thr 355 Ser Asn Pro Pro Thr 435 Gly Asn Ser Gly 1 Arg 515 is eu 31n i 260 kla Phe Arg Asn Asn 340 Arg Thr Arg Ala Pro 420 Phe Ser Thr Ala Gly 500 Val .ln Phe 245 Leu Asn Ser Leu Phe 325 Asp Ala Ala Ala Asn 405 Asp Phe Val Ile Pro 485 Gly Thi Phe 230 Pro Thr Val Glu Asn 310 Met Ser Thr Val Ser 390 Gly Glu Ser Pro Thr 470 Val Ile Val krg Iyr krg Gly Leu 295 Ser Asp Ala Ile Asp 375 Phe Gly Ser Phe Thr 455 Pro Ser Let Asr Arg Tyr Glu Leu 280 Glu Leu Tyr Val Asn 360 Phe Va1 Cys Thr Gin 440 STyr Asr Glj Ar 1 Se 52( Glu D Asp Val 265 Cys 2 Asn 2 Thr Trp Gin 4 345 Pro Arg Pro Arg Gly 425 Thr Val Arg Thr Arg 505 Pro 0 let Tal ,50 yr krg kia Ile Ser 330 Glu Gly Ser Gly Asp 410 Ser Asn Trp Ile Thr 490 Thy Let Thr I 235 Arg Thr 2 Arg Phe Ser 315 Gly Asp Val Ala Gly 395 Leu Ser Gin Thr Thr 475 Val Thr 1 Thr .eu eu ksp rrp Ile 300 Ser His Ser Asp Leu 380 Leu Tyr Thr Ala Arg 460 Gin Leu Asn Glr Val N Tyr I Pro Gly 285 Arg Asn Thr Tyr Gly 365 Ile Phe Asp His Gly 445 Arg Leu Lys Gly Gin 525 Tal ?ro Ile 270 rhr Pro Arg Leu Gly 350 Thr Gly Asn Thr Arg 430 Ser Asp Pro Gly Thr 510 TyT.

Leu 2 Thr 255 Val Asn Pro Phe Arg 335 Leu Asn Ile Gly Asn 415 Leu Ile Val Leu Pro 495 Phe Arg ksp 240 ly Phe Pro His Pro 320 Arg Ile Arg Tyr Thr 400 Asp Ser Ala Asp Val 480 Gly Gly Leu Arg Val Arg Phe Ala Ser Thr Gly Asn Phe Ser Ile Arg Val Leu Arg 530 535 540 -36- Gly Gly Val 545 Gly Gin Glu Thr Gly Pro Leu Thr Val 595 Asp Arg Ile 610 Ser Ile Gly Asp Val 550 Leu Thr Tyr Glu Ser 565 Arg Leu Gly Ser 555 Phe Phe Thr Arg 570 Thr Phe Thr Gin 585 Ser Thr Gly Gly Thr Met Asn Arg 560 Phe 580 Asn Asn Pro Pro Phe Ala Glu Gly Val 600 Glu Phe Thr Thr 575 Ala Gin Glu Ile 590 Glu Tyr Tyr Ile 605 Glu Ala Glu Glu Glu Ile Val Pro Val Asn Pro Ala Arg 615 620 Asp 625 "Comprises/comprising" when used in this specification is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.

Claims

1. A modified Cry9C protein with an improved toxicity to an insect species, comprising the amino acid sequence of SEQ ID No. 2 or an insecticidally-effective fragment thereof, wherein at least one amino acid at the following amino acid positions in SEQ ID No. 2 is replaced by another amino acid: 316, 317, 319, 321, 329, 330, 364, 369, 422, or 488.

2. The modified Cry9C protein of claim 1 with improved toxicity to Ostrinia nubilalis, comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658, wherein at least the amino acids at positions 364 and 488 in SEQ ID No. 2 are replaced by other amino acids.

3. The modified Cry9C protein of claim 1 with improved toxicity to Heliothis virescens, comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658, wherein the amino acid at position 321 or position 329 in SEQ ID No. 2, is replaced by another amino acid.

4. The modified Cry9C protein of claim 1 with improved toxicity to Diatraea grandiosella, comprising the amino acid sequence of SEQ ID No. 2 from amino acid position 1 or 44 to amino acid position 658, wherein the amino acid at any *e or all of amino acid positions 316, 317, 319, 321, 330, 369, or 422 in SEQ ID No. 2 is replaced by another amino acid.

5. The modified Cry9C protein of any one of claims 1 to 4 wherein the Sarginine at position 164 in SEQ ID No. 2 is replaced by another amino acid.

6. The modified Cry9C protein of any one of claims 1 to 4 wherein said at least one amino acid position is replaced by alanine. A DNA sequence encoding the protein of any one of claims 1 to 4. 38

8. A DNA sequence encoding the protein of claim 5 or 6.

9. A plant, comprising the DNA of claim 7 or 8. A seed, comprising the DNA of claim 7 or 8.

11. The plant of claim 9 which is selected from the group consisting of: corn, cotton, rice, oilseed rape, cauliflower, broccoli, soybean, tomato, tobacco, potato, eggplant, beet, oat, pepper, gladiolus, dahlia, chrysanthemum, sorghum, and garden peas.

12. A method for controlling insects feeding on a plant, comprising expressing the protein of any one of claims 1 to 4 in a plant.

13. A method for controlling insects feeding on a plant, comprising growing the plant of claim 9.

14. A method of obtaining a seed comprising the DNA of claim 7 or 8 comprising inserting said DNA into the genome of a plant and harvesting the seed from said plant. DATED this 13 th day of September 2001 PLANT GENETIC SYSTEMS N.V. WATERMARK PATENT TRADEMARK ATTORNEYS 290 BURWOOD ROAD HAWTHORN VICTORIA 3122 AUSTRALIA P16581AU00 KJS:AMT:SLB -bj KN C-) x^^