AU720423B2

AU720423B2 - Manipulation of cellulose and/or beta-1,4-glucan

Info

Publication number: AU720423B2
Application number: AU31603/97A
Authority: AU
Inventors: Antonio Arioli; Andreas Stefan Betzner; Liangcai Peng; Richard Edward Williamson
Original assignee: Australian National University
Current assignee: Australian National University
Priority date: 1996-06-27
Filing date: 1997-06-24
Publication date: 2000-06-01
Anticipated expiration: 2017-06-24
Also published as: AU3160397A

Description

WO 98/00549 PCT/AU97/00402 -1- "MANIPULATION OF CELLULOSE AND/OR p-1,4-GLUCAN" The present invention relates generally to isolated genes which encode polypeptides involved in cellulose biosynthesis and transgenic organisms expressing same in sense or antisense orientation, or as ribozymes, co-suppression or gene-targeting molecules. More particularly, the present invention is directed to a nucleic acid molecule isolated from Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp. which encode an enzyme which is important in cellulose biosynthesis, in particular the cellulose synthase enzyme and homologues, analogues and derivatives thereof and uses of same in the production of transgenic plants expressing altered cellulose biosynthetic properties.

Bibliographic details of the publications referred to by author in this specification are collected at the end of the description. Sequence identity numbers (SEQ ID Nos.) for the nucleotide and amino acid sequences referred to in the specification are defined after the bibliography.

Throughout the specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising" will be understood to imply the inclusion of a stated element or integer or group of elements or integers but not the exclusion of any other element or integer or group of elements or integers.

Cellulose, the world's most abundant biopolymer, is the most characteristic component of plant cell walls in so far as it forms much of the structural framework of the cell wall.

Cellulose is comprised of crystalline p-1,4-glucan microfibrils. The crystalline microfibrils are extremely strong and resist enzymic and mechanical degradation, an important factor in determining the nutritional quantity, digestibility and palatability of animal and human foodstuffs. As cellulose is also the dominant structural component of industrially-important plant fibres, such as cotton, flax, hemp, jute and the timber crops such as Eucalyptus ssp. and Pinus ssp., amongst others, there is considerable economic benefit to be derived from the WO 98/00549 PCT/AU97/00402 -2manipulation of cellulose content and/or quantity in plants. In particular, the production of food and fibre crops with altered cellulose content are highly desirable objectives.

The synthesis of cellulose involves the p-1,4-linkage of glucose monomers, in the form of a nucleoside diphospoglucose such as UDP-glucose, to a pre-existing cellulose chain, catalysed by the enzyme cellulose synthase.

Several attempts to identify the components of the functional cellulose synthase in plants have failed, because levels of P-1,4-glucan or crystalline cellulose produced in such assays have hitherto been too low to permit enzyme purification for protein sequence determination.

Insufficient homology between bacterial p-1,4-glucan synthase genes and plant cellulose synthase genes has also prevented the use of hybridisation as an approach to isolating the plant homologues of bacterial P-1,4-glucan (cellulose) synthases.

Furthermore, it has not been possible to demonstrate that the cellulose synthase enzyme from plants is the same as, or functionally related to, other purified and characterised enzymes involved in polysaccharide biosynthesis. As a consequence, the cellulose synthase enzyme has not been isolated from plants and, until the present invention, no nucleic acid molecule has been characterised which functionally-encodes a plant cellulose synthase enzyme.

In work leading up to the present invention, the inventors have generated several novel mutant Arabidopsis thaliana plants which are defective in cellulose biosynthesis. The inventors have further isolated a cellulose synthase gene designated RSW1, which is involved in cellulose biosynthesis in Arabidopsis thaliana, and homologous sequences in Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp. The isolated nucleic acid molecules of the present invention provide the means by which cellulose content and structure may be modified in plants to produce a range of useful fibres suitable for specific industrial purposes, for example increased decay resistance of timber and altered digestibility of foodstuffs, amongst others.

I i f, WO 98/00549 PCT/AU97/00402 -3- Accordingly, one aspect of the present invention provides an isolated nucleic acid molecule comprising a sequence of nucleotides which encodes, or is complementary to a sequence which encodes a polypeptide of the cellulose biosynthetic pathway or a functional homologue, analogue or derivative thereof.

The nucleic acid molecule of the invention may be derived from a prokaryotic source or a eukaryotic source.

Those skilled in the art will be aware that cellulose production requires not only the presence of a catalytic subunit, but also its activation and organisation into arrays which favour the crystallization of glucan chains. This organisation is radically different between bacteria, which possess linear arrays, and higher plants, which possess hexameric clusters or "rosettes", of glucan chains. The correct organisation and activation of the bacterial enzyme may require many factors which are either not known, or alternatively, not known to be present in plant cells, for example specific membrane lipids to impart an active conformation on the enzyme complex or protein, or the bacterial c-di-GMP activation system.

Accordingly, the use of a plant-derived sequence in eukaryotic cells such as plants provides significant advantages compared to the use of bacterially-derived sequences.

Accordingly, the present invention does not extend to known genes encoding the catalytic subunit of Agrobacterium tumefaciens or Acetobacter xylinum or Acetobacter pasteurianus cellulose synthase, or the use of such known bacterial genes and polypeptides to manipulate cellulose.

Preferably, the subject nucleic acid molecule is derived from an eukaryotic organism.

In a more preferred embodiment of the invention, the isolated nucleic acid molecule of the invention encodes a plant cellulose synthase or a catalytic subunit thereof, or a homologue, analogue or derivative thereof.

WO 98/00549 PCT/AU97/00402 -4- More preferably, the isolated nucleic acid molecule encodes a plant cellulose synthase polypeptide which is associated with the primary cell wall of a plant cell. In an alternative preferred embodiment, the nucleic acid molecule of the invention encodes a plant cellulose synthase or catalytic subunit thereof which is normally associated with the secondary cell wall of a plant cell.

In a more preferred embodiment, the nucleic acid molecule of the invention is a cDNA molecule, genomic clone, mRNA molecule or a synthetic oligonucleotide molecule.

In a particularly preferred embodiment, the present invention provides an isolated nucleic acid molecule which encodes or is complementary to a nucleic acid molecule which encodes the Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp. wheat, barley or maize cellulose synthase enzyme or a catalytic subunit thereof or a polypeptide component, homologue, analogue or derivative thereof.

As exemplified herein, the present inventors have identified cellulose biosynthesis genes in maize, wheat, barley, rice, cotton, Brassica ssp. and Eucalyptus ssp., in addition to the specific Arabidopsis thaliana RSW1 gene sequence which has been shown to be particularly useful for altering cellulose and/or p-1,4-glucan and/or starch levels in cells.

Hereinafter the term "polypeptide of the cellulose biosynthetic pathway" or similar term shall be taken to refer to a polypeptide or a protein or a part, homologue, analogue or derivative thereof which is involved in one or more of the biosynthetic steps leading to the production of cellulose or any related p-1,4-glucan polymer in plants. In the present context, a polypeptide of the cellulose biosynthetic pathway shall also be taken to include both an active enzyme which contributes to the biosynthesis of cellulose or any related p-1,4-glucan polymer in plants and to a polypeptide component of such an enzyme. As used herein, a polypeptide of the cellulose biosynthetic pathway thus includes cellulose synthase. Those skilled in the art will be aware of other cellulose biosynthetic pathway polypeptides in plants.

WO 98/00549 PCT/AU97/00402 The term "related p-1,4-glucan polymer" shall be taken to include any carbohydrate molecule comprised of a primary structure of p-1,4-linked glucose monomers similar to the structure of the components of the cellulose microfibril, wherein the relative arrangement or relative configuration of the glucan chains may differ from their relative configuration in microfibrils of cellulose. As used herein, a related p-1,4-glucan polymer includes those p-1,4-glucan polymers wherein individual p-1,4-glucan microfibrils are arranged in an anti-parallel or some other relative configuration not found in a cellulose molecule of plants and those noncrystalline P-1,4-glucans described as lacking the resistance to extraction and degradation that characterise cellulose microfibrils.

The term "cellulose synthase" shall be taken to refer to a polypeptide which is required to catalyse a P-1,4-glucan linkage to a cellulose microfibril.

Reference herein to "gene" is to be taken in its broadest context and includes: a classical genomic gene consisting of transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (i.e.

introns, and untranslated sequences); or (ii) mRNA or cDNA corresponding to the coding regions exons) and and untranslated sequences of the gene.

The term "gene" is also used to describe synthetic or fusion molecules encoding all or part of a functional product.

In the present context, the term "cellulose gene" or "cellulose genetic sequence" or similar term shall be taken to refer to any gene as hereinbefore defined which encodes a polypeptide of the cellulose biosynthetic pathway and includes a cellulose synthase gene.

The term "cellulose synthase gene" shall be taken to refer to any cellulose gene which specifically encodes a polypeptide which is a component of a functional enzyme having cellulose synthase activity i.e. an enzyme which catalyses a p-1,4-glucan linkage to a WO 98/00549 PCT/AU97/00402 -6cellulose microfibril.

Preferred cellulose genes may be derived from a naturally-occurring cellulose gene by standard recombinant techniques. Generally, a cellulose gene may be subjected to mutagenesis to produce single or multiple nucleotide substitutions, deletions and/or additions.

Nucleotide insertional derivatives of the cellulose synthase gene of the present invention include 5' and 3' terminal fusions as well as intra-sequence insertions of single or multiple nucleotides. Insertional nucleotide sequence variants are those in which one or more nucleotides are introduced into a predetermined site in the nucleotide sequence although random insertion is also possible with suitable screening of the resulting product. Deletional variants are characterised by the removal of one or more nucleotides from the sequence.

Substitutional nucleotide variants are those in which at least one nucleotide in the sequence has been removed and a different nucleotide inserted in its place. Such a substitution may be "silent" in that the substitution does not change the amino acid defined by the codon.

Alternatively, substituents are designed to alter one amino acid for another similar acting amino acid, or amino acid of like charge, polarity, or hydrophobicity.

As used herein, the term "derived from" shall be taken to indicate that a particular integer or group of integers has originated from the species specified, but has not necessarily been obtained directly from the specified source.

For the present purpose, "homologues" of a nucleotide sequence shall be taken to refer to an isolated nucleic acid molecule which is substantially the same as the nucleic acid molecule of the present invention or its complementary nucleotide sequence, notwithstanding the occurrence within said sequence, of one or more nucleotide substitutions, insertions, deletions, or rearrangements.

"Analogues" of a nucleotide sequence set forth herein shall be taken to refer to an isolated nucleic acid molecule which is substantially the same as a nucleic acid molecule of the present invention or its complementary nucleotide sequence, notwithstanding the occurrence of any WO 98/00549 PCT/AU97/00402 non-nucleotide constituents not normally present in said isolated nucleic acid molecule, for example carbohydrates, radiochemicals including radionucleotides, reporter molecules such as, but not limited to DIG, alkaline phosphatase or horseradish peroxidase, amongst others.

"Derivatives" of a nucleotide sequence set forth herein shall be taken to refer to any isolated nucleic acid molecule which contains significant sequence similarity to said sequence or a part thereof. Generally, the nucleotide sequence of the present invention may be subjected to mutagenesis to produce single or multiple nucleotide substitutions, deletions and/or insertions.

Nucleotide insertional derivatives of the nucleotide sequence of the present invention include 5' and 3' terminal fusions as well as intra-sequence insertions of single or multiple nucleotides or nucleotide analogues. Insertional nucleotide sequence variants are those in which one or more nucleotides or nucleotide analogues are introduced into a predetermined site in the nucleotide sequence of said sequence, although random insertion is also possible with suitable screening of the resulting product being performed. Deletional variants are characterised by the removal of one or more nucleotides from the nucleotide sequence.

Substitutional nucleotide variants are those in which at least one nucleotide in the sequence has been removed and a different nucleotide or nucleotide analogue inserted in its place.

The present invention extends to the isolated nucleic acid molecule when integrated into the genome of a cell as an addition to the endogenous cellular complement of cellulose synthase genes. The said integrated nucleic acid molecule may, or may not, contain promoter sequences to regulate expression of the subject genetic sequence.

The isolated nucleic acid molecule of the present invention may be introduced into and expressed in any cell, for example a plant cell, fungal cell, insect cell. animal cell, yeast cell or bacterial cell. Those skilled in the art will be aware of any moficiations which are required to the codon usage or promoter sequences or other regulatory sequences, in order for expression to occur in such cells.

Another aspect of the present invention is directed to a nucleic acid molecule which comprises -)WO 98/00549 PCT/AU97/00402 -8a sequence of nucleotides corresponding or complementary to any one or more of the sequences set forth in SEQ ID Nos:l, 3, 4, 5, 7, 9, 11, or 13, or having at least about more preferably at least about 55 still more preferably at least about 65 yet still more preferably at least about 75-80% and even still more preferably at least about 85-95% nucleotide similarity to all, or a part thereof.

According to this aspect of the invention, said nucleic acid molecule encodes, or is complementary to a nucleotide sequence encoding, a polypeptide of the cellulose biosynthetic pathway in a plant or a homologue, analogue or derivative thereof.

Preferably, a nucleic acid molecule which is at least 40% related to any one or more of the sequences set forth in SEQ ID Nos:l, 3, 4, 5, 7, 9, 11, or 13 comprises a nucleotide sequence which encodes or is complementary to a sequence which encodes a plant cellulose synthase, more preferably a cellulose synthase which is associated with the primary or the secondary plant cell wall of the species from which it has been derived.

Furthermore, the nucleic acid molecule according to this aspect of the invention may be derived from a monocotyledonous or dicotyledonous plant species. In a particularly preferred embodiment, the nucleic acid molecule is derived from Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum (cotton) or Eucalyptus ssp., amongst others.

For the purposes of nomenclature, the nucleotide sequence shown in SEQ ID NO: 1 relates to a cellulose gene as hereinbefore defined which comprises a cDNA sequence designated T20782 and which is derived from Arabidopsis thaliana. The amino acid sequence set forth in SEQ ID NO:2 relates to the polypeptide encoded by T20782.

The nucleotide sequence set forth in SEQ ID NO:3 relates to the nucleotide sequence of the complete Arabidopsis thaliana genomic gene RSW1, including both intron and exon sequences. The nucleotide sequence of SEQ ID NO:3 comprises exons 1-14 of the genomic WO 98/00549 PCT/AU97/00402 -9gene and includes 2295bp of 5'-untranslated sequences, of which approximately the first 1.9kb comprises RSW1 promoter sequence (there is a putative TATA box motif at positions 1843-1850 of SEQ ID NO:3). The nucleotide sequence set forth in SEQ ID NO:3 is derived from the cosmid clone 23H12. This sequence is also the genomic gene equivalent of SEQ ID Nos:l and The nucleotide sequence set forth in SEQ ID NO:4 relates to the partial nucleotide sequence of a genomic gene variant of RSW1, derived from cosmid clone 12C4. The nucleotide sequence of SEQ ID NO:4 comprises exon sequence 1-11 and part of exon 12 of the genomic gene sequence and includes 862bp of 5'-untranslated sequences, of which approximately 700 nucleotides comprise RSW1 promoter sequences (there is a putative TATA box motif at positions 668-673 of SEQ ID NO:4). The genomic gene sequence set forth in SEQ ID NO:4 is the equivalent of the cDNA sequence set forth in SEQ ID NO:7 cDNA clone Ath-A).

The nucleotide sequence shown in SEQ ID NO:5 relates to a cellulose gene as hereinbefore defined which comprises a cDNA equivalent of the Arabidopsis thaliana RSW1 gene set forth in SEQ ID NO:3. The amino acid sequence set forth in SEQ ID NO:6 relates to the polypeptide encoded by the wild-type RSW1 gene sequences set forth in SEQ ID Nos:3 and The nucleotide sequence shown in SEQ ID NO:7 relates to a cellulose gene as hereinbefore defined which comprises a cDNA equivalent of the Arabidopsis thaliana RSW1 gene set forth in SEQ ID NO:4. The nucleotide sequence is a variant of the nucleotide sequences set forth in SEQ ID Nos:3 and 5. The amino acid sequence set forth in SEQ ID NO:8 relates to the polypeptide encoded by the wild-type RSW1 gene sequences set forth in SEQ ID Nos:4 and 6.

The nucleotide sequence shown in SEQ ID NO:9 relates to a cellulose gene as hereinbefore defined which comprises a further wild-type variant of the Arabidopsis thaliana RSW1 gene set forth in SEQ ID Nos:3 and 5. The nucleotide sequence variant is designated Ath-B. The WO 98/00549 PCT/AU97/00402 amino acid sequence set forth in SEQ ID NO:10 relates to the polypeptide encoded by the wild-type RSW1 gene sequence set forth in SEQ ID No:9.

The nucleotide sequence shown in SEQ ID NO:11 relates to a cellulose gene as hereinbefore defined which comprises a cDNA equivalent of the Arabidopsis thaliana rswl gene. The rswl gene is a mutant cellulose gene which produces a radial root swelling phenotype as described by Baskin et al (1992). The present inventors have shown herein that the rswl gene also produces reduced inflorescence length, reduced fertility, misshapen epidermal cells, reduced cellulose content and the accumulation of non-crystalline P-1,4-glucan, amongst others, when expressed in plant cells. The rswl nucleotide sequence is a further variant of the nucleotide sequences set forth in SEQ ID Nos:3 and 5. The amino acid sequence set forth in SEQ ID NO:12 relates to the rswl polypeptide encoded by the mutant rswl gene sequence set forth in SEQ ID No:11.

The nucleotide sequence shown in SEQ ID NO:13 relates to a cellulose gene as hereinbefore defined which comprises a cDNA equivalent of the Oryza sativa RSW1 or RSW1-like gene.

The nucleotide sequence is closely-related to the Arabidopsis thaliana RSW1 and rswl nucleotide sequences set forth herein (SEQ ID Nos:l, 3, 4, 5, 7, 9 and 11). The amino acid sequence set forth in SEQ ID NO:14 relates to the polypeptide encoded by the RSW1 or RSW1-like gene sequences set forth in SEQ ID No:13.

Those skilled in the art will be aware of procedures for the isolation of further cellulose genes to those specifically described herein, for example further cDNA sequences and genomic gene equivalents, when provided with one or more of the nucleotide sequences set forth in SEQ ID Nos:l, 3, 4, 5, 7, 9, 11, or 13. In particular, hybridisations may be performed using one or more nucleic acid hybridisation probes comprising at least 10 contiguous nucleotides and preferably at least 50 contiguous nucleotides derived from the nucleotide sequences set forth herein, to isolate cDNA clones, mRNA molecules, genomic clones from a genomic library (in particular genomic clones containing the entire 5' upstream region of the gene including the promoter sequence, and the entire coding region and 3'-untranslated sequences), and/or WO 98/00549 PCT/AU97/00402 11 synthetic oligonucleotide molecules, amongst others. The present invention clearly extends to such related sequences.

The invention further extends to any homologues, analogues or derivatives of any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13.

A further aspect of the present invention contemplates a nucleic acid molecule which encodes or is complementary to a nucleic acid molecule which encodes, a polypeptide which is required for cellulose biosynthesis in a plant, such as cellulose synthase, and which is capable of hybridising under at least low stringency conditions to the nucleic acid molecule set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or to a complementary strand thereof.

As an exemplification of this embodiment, the present inventors have shown that it is possible to isolate variants of the Arabidopsis thaliana RSW1 gene sequence set forth in SEQ ID NO:3, by hybridization under low stringency conditions. Such variants include related sequences derived from Gossypium hirsutum (cotton), Eucalyptus ssp. and A. thaliana.

Additional variant are clearly encompassed by the present invention.

Preferably, the nucleic acid molecule further comprises a nucleotide sequence which encodes, or is complementary to a nucleotide sequence which encodes, a cellulose synthase polypeptide, more preferably a cellulose synthase which is associated with the primary or secondary plant cell wall of the plant species from which said nucleic acid molecule was derived.

More preferably, the nucleic acid molecule according to this aspect of the invention encodes or is complementary to a nucleic acid molecule which encodes, a polypeptide which is required for cellulose biosynthesis in a plant, such as cellulose synthase, and which is capable of hybridising under at least medium stringency conditions to the nucleic acid molecule set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or to a complementary WO 98/00549 PCT/AU97/00402 -12strand thereof.

Even more preferably, the nucleic acid molecule according to this aspect of the invention encodes or is complementary to a nucleic acid molecule which encodes, a polypeptide which is required for cellulose biosynthesis in a plant, such as cellulose synthase, and which is capable of hybridising under at least high stringency conditions to the nucleic acid molecule set forth in any one or more of SEQ ID Nos:1, 3, 4, 5, 7, 9, 11 or 13, or to a complementary strand thereof.

For the purposes of defining the level of stringency, a low stringency is defined herein as being a hybridisation and/or a wash carried out in 6xSSC buffer, 0.1% SDS at 28 0

C.

Generally, the stringency is increased by reducing the concentration of SSC buffer, and/or increasing the concentration of SDS and/or increasing the temperature of the hybridisation and/or wash. A medium stringency comprises a hybridisation and/or a wash carried out in 0.2xSSC-2xSSC buffer, 0.1% SDS at 42 0 C to 65 0 C, while a high stringency comprises a hybridisation and/or a wash carried out in 0. 1xSSC-0.2xSSC buffer, 0.1 SDS at a temperature of at least 55 C. Conditions for hybridisations and washes are well understood by one normally skilled in the art. For the purposes of further clarification only, reference to the parameters affecting hybridisation between nucleic acid molecules is found in pages 2.10.8 to 2.10.16. of Ausubel et al. (1987), which is herein incorporated by reference.

In an even more preferred embodiment of the invention, the isolated nucleic acid molecule further comprises a sequence of nucleotides which is at least 40% identical to at least contiguous nucleotides derived from any one or more of SEQ ID Nos:1, 3, 4, 5, 7, 9, 11 or 13, or a complementary strand thereof.

Still more preferably, the isolated nucleic acid molecule further comprises a sequence of nucleotides which is at least 40% identical to at least 50 contiguous nucleotides derived from the sequence set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a complementary strand thereof.

WO 98/00549 PCT/AU97/00402 -13- The present invention is particularly directed to a nucleic acid molecule which is capable of functioning as a cellulose gene as hereinbefore defined, for example a cellulose synthase gene such as, but not limited to, the Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum or Eucalyptus ssp. cellulose synthase genes, amongst others. The subject invention clearly contemplates additional cellulose genes to those specifically described herein which are derived from these plant species.

The invention further contemplates other sources of cellulose genes such as but not limited to, tissues and cultured cells of plant origin. Preferred plant species according to this embodiment include hemp, jute, flax and woody plants including, but not limited to Pinus ssp.. Populus ssp., Picea spp., amongst others.

A genetic sequence which encodes or is complementary to a sequence which encodes a polypeptide which is involved in cellulose biosynthesis may correspond to the naturally occurring sequence or may differ by one or more nucleotide substitutions, deletions and/or additions. Accordingly, the present invention extends to cellulose genes and any functional genes, mutants, derivatives, parts, fragments, homologues or analogues thereof or nonfunctional molecules but which are at least useful as, for example, genetic probes, or primer sequences in the enzymatic or chemical synthesis of said gene, or in the generation of immunologically interactive recombinant molecules.

In a particularly preferred embodiment, the cellulose genetic sequences are employed to identify and isolate similar genes from plant cells, tissues, or organ types of the same species, or from the cells, tissues, or organs of another plant species.

According to this embodiment, there is contemplated a method for identifying a related cellulose gene or related cellulose genetic sequence, for example a cellulose synthase or cellulose synthase-like gene, said method comprising contacting genomic DNA, or mRNA, or cDNA with a hybridisation effective amount of a first cellulose genetic sequence comprising any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a complementary WO 98/00549 PCT/AU97/00402 -14sequence, homologue, analogue or derivative thereof derived from at least 10 contiguous nucleotides of said first sequence, and then detecting said hybridisation.

Preferably, the first genetic sequence comprises at least 50 contiguous nucleotides, even more preferably at least 100 contiguous nucleotides and even more preferably at least 500 contiguous nucleotides, derived from any one or more of SEQ ID Nos:1, 3, 4, 5, 7, 9, 11 or 13, or a complementary strand, homologue, analogue or derivative thereof.

The related cellulose gene or related cellulose genetic sequence may be in a recombinant form, in a virus particle, bacteriophage particle, yeast cell, animal cell, or a plant cell.

Preferably, the related cellulose gene or related cellulose genetic sequence is derived from a plant species, such as a monocotyledonous plant or a dicotyledonous plant selected from the list comprising Arabidopsis thaliana, wheat, barley, maize, Brassica ssp., Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., hemp, jute, flax, and woody plants including, but not limited to Pinus ssp., Populus ssp., Picea spp., amongst others.

More preferably, related cellulose gene or related cellulose genetic sequence is derived from a plant which is useful in the fibre or timber industries, for example Gossypium hirsutum (cotton), hemp, jute, flax, Eucalyptus ssp. or Pinus ssp., amongst others. Alternatively, the related cellulose gene or related cellulose genetic sequence is derived from a plant which is useful in the cereal or starch industry, for example wheat, barley, rice or maize, amongst others.

In a particularly preferred embodiment, the first cellulose genetic sequence is labelled with a reporter molecule capable of giving an identifiable signal a radioisotope such as 32p or "S or a biotinylated molecule).

An alternative method contemplated in the present invention involves hybridising two nucleic acid "primer molecules" to a nucleic acid "template molecule" which comprises a related cellulose gene or related cellulose genetic sequence or a functional part thereof, wherein the WO 98/00549 PCT/AU97/00402 first of said primers comprises contiguous nucleotides derived from any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13 or a homologue, analogue or derivative thereof and the second of said primers comprises contiguous nucleotides complementary to any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13. Specific nucleic acid molecule copies of the template molecule are amplified enzymatically in a polymerase chain reaction, a technique that is well known to one skilled in the art.

In a preferred embodiment, each nucleic acid primer molecule is at least 10 nucleotides in length, more preferably at least 20 nucleotides in length, even more preferably at least nucleotides in length, still more preferably at least 40 nucleotides in length and even still more preferably at least 50 nucleotides in length.

Furthermore, the nucleic acid primer molecules consists of a combination of any of the nucleotides adenine, cytidine, guanine, thymidine, or inosine, or functional analogues or derivatives thereof which are at least capable of being incorporated into a polynucleotide molecule without having an inhibitory effect on the hybridisation of said primer to the template molecule in the environment in which it is used.

Furthermore, one or both of the nucleic acid primer molecules may be contained in an aqueous mixture of other nucleic acid primer molecules, for example a mixture of degenerate primer sequences which vary from each other by one or more nucleotide substitutions or deletions. Alternatively, one or both of the nucleic acid primer molecules may be in a substantially pure form.

The nucleic acid template molecule may be in a recombinant form, in a virus particle, bacteriophage particle, yeast cell, animal cell, or a plant cell. Preferably, the nucleic acid WO 98/00549 PCT/AU97/00402 -16template molecule is derived from a plant cell, tissue or organ, in particular a cell, tissue or organ derived from a plant selected from the list comprising Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, and woody plants including, but not limited to Pinus ssp., Populus ssp., Picea spp.. amongst others.

Those skilled in the art will be aware that there are many known variations of the basic polymerase chain reaction procedure, which may be employed to isolate a related cellulose gene or related cellulose genetic sequence when provided with the nucleotide sequences set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13. Such variations are discussed, for example, in McPherson et al (1991). The present invention extends to the use of all such variations in the isolation of related cellulose genes or related cellulose genetic sequences using the nucleotide sequences embodied by the present invention.

The isolated nucleic acid molecule according to any of the further embodiments may be cloned into a plasmid or bacteriophage molecule, for example to facilitate the preparation of primer molecules or hybridisation probes or for the production of recombinant gene products.

Methods for the production of such recombinant plasmids, cosmids, bacteriophage molecules or other recombinant molecules are well-known to those of ordinary skill in the art and can be accomplished without undue experimentation. Accordingly, the invention further extends to any recombinant plasmid, bacteriophage, cosmid or other recombinant molecule comprising the nucleotide sequence set forth in any one or more of SEQ ID Nos: 1, 3, 4, 7, 9, 11 or 13, or a complementary sequence, homologue, analogue or derivative thereof.

The nucleic acid molecule of the present invention is also useful for developing genetic constructs which express a cellulose genetic sequence, thereby providing for the increased expression of genes involved in cellulose biosynthesis in plants, selected for example from the list comprising Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, and woody plants including, but not limited to Pinus ssp., Populus ssp., Picea spp., amongst others. The present invention WO 98/00549 PCT/AU97/00402 -17particularly contemplates the modification of cellulose biosynthesis in cotton, hemp, jute, flax, Eucalyptus ssp. and Pinus ssp., amongst others.

The present inventors have discovered that the genetic sequences disclosed herein are capable of being used to modify the level of non-crystalline P-1,4,-glucan, in addition to altering cellulose levels when expressed, particularly when expressed in plants cells. In particular, the Arabidopsis thaliana rswl mutant has increased levels of non-crystalline p-1,4,-glucan, when grown at 31°C, compared to wild-type plants, grown under identical conditions. The expression of a genetic sequence described herein in the antisense orientation in transgenic plants grown at only 21 °C is shown to reproduce many aspects of the rswl mutant phenotype.

Accordingly, the present invention clearly extends to the modification of non-crystalline 3- 1,4,-glucan biosynthesis in plants, selected for example from the list comprising Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, and woody plants including, but not limited to Pinus ssp., Populus ssp., Picea spp., amongst others. The present invention particularly contemplates the modification of non-crystalline P-l,4,-glucan biosynthesis in cotton, hemp, jute, flax, Eucalyptus ssp. and Pinus ssp., amongst others.

The present invention further extends to the production and use of non-crystalline p-1,4-glucan and to the use of the glucan to modify the properties of plant cell walls or cotton fibres or wood fibres. Such modified properties are described herein (Example 13).

The inventors have discovered that the rswl mutant has altered carbon partitioning compared to wild-type plants, resulting in significantly higher starch levels therein. The isolated nucleic acid molecules provided herein are further useful for altering the carbon partitioning in a cell.

In particular, the present invention contemplates increased starch production in transgenic plants expressing the nucleic acid molecule of the invention in the antisense orientation or alterntively, expressing a ribozyme or co-suppression molecule comprising the nucleic acid sequence of the invention.

WO 98/00549 PCT/AU97100402 -18- The invention further contemplates reduced starch and/or non-crystalline 3-1,4-glucan product in transgenic plants expressing the nucleic acid molecule of the invention in the sense orientation such that cellulose production is increased therein.

Wherein it is desired to increase cellulose production in a plant cell, the coding region of a cellulose gene is placed operably behind a promoter, in the sense orientation, such that a cellulose gene product is capable of being expressed under the control of said promoter sequence. In a preferred embodiment, the cellulose genetic sequence is a cellulose synthase genomic sequence, cDNA molecule or protein-coding sequence.

In a particularly preferred embodiment, the cellulose genetic sequence comprises a sequence of nucleotides substantially the same as the sequence set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13 or a homologue, analogue or derivative thereof.

Wherein it is desirable to reduce the content of cellulose or to increase the content of noncrystalline 1-1,4-glucan, the nucleic acid molecule of the present invention is expressed in the antisense orientation under the control of a suitable promoter. Additionally, the nucleic acid molecule of the invention is also useful for developing ribozyme molecules, or in cosuppression of a cellulose gene. The expression of an antisense, ribozyme or co-suppression molecule comprising a cellulose gene, in a cell such as a plant cell, fungal cell, insect cell.

animal cell, yeast cell or bacterial cell, may also increase the solubility, digestibility or extractability of metabolites from plant tissues or alternatively, or increase the availability of carbon as a precursor for any secondary metabolite other than cellulose starch or sucrose). By targeting the endogenous cellulose gene, expression is diminished, reduced or otherwise lowered to a level that results in reduced deposition of cellulose in the primary or secondary cell walls of the plant cell, fungal cell, insect cell, animal cell, yeast cell or bacterial cell, and more particularly, a plant cell. Additionally, or alternatively, the content of non-crystalline 3-1,4-glucan is increased in such cells.

Co-suppression is the reduction in expression of an endogenous gene that occurs when one WO 98/00549 PCT/AU97/00402 -19or more copies of said gene, or one or more copies of a substantially similar gene are introduced into the cell. The present invention also extends to the use of co-suppression to inhibit the expression of a gene which encodes a cellulose gene product, such as but not limited to cellulose synthase. Preferably, the co-suppression molecule of the present invention targets a plant mRNA molecule which encodes a cellulose synthase enzyme, for example a plant, fungus, or bacterial cellulose synthase mRNA, and more preferably a plant mRNA derived from Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, or a woody plant such as Pinus ssp., Populus ssp., or Picea spp., amongst others.

In a particularly preferred embodiment, the gene which is targeted by a co-suppression molecule, comprises a sequence of nucleotides set forth in any one or more of SEQ ID Nos:1, 3, 4, 5, 7, 9, 11 or 13, or a complement, homologue, analogue or derivative thereof.

In the context of the present invention, an antisense molecule is an RNA molecule which is transcribed from the complementary strand of a nuclear gene to that which is normally transcribed to produce a "sense" mRNA molecule capable of being translated into a polypeptide component of the cellulose biosynthetic pathway. The antisense molecule is therefore complementary to the mRNA transcribed from a sense cellulose gene or a part thereof. Although not limiting the mode of action of the antisense molecules of the present invention to any specific mechanism, the antisense RNA molecule possesses the capacity to form a double-stranded mRNA by base pairing with the sense mRNA, which may prevent translation of the sense mRNA and subsequent synthesis of a polypeptide gene product.

Preferably, the antisense molecule of the present invention targets a plant mRNA molecule which encodes a cellulose gene product, for example cellulose synthase. Preferably, the antisense molecule of the present invention targets a plant mRNA molecule which encodes a cellulose synthase enzyme, for example a plant mRNA derived from Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, or a woody plant such as Pinus ssp., Populus ssp., or Picea spp., amongst WO 98/00549 PCT/AU97/00402 others.

In a particularly preferred embodiment, the antisense molecule of the invention targets an mRNA molecule encoded by any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a homologue, analogue or derivative thereof.

Ribozymes are synthetic RNA molecules which comprise a hybridising region complementary to two regions, each of at least 5 contiguous nucleotide bases in the target sense mRNA. In addition, ribozymes possess highly specific endoribonuclease activity, which autocatalytically cleaves the target sense mRNA. A complete description of the function of ribozymes is presented by Haseloff and Gerlach (1988) and contained in International Patent Application No. W089/05852.

The present invention extends to ribozyme which target a sense mRNA encoding a cellulose gene product, thereby hybridising to said sense mRNA and cleaving it, such that it is no longer capable of being translated to synthesise a functional polypeptide product. Preferably, the ribozyme molecule of the present invention targets a plant mRNA molecule which encodes a cellulose synthase enzyme, for example a plant mRNA derived from Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., hemp, jute, flax, or a woody plant such as Pinus ssp., Populus ssp., or Picea spp., amongst others.

In a particularly preferred embodiment, the ribozyme molecule will target an mRNA encoded by any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a homologue, analogue or derivative thereof.

According to this embodiment, the present invention provides a ribozyme or antisense molecule comprising at least 5 contiguous nucleotide bases derived from any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a complementary nucleotide sequence or a homologue, analogue or derivative thereof, wherein said antisense or ribozyme molecule is able to form a hydrogen-bonded complex with a sense mRNA encoding a cellulose gene WO 98/00549 PCT/AU97/00402 -21product to reduce translation thereof.

In a preferred embodiment, the antisense or ribozyme molecule comprises at least 10 to contiguous nucleotides derived from any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a complementary nucleotide sequence or a homologue, analogue or derivative thereof.

Although the preferred antisense and/or ribozyme molecules hybridise to at least about 10 to nucleotides of the target molecule, the present invention extends to molecules capable of hybridising to at least about 50-100 nucleotide bases in length, or a molecule capable of hybridising to a full-length or substantially full-length mRNA encoded by a cellulose gene, such as a cellulose synthase gene.

Those skilled in the art will be aware of the necessary conditions, if any, for selecting or preparing the antisense or ribozyme molecules of the invention.

It is understood in the art that certain modifications, including nucleotide substitutions amongst others, may be made to the antisense and/or ribozyme molecules of the present invention, without destroying the efficacy of said molecules in inhibiting the expression of a gene encoding a cellulose gene product such as cellulose synthase. It is therefore within the scope of the present invention to include any nucleotide sequence variants, homologues, analogues, or fragments of the said gene encoding same, the only requirement being that said nucleotide sequence variant, when transcribed, produces an antisense and/or ribozyme molecule which is capable of hybridising to a sense mRNA molecule which encodes a cellulose gene product.

Gene targeting is the replacement of an endogenous gene sequence within a cell by a related DNA sequence to which it hybridises, thereby altering the form and/or function of the endogenous gene and the subsequent phenotype of the cell. According to this embodiment, at least a part of the DNA sequence defined by any one or more of SEQ ID Nos:1, 3, 4, 7, 9, 11 or 13, or a related cellulose genetic sequence, may be introduced into target cells containing an endogenous cellulose gene, thereby replacing said endogenous cellulose gene.

WO 98/00549 PCT/AU97/00402 -22- According to this embodiment, the polypeptide product of said cellulose genetic sequence possesses different catalytic activity and/or expression characteristics, producing in turn modified cellulose deposition in the target cell. In a particularly preferred embodiment of the invention, the endogenous cellulose gene of a plant is replaced with a gene which is merely capable of producing non-crystalline p-1,4-glucan polymers or alternatively which is capable of producing a modified cellulose having properties similar to synthetic fibres such as rayon, in which the p-1,4-glucan polymers are arranged in an antiparallel configuration relative to one another.

The present invention extends to genetic constructs designed to facilitate expression of a cellulose genetic sequence which is identical, or complementary to the sequence set forth in any one or more of SEQ ID Nos:l, 3, 4, 5, 7, 9, 11 or 13, or a functional derivative, part, homologue, or analogue thereof, or a genetic construct designed to facilitate expression of a sense molecule, an antisense molecule, ribozyme molecule, co-suppression molecule, or gene targeting molecule containing said genetic sequence.

The said genetic construct of the present invention comprises the foregoing sense, antisense, or ribozyme, or co-suppression nucleic acid molecule, or gene-targeting molecule, placed operably under the control of a promoter sequence capable of regulating the expression of the said nucleic acid molecule in a prokaryotic or eukaryotic cell, preferably a plant cell. The said genetic construct optionally comprises, in addition to a promoter and sense, or antisense, or ribozyme, or co-suppression, or gene-targeting nucleic acid molecule, a terminator sequence.

The term "terminator" refers to a DNA sequence at the end of a transcriptional unit which signals termination of transcription. Terminators are 3'-non-translated DNA sequences containing a polyadenylation signal, which facilitates the addition of polyadenylate sequences to the 3'-end of a primary transcript. Terminators active in plant cells are known and described in the literature. They may be isolated from bacteria, fungi, viruses, animals and/or plants. Examples of terminators particularly suitable for use in the genetic constructs WO 98/00549 PCT/AU97/00402 -23of the present invention include the nopaline synthase (NOS) gene terminator of Agrobacterium tumefaciens, the terminator of the Cauliflower mosaic virus (CaMV) gene, and the zein gene terminator from Zea mays.

Reference herein to a "promoter" is to be taken in its broadest context and includes the transcriptional regulatory sequences of a classical genomic gene, including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence and additional regulatory elements upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or external stimuli, or in a tissue-specific manner. A promoter is usually, but not necessarily, positioned upstream or of a structural gene, the expression of which it regulates. Furthermore, the regulatory elements comprising a promoter are usually positioned within 2 kb of the start site of transcription of the gene.

In the present context, the term "promoter" is also used to describe a synthetic or fusion molecule, or derivative which confers, activates or enhances expression of said sense, antisense, or ribozyme, or co-suppression nucleic acid molecule, in a plant cell. Preferred promoters may contain additional copies of one or more specific regulatory elements, to further enhance expression of a sense antisense, ribozyme or co-suppression molecule and/or to alter the spatial expression and/or temporal expression of said sense or antisense, or ribozyme, or co-suppression, or gene-targeting molecule. For example, regulatory elements which confer copper inducibility may be placed adjacent to a heterologous promoter sequence driving expression of a sense, or antisense, or ribozyme, or co-suppression, or gene-targeting molecule, thereby conferring copper inducibility on the expression of said molecule.

Placing a sense or ribozyme, or antisense, or co-suppression, or gene-targeting molecule under the regulatory control of a promoter sequence means positioning the said molecule such that expression is controlled by the promoter sequence. Promoters are generally positioned (upstream) to the genes that they control. In the construction of heterologous promoter/structural gene combinations it is generally preferred to position the promoter at a WO 98/00549 PCT/AU97/00402 -24distance from the gene transcription start site that is approximately the same as the distance between that promoter and the gene it controls in its natural setting, the gene from which the promoter is derived. As is known in the art, some variation in this distance can be accommodated without loss of promoter function. Similarly, the preferred positioning of a regulatory sequence element with respect to a heterologous gene to be placed under its control is defined by the positioning of the element in its natural setting, the genes from which it is derived. Again, as is known in the art, some variation in this distance can also occur.

Examples of promoters suitable for use in genetic constructs of the present invention include viral, fungal, bacterial, animal and plant derived promoters capable of functioning in prokaryotic or eukaryotic cells. Preferred promoters are those capable of regulating the expression of the subject cellulose genes of the innvention in plants cells, fungal cells, insect cells, yeast cells, animal cells or bacterial cells, amongst others. Particularly preferred promoters are capable of regulating expression of the subject nucleic acid molecules in plant cells. The promoter may regulate the expression of the said molecule constitutively, or differentially with respect to the tissue in which expression occurs or, with respect to the developmental stage at which expression occurs, or in response to external stimuli such as physiological stresses, or plant pathogens, or metal ions, amongst others. Preferably, the promoter is capable of regulating expression of a sense, or ribozyme, or antisense, or cosuppression molecule or gene targeting, in a plant cell. Examples of preferred promoters include the CaMV 35S promoter, NOS promoter, octopine synthase (OCS) promoter and the like.

In a most preferred embodiment, the promoter is capable of expression in any plant cell, such as, but not limited to a plant selected from the list comprising Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, and woody plants including, but not limited to Pinus ssp., Populus ssp., Picea spp., amongst others.

In a particularly preferred embodiment, the promoter may be derived from a genomic clone WO 98/00549 PCT/AU97/00402 encoding a cellulose gene product, in particular the promoter contained in the sequence set forth in SEQ ID NO:3 or SEQ ID NO:4. Preferably, the promoter sequence comprises nucleotide 1 to about 1900 of SEQ ID NO:3 or nucleotides 1 to about 700 of SEQ ID NO:4 or a homologue, analogue or derivative capable of hybridizing thereto under at least low stringency conditions.

Optionally, the genetic construct of the present invention further comprises a terminator sequence.

In an exemplification of this embodiment, there is provided a binary genetic construct comprising the isolated nucleotide sequence of nucleotides set forth in SEQ ID NO:3. There is also provided a genetic construct comprising the isolated nucleotide sequence of nucleotides set forth in SEQ ID NO: 1, in the antisense orientation, placed operably in connection with the CaMV 35S promoter.

In the present context, the term "in operable connection with" means that expression of the isolated nucleotide sequence is under the control of the promoter sequence with which it is connected, regardless of the relative physical distance of the sequences from each other or their relative orientation with respect to each other.

An alternative embodiment of the invention is directed to a genetic construct comprising a promoter or functional derivative, part, fragment, homologue, or analogue thereof, which is capable of directing the expression of a polypeptide early in the development of a plant cell at a stage when the cell wall is developing, such as during cell expansion or during cell division. In a particularly preferred embodiment, the promoter is contained in the sequence set forth in SEQ ID NO:3 or SEQ ID NO:4. Preferably, the promoter sequence comprises nucleotide 1 to about 1900 of SEQ ID NO:3 or nucleotides 1 to about 700 of SEQ ID NO:4 or a homologue, analogue or derivative capable of hybridizing thereto under at least low stringency conditions.

WO 98/00549 PCT/AU97/00402 -26- The polypeptide may be a reporter molecule which is encoded by a gene such as the bacterial P-glucuronidase gene or chloramphenicol acetyltransferase gene or alternatively, the firefly luciferase gene. Alternatively, the polypeptide may be encoded by a gene which is capable of producing a modified cellulose in the plant cell when placed in combination with the normal complement of cellulose genes which are expressible therein, for example it may be a cellulose-like gene obtained from a bacterial or fungal source or a cellulose gene obtained from a plant source.

The genetic constructs of the present invention are particularly useful in the production of crop plants with altered cellulose content or structure. In particular, the rate of cellulose deposition may be reduced leading to a reduction in the total cellulose content of plants by transferring one or more of the antisense, ribozyme or co-suppression molecules described supra into a plant or alternatively, the same or similar end-result may be achieved by replacing an endogenous cellulose gene with an inactive or modified cellulose gene using gene-targeting approaches. The benefits to be derived from reducing cellulose content in plants are especially apparent in food and fodder crops such as, but not limited to maize, wheat, barley, rye, rice, barley, millet or sorghum, amongst others where improved digestibility of said crop is desired. The foregoing antisense, ribozyme or co-suppression molecules are also useful in producing plants with altered carbon partitioning such that increased carbon is available for growth, rather than deposited in the form of cellulose.

Alternatively, the introduction to plants of additional copies of a cellulose gene in the 'sense' orientation and under the control of a strong promoter is useful for the production of plants with increased cellulose content or more rapid rates of cellulose biosynthesis. Accordingly, such plants may exhibit a range of desired traits including, but not limited to modified strength and/or shape and/or properties of fibres, cell and plants, increased protection against chemical, physical or environmental stresses such as dehydration, heavy metals (e.g.

cadmium) cold, heat or wind, increased resistance to attack by pathogens such as insects, nematodes and the like which physically penetrate the cell wall barrier during invasion/infection of the plant.

WO 98/00549 PCT/AU97/00402 -27- Alternatively, the production of plants with altered physical properties is made possible by the introduction thereto of altered cellulose gene(s). Such plants may produce 3-1,4-glucan which is either non-crystalline or shows altered crystallinity. Such plants may also exhibit a range of desired traits including but not limited to, altered dietary fibre content, altered digestibility and degradability or producing plants with altered extractability properties.

Furthermore, genetic constructs comprising a plant cellulose gene in the 'sense' orientation may be used to complement the existing range of cellulose genes present in a plant, thereby altering the composition or timing of deposition of cellulose deposited in the cell wall of said plant. In a preferred embodiment, the cellulose gene from one plant species or a 13-1,4-glucan synthase gene from a non-plant species is used to transform a plant of a different species, thereby introducing novel cellulose biosynthetic metabolism to the second-mentioned plant species.

In a related embodiment, a recombinant fusion polypeptide may be produced containing the active site from one cellulose gene product fused to another cellulose gene product, wherein said fusion polypeptide exhibits novel catalytic properties compared to either 'parent' polypeptide from which it is derived. Such fusion polypeptides may be produced by conventional recombinant DNA techniques known to those skilled in the art, either by introducing a recombinant DNA capable of expressing the entire fusion polypeptide into said plant or alternatively, by a gene-targeting approach in which recombination at the DNA level occurs in vivo and the resultant gene is capable of expressing a recombinant fusion polypeptide.

The present invention extends to all transgenic methods and products described supra, including genetic constructs.

The recombinant DNA molecule carrying the sense, antisense, ribozyme or co-suppression molecule of the present invention and/or genetic construct comprising the same, may be introduced into plant tissue, thereby producing a "transgenic plant", by various techniques WO 98/00549 PCT/AU97/00402 -28known to those skilled in the art. The technique used for a given plant species or specific type of plant tissue depends on the known successful techniques. Means for introducing recombinant DNA into plant tissue include, but are not limited to, transformation (Paszkowski et al., 1984), electroporation (Fromm et al., 1985), or microinjection of the DNA (Crossway et al., 1986), or T-DNA-mediated transfer from Agrobacterium to the plant tissue. Representative T-DNA vector systems are described in the following references: An et al.(1985); Herrera-Estrella et al. (1983a,b); Herrera-Estrella et al. (1985). Once introduced into the plant tissue, the expression of the introduced gene may be assayed in a transient expression system, or it may be determined after selection for stable integration within the plant genome. Techniques are known for the in vitro culture of plant tissue, and in a number of cases, for regeneration into whole plants. Procedures for transferring the introduced gene from the originally transformed plant into commercially useful cultivars are known to those skilled in the art.

A still further aspect of the present invention extends to a transgenic plant such as a crop plant, carrying the foregoing sense, antisense, ribozyme, co-suppression, or gene-targeting molecule and/or genetic constructs comprising the same. Preferably, the transgenic plant is one or more of the following:Arabidopsis thaliana, Oryza sativa, wheat, barley, maize, Brassica ssp., Gossypium hirsutum and Eucalyptus ssp., hemp, jute, flax, Pinus ssp., Populus ssp., or Picea spp. Additional species are not excluded.

The present invention further extends to the progeny of said transgenic plant.

Yet another aspect of the present invention provides for the expression of the subject genetic sequence in a suitable host a prokaryote or eukaryote) to produce full length or non-full length recombinant cellulose gene products.

Hereinafter the term "cellulose gene product" shall be taken to refer to a recombinant product of a cellulose gene as hereinbefore defined. Accordingly, the term "cellulose gene product" includes a polypeptide product of any gene involved in the cellulose biosynthetic pathway in WO 98/00549 PCT/AU97/00402 -29plants, such as, but not limited to a cellulose synthase gene product.

Preferably, the recombinant cellulose gene product comprises an amino acid sequence having the catalytic activity of a cellulose synthase polypeptide or a functional mutant, derivative part, fragment, or analogue thereof.

In a particularly preferred embodiment of the invention, the recombinant cellulose gene product comprises a sequence or amino acids that is at least 40% identical to any one or more of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or a homologue, analogue or derivative thereof.

Single and three-letter abbreviations used for amino acid residues contained in the specification are provided in Table 1.

In the present context, "homologues" of an amino acid sequence refer to those polypeptides, enzymes or proteins which have a similar catalytic activity to the amino acid sequences described herein, notwithstanding any amino acid substitutions, additions or deletions thereto.

A homologue may be isolated or derived from the same or another plant species as the species from which the polypeptides of the invention are derived.

"Analogues" encompass polypeptides of the invention notwithstanding the occurrence of any non-naturally occurring amino acid analogues therein.

"Derivatives" include modified peptides in which ligands are attached to one or more of the amino acid residues contained therein, such as carbohydrates, enzymes, proteins, polypeptides or reporter molecules such as radionuclides or fluorescent compounds. Glycosylated, fluorescent, acylated or alkylated forms of the subject peptides are particularly contemplated by the present invention. Additionally, derivatives of an amino acid sequence described herein which comprises fragments or parts of the subject amino acid sequences are within the scope of the invention, as are homopolymers or heteropolymers comprising two or more copies of the subject polypeptides. Procedures for derivatizing peptides are well-known in the WO 98/00549 WO 9800549PCT/AU97/00402 30 art.

TABLE 1 Amino Acid Three-letter One-letter Abbreviation Symbol Alanine Ala A Arginine Arg R Asparagine Asn N Aspartic acid Asp D Cysteine Cys C D-alanine Dal X Glutamine Gin Q Glutamic acid Glu E Glycine Gly G Histidine His H Isoleucine le I Leucine Leu L Lysine Lys K Methionine Met M Phenylalamne Phe F Proline Pro P Serine Scr S Threonine Thr T Tryptophan Trp, W Tryosine Tyr y Valine Val V Any amino acid Xaa X WO 98/00549 PCT/AU97/00402 -31- Substitutions encompass amino acid alterations in which an amino acid is replaced with a different naturally-occurring or a non-conventional amino acid residue. Such substitutions may be classified as "conservative", in which an amino acid residue contained in a cellulose gene product is replaced with another naturally-occurring amino acid of similar character, for example Gly*-+Ala, Val+-Ile-+Leu, Asp-+Glu, Lys+-+Arg, Asn-+Gln or Phe+-Trp -Tyr.

Substitutions encompassed by the present invention may also be "non-conservative", in which an amino acid residue which is present in a cellulose gene product described herein is substituted with an amino acid with different properties, such as a naturally-occurring amino acid from a different group (eg. substituted a charged or hydrophobic amino acid with alanine), or alternatively, in which a naturally-occurring amino acid is substituted with a nonconventional amino acid.

Non-conventional amino acids encompassed by the invention include, but are not limited to those listed in Table 2.

Amino acid substitutions are typically of single residues, but may be of multiple residues, either clustered or dispersed.

Amino acid deletions will usually be of the order of about 1-10 amino acid residues, while insertions may be of any length. Deletions and insertions may be made to the N-terminus, the C-terminus or be internal deletions or insertions. Generally, insertions within the amino acid sequence will be smaller than amino- or carboxy-terminal fusions and of the order of 1-4 amino acid residues.

A homologue, analogue or derivative of a cellulose gene product as referred to herein may readily be made using peptide synthetic techniques well-known in the art, such as solid phase peptide synthesis and the like, or by recombinant DNA manipulations. Techniques for making substituent mutations at pre-determined sites using recombinant DNA technology, for example by M13 mutagenesis, are also well-known. The manipulation of nucleic acid WO 98/00549 PCT/AU97/00402 -32molecules to produce variant peptides, polypeptides or proteins which manifest as substitutions, insertions or deletions are well-known in the art.

The cellulose gene products described herein may be derivatized further by the inclusion or attachment thereto of a protective group which prevents, inhibits or slows proteolytic or cellular degradative processes. Such derivatization may be useful where the half-life of the subject polypeptide is required to be extended, for ample to increase the amount of cellulose produced in a primary or secondary cell wall of a plant cell or alternatively, to increase the amount of protein produced in a bacterial or eukaryotic expression system. Examples of chemical groups suitable for this purpose include, but are not limited to, any of the nonconventional amino acid residues listed in Table 2, in particular a D-stereoisomer or a methylated form of a naturally-occurring amino acid listed in Table 1. Additional chemical groups which are useful for this purpose are selected from the list comprising aryl or heterocyclic N-acyl substituents, polyalkylene oxide moieties, desulphatohirudin muteins, alpha-muteins, alpha-aminophosphonic acids, water-soluble polymer groups such as polyethylene glycol attached to sugar residues using hydrazone or oxime groups, benzodiazepine dione derivatives, glycosyl groups such as beta-glycosylamine or a derivative thereof, isocyanate conjugated to a polyol functional group or polyoxyethylene polyol capped with diisocyanate, amongst others. Similarly, a cellulose gene product or a homologue, analogue or derivative thereof may be cross-linked or fused to itself or to a protease inhibitor peptide, to reduce susceptibility of said molecule to proteolysis.

WO 98/00549 WO 9800549PCT/AU97/00402 33 TABLE 2 Non-conventional Code Non-conventional Code amino acid amino acid a-amninobutyric acid c-amino-a-methylbutyrate aminocyclopropanecarboxylate aminoisobutyric acid aminonorbornylcarboxylate cyclohexylalanine cyclopentylalanine D-alanine D-arginine D-aspartic acid D-cysteine D-glutamine D-glutamic acid D-histidine D-isoleucine D-leucine D-lysine D-methionine D-omnithine D-phenylalanine D-proline D-serine D-threonine Abu Mgabu Cpro Aib Norb Chexa Cpen Dal Darg Dasp Deys Dgln Dglu Dhis Dule Dleu Dlys Dmet Dorn Dphe Dpro Dser Dthr L-N-methylalanine L-N-methylarginine L-N-methylasparagine L-N-methylaspartic acid L-N-methylcysteine L-N-methylglutamine L-N-methylglutamic acid L-N-methylhistidine L-N-methylisolleucine L-N-methylleucine L-N-methyllysine L-N-methylmethionine L-N-methylnorleucine L-N -methylnorvaline L-N-methylornithine L-N-methylphenylalanine L-N-methylproline L-N-methylserine L-N-methylthreonine L-N-methyltryptophan L-N-methyltyrosine L-N-methylvaline L-N-methylethylglycine L-N-methyl-t-butylglycine L-norleucine Nmala Nmarg Nmasn Nmasp Nmcys Nmgln Nmnglu Nmhis Nmile Nmleu Nmlys Nmmet Nmnle Nmnva Nmorn Nmaphe Nmpro Nmser Nmthr Nmtrp Nmtyr Nmval Nmetg Nmtbug NMe WO 98/00549 WO 9800549PCT/AU97/00402 34 D-tryptophan D-tyrosine D-valine D-ix-methylalanine D-cz-methylarginine D-u-methylasparagine D-et-methylaspartate D-ca-methylcysteine D-ct-methylglutamine D-et-methylhistidine D-a-methylisoleucine D-a-methylleucine D-oa-methyllysine D-ct-methylmethionine D-a-methylornithine D-a-methylphenylalanine D-a-methylproline D-ca-methylserine D-a-methylthreonine D-a-methyltryptophan D-a-methyltyrosine D-a-methylvaline D-N-methylalanine D-N-methylarginine D-N-methylasparagine D-N-methylaspartate D-N-methylcysteine D-N-methylglutamine D-N-methylglutamate D-N-methylhistidine Dtrp Dtyr Dval Dmala Drnarg Dmasn Dmasp Dmcys Dmgln Dmhis Dmile Dmleu Dmlys Dmmet Dmorn Dmphe Dmpro Dmser Dmthr Dmtrp Dmty Dmval Drnala Dnarg Dnasn Drnasp Dnxncys Dngln Dnglu Dnhis L-norvaline a-methyl-aminoisobutyrate a-methyl-y-aminobutyrate cc-methylcyclohexylalanine oc-methylcylcopentylalanine ot-methy1-cx-napthylalanine ct-methylpenicil1amine N-(4-aminobutyl)glycine N-(2-aminoethyl)glycine N-(3-aminopropyl)glycine N-amino-a-methylbutyrate ax-napthylalanine N-benzylglycine N-(2-carbainylethyl)glycine N-(carbamylmethyl)glycine N-(2-carboxyethyl)glycine N-(carboxymethyl)glycine N-cyc~lobutylglycine N-cycloheptylglycine N-cyclohexylglycine N-cyclodecylglycine N-cylcododecylglycine N-cyclooctylglycine N-cyclopropylglycine N-cycloundecylglycine N-(2,2-diphenylethyl)glycine N-(3 ,3-diphenylpropyl)glycine N-(3-guanidinopropyl)glycine 1-hydroxyethyl)glycine N-(hydroxyethyl))glycine Nva Maib Mgabu Mchexa Mcpen Manap Mpen Nglu Naeg Norn Nmaabu Anap Nphe Ngln Nasn Nglu Nasp Ncbut Nchep Nchex Ncdec Ncdod Ncoct Ncpro Ncund Nbhm Nbhe Narg Nthr Nser WO 98/00549 WO 9800549PCTIAU97/00402 D-N-methylisoleucine D-N-methylleucine D-N-methyllysine N-methylcyclohexylalanine D-N-methylornithine N-methylglycine N-methylaminoisobutyrate N-(l1-methylpropyl)glycine N-(2-methylpropyl)glycine D-N-methyltryptophan D-N-methyltyrosine D-N-methylvaline y-aminobutyric acid L-t-butylglycine L-ethylglycine, L-homophenylalanine L-ot-methylarginine L-a-methylaspartate L-a-methyleysteine L-a-methylglutamine L-a-methylhistidine L-a-methylisoleucine L-rx-methylleucine L-a-methylmethionine L-a-methylnorvaline L-a-methylphenylalanine L-ct-methylserine L-a-methyltryptophan L-a-methylvaline Dnmile Dnmleu Dnmnlys Nmchexa Dnmorn Nala Nmaib Nile Nleu Dnmtrp Dnmtyr Dnval Gabu Tbug Etg Hphe Marg Masp Mcys Mgln Mhis Mile Mleu Mmet Mnva Mphe Mser Mtrp Mval N-(imidazolylethyl))glycine N-(3-.indolylyethyl)glycine N-methyl-y-aminobutyrate D-N-methylmethionine N-methylcyclopentylalanine D-N-methylphenylalanine D-N-methylproline D-N-methylserine D-N-methylthreonine 1-methylethyl)glycine N-methyla-napthylalanine N-methylpenicillamine N-(p-hydroxyphenyl)glycine N-(thiomethyl)glycine penicillarnine L-a-methylalanine L-a-methylasparagine L-a-methyl-t-butylglycine L-methylethylglycine L-at-methylglutamate L-a-methylhomophenylalanine N-(2-methylthioethyl)glycine L-a-methyllysine L-a-methylnorleucine L-a-methylornithine L-a-methylproline L-a-methylthreonine L-a-methyltyrosine L-N-methylhomophenylalanine Nhis Nhtrp Nmgabu Dnmmnet Nmcpen Dnmphe Dnmpro Dnmser Dninthr Nval Nmanap Nmpen Nhtyr Ncys Pen Mala Masn Mtbug Metg Mglu Mhphe Nmet Mlys Mnle Morn Mpro Mthr Mtyr Nmnhphe WO 98/00549 PCT/AU97/00402 -36- N-(N-(2,2-diphenylethyl) Nnbhm N-(N-(3,3-diphenylpropyl) Nnbhe carbamylmethyl)glycine carbamylmethyl)glycine 1-carboxy-l-(2,2-diphenyl- Nmbc ethylamino)cyclopropane In an alternative embodiment of the invention, the recombinant cellulose gene product is characterised by at least one functional P-glycosyl transferase domain contained therein.

The term "P-glycosyl transferase domain" as used herein refers to a sequence of amino acids which is highly conserved in different processive enzymes belonging to the class of glycosyl transferase enzymes (Saxena et al., 1995), for example the bacterial P-1,4-glycosyl transferase enzymes and plant cellulose synthase enzymes amongst others, wherein said domain possesses a putative function in,contributing to or maintaining the overall catalytic activity, substrate specificity or substrate binding of an enzyme in said enzyme class. The P-glycosyl transferase domain is recognisable by the occurrence of certain amino acid residues at particular locations in a polypeptide sequence, however there is no stretch of contiguous amino acid residues comprised therein.

As a consequence of the lack of contiguity in a P-glycosyl transferase domain, it is not a straightforward matter to isolate a cellulose gene by taking advantage of the presence of a P-glycosyl transferase domain in the polypeptide encoded by said gene. For example, the P-glycosyl transferase domain would not be easily utilisable as a probe to facilitate the rapid isolation of all P-glycosyl transferase genetic sequences from a particular organism and then to isolate from those genetic sequences a cellulose gene such as cellulose synthase.

In a preferred embodiment, the present invention provides an isolated polypeptide which: (i)contains at least one structural P-glycosyl transferase domain as hereinbefore defined; and WO 98/00549 PCT/AU97/00402 -37- (ii) has at least 40% amino acid sequence similarity to at least 20 contiguous amino acid residues set forth in any one or more of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or a homologue, analogue or derivative thereof.

More preferably, the polypeptide of the invention is at least 40% identical to at least contiguous amino acid residues, even more preferably at least 100 amino acid residues of any one or more of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or a homologue, analogue or derivative thereof.

In a particularly preferred embodiment, the percentage similarity to any one or more of SEQ ID Nos:2 6, 8, 10, 12 or 14 is at least 50-60%. more preferably at least 65-70%, even more preferably at least 75-80% and even more preferably at least 85-90%, including about 91% or In a related embodiment, the present invention provides a "sequencably pure" form of the amino acid sequence described herein. "Sequencably pure" is hereinbefore described as substantially homogeneous to facilitate amino acid determination.

In a further related embodiment, the present invention provides a "substantially homogeneous" form of the subject amino acid sequence, wherein the term "substantially homogeneous" is hereinbefore defined as being in a form suitable for interaction with an immunologically interactive molecule. Preferably, the polypeptide is at least homogeneous, more preferably at least 50% homogeneous, still more preferably at least homogeneous and yet still more preferably at least about 95-100% homogenous, in terms of activity per microgram of total protein in the protein preparation.

The present invention further extends to a synthetic peptide of at least 5 amino acid residues in length derived from or comprising a part of the amino acid sequence set forth in any one or more of SEQ ID Nos:2, 6, 8, 10, 12 or 14, or having at least 40% similarity thereto.

WO 98/00549 PCT/AU97/00402 -38- Those skilled in the art will be aware that such synthetic peptides may be useful in the production of immunologically interactive molecules for the preparation of antibodies or as the peptide component of an immunoassay.

The invention further extends to an antibody molecule such as a polyclonal or monoclonal antibody or an immunologically interactive part or fragment thereof which is capable of binding to a cellulose gene product according to any of the foregoing embodiments.

The term "antibody" as used herein, is intended to include fragments thereof which are also specifically reactive with a polypeptide of the invention. Antibodies can be fragmented using conventional techniques and the fragments screened for utility in the same manner as for whole antibodies. For example, F(ab')2 fragments can be generated by treating antibody with pepsin.

The resulting F(ab')2 fragment can be treated to reduce disulfide bridges to produce Fab' fragments.

Those skilled in the art will be aware of how to produce antibody molecules when provided with the cellulose gene product of the present invention. For example, by using a polypeptide of the present invention polyclonal antisera or monoclonal antibodies can be made using standard methods. A mammal, a mouse, hamster, or rabbit) can be immunized with an immunogenic form of the polypeptide which elicits an antibody response in the mammal.

Techniques for conferring immunogenicity on a polypeptide include conjugation to carriers or other techniques well known in the art. For example, the polypeptide can be administered in the presence of adjuvant. The progress of immunization can be monitored by detection of antibody titers in plasma or serum. Standard ELISA or other immunoassay can be used with the immunogen as antigen to assess the levels of antibodies. Following immunization, antisera can be obtained and, if desired IgG molecules corresponding to the polyclonal antibodies may be isolated from the sera.

To produce monoclonal antibodies, antibody producing cells (lymphocytes) can be harvested from an immunized animal and fused with myeloma cells by standard somatic cell fusion WO 98/00549 PCT/AU97/00402 -39procedures thus immortalizing these cells and yielding hybridoma cells. Such techniques are well known in the art. For example, the hybridoma technique originally developed by Kohler and Milstein (1975) as well as other techniques such as the human B-cell hybridoma technique (Kozbor et al., 1983), the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., 1985), and screening of combinatorial antibody libraries (Huse et al., 1989).

Hybridoma cells can be screened immunochemically for production of antibodies which are specifically reactive with the polypeptide and monoclonal antibodies isolated.

As with all immunogenic compositions for eliciting antibodies, the immunogenically effective amounts of the polypeptides of the invention must be determined empirically. Factors to be considered include the immunogenicity of the native polypeptide, whether or not the polypeptide will be complexed with or covalently attached to an adjuvant or carrier protein or other carrier and route of administration for the composition, i.e. intravenous, intramuscular, subcutaneous, etc., and the number of immunizing doses to be administered. Such factors are known in the vaccine art and it is well within the skill of immunologists to make such determinations without undue experimentation.

It is within the scope of this invention to include any second antibodies (monoclonal, polyclonal or fragments of antibodies) directed to the first mentioned antibodies discussed above. Both the first and second antibodies may be used in detection assays or a first antibody may be used with a commercially available anti-immunoglobulin antibody.

The present invention is further described by reference to the following non-limiting Figures and Examples.

WO 98/00549 PCT/AU97/00402 40 In the Figures: Figure 1 is a photographic representation showing the inflorescence length of wild-type Arabidopsis thaliana Columbia plants (plants 1 and 3) and rswl plants (plants 2 and 4) grown at 21 °C (plants 1 and 2) or 31 0 C. Plants were grown initially at 21 °C until bolting commenced, the bolts were removed and the re-growth followed in plants grown at each temperature.

Figure 2 is a photographic representation of a cryo-scanning electron micrograph showing misshapen epidermal cells in the cotyledons and hypocotyl of the rswl mutant when grown at 31°C for 10 days.

Figure 3 is a graphical representation of a gas chromatograph of alditol acetates of methylated sugars from a cellulose standard (top panel) and from the neutral glucan derived from shoots of rswl plants grown at 310C (lower panel). The co-incident peaks show that the rswl glucan is 1,4-linked.

Figure 4 is a schematic representation of the contiguous region of Arabidopsis thaliana chromosome 4 (stippled box) between the cosmid markers g8300 and 06455, showing the location of overlapping YAC clones (open boxes) within the contiguous region. The position of the RSW1 locus is also indicated, approximately 1.2cM from g8300 and 0.9cM from 06455. The scale indicates 100kb in length. L, left-end of YAC; R, right-end of YAC.

Above the representation of chromosome 4, the YAC fragments and cosmid clone fragments used to construct the contiguous region are indicated, using a prefix designation corresponding to the YAC or cosmid from which the fragments were obtained( eg yUP9E3, yUP20B12, etc) and a suffix designation indicating whether the fragment corresponds to the right-end (RE) or left-end (LE) of the YAC clone; N, North; S, South; CAPS, cleaved amplified polymorphic sequence (Konieczny and Ausubel, 1993) version of the g8300 marker.

WO 98/00549 PCT/AU97/00402 -41- Figure 5 is a schematic representation of a restriction map of construct 23H12 between the left T-DNA border (LB) and right T-DNA border (RB) sequences (top solid line), showing the position of the Arabidopsis thaliana RSW1 locus (stippled box). The line at the top of the figure indicates the region of 23H12 which is contained in construct pRSW1. The structure of the RSW1 gene between the translation start (ATG) and translation stop (TAG) codons is indicated at the bottom of the figure. Exons are indicated by filled boxes; introns are indicated by the solid black line. The alignment of EST clone T20782 to the 3'-end of the RSW1 gene, from near the end of exon 7 to the end of exon 14, is also indicated at the bottom of the figure. Restriction sites within 23H12 are as follows: B, BamHI; E, EcoRI; H, HindIII; S, Sail; Sm, SmaI.

Figure 6 is a photographic representation showing complementation of the radial root swelling phenotype of the rswl mutant by transformation with construct 23H12. The rswl mutant was transformed with 23H12 as described in Example 6. Transformed rswl plants (centre group of three seedlings), untransformed rswl plants (left group of three seedlings) and untransformed A.thaliana Columbia plants (right group of three seedlings) were grown at 21 C for 5 days and then transferred to 31 C for a further 2 days, after which time the degree of root elongation and radial root swelling was determined.

Figure 7 is a photographic representation comparing wild-type Arabidopsis thaliana Columbia plants (right-hand side of the ruler) and A.thaliana Columbia plants transformed with the antisense RSW1 construct EST T20782 expressed in the antisense orientation under control of the CaMV 35S promoter sequence; left-hand side of the ruler), showing inflorescence shortening at 210C in plants transformed with the antisense RSW1 construct compared to untransformed Columbia plants. The phenotype of the antisense plants at 21°C is similar to the phenotype of the rswl mutant at 31 Inflorescence height is indicated in millimetres.

Figure 8 is a schematic representation showing the first 90 amino acid residues of Arabidopsis thaliana RSW1 aligned to the amino acid sequences of homologous polypeptides WO 98/00549 PCT/AU97/00402 -42from A. thaliana and other plant species. The shaded region indicates highly conserved sequences. Ath-A and Ath-B are closely related Arabidopsis thaliana cDNA clones identified by hybridisation screening using part of the RSW1 cDNA as a probe. S0542, rice EST clone (MAFF DNA bank, Japan); celAl and celA2, cotton cDNA sequences expressed in cotton fibre (Pear et al, 1996); SOYSTF1A and SOYSTF1B, putative soybean bZIP transcription factors. Amino acid designations are as indicated in Table 1 incorporated herein. Conserved cysteine residues are indicated by the asterisk.

Figure 9 is a schematic representation showing the alignment of the complete amino acid sequence of Arabidopsis thaliana RSW1 to the amino acid sequences of homologous polypeptides from A. thaliana and other plant species. The shaded region indicates highly conserved sequences. Ath-A and Ath-B are closely related Arabidopsis thaliana cDNA clones identified by hybridisation screening using part of the RSW1 cDNA as a probe.

S0542, rice EST clone (MAFF DNA bank, Japan); celAl, cotton genetic sequence (Pear et al, 1996); D48636, a partial cDNA clone obtained from rice (Pear et al, 1996). Amino acid designations are as indicated in Table 1 incorporated herein. Numbering indicates the amino acid position in the RSW1 sequence.

Figure 10 is a schematic representation of the RSW1 polypeptide, showing the positions of putative transmembrane helices (hatched boxes), cysteine-rich region (Cys) and aspartate residues and the QVLRW signature which are conserved between RSW1 and related amino acid sequences. Regions of RSW1 which are highly-conserved between putative cellulose biosynthesis polypeptides are indicated by the dark-shaded boxes, while lessconserved regions are indicated by the light-shaded boxes.

Figure 11 is a photographic representation of a Southern blot hybridisation of the end of the Arabidopsis thaliana RSW1 cDNA to BglII-digested DNA derived from A. thaliana (lane 1) and cotton (lane Hybridisations were carried out under low stringency conditions at 55 0 C. Arrows indicate the positions of hybridising bands.

WO 98/00549 PCT/AU97/00402 -43 EXAMPLE 1 CHARACTERISATION OF THE CELLULOSE-DEFICIENT Arabidopsis thaliana MUTANT rswl 1. Morphology The Arabidopsis thaliana rswl mutant was produced in a genetic background comprising the ecotype Columbia.

The altered root cell-shape and temperature sensitivity of the root morphology of the Arabidopsis thaliana mutant rswl are disclosed, among other morphological mutants, by Baskin et al. (1992).

As shown in Figure 1, the present inventors have shown that the rswl mutant exhibits the surprising phenotype of having reduced inflorescence height when grown at 31 compared to wild-type Columbia plants grown under similar conditions. In contrast, when grown at 21 C, the inflorescence height of rswl is not significantly different from wild type plants grown under similar conditions, indicating that the shoot phenotype of rswl is conditional and temperature-dependent.

Furthermore, cryo-scanning electron microscopy of the epidermal cells of the rswl mutant indicates significant abnormality in cell shape, particularly in respect of those epidermal cells forming the leaves, hypocotyl and cotyledons, when the seedlings are grown at 31 0 C (Figure 2).

Rosettes (terminal complexes) are the putative hexameric cellulose synthase complexes of higher plant plasma membranes (Herth, 1985). Freeze-fractured root cells of Arabidopsis thaliana rswl plants grown at 18 0 C show cellulose microfibrils and rosettes on the PF face of the plasma membrane that resembles those of wild-type A. thaliana and other angiosperms. Transferring the rswl mutant to 31°C reduces the number of rosettes in the mutant within 30 min, leading to extensive loss after 3 hours. Plasma membrane particles WO 98/00549 PCT/AU97/00402 -44align in rows on prolonged exposure to the restrictive temperature. In contrast, there is no change in the appearance of cortical microtubules that align cellulose microfibrils, or of Golgi bodies that synthesise other wall polysaccharides and assemble rosettes.

2. Carbohydrate content The effect of mutations in the RSW1 gene on the synthesis of cellulose and other carbohydrates was assessed by measuring in vivo incorporation of 1 4 C (supplied as uniformly labelled glucose) into various cell wall fractions. Wild type (RSW1) and homozygous mutant rswl seed were germinated at 21 0 C on agar containing Hoagland's nutrients and 1% (w/v) unlabelled glucose. After 5 d, half of the seedlings were transferred to 31 C for 1 d while the remainder were maintained at 21 C for the same time. Seedlings were covered with a solution containing Hoagland's nutrients and 4 C-glucose and incubated for a further 3 h at the same temperature. Rinsed roots and shoots were separated and frozen in liquid nitrogen.

Tissue was homogenised in cold, 0.5 M potassium phosphate buffer (0.5M KH 2

PO

4 and a crude cell wall fraction collected by centrifugation at 2800 rpm. The wall fraction was extracted with chloroform/methanol [1:1 at 40 0 C for 1 hour, followed by a brief incubation at 150 0 C, to remove lipids. The pellet was washed successively with 2ml methanol, 2ml acetone and twice with 2ml of deionised water. Finally, the pellet was extracted successively with dimethyl sulphoxide under nitrogen to remove starch; ammonium oxalate to remove pectins; 0.1 M KOH and 3 mg/ml NaBH 4 and then with 4 M KOH and 3 mg/ml NaBH, to extract hemicelluloses; boiling acetic acid/nitric acid/water [8:1:2 to extract any residual non-cellulosic carbohydrates and leave crystalline cellulose as the final insoluble pellet (Updegraph, 1969). All fractions were analysed by liquid scintillation counting and the counts in each fraction from the mutant were expressed as a percentage of the counts in the wild type under the same conditions.

As shown in Table 3, mutant and wild type plants behave in quite similar fashion at 21 C (the permissive temperature) whereas, at the restrictive temperature of 31°C, the incorporation of 1 4 C into cellulose is severely inhibited (to 36% of wild type) by the rswl mutation. The data in Table 3 indicate that cellulose synthesis is specifically inhibited in WO 98/00549 PCT/AU97/00402 the rswl mutant. The wild type RSW1 gene is therefore involved quite directly in cellulose synthesis and changing its sequence by mutation changes the rate of synthesis.

TABLE3 Counts in fractions from rswl plants expressed as a of counts in comparable fraction from wild type plants Pectins Hemicelluloses Cellulose 21°C 31°C 21°C 31°C 21°C 31°C 125 104 111 101 80 36 In homozygous mutant rswl plants, the pectin fraction extracted by ammonium oxalate contained abundant glucose, atypical of true uronic acid-rich pectins. The great majority of the glucose remained in the supernatant when cetyltrimethylammonium bromide precipitated the negatively charged pectins.

3. Non-crystalline P-1,4-glucan content The quantity of cellulose and the quantity of a non-crystalline p3-1,4-glucan recovered from the ammonium oxalate fraction were determined for seedlings of wild type Columbia and for backcrossed, homozygous rswl that were grown for either 7 days at 21 °C or alternatively, for 2 days at 21 °C and 5 days at 31 on vertical agar plates containing growth medium (Baskin et al., 1992) plus 1% glucose, and under continuous light (90 uimol m- 2 Roots and shoots were separated from about 150 seedlings, freeze-dried to constant weight and ground in a mortar and pestle with 3 ml of cold 0.5 M potassium phosphate buffer (pH The combined homogenate after two buffer rinses (2ml each) was centrifuged at 2800 x g for min. After washing the pellet fraction twice with 2 ml buffer and twice with 2 ml distilled water, the pellet, comprising the crude cell wall fraction, and the pooled supematants, comprising the phosphate buffer fraction were retained. The crude cell wall pellet fraction was stirred with two 3 ml aliquots of chloroform/methanol [1:1 for 1 hour at 40 C, 2 ml of methanol at 40°C for 30 min, 2 ml of acetone for 30 min, and twice with water. The whole WO 98/00549 PCT/AU97/00402 -46procedure repeated in the case of shoots. Combined supernatants were dried in a nitrogen stream. The pellet was successively extracted with: (i)3 ml of DMSO- water 9:1 sealed under nitrogen, overnight with shaking, followed by two 2ml extractions using DMSO/water and three 2ml water washes; (ii) 3ml of ammonium oxalate (0.5 at 100 0 C for 1 hour, followed by two water washes; (iii) 3ml of 0.1 M KOH containing 1mg/ ml sodium borohydride, for 1 hour at 25 °C (repeated once for root material or twice for shoot material), with a final wash with 2 ml water; (iv) 3 ml of 4 M KOH containing 1 mg/ml sodium borohydride, for 1 hour at 25 °C (repeated once for root material or twice for shoot material).

The final pellet was boiled with intermittent stirring in 3 ml of acetic acid-nitric acid-water [8:1:2 (Updegraph 1969), combined with 2 water washes, and diluted with 5 ml water.

The insoluble residue of cellulose was solubilised in 67% H 2 SO4, shown to contain greater than 97% glucose using GC/MS (Fisons AS800/MD800) of alditol acetates (Doares et al., 1991) and quantified in three independent samples by anthrone/H 2

SO

4 reaction.

Results of GC/MS for pooled replica samples are presented in Table 4.

The non-crystalline p-1,4-glucan was recovered as the supematant from the ammonium oxalate fraction when anionic pectins were precipitated by overnight incubation at 37 0 C with 2% (w/v) cetyltrimethylammonium bromide (CTAB) and collected by centrifugation at 2800 x g for min. The glucan (250 pig/ml) or starch (Sigma; 200 pg/ml) were digested with mixtures of endocellulase (EC 3.2.1.4; Megazyme, Australia) from Trichoderma and almond P-glucosidase (EC 3.2.1.21; Sigma), or Bacillus sp. a-amylase (EC 3.2.1.1; Sigma) and rice a-glucosidase (EC 3.2.1.20; Sigma).

The material recovered in the supernatant from the ammonium oxalate fraction was shown to contain a pure P-1,4-glucan by demonstrating that: only glucose was detectable when it was hydrolysed by 2 M TFA in a sealed tube for 1 h at 120 0 C in an autoclave, the supernatant (2000 g for 5 min) was dried under vacuum at 45 °C to remove TFA and glucose was determined by GC/MS; (ii) methylation (Needs and Selvendran 1993) gave a dominant peak resolved by thin layer chromatography and by GC/MS WO 98/00549 PCT/AU97/00402 -47that was identical to that from a cellulose standard and so indicative of 1,4-linked glucan (Figure and (iii) the endo-cellulase and P-1,4-glucosidase mixture released 83 of the TFA-releasable glucose from the glucan produced by rswl at 31 OC while the a-amylase/a-glucosidase mixture released no glucose from the glucan. Conversely, the a-amylase/ a-glucosidase mixture released 95% of the TFA-releasable glucose from a starch sample, while the endo-cellulase/p-1,4-glucosidase mixture released no glucose from starch.

Extractability of the glucan using ammonium oxalate, and the susceptibility of the glucan to endocellulase/P-glucosidase and TFA hydrolysis indicate that the glucan in the rswl mutant is not crystalline, because it is the crystallinity of glucan which makes cellulose resistant to extraction and degradation.

Table 4 shows the quantity of glucose in cellulose determined by the anthrone/H 2

SO

4 reaction and the quantity in the non-crystalline glucan after TFA hydrolysis, for shoots of wild type and mutant rswl Arabidopsis plants. The data indicate that the production of cellulose and of the non-crystalline P-1,4-glucan can be manipulated by mutational changes in the RSW1 gene.

TABLE 4 Glucose contents of cellulose and of the ammonium oxalate-extractable glucan wild type rswl 21°C 31 0 C 21°C 31 0

C

Cellulose 273+28 363+18* 218+20 159+19* Glucan 22 58 24 195 All values nmol glucose mg-1 plant dry weight sd Differences significant at 0.001 level.

4. Starch content The quantity of starch recovered in the DMSO fraction from roots in the experiment described above was also determined by the anthrone/H 2 SO, extraction (Table WO 98/00549 PCT/AU97/00402 -48- As shown in Table 5, the level of starch deposited in the rswl mutant is 4-fold that detectable in the roots of wild-type plants at the restrictive temperature of 31 A similar rise in starch is also seen if the data are expressed as nmol glucose per plant. There is no detectable difference in deposition at starch between rswl plants and wild-type plants at 21°C.

TABLE Quantity of starch (nmol glucose per mg dry weight of seedling) extracted from roots of rswl and wild type seedlings Phenotype Temperature Wild-type rswl mutant 21 0 C 22 18 31 0 C 37 126 The composition of cell walls in the rswl mutant plant compared to wild type plants at the restrictive temperature of 31 0 C, is summarised in Table 6.

TABLE 6 Mol% composition of cell walls from shoots of rswl and wild-type seedlings grown at 31 C Phenotype Cell wall component Wild-type rswl mutant Crystalline cellulose 38.4 16.5 Non- crystalline P-1,4-glucan 8.5 27.1 Pectin 37.1 36.3 Alkali-soluble 15.6 19.8 Acid-soluble 0.3 0.4 1' WO 98/00549 PCT/AU97/00402 -49- In conclusion, the rswl mutation disassembles cellulose synthase complexes in the plasma membrane, reduces cellulose accumulation and causes p-1,4-glucan to accumulate in a noncrystalline form.

EXAMPLE 2 MAPPING OF YAC CLONES TO THE rswl LOCUS The rswl locus in the mutant Arabidopsis thaliana plant described in Example 1 above was mapped to chromosome 4 of A. thaliana using RFLP gene mapping techniques(Chang et al, 1988: Nam et al., 1989) to analyse the F 2 or 1% progeny derived from a Columbia (Co)/Landsberg (Ler) cross. In particular, the rswl mutation was shown to be linked genetically to the ga5 locus, which is a chromosome 4 visual marker in A. thaliana.

Based on an analysis of map distances and chromosomal break points in 293 F 2 or F 3 progeny derived from a Columbia (Co)/Landsberg (Ler) cross, rswl was localised to an approximately 2.1 cM region between the RFLP markers g8300 and 06455, approximately 1.2cM south of the CAPS (cleaved amplified polymorphic sequence; Konieczny and Ausubel, 1993) version of the g8300 marker (Figure 4).

The interval between g8300 and 06455 in which rswl residues was found to be spanned by an overlapping set of Yeast Artificial Chromosome (YAC) clones. The clones were obtained from Plant Industry, Commonwealth Scientific and Industrial Research Organisation, Canberra, Australia. The YACs were positioned in the g8300/06455 interval by hybridisation using known DNA molecular markers (from within the interval) and DNA fragments from the ends of the YACs. The length of the interval was estimated to comprise 900kb of DNA.

Refined gene mapping of recombinants within the region spanned by YAC clones established the genetic distance between the RFLP marker g8300 and the rswl locus.

WO 98/00549 PCT/AU97/00402 The combination of genetic map distance data and the mapping of YAC clones within the region further localised the rswl locus to the YAC clone designated yUP5C8.

EXAMPLE 3 MAPPING OF cDNA CLONES TO THE YAC CLONE YUP5C8 An Arabidopsis thaliana cDNA clone designated T20782 was obtained from the public Arabidopsis Resource Centre, Ohio State University, 1735 Neil Avenue, Columbus, OH 43210, United States of America. The T20782 cDNA clone was localised broadly to the DNA interval on Arabidopsis chromosome 4 between the two markers g8300 and 06455 shown in Figure 4. Using a polymerase chain reaction (PCR) based approach DNA primers '-AGAACAGCAGATACACGGA-3' and 5'-CTGAAGAAGGCTGGACAAT-3') designed to the T20782 cDNA nucleotide sequence were used to screen Arabidopsis YAC clone libraries. The T20782 cDNA clone was found to localise to YACs (CIC1F9, CIC10E9, CIC11D9) identified on the Arabidopsis chromosome 4 g8300 and 06455 interval (Figure The same approach was used to further localise clone T20782 to YAC clone yUP5C8, the same YAC designated to contain the rswl locus in the same chromosome interval (Figure 4).

Furthermore, amplification of the YAC clone yUP5C8 using primers derived from T20782 produces a 500bp fragment containing two putative exons identical to part of the T20782 nucleotide sequence, in addition to two intron sequences.

The cDNA T20782 was considered as a candidate gene involved in cellulose biosynthesis.

WO 98/00549 PCT/AU97/00402 -51 EXAMPLE 4 NUCLEOTIDE SEQUENCE ANALYSIS OF THE CDNA CLONE T20782 The nucleotide sequence of the cDNA clone T20782 is presented in SEQ ID NO: 1. The nucleotide sequence was obtained using a Dye Terminator Cycle Sequencing kit (Perkin Elmer cat. #401384) as recommended by the manufacturer. Four template clones were used for nucleotide sequencing to generate the sequence listed. The first template was the cDNA clone T20782. This template was sequenced using the following sequencing primers: a)5'-CAATGCATTCATAGCTCCAGCCT-3' b)5'-AAAAGGCTGGAGCTATGAATGCAT-3' c)5'-TCACCGACAGATTCATCATACCCG-3' GACATGGAATCACCTTAACTGCC-3' e)5'-CCATTCAGTCTTGTCTTCGTAACC-3' f)5'-GGTTACGAAGACAAGACTGAAATGG-3' g)5'-GAACCTCATAGGCATTGTGGGCTGG-3' h)5'-GCAGGCTCTATATGGGTATGATCC-3' i)Standard M13 forward sequencing primer.

j)Standard T7 sequencing primer.

The second template clone (T20782 SphI deletion clone) was constructed by creating a DNA deletion within the T20782 clone. The T20782 clone was digested with the restriction enzyme SphI, the enzyme was heat-killed, the DNA ligated and electroporated into NM522 E.coli host cells. The T20782 SphI deletion clone was then sequenced using a standard M13 forward sequencing primer. Two other deletion clones were made for DNA sequencing in a similar fashion but the restriction enzymes EcoRI and SmaI were used. The T20782 EcoRI deletion clone and the T20782 SmaI deletion clone were sequenced using a standard T7 sequencing primer. The DNA sequence shown in SEQ ID NO:1 is for one DNA strand only however those skilled in the art will be able to generate the nucleotide sequence of the WO 98/00549 PCT/AU97/00402 -52complementary strand from the data provided.

The amino acid sequence encoded by clone T20782 was derived and is set forth in SEQ ID NO:2.

The T20782 clone encodes all but the first Aspartate residue of the D, D, D, QXXRW signature conserved in the general architecture of P-glycosyl transferases. In particular, T20782 encodes 5 amino acid residues of the D, D, D, QXXRW signature, between amino acid positions 109 and 370 of SEQ ID NO:2. The conserved Aspartate, Aspartate, Glutamine, Arginine and Tryptophan amino acid residues are shown below, in bold type, with the local amino acid residues also indicated: 1. Amino acid residues 105 to 113 of SEQ ID NO:2:

LLNVDCDHY;

2. Amino acid residues 324 to 332 of SEQ ID NO:2: SVTEDILTG; and 3. Amino acid residues 362 to 374 of SEQ ID NO:2:

DRLNQVLRWALGS.

It must be noted that these invariable amino acids merely indicate that the T20782 derived amino acid sequence belongs to a very broad group of glycosyl transferases. Some of these enzymes such as cellulose synthase, chitin synthase, alginate synthase and hyaluronic acid synthase produce functionally very different compounds.

The presence of the conserved amino acid residues merely indicate that the T20782 clone may encode a P-glycosyl transferase protein such as the cellulose gene product, cellulose synthase. The fact that the clone localises in the vicinity of a gene involved in cellulose biosynthesis is the key feature which now focus interest on the T20782 clone as a candidate for the RSW1 (cellulose synthase) gene.

WO 98/00549 PCT/AU97/00402 -53 The T20782 potentially codes for a cellulose synthase.

EXAMPLE NUCLEOTIDE SEQUENCE ANALYSIS OF THE GENOMIC CLONE 23H12 Clone 23H12 contains approximately 21kb of Arabidopsis thaliana genomic DNA in the region between the left border and right border T-DNA sequences, and localises to the RSW1 candidate YAC yUP5C8. Clone 23H12 was isolated by hybridisation using EST20782 insert DNA, from a genomic DNA library made for plant transformation. Cosmid 12C4 was also shown to hybridize to the cDNA clone T20782, however this cosmid appears to comprise a partial genomic sequence corresponding to the related Ath-A cDNA sequence set forth in SEQ ID NO:7, for which the corresponding amino acid sequence is set forth in SEQ ID NO:8.

A restriction enzyme map of clone 23H12 is presented in Figure Nucleotide sequence of 8411bp of genomic DNA in the binary cosmid clone 23H12 was obtained (SEQ ID NO:3) by primer walking along the 23H12 template, using a Dye Terminator Cycle Sequencing kit (Perkin Elmer cat. #401384) as recommended by the manufacturer. The following primers at least, were used for DNA sequencing of the 23H12 clone DNA: a)cs 1-R 5'-CAATGCATTCATAGCTCCAGCCT-3' b)csl-F 5'-AAAAGGCTGGAGCTATGAATGCAT-3' c)up 5'-AGAACAGCAGATACACGGA-3' d)ve76-R2 5'-ATCCGTGTATCTGCTGTTCTTACC-3' e)estl-R 5'-AATGCTCTTGTTGCCAAAGCAC-3' f)sve76-F 5'-ATTGTCCAGCCTTCTTCAGG-3' g)ve76-R 5'-CTGAAGAAGGCTGGACAATGC-3' WO 98/00549 PCT/AU97/00402 54 h)B12-R1 5'-AGGTAAGCATAGCTGAACCATC-3' i)B12-R2 5'-AGTAGATTGCAGATGGTTTTCTAC-3' j)B12-R3 5'-TTCAATGGGTCCACTGTACTAAC-3' k)B12-R4 5'-ATTCAGATGCACCATTGTC-3' The structure of the RSW1 gene contained in cosmid clone 23H12 is also presented in Figure As shown therein, coding sequences in 23H12, from the last 12 bp of exon 7 to the end of exon 14, correspond to the full T20782 cDNA sequence SEQ ID NO:1). The nucleotide sequences of the RSW1 gene comprising exons 1 to 8 were amplified from A.thaliana Columbia double-stranded cDNA, using amplification primers upstream of the RSW1 start site and a primer internal to the EST clone T20782.

The exons in the RSW1 gene range from 81bp to 585bp in length and all 5' and 3' intron/exon splice junctions conform to the conserved intron rule.

The RSW1 transcript comprises a 5'-untranslated sequence of at least 70bp in length, a 3243bp coding region and a 360bp 3'-untranslated region. Northern hybridization analyses indicate that the RSW1 transcript in wild-type A. thaliana roots, leaves and inflorescences is approximately 4.0kb in length, and that a similar transcript size occurs in mutant tissue (data not shown).

The derived amino acid sequence of the RSW1 polypeptide encoded by the cosmid clone 23H12 the polypeptide set forth in SEQ ID NO:6) is 1081 amino acids in length and contains the entire D, D, D, QXXRW signature characteristic of P-glycosyl transferase proteins, between amino acid position 395 and amino acid position 822. The conserved Aspartate, Glutamine, Arginine and Tryptophan residues are shown below, in bold type, with the local amino acid residues also indicated: 1. amino acid residues 391 to 399 of SEQ ID NO:6:

YVSDDGSAM

WO 98/00549 PCT/AU97/00402 2. Amino acid residues 557 to 565 of SEQ ID NO:6:

LLNVDCDHY;

3. Amino acid residues 776 to 784 of SEQ ID NO:6: SVTEDILTG; and 4. Amino acid residues 814 to 826 of SEQ ID NO:6:

DRLNQVLRWALGS.

The second and third conserved Aspartate residues listed supra, and the fourth conserved amino acid sequence motif listed supra QVLRW) are also present in the cDNA clone T20782 (see Example 4 above).

The 23H12 clone potentially encodes a cellulose synthase.

EXAMPLE 6 COMPLEMENTATION OF THE rswl MUTATION The complementation of the cellulose mutant plant rswl is the key test to demonstrate the function of the clone 23H12 gene product. Complementation of the rswl phenotype was demonstrated by transforming the binary cosmid clone 23H12, or a derivative clone thereof encoding a functional gene product, into the Arabidopsis thaliana cellulose mutant rswl.

Two DNA constructs (23H12 and pRSW1) were used to complement the rswl mutant plant line.

1. Construct 23H12 Clone 23H12 is described in Example 5 and Figure 2. Construct pRSW1 The 23H12 construct has an insert of about 21kb in length. To demonstrate that any complementation of the phenotype of the rswl mutation is the result of expression of the gene WO 98/00549 PCT/AU97/00402 -56which corresponds to SEQ ID NO:3, a genetic construct, designated as pRSW1, comprising the putative RSW1 gene with most of the surrounding DNA deleted, was produced. A restriction enzyme (RE) map of the RSW1 gene insert in pRSW1 is provided in Figure To produce pRSW1, the RSW1 gene was subcloned from cosmid 23H12 and cloned into the binary plasmid pBIN 19. Briefly, Escherichia coli cells containing cosmid 23H 12 were grown in LB medium supplemented with tetracyclin (3.5 mg/L). Plasmid DNA was prepared by alkaline lysis and digested sequentially with restriction enzymes PvuII and Sail. Two co-migrating fragments of 9 kb and 10 kb, respectively, were isolated as a single fraction from a 0.8% agarose gel. The RSWI gene was contained on the 10 kb PvulI/Sall fragment.

The 9 kb fragment appeared to be a Pvull cleavage product not comprising the RSW1 gene.

The restriction fragments were ligated into pBIN19 digested with Smal and Sall. An aliquot of the ligation mix was introduced by electroporation into E.coli strain XLB1. Colonies resistant to kanamycin (50 mg/L) were selected and subsequently characterised by restriction enzyme analysis to identify those clones which contained only the 10 kb PvulI/Sall fragment comprising the RSW1 gene, in pBIN19.

3. Transfer of the 23H12 and pRSW1 constructs to Agrobacterium tumefaciens Cosmid 23H12 was transferred to Agrobacterium by triparental mating, essentially as described by Ditta et al. (1980). Three bacterial strains as follows were mixed on solid LB medium without antibiotics: Strain 1 was an E. coli helper strain containing the mobilising plasmid pRK2013, grown to stationary phase; Strain 2 was E.coli containing cosmid 23H12, grown to stationary phase; and Strain 3 was an exponential-phase culture of A. tumefaciens strain AGL1 (Lazo et al., 1991). The mixture was allowed to grow over night at 28 0 C, before an aliquot was streaked out on solid LB medium containing antibiotics (ampicillin 50 mg/L, rifampicin 50 mg/L, tetracyclin 3.5 mg/L) to select for transformed A. tumefaciens AGL1.

Resistant colonies appeared after 2-3 days at 28 0 C and were streaked out once again on selective medium for further purification. Selected colonies were then subcultured in liquid LB medium supplemented with rifampicin (50 mg/L) and tetracyclin (3.5 mg/L) and stored at -80 0

C.

WO 98/00549 PCT/AU97/00402 -57- Plasmid pRSW1 (initially designated as p2029) was introduced into A. tumefaciens strain AGL1 by electroporation.

4. Transformation of rswl plants The rswl plant line was transformed with constructs 23H12 and pRSW1 using vacuum infiltration essentially as described by Bechtold et al. (1993).

Analysis of radial swelling in transformants Complementation of the radial swelling (rsw) phenotype, which is characteristic of the rswl mutant plant, was assayed by germinating transformed T1 seed) rswlseeds obtained as described supra on Hoaglands plates containing 50pg/ml kanamycin. Plates containing the transformed seeds were incubated at 21 °C for 10-12 days. Kanamycin-resistant seedlings were transferred to fresh Hoaglands plates containing 50,ig/ml kanamycin and incubated at 31 °C for 2 days Following this incubation, the root tip was examined for a radial swelling phenotype. Under these conditions, the roots of wild-type plants do not show any radial swelling phenotype however, the roots of rswl plants show clear radial swelling at the root tip and also have a short root compared to the wild-type plants. As a consequence, determination of the radial swelling phenotype of the transformed plants was indicative of successful complementation of the rswl phenotype.

The kanamycin-resistant seedlings were maintained by further growth of seedlings at 21 °C, following the high temperature incubation. Once plants had recovered, the seedlings were transferred to soil and grown in cabinets at 21 OC (16 hr light/8 hr dark cycle). T2 seed was then harvested from mature individual plants.

Using the 23H12 construct for rswl transformation, a total of 262 kanamycin-resistant seedlings were obtained. All of these transformants were tested for complementation of the root radial swelling phenotype. A total of 230 seedlings showed a wild type root phenotype, while only 32 seedlings showed the radial swelling root phenotype characteristic of rswl plants. By way of example, Figure 6 shows the phenotypes of transformed seedlings compared WO 98/00549 PCT/AU97/00402 -58to untransformed wild-type and rswl seedlings, following incubation at 31 As shown in Figure 6, there is clear complementation of the radial swelling phenotype in the transformed seedlings, with normal root length being exhibited by the transformed seedlings at 31 C Using the pRSW1 construct for transformation, a total of 140 kanamycin-resistant seedlings were obtained. All of the 11 seedlings tested for complementation of the root radial swelling phenotype showed a wild type root phenotype and none of the seedlings showed any signs of radial swelling in the roots (data not shown).

6. General morphological analysis of the complemented rswl mutant line Further characterisation of the complemented rswl plants has shown that other morphological characteristics of rswl have also been restored in the transgenic lines, for example the bolt (inflorescence) height, and the ability of the plants to grow wild type cotyledons, leaves, trichomes, siliques and flowers at 31 °C (data not shown).

7. Biochemical complementation of the rswl mutant line T2 seed from transformations using cosmid 23H12 as described supra or alternatively, using the binary plasmid pBinl9 which lacks any RSWI gene sequences, was sown on Hoagland's solid media containing kanamycin (50ug/ml), incubated for 2 days at 21 C and then transferred to 31 C for 5 days. Wild-type A.thaliana Columbia plants were grown under similar conditions but without kanamycin in the growth medium. Kanamycin resistant T2 seedlings which have at least one copy of the 23H12 cosmid sequence, and wild-type seedlings, were collected and frozen for cellulose analysis.

Cellulose levels were determined as acetic-nitric acid insoluble material (Updegraph, 1969) for 10 lines of kanamycin-resistant T2 plants transformed with the 23H12 cosmid sequence, and compared to the cellulose levels in rswl mutant plants, wild-type A.thaliana Columbia plants and A.thaliana Columbia plants transformed with the binary plasmid pBinl9. The results are provided in Table 7.

WO 98/00549 PCT/AU97/00402 -59- As shown in Table 7, the cellulose levels have been significantly elevated in the complemented rswl (T2) plants, compared to the cellulose levels measured in the rswl mutant parent plant. In fact, cellulose levels in the 23H12-transformed plants, expressed relative to the fresh weight of plant material or on a per seedling basis, are not significantly different from the cellulose levels of either wild-type Arabidopsis thaliana Columbia plants or A.thaliana Columbia transformed with the binary plasmid pBinl9. These data indicate that the 23H12 cosmid is able to fully complement the cellulose-deficient phenotype of the rswl mutant.

Homozygous T3 lines are generated to confirm the data presented in Table 7.

Furthermore, data presented in Table 7 indicate that there is no difference in the rate of growth of the T2 transformed rswl plants and wild-type plants at 31 C, because the fresh weight of such plants does not differ significantly. In contrast, the fresh weight of mutant rswl seedlings grown under identical conditions is only approximately 55% of the level observed in T2 lines transformed with 23H12 (range about 30% to about These data support the conclusion that cellulose levels have been manipulated in the complemented rswl (T2) plants.

Furthermore, the rate of cellulose synthesis in 23H12-transformed plants and wild-type plants at 31°C, as measured by 4 C incorporation is also determined.

Furthermore, the p-1,4-glucan levels and starch levels in the 23H12 transformant lines are shown to be similar to the P-1,4-glucan and starch levels in wild-type plants.

WO 98/00549 WO 9800549PCT/AU97/00402 60 TABLE 7 CELLULOSE LEVELS IN rswl PLANTS TRANSFORMED WITH COSMID CLONE 23H12 SAMPLE SEEDLING CELLULOSE CELLULOSE SIZE FRESH (mg cellulose! (mg cellulose! PLANT LINE (No. of WEIGHT 100 mng tissue) seedling) plants) (mg) 1.2 (rswl+23H12) 126 2.51 1.23 0.03 1 1.4 (rswl+23H12) 132 2.25 2.50 0.056 2.1 (rswl+23H12) 126 3.23 1.29 0.042 3.1 (rswl+23H12) 127 3.75 1.23 0.046 3.10 128 3.52 1.69 0.060 (rswl ±23H12) 4.4 (rswl+23H12) 110 5.14 1.31 0.067 (rswl+23H12) 125 3.18 1.26 0.040 5.3 (rswl+23H12) 124 2.77 1.17 0.032 9.2 (rswl+23H12) 125 2.26 1.41 0.032 10.8 126 2.4 1.20 0.029 Columbia/pffin19 106 2.64 1.34 0.035 Columbia 178 2.73 1.18 0.032 rswl1 mutant 179 1.77 0.84 0.015 WO 98/00549 PCT/AU97/00402 -61- EXAMPLE 7 DETERMINATION OF THE FULL-LENGTH NUCLEOTIDE SEQUENCE ENCODING THE WILD-TYPE RSW1 POLYPEPTIDE Arabidopsis thaliana double-stranded cDNA and cDNA libraries were prepared using the CAPFINDER cDNA kit (Clontech). RNA was isolated from wild-type Columbia grown in sterile conditions for 21 days.

Approximately 100,000 cDNA clones in an unamplified cDNA library were screened under standard hybridization conditions at 65°C, using a probe comprising 3 2 P-labelled DNA amplified from double stranded cDNA. To prepare the hybridization probe, the following amplification primers were used: 1. 2280-F:5'GAATCGGCTACGAATTTCCCA 3' 2. 2370-F:5'TTGGTTGCTGGATCCTACCGG 3' 3. cspl-R:5'GGT TCT AAA TCT TCT TCC GTC 3' wherein the primer combinations were either 2280-F/cspl-R or 2370-F/cspl-R. The primer 2280-F corresponds to nucleotide positions 2226 to 2246 in SEQ ID NO:3, upstream of the translation start site. The primer 2370-F corresponds to nucleotide positions 2314 to 2334 in SEQ ID NO:3, encoding amino acids 7 through 13 of the RSW1 polypeptide. The primer cspl-R comprises nucleotide sequences complementary to nucleotides 588 to 608 of the T20782 clone (SEQ ID NO:1) corresponding to nucleotides 6120 to 6140 of SEQ ID NO:3.

The hybridization probes produced are approximately 1858 nucleotides in length (2280- F/cspl-R primer combination) or 1946 nucleotides in length (2370-F/cspl-R primer combination).

Five hybridizing bacteriophage clones were identified, which were plaque-purified to homogeneity during two successive rounds of screening. Plasmids were rescued from the positively-hybridizing bacteriophage clones, using the Stratagene excision protocol for the ZapExpress

TM

vector according to the manufacturer's instructions. Colony hybridizations WO 98/00549 PCT/AU97/00402 -62confirmed the identity of the clones.

Isolated cDNA clones were sequenced by primer walking similar to the method described in Examples 4 and 5 supra.

A full-length wild-type RSW1 nucleotide sequence was compiled from the nucleotide sequences of two cDNA clones. First, the 3'-end of the cDNA, encoding amino acids 453- 1081 of RSW1, corresponded to the nucleotide sequence of the EST clone T20782 (SEQ ID NO:1). The remaining cDNA sequence, encoding amino acids 1-654 of RSW1, was generated by amplification of the 5'-end from cDNA, using primer 2280-F, which comprises nucleotide sequences approximately 50-70bp upstream of the RSW1 translation start site in cosmid 23 H12, and primer cspl-R, which comprises nucleotide sequences complementary to nucleotides 588 to 608 of the T20782 clone (SEQ ID NO:1).

Several amplified clones are sequenced to show that no nucleotide errors were introduced by the amplification process. The 5' and 3' nucleotide sequences are spliced together to produce the complete RSW1 open reading frame and 3'-untranslated region provided in SEQ ID Those skilled in the art will be aware that the 5'-end and 3'-end of the two incomplete cDNAs are spliced together to obtain a full-length cDNA clone, the nucleotide sequence of which is set forth in SEQ ID Of the remaining cDNA clones, no isolated cDNA clone comprised a nucleotide sequence which precisely matched the nucleotide sequence of the RSW1 gene present in cosmid 23H12. However, several clones containing closely-related sequences were obtained, as summarised in Table 8. The nucleotide sequences of the Ath-A and Ath-B cDNAs are provided herein as SEQ ID Nos: 7 and 9, respectively.

WO 98/00549 PCT/AU97/00402 -63- TABLE 8 CHARACTERISATION OF A. thaliana cDNA CLONES CLONE NAME DESCRIPTION LENGTH SEQ ID NO: RSW1.1A chimeric clone partial not provided RSW1A chimeric clone partial not provided Ath-A 12C4 cDNA full-length SEQ ID NO:7 Ath-B new sequence full-length SEQ ID NO:9 RSW4A identical to Ath-B full-length not provided The derived amino acid sequences encoded by the cDNAs listed in Table 8, is provided in Figures 8 and 9 and SEQ ID Nos: 8 and 10 herein.

Figure 10 a schematic representation of the important features of the RSW1 polypeptide which are conserved within A.thaliana and between A.thaliana and other plant species. In addition to the species indicated in Figure 10, the present inventors have also identified maize, wheat, barley and Brassica ssp. cellulose biosynthetic genes by homology search.

Accordingly, the present invention extends to cellulose genes and cellulose biosynthetic polypeptides as hereinbefore defined, derived from any plant species, including A. thaliana, cotton, rice, wheat, barley, maize, Eucalyptus ssp., Brassica ssp. Pinus ssp., Populus ssp., Picea ssp., hemp, jute and flax, amongst others.

EXAMPLE 8 ISOLATION OF FULL-LENGTH NUCLEOTIDE SEQUENCE ENCODING THE MUTANT RSW1 POLYPEPTIDE Arabidopsis thaliana double-stranded cDNA and cDNA libraries were prepared using the CAPFINDER cDNA kit (Clontech). RNA was isolated from Arabidopsis thaliana Columbia rswl mutant plants grown in sterile conditions for 21 days.

The full-length rswl mutant nucleotide sequence was generated by sequencing two amplified WO 98/00549 PCT/AU97/00402 64- DNA fragments spanning the rswl mutant gene. The end sequence of the cDNA (comprising the 5'-untranslated region and exons 1-11) was amplified using the primer combination 2280-F/cspl-R (Example The 3'-end sequence was amplified using the primers EST1-F and cs3-R set forth below: 1.Primer EST1-F: 5'AATGCTTCTTGTTGCCAAAGCA 3' 2.Primer cs3-R: 5'GACATGGAATCACCTTAACTGCC 3' wherein primer EST1-F corresponds to nucleotide positions 1399-1420 of SEQ ID (within exon 8) and primer cs3-R is complementary to nucleotides 3335-3359 of SEQ ID NO:5 (within the 3'-untranslated region of the wild-type transcript).

The full-length sequence of the mutant rswl transcript is set forth herein as SEQ ID NO: 11.

Whilst not being bound by any theory or mode of action, a single nucleotide substitution in the rswl mutant nucleotide sequence (nucleotide position 1716 in SEQ ID NO:11), relative to the wild-type RSW1 nucleotide sequence (nucleotide position 1646 in SEQ ID resulting in Ala549 being substituted with Val549 in the mutant polypeptide, may contribute to the altered activity of the RSW1 polypeptide at non-permissive temperatures such as 31 0 C. Additional amino acid substitutions are also contemplated by the present invention, to alter the activity of the RSW1 polypeptide, or to make the polypeptide temperature-sensitive.

EXAMPLE 9 ANTISENSE INHIBITION OF CELLULOSE PRODUCTION IN TRANSGENIC PLANTS 1. Construction of an antisense RSW1 binary vector One example of transgenic plants in which cellulose production is inhibited is provided by the expression of an antisense genetic construct therein. Antisense technology is used to target expression of a cellulose gene(s) to reduce the amount of cellulose produced by WO 98/00549 PCT/AU97/00402 transgenic plants.

By way of exemplification, an antisense plant transformation construct has been engineered to contain the T20782 cDNA insert (or a part thereof) in the antisense orientation and in operable connection with the CaMV 35S promoter present in the binary plasmid pRD410 (Datla et al, 1992). More particularly, the T20782 cDNA clone, which comprises the 3'-end of the wild-type RSW1 gene, was digested with XbaI and KpnI and cloned into the kanamycin-resistant derivative of pGEM3zf(-), designated as plasmid, pJKKMf(-). The RSW1 sequence was sub-cloned, in the antisense orientation, into the binary vector pRD410 as a Xbal/SacI fragment, thereby replacing the P-glucuronidase (GUS or uidA) gene. This allows the RSW1 sequence to be transcribed in the antisense orientation under the control of the CaMV 35S promoter.

The antisense RSW1 binary plasmid vector was transferred to Agrobacterium tumefaciens strain AGL1, by triparental mating and selection on rifampicin and kanamycin, as described by Lazo et al. (1991). The presence of the RSW1 insert in transformed A.tumefaciens cells was confirmed by Southern hybridization analysis (Southern, 1975). The construct was shown to be free of deletion or rearrangements prior to transformation of plant tissues, by back-transformation into Escherichia coli strain JM101 and restriction digestion analysis.

2. Transformation of Arabidopsis thaliana Eight pots, each containing approximately 16 A. thaliana ecotype Columbia plants, were grown under standard conditions. Plant tissue was transformed with the antisense RSW1 binary plasmid by vacuum infiltration as described by Bechtold et al (1993). Infiltration media contained 2.5 sucrose and plants were infiltrated for 2 min until a vacuum of approximately 400mm Hg was obtained. The vacuum connection was shut off and plants allowed to sit under vacuum for 5 min.

Approximately 34,000 T1 seed was screened on MS plates containing 50pg/ml kanamycin, to select for plants containing the antisense RSW1 construct. Of the T1 seed sown, 135 WO 98/00549 PCT/AU97/00402 -66kanamycin-resistant seedlings were identified, of which 91 were transferred into soil and grown at 21 °C under a long-day photoperiod (l6hr light; 8hr dark).

Of the 91 transgenic lines, 19 lines were chosen for further analysis which had anther filaments in each flower which were too short to deposit pollen upon the stigma and, as a consequence, required hand-pollination to obtain T2 seed therefrom.

T2 seed from 14 of these 19 lines was plated out onto vertical Hoaglands plates containing kanamycin to determine segregation ratios. Between five and ten seed were plated per transgenic line. Control seeds, including A. thaliana Columbia containing the binary vector pBIN19 (Bevan, 1984) and segregating 3:1 for kanamycin resistance, and the rswl mutant transformed with the NPTII gene, also segregating 3:1 for kanamycin resistance, were grown under the same conditions. Kanamycin-resistant plants were transferred to soil and grown at 21 °C under long days, until flowering. Untransformed Arabidopsis thaliana Columbia plants were also grown under similar conditions, in the absence of kanamycin.

3. Morphology of antisense- RSW1 plants A comparison of the morphology of antisense RSWI plants grown at 21 to mutant rswl plants grown at the non-permissive temperature 31 has identified a number of common phenotypes. For example, the antisense plants exhibit reduced fertility, inflorescence shortening and have short anthers, compared to wild-type plants, when grown at 21 These phenotypes are also observed in mutant rswl plants grown at 31 0C. These results suggest that the antisense construct in the transgenic plants may be targeting the expression of the wild-type RSWI gene at 21 C.

Figure 7 shows the reduced inflorescence (bolt) height in antisense 35S-RSW1 plants compared to wild-type A. thaliana Columbia plants grown under identical conditions.

4. Cell wall carbohydrate analysis of antisense plants.

T3 plants which are homozygous for the 35S-RSW1 antisense construct are generated and the WO 98/00549 PCT/AU97/00402 -67content of cellulose therein is determined as described in Example 1. Plants expressing the antisense construct are shown to have significantly less cellulose in their cell walls, compared to wild-type plants. Additionally, the levels of non-crystalline P-1,4-glucan and starch are elevated in the cells of antisense plants, compared to otherwise isogenic plant lines which have not been transformed with the antisense genetic construct.

Antisense 35S-RSW1 mRNA expression levels in transgenic plants Total RNA was extracted from 0.2g of leaf tissue derived from 33 kanamycin-resistant T1 plants containing the antisense 35S-RSW1 genetic construct, essentially according to Longemann et al. (1986). Total RNA (25 [tg) was separated on a 2.2M formaldehyde/agarose gel, blotted onto nylon filters and hybridized to a riboprobe comprising the sense strand sequence of the cDNA clone T20782. To produce the riboprobe, T7 RNA polymerase was used to transcribe sense RNA from a linearised plasmid template containing T20782, in the presence of 32 P]UTP. Hybridizations and subsequent washes were performed as described by Dolferus et al. (1994). Hybridized membranes were exposed to Phosphor screens (Molecular Dynamics, USA).

The levels of expression of the RSW1 antisense transcript were determined and compared to the level of fertility observed for the plant lines. As shown in Table 9, the level of antisense gene expression is correlated with the reduced fertility phenotype of the antisense plants. In 13 lines, a very high or high level of expression of the 35S-RSW1 antisense gene was observed and, in 11 of these lines fertility was reduced. Only lines 2W and 3E which expressed high to very high levels of antisense mRNA, appeared to be fully fertile. In 12 lines which expressed medium levels of antisense mRNA, approximately one-half were fertile and one-half appeared to exhibit reduced fertility. In contrast, in 8 plant lines in which only a low or very low level of expression of the antisense 35S-RSW1 genetic construct was observed, a wild-type (i.e.

fertile) phenotype was observed for all but one transgenic line, line 2R.

Data presented in Table 9 and Figure 7 indicate that the phenotype of the cellulose-deficient mutant rswl may be reproduced by expressing antisense RSW1 genetic constructs in transgenic WO 98/00549 PCT/AU97/00402 -68plants.

To confirm reduced cellulose synthesis and/or deposition in transgenic plants expressing the antisense RSW1 gene, the level of cellulose is measured by the 1 4 C incorporation assay or as acetic/nitric acid insoluble material as described in Example 1 and compared to cellulose production in otherwise isogenic wild-type plants. Cellulose production in the transgenic plants is shown to be significantly reduced compared to wild-type plants. The severity of phenotype of the transgenic plants thus produced varies considerably, depending to some extent upon the level of inhibition of cellulose biosynthesis.

TABLE 9 WO 98/00549 WO 9800549PCT/AU97/00402 -69 LEVELS OF ANTISENSE GENE EXPRESSION AND FERTILITY IN TI LINES OF ANTISENSE 35S-RSWl PLANTS T1 ANTISENSE Ti ANTISENSE PLANT 35S-RSW1 FERTILITY PLANT 35S-RSW1 FERTILITY LINE EXPRESSION LINE EXPRESSION B very high sterile* 2H medium fertile 2B very high sterile* C medium sterile* 3E very high fertile F medium sterile* 2E high sterile* 2Q medium fertile 2K high sterile* 3P medium sterile* 2M high sterile* 3T medium fertile high sterile* 51) medium sterile* 2P high sterile* 6A medium fertile 2W high fertile 8E low fertile 2Z high sterile* 2R low sterile* 3G high sterile* 7A low fertile 3Q high sterile* 7S low fertile 7Q high sterile* 70 low fertile 7N medium sterile* 7R low fertile 7G medium fertile 1B very low fertile 1 C medium sterile* 2U very low fertile 2X medium sterile* *~sterile phenotype not indicative o1 complete required to obtain seed from such plants.

sterility, but mat hand polination aL ieaSL, is WO 98/00549 PCT/AU97/00402 EXAMPLE RSW1 RELATED SEQUENCES IN RICE PLANTS To identify RSW1 related nucleotide sequences in rice, a genetic sequence database was searched for nucleotide sequences which were closely-related to one or more of the Arabidopsis thaliana RSW1 nucleotide sequences described in the preceding Examples. Rice EST S0542 (MAFF DNA bank, Japan) was identified, for which only a partial nucleotide sequences was available. Additionally, before the instant invention, there was no probable function attached to the rice EST S0542 sequence.

The present inventors have obtained the complete nucleotide sequence of clone S0542 and derived the amino acid sequence encoded therefor. The S0542 cDNA is only 1741bp in length and appears to be a partial cDNA clone because, although it comprises 100bp of untranslated sequence and contains the ATG start codon, it is truncated at 3'-end and, as a consequence encodes only the first 547 amino acid residues of the rice RSW1 or RSW1-like polypeptide. Based upon the length of the corresponding Arabidopsis thaliana RSWI polypeptide (1081 amino acids), the rice RSW1 sequence set forth in SEQ ID NO:14 appears to contain approximately one-half of the complete amino acid sequence.

The N-terminal half of the rice RSW1 amino acid sequence is approximately 70% identical to the Arabidopsis thaliana RSW1 polypeptide set forth in SEQ ID NO:6, with higher homology (approximately 90%) occurring between amino acid residues 271-547 of the rice sequence.

These data strongly suggest that S0542 is the rice homologue of the A. thaliana RSW1 gene.

Alignments of rice, A. thaliana and cotton RSW1 amino acid sequences are presented in Figures 9 and To isolate full-length cDNA clones and genomic clone equivalents of S0542 (this study and MAFF DNA bank, Japan) or D48636 (Pear et al., 1996), cDNA and genomic clone libraries are produced using rice mRNA and genomic DNA respectively, and screened by hybridisation using the S0542 or D48636 cDNAs as a probe, essentially as described herein. Positive- WO 98/00549 PCT/AU97/00402 -71 hybridising plaques are identified and plaque-purified, during further rounds of screening by hybridisation, to single plaques.

The rice clones are sequenced as described in the preceding Examples to determine the complete nucleotide sequences of the rice RSW1 genes and derived amino acid sequences therefor. Those skilled in the art will be aware that such gene sequences are useful for the production of transgenic plants, in particular transgenic cereal plants having altered cellulose content and/or quality, using standard techniques. The present invention extends to all such genetic sequences and applications therefor.

EXAMPLE 11 RSW1 RELATED SEQUENCES IN COTTON PLANTS A 32 P-labelled RSW1 PCR fragment was used to screen approximately 200,000 cDNA clones in a cotton fibre cDNA library. The RSW1 PCR probe was initially amplified from Arabidopsis thaliana wild type cDNA using the primers 2280-F and cspl-R described in the preceding Examples, and then re-amplified using the primer combination 2370-F/cspl-R, also described in the preceding Examples.

Hybridisations were carried out under low stringency conditions at 55 C.

Six putative positive-hybridising plaques were identified in the first screening round. Using two further rounds of screening by hybridisation, four of these plaques were purified to single plaques. Three plaques hybridise very strongly to the RSW1 probe while the fourth plaque hybridises less intensely.

We conclude that the positive-hybridising plaques which have been purified are strong candidates for comprising cotton RSW1 gene sequences or RSW1-like gene sequences.

Furthermore, the cotton cDNAs may encode the catalytic subunit of cellulose synthase, WO 98/00549 PCT/AU97/00402 -72because the subunit protein architecture of cellulose synthase appears to be highly conserved among plants as highlighted in the preceding Example.

Furthermore, a Southern blot of cotton genomic DNA digested with BglII was hybridised with the 5' end of the RSW1 cDNA, under low stringency hybridisation conditions at 55 C. Results are presented in Figure 11. These data demonstrate that RSW1-related sequences exist in the cotton genome.

The cotton cDNA clones described herein are sequenced as described in the preceding Examples and used to produce transgenic cotton plants having altered fibre characteristics. The cDNAs are also used to genetically alter the cellulose content and/or quality of other plants, using standard techniques.

EXAMPLE 12 RSW1 RELATED SEQUENCES IN EUCALYPTUS SSP.

Putative Eucalyptus ssp. cellulose synthase catalytic subunit gene fragments were obtained by amplification using PCR. DNA primers were designed to conserved amino acid residues found in the Arabidopsis thaliana RSWI and 12C4 amino acid sequences. Three primers were used for PCR. The primers are listed below: pcsF-I A A/G A A G A T I G A C/T TA C/T C/T T I A A/G GA C/T A A-3' pcsR-II 5'-A T I G T I G G I G T I C G/T A/G T T C/T T G A/T/G/C C T/G A/T/C/G C C -3' pcsF-II G C I A T GA AA/G A/C G I G A I T A C/T GA A/G GA -3' Using standard PCR conditions (50 0 C annealing temperature) and solutions, the primer sets pcsF-I/pcsR-II and pcsF-II/pcsR-II were used to amplify genetic sequences from pooled Eucalyptus ssp. cDNA. In the first reaction primers pcsF-I and pcsR-II were used to generate a fragment approximately 700 bp in length. In the second PCR reaction, which used primers pcsF-II and pcsR-II, a fragment estimated to 700 bp was obtained. The sizes of the PCR WO 98/00549 PCT/AU97/00402 73 fragments are within the size range estimated for the corresponding Arabidopsis thaliana sequences.

We conclude that the amplified Eucalyptus ssp. PCR fragments are likely to be related to the Arabidopsis thaliana RSW1 gene and may encode at least a part of the Eucalyptus ssp.

cellulose synthase catalytic subunit.

The Eucalyptus ssp. PCR clones described herein are sequenced as described in the preceding Examples and used to isolate the corresponding full-length Eucalyptus ssp cDNAs and genomic gene equivalents. Those skilled in the art will be aware that such gene sequences are useful for the production of transgenic plants, in particular transgenic Eucalyptus ssp plants having altered cellulose content and/or quality, using standard techniques. The present invention extends to all such genetic sequences and applications therefor.

EXAMPLE 13 NON-CRYSTALLINE B-1,4-GLUCAN AS A MODIFIER OF CELL WALL PROPERTIES The properties of plant cell walls depend on the carbohydrates, proteins and other polymers of which they are composed and the complex ways in which they interact. Increasing the quantities of non-crystalline P-1,4-glucan in cell walls affects those wall properties which influence mechanical, nutritional and many other qualities as well as having secondary consequences resulting from the diversion of carbon into non-crystalline glucan at the expense of other uses. To illustrate one of these effects, we investigated the ability of the noncrystalline glucan to hydrogen bond to other wall components particularly cellulose in the way that has been shown to be important for wall mechanics.

Hemicelluloses such as xyloglucans cross-link cellulose microfibrils by hydrogen bonding to the microfibril surface (Levy et al., 1991). Since the p-1,4-glucan backbone of xyloglucan is WO 98/00549 PCT/AU97/00402 -74thought to be responsible for hydrogen bonding (with the xylose, galactose and fucose substitutions limiting the capacity to form further hydrogen bonds) we can expect the noncrystalline 0-1,4-glucan also to have a capacity to hydrogen bond and cross link cellulose. The effectiveness of strong alkalis in extracting xyloglucans is thought to relate to their disruption of the hydrogen bonds with cellulose (Hayashi and MacLachlan, 1984).

To demonstrate that the non-crystalline p-1,4-glucan forms similar associations with the cellulose microfibrils, we examined whether the 4 M KOH fraction, extracted from shoots of the rswl mutant and from wild type RSW1 plants, contained non-crystalline glucan in addition to xyloglucan. The non-crystalline glucan was separated from xyloglucan in the 4 M KOH extract by dialysing the neutralised extract against distilled water and centrifuging at 14000 g for 1 hour. The pellet was shown to be a pure P-1,4-glucan by using the methods for monosaccharide analysis, methylation analysis and enzyme digestion used to characterise the glucan in the ammonium oxalate fraction (see Example 1).

Table 10 shows the presence of substantial quantities of glucan recovered in pure form in the pellet from 4 M KOH fractions extracted from the overproducing rswl mutant of Arabidopsis thaliana. These data also demonstrate the presence of smaller quantities of non-crystalline P- 1,4-glucan in the 4 M KOH fraction from wild type plants, compared to rswl, particularly when grown at 31 OC.

TABLE Glucose contents* of 4M KOH fractions from shoots of wild-type and rswlmutant Arabidopsis thaliana plants Glucose fraction wild-type rswl mutant 21 0 C 31°C 21 0 C 31 0

C

xyloglucan and non-crystalline glucan in whole extract 36.4 56.9 27.1 93.1 non-crystalline glucan in pellet 7.8 20.5 7.6 56.0 nmol glucose/ mg plant dry weight after TFA hydrolysis WO 98/00549 PCT/AU97/00402 The monosaccharide composition of the supernatant remaining after centrifugation was determined after TFA hydrolysis. These data, and data from methylation analysis, are consistent with the supernatant being a relatively pure xyloglucan. The supernatant was free of glucan, because no glucose could be released by the endocellulase/p-glucosidase mixture that released glucose from p-1,4-glucan.

The presence of both non-crystalline p-1,4-glucan and xyloglucan in the 4 M KOH fraction, when taken together with the implications from structural predictions (Levy et al, 1991), is consistent with some of the non-crystalline p-1,4-glucan in the wall hydrogen bonding to cellulose microfibrils in similar fashion to the 3-1,4-glucan backbone of xyloglucan.

The cross linking provided when xyloglucans and other hemicelluloses bind to two or more microfibrils is an important determinant of the mechanical properties of cellulosic walls (Hayashi, 1989). The effects of increasing the amounts of non-crystalline p-1,4-glucan in walls are likely to be greatest in walls which otherwise possess relatively low levels of cross linking as a result of high ratios of cellulose: hemicelluloses. Such conditions are common in secondary walls including those of various fibres, and the cellulose:hemicellulose ratio is particularly high in cotton fibres.

The effects on wall mechanical properties of overproducing non-crystalline glucan are shown by transforming plants with the mutant allele of rswl (SEQ ID NO: 1) operably under the control of either the RSW1 promoter derived from SEQ ID NO:3 or SEQ ID NO:4 or alternatively, an appropriate constitutive promoter such as the CaMV 35S promoter.

Production of non-crystalline glucan is quantified by fractionating the cell walls using the methods described above to show in particular that non-crystalline glucan is recovered in the 4 M KOH fraction. Mechanical properties of the cell walls are measured using standard methods for fibre analysis to study parameters such as stress-strain curves, and breaking strain, amongst other properties.

.WO 98/00549 PCT/AU97/00402 -76- EXAMPLE 14 OVER-EXPRESSION OF CELLULOSE SYNTHASE IN TRANSGENIC PLANTS Three strategies are employed to over-express cellulose synthase in Arabidopsis thaliana plants.

In the first strategy, the CaMV 35S promoter sequence is operably connected to the full-length cellulose synthase cDNA which is obtainable by primer extension of SEQ ID NO:1. This is achievable by cloning the full-length cDNA encoding cellulose synthase, in the sense orientation, between the CaMV 35S promoter or other suitable promoter operable in plants and the nopaline synthase terminator sequences of the binary plasmid pBI121.

In the second strategy, the coding part of the genomic gene is cloned, in the sense orientation, between the CaMV 35S promoter and the nopaline synthase terminator sequences of the binary plasmid pBI121.

In the third strategy, the 23H12 binary cosmid clone or the derivative pRSW1, containing the cellulose synthase gene sequence operably under the control of the cellulose synthase gene promoter and terminator sequences is prepared in a form suitable for transformation of plant tissue.

For Agrobacterium-mediated tissue transformation, binary plasmid constructs discussed supra are transformed into Agrobacterium tumefaciens strain AGL1 or other suitable strain. The recombinant DNA constructs are then introduced into wild type Arabidopsis thaliana plants (Columbia ecotype), as described in the preceding Examples.

Alternatively, plant tissue is directly transformed using the vacuum infiltration method described by Beshtold et al. (1993).

WO 98/00549 PCT/AU97/00402 -77- The transgenic plants thus produced exhibit a range of phenotypes, partly because of position effects and variable levels of expression of the cellulose synthase transgene.

Cellulose content in the transgenic plants and isogenic untransformed control plants is determined by the 14 C incorporation assay or as acetic/nitric acid insoluble material as described in Example 1. In general, the level of cellulose deposition and rates of cellulose biosynthesis in the transgenic plants are significantly greater than for untransformed control plants.

Furthermore, in some cases, co-supression leads to mimicry of the rswl mutant phenotype.

EXAMPLE SITE-DIRECTED MUTAGENESIS OF THE RSW1 GENE The nucleotide sequence of the RSWI gene contained in 23H 12 is mutated using site-directed mutagenesis, at several positions to alter its catalytic activity or substrate affinity or glucan properties. In one example, the RSW1 gene is mutated to comprise one or more mutations present in the mutant rswl allele.

The mutated genetic sequences are cloned into binary plasmid described in the preceding Examples, in place of the wild-type sequences. Plant tissue obtained from both wild-type Arabidopsis thaliana (Columbia) plants and A. thaliana rswl plants is transformed as described herein and whole plants are regenerated.

Control transformations are performed using the wild-type cellulose synthase gene sequence.

WO 98/00549 PCT/AU97/00402 -78- EXAMPLE 16 PHENOTYPES OF PLANTS EXPRESSING MUTATED RSW1 GENES Plants transformed with genetic constructs described in Example 15 (and elsewhere) are categorised initially on the basis of number of transgene copies, to eliminate variability arising therefrom. Plants expressing single copies of different transgenes are analysed further for cell wall components, including cellulose, non-crystalline P-1,4-glucan polymer, starch and carbohydrate content.

1. Cellulose content Cellulose content in the transgenic plants is determined by the 1 4 C incorporation assay as described in Example 1. Cell walls are prepared, fractionated and the monosaccharide composition of individual fractions determined as in Example 1.

2. Non-crystalline p-1,4-glucan content Transgenic plants expressing the rswl mutant allele exhibit a higher level of non-crystalline, and therefore extractable, p-1,4-glucan in cell walls compared to plants expressing an additional copy of the wild-type RSWI allele. Thus, it is possible to change the crystallinity of the P-1,4-glucan chains present in the cell wall by mutation of the wild-type RSW1 allele.

3. Starch content Transgenic plants are also analysed to determine the effect of mutagenesis of the RSW1 gene on the level of starch deposited in their roots. The quantity of starch present in material prepared from the crude wall fraction is determined using the anthrone/H 2

SO

4 method described in Example 1. The data show that mutating the RSWI gene to the mutant rswl allele increases starch deposition. This demonstrates that the gene can be used to alter the partitioning of carbon into carbohydrates other than cellulose.

4.Cell wall composition The cell wall composition of transgenic plant material is also analysed. Wild type and rswl WO 98/00549 PCT/AU97/00402 -79and transgenic seedlings are grown for 2 d at 21 °C and then kept for a further 5 d at either 21 C or 31 C. With transfer to 31 C when the seed has scarcely germinated, the wall composition at final harvest largely reflects the operation of the mutated rswl gene product at its restrictive temperature. Cell wall fractionation is carried out in similar fashion to that described for the 14 C-experiment (Example 1) and the monosaccharide composition of each fraction is quantified by GC/MS after hydrolysis with trifluoroacetic acid or, in the case of crystalline cellulose, HSO 4 In some transgenic plants in which the RSW1 gene is mutated, the monosaccharide composition is comparable to that observed for homozygous rswl plants, at least in some cases, confirming that there is a major reduction in the quantity of crystalline cellulose in the final, acid insoluble fraction. Thus, mutation of the RSWI gene can be performed to produce changes in the composition of plant cell walls.

EXAMPLE 17 CHEMICAL MODIFICATION OF THE RSW1 GENE TO MANIPULATE CELLULOSE PRODUCTION AND PLANT CELL WALL CONTENT.

As demonstrated in the preceding Examples, the RSW1 gene is involved in cellulose production and the manipulation of cell wall content.

In the present Example, to identify novel phenotypes and gene sequences important for the normal functioning of the cellulose synthase gene, the RSW1 gene is modified in planta, using the chemical mutagen EMS. The mutant plants are identified following germination and the modified RSW1 genes are isolated and characterised at the nucleotide sequence level. A sequence comparison between the mutant gene sequences and the wild type sequence reveals nucleotides which encode amino acids important to the normal catalytic activity of the cellulose synthase enzyme, at least in Arabidopsis thaliana plants.

This approach thus generates further gene sequences of utility in the modification of cellulose WO 98/00549 PCT/AU97/00402 content and properties in plants.

EXAMPLE 18

DISCUSSION

Five pieces of evidence make a compelling case that the RSW1 gene product encodes the catalytic subunit of cellulose synthase: 1. The rswl mutation selectively inhibits cellulose synthesis and promotes accumulation of a non-crystalline P-1,4-glucan; 2. The rswl mutation removes cellulose synthase complexes from the plasma membrane, providing a plausible mechanism for reduced cellulose accumulation and placing the RSW1 product either in the complexes or interacting with them; 3. The D,D,D,QXXRW signature identifies the RSW1 gene product as a processive glycosyl transferase enzyme (Saxena, 1995); 4. The wild type allele corrects the temperature sensitive phenotype of the rswl mutant; and Antisense expression of the RSW1 in transgenic plants grown at 21 °C reproduces some of the phenotype of rswl which is observed following growth at 31 °C.

Consistent with the plasma membrane location expected for a catalytic subunit, the putative 122 kDa RSW1 product contains 8 predicted membrane-spanning regions. Six of these regions cluster near the C-terminus (Figure 10), separated from the other two by a domain that is probably cytoplasmic and has the weak sequence similarities to prokaryotic glycosyl transferases (Wong, 1990; Saxena, 1990 Matthyse, 1995; Sofia, 1994 Kutish, 1996).

RSW1 therefore qualifies as a member of the large family of Arabidopsis thaliana genes whose members show weak similarities to bacterial cellulose synthase. RSW1 is the first member of that family to be rigorously identified as an authentic cellulose synthase. Among the diverse genes in A. thaliana, at least two genes show very strong sequence similarities to the RSW1 gene and are most likely members of a highly conserved sub-family involved in WO 98/00549 PCT/AU97/00402 -81 cellulose synthesis. The closely related sequences come from cosmid 12C4, a partial genomic clone cross-hybridising with EST T20782 designated Ath-A, and from a full length cDNA designated Ath-B.

Ath-A resembles RSW1 (SEQ ID NO:5) at its N-terminus whereas Ath-B starts 22 amino acid residues downstream [Figure 8 and Figure (ii) and Closely related sequences in other angiosperms are the rice EST S0542 [Figure (ii) and which resembles the polypeptides encoded by RSW1 and Ath-A and the cotton celAl gene (Pear, 1996) at the N-terminus.

The Arabidopsis thaliana, rice and cotton genes have regions of very high sequence similarity interspersed with variable regions (Figures 9 and 10). Most of the highest conservation among those gene products occurs in their central cytoplasmic domain where the weak similarities to the bacterial cellulose synthase occur. The N-terminal region that precedes the first membrane spanning region is probably also cytoplasmic but shows many amino acid substitutions as well as sequences in RSW1 that have no counterpart in some of the other genes as already noted for celA. An exception to this is a region comprising 7 cysteine residues with highly conserved spacings (Figure 10). This is reminiscent of regions suggested to mediate protein-protein and protein-lipid interactions in diverse proteins including transcriptional regulators and may account for the striking sequence similarity between this region of RSW1 and two putative soybean bZIP transcription factors (Genbank SOYSTF1A and 1B).

In conclusion, the chemical and ultrastructural changes seen in the cellulose-deficient mutant combine with gene cloning and complementation of the mutant to provide strong evidence that the RSW1 locus encodes the catalytic subunit of cellulose synthase. Accumulation of non-crystalline P- 1,4-glucan in the shoot of the rswl mutant suggests that properties affected by the mutation are required for glucan chains to assemble into microfibrils. Whilst not being bound by any theory or mode of action, a key property may be the aggregation of catalytic subunits into plasma membrane rosettes. At the restrictive temperature, mutant synthase WO 98/00549 PCT/AU97/00402 -82complexes disassemble to monomers (or smaller oligomers) that are undetectable by freeze etching. At least in the shoot, the monomers seem to remain biosynthetically active but their p-1,4-glucan products fail to crystallise into microfibrils probably because the chains are growing from dispersed sites. Crystallisation into microfibrils, with all its consequences for wall mechanics and morphogenesis, therefore may depend upon catalytic subunits remaining aggregated as plasma membrane rosettes.

Those skilled in the art will appreciate that the invention described herein is susceptible to variations and modifications other than those specifically described. It is to be understood that the invention includes all such variations and modifications. The invention also includes all of the steps, features, compositions and compounds referred to or indicated in this specification, individually or collectively, and any and all combinations or any two or more of said steps or features.

REFERENCES

1. An et al. (1985) EMBO J. 4:277-284.

2. Ausubel, Brent, Kingston, Moore, Seidman, Smith, J.A.

and Struhl, K. (1987) Current Protocols in Molecular Biology, Wiley Interscience (ISBN 047140338).

3. Baskin et al. (1992) Aust. J. Plant Physiol. 19:427-437.

4. Bechtold et al. (1993) Planta 316:1194-1199.

Bevan, M. (1984) Nucl. Acids Res. 12, 8711-8721.

6. Chang, C. et al (1988) Proc. Natl. Acad. Sci. (USA) 85, 6856-6860.

7. Coleet al. (1985) In Monoclonal antibodies in cancer therapy, Alan R. Bliss Inc., pp 77-96; 8. Crossway et al. (1986) Mol. Gen. Genet. 202:179-185.

9. Datla, Hammerlindl, Panchuk, Pelcher, L.E. and Keller, W.

(1992) Gene, 211, 383-384.

Ditta et al. (1980) Proc. Natl. Acad. Sci. (USA) 77, 7347-7351.

11. Doares, Albersheim, P. and Darvill, A.G. (1991) Carb. Res. 210, 311-317.

WO 98/00549 PTA9/00 PCT/AU97/00402 83 12. Dolferus, Jacobs, Peacock, W.J. and Dennis, E.S. (1994) Plant Physiol. 105, 1075-1087.

13. Fromm et (1985) Proc. Natl. Acad. Sci. (USA) 82:5824-5828.

14. Haseloff, and Gerlach, W.L. (1988) Nature 334:586-594 15. Hayashi (1989) Ann Rev Plant Physiol. Plant Molecular Biol. 40, 139-168.

16. Hayashi and MacLachian (1984) Plant Physiol. 75, 596-604.

17. Herrera-Estrella etal. (I1983a) Nature 303:209-213.

18. Herrera-Estrella et (1983b) EMBO J. 2:987-995 19. Herrera-Estrella et al. (1985) In: Plant Genetic Engineering, Cambridge University Press, NY, pp 63-93.

Herth, W. (1985) Planta 164, 12-21.

21. Huse et al. (1989) Science 246: 1275-128 1.

22. Kohler and Milstein (1975) Nature, 256: 495-499.

23. Konieczny, A. and Ausubel, F. (1993) Plant J. 4, 403-410.

24. Kozbor et al. (1983) Immunol. Today 4: 72.

Lazo, Stein, P.A. and Ludwig, R.A. (1991). Bio/technology 9,963-967.

26. Levy et al. (1991) Plant Journal 1, 195-215.

27. Longemann, J, Schell, J. and Willmitzer, L. (1987). Anal. Biochem. 163, 16-20.

28. Matthyse, White, and Lightfoot, R. (1995) J. Bacteriol. 177, 1069-1075.

29. McPherson et a. (199 1) In: PCR:A Practical Approach. IRL Press. Oxford.

Nam, et (1989) Plant Cell 1, 699-705.

31. Needs, P.W. and Selvendran, R.R. (1993) Phytochem. Anal. 4, 210-216.

32. Paszkowski et al. (1984) EMBO J. 3:27 17-2722.

33. Pear, J.R.,et a! (1996) Proc. Natl. Acad. Sci. (USA) 93, 1263 7-12642.

34. Saxena et al. (1990) Plant Mci. Biol. 15, 673-683.

Saxena et al. (1995) J.Bacteriol. 177: 1419-1424.

36. Sofia, et a! (1994) Nuci. Acids Res. 22, 2576-2586.

37. Southern, E.M. (1975). J. Mol. Biol. 98, 503-5 17.

38. Updegraph, (1969) Analyt. Bioch. 32: 429-424.

39. Wong, H.C. et a! (1990) Proc. Natl. Acad. Sci. (USA) 87:8130-8134.

WO 98/00549 PCT/AU97/00402 -84- SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: Australian National University and the Commonwealth Scientific and Industrial Research Organisation (ii) TITLE OF INVENTION: Manipulation of plant cellulose (iii) NUMBER OF SEQUENCES: 14 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Davies Collison Cave Patent Attorneys STREET: 1, Little Collins Street CITY: Melbourne STATE: Victoria COUNTRY: Australia ZIP: 3000 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.25 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: PCT INTERNATIONAL FILING DATE: (vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: AU PO0699 FILING DATE: 27-JUN-1996 (viii) ATTORNEY/AGENT INFORMATION: NAME: SLATTERY, JOHN M (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: 61-3-9254-2777 TELEFAX: 61-3-9254-2770 It WO 98/00549 PCT/AU97/00402 TELEX: AA31787 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 2248 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana (vii) IMMEDIATE SOURCE: CLONE: EST T20782 (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..1887 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:

CGA

Arg 1 GCT ATG Ala Met AAG AGA GAG Lys Arg Glu TAT GAA GAG Tyr Glu Glu TTT AAA Phe Lys GTG AGG ATA AAT GCT Val Arg Ile Asn Ala CTT GTT GCC Leu Val Ala AAA GCA Lys Ala CCC TGG Pro Trp CAG AAA ATC CCT GGA GAA GGC TGG ACA ATG CAG Gin Lys Ile Pro Gly Glu Gly Trp Thr Met Gin GAT GGT ACT Asp Gly Thr CCT GGT AAC AAC ACT AGA GAT Pro Gly Asn Asn Thr Arg Asp CAT CCT GGA ATG His Pro Gly Met WO 98/00549 WO 9800549PCT/AU97/00402 86 ATA CAG GTG TTC TTA GGC CAT AGT GGG GGT CTG GAT ACC GAT GGA AAT Leu Asp Thr Asp Giy Asn 192 Ile Gin Val Phe Leu Giy His Ser Gly Gly GAG CTG CCT AGA CTC ATC TAT GTT TCT CGT GAA AAG CGG CCT Leu Ile Tyr Val Ser Arg Giu Lys Arg pro Leu Pro Arg GGA TTT Giy Phe GTA TCT Vai Ser CAA CAC Gin His CAC AAA AAG GCT His Lys Lys Aia GGA GCT ATG AAT Giy Aia Met Asn OCA TCG ATC CGT Ala Ser Ile Arg GCT GTT CTT Ala Vai Leu AAT GGA GCA TAT Asn Gly Aia Tyr CTT TTC AAC GTG GAT TOT CAT CAT Leu Leu Asn Val Asp Cys Asp His 105 110 AAA GAA OCT ATG TGT TTC ATG ATG Lys Giu Aia Met Cys Phe Met Met 125 336 TAC TTT AAT AAC AGT AAG OCT Tyr Phe Asn Asn Ser Lys Aia 115 GAC CCG GCT ATT GGA AAG AAG TGC TGC TAT GTC CAG TTC CCT CPA CGT Asp Pro Ala Ile Gly Lys Lys Cys Cys Tyr Val Gin Phe Pro Gin Arg 130 135 140 TTT GAC GGT ATT GAT TTG CAC GAT CGA TAT CCC AAC AGG AAT Phe Asp Gly Ile Asp Leu His Asp Arg Tyr Ala Asn Arg Asn ATA GTC Ile Val 160 480 TTT TTC OAT Phe Phe Asp ATT AAC Ile Asn 165 ATO PAG GGG TTG GAT Met Lys Cly Leu Asp 170 TOT TGT TTT PAT AGG Cys Cys Phe Asn Arg GGT ATC CAC GGT CCA GTA Cly Ile His Gly Pro Val 175 TAT GTG GGT ACT GGT Tyr Val Gly Thr Gly 180 CAG GCT CTA Gin Ala Leu

TAT

Tyr 190 GGG TAT Cly Tyr GAT CCT Asp Pro OTT TTG ACG Val Leu Thr GPA GPA CAT TTA Olu Glu Asp Leu GPA CCA AAT ATT ATT GTC PAC Clu Pro Asn liVaLy Ile Val Lys

I,

WO 98/00549 WO 9800549PCT/AU97/00402 -87- AGC TGT TGC GGG TCA AGG AAG AAA GGT AAA AGT Lys Gly Lys Ser AAG AAG TAT AAC Lys Lys Tyr Asn Ser Cys Cys Gly 210 Ser Arg Lys 215

TAC

Tyr 225 GAA AAG AGG AGA GGC ATC AAC AGA AGT GAC TCC AAT GCT CCA Giu Lys Arg Arg Gly Ile Asn Arg Ser Asp Ser Asn Ala Pro TTC AAT ATG GAG GAC ATC GAT GAG GGT TTT GAA GGT TAT GAT Phe Asn Met Giu Asp Ile Asp Glu Gly Phe Giu Gly Tyr Asp GAT GAG Asp Giu 255 AGG TCT ATT CTA ATG TCC CAG AGG Arg Ser Ile Leu Met Ser Gin Arg 260 GTA GAG AAG CGT Val Giu Lys Arg TTT GGT CAG Phe Giy Gin 270 TCG CCG GTA Ser Pro Val 275 TTT ATT GCG GCA ACC TTC ATG GAA CAA Phe Ile Ala Ala Thr Phe Met Giu Gin 280 GGC GGC ATT CCA Giy Gly Ile Pro 285 ATT CAT GTT ATA Ile His Val Ile CCA AC-A Pro Thr 290 ACC AAT CCC GCT ACT CTT CTG AAG GAG Thr Asn Pro Ala Thr Leu Leu Lys Giu 295

AGC

Ser 305 TGT GGT TAC GAA GAC AAG Cys Giy Tyr Giu Asp Lys ACT GAA TGG Thr Glu Trp, GGC AAA GAG ATT GGT Gly Lys Giu Ile Gly TGG 960 Trp 320 ATC TAT GGT TOC GTG ACG GAA GAT ATT CTT ACT GGG TTC AAG, Ile Tyr Gly Ser Val Thr Glu Asp Ile Leu Thr Gly Phe Lys ATG CAT Met His 335 1008 GCC CGG GGT Ala Arg Gly ATA TOG ATO Ile Ser Ile TAO TGC Tyr Cys 345 AAT CCT CCA CGC Asn Pro Pro Arg OCT GCG TTC Pro Ala Phe 350 CAA GTT CTT Gin Val Leu 1056 1104 AAG GGA TOT Lys Gly Ser 355 GCA CCA ATC Aia Pro Ile AAT CTT TOT GAT CT TTG Asn Leu Ser Asp Arg Leu 360 WO 98/00549 PCT/AU97/00402 88 CGA TGG GCT TTG GGA TCT ATC GAG ATT CTT CTT Glu Ile Leu Leu Arg Trp 370 Ala Leu Gly Ser AGC AGA CAT TGT CCT Ser Arg His Cys Pro 380 1152 1200 ATC Ile 385 TGG TAT GGT TAC CAT Trp Tyr Gly Tyr His 390 GGA AGG TTG AGA Gly Arg Leu Arg TTG GAG AGG ATC Leu Giu Arg Ile TAT ATC AAC ACC Tyr Ile Asn Thr GTC TAT CCT ATT ACA TCC ATC CCT CTT Val Tyr Pro Ile Thr Ser Ile Pro Leu 410 ATT GCG Ile Ala 415 1248 TAT TOT ATT CTT Tyr Cys Ile Leu 420 CCC GCT TTT TOT Pro Ala Phe Cys ATC ACC GAC AGA Ile Thr Asp Arg TTC ATC ATA Phe Ile Ile 430 CTC TTC ATC Leu Phe Ile 1296 CCC GAG ATA AGC AAC TAC GCG Pro Glu Ile Ser Asn Tyr Ala 435 ATT TGG TTC ATT Ile Trp Phe Ile 1344 TCA ATT Ser Ile 450 GCT GTO ACT OGA Ala Val Thr Gly CTG AAA CTG AAA Leu Lys Leu Lys AAC GGT OTO AGC Asn Gly Val Ser 1392 ATT GAG GAT TOG TGO AGO Ile Giu Asp Trp Trp Arg 465 470 AAC AAC CAG TTC TGG Asn Asn Gin Phe Trp 475 GTC ATT GGT GGC Val Ile Gly Gly TCC ACC CAT CTT Ser Thr His Leu GOT ATC AAC ACC Gly Ile Asn Thr 500 TTT GCT GTC TTC CAA GOT CTA CTT AAG Phe Ala Val Phe Gin Gly Leu Leu Lys 485 490 AAC TTC ACC GTT ACA TCT AAA GCC ACA Asn Phe Thr Val Thr Ser Lys Ala Thr 505 GTT CTT GCT Val Leu Ala 49S AAC AAA AAT Asn Lys Asn 510 1440 1488 1536 GGG GAT Gly Asp OCA AAA CTC TAC Ala Lys Leu Tyr TTC AAA TOG ACA Phe Lys Trp Thr OCT CTT CTC ATT Ala Leu Leu Ile 525 1584 4 .1' WO 98/00549 PCU/AU97/00402 89 CCA CCA Pro Pro 530 ACC ACC GTC CTA Thr Thr Val Leu

GTC

Val 545

GGG

Giy TCT TAT GCT GTA Ser Tyr Ala Val AAG C FC TTC TTC Lys Leu Phe Phe 565 AAA GGT CTG TTG Lys Gly Leu Leu GTG AAC CTC ATA Val Asn Leu Ile GGC TAC CAG TCG Gly Tyr Gin Ser 555 TGG GTT ATT GCC Trp, Val Ile Aia ATT GTG GCT GGT Ile Vai Ala Gly GGT CCG CTT Gly Pro Leu

TTA

Leu 570 CAT CTC TAC CCT TTC His Leu Tyr Pro Phe 575 CCA ACC ATC GTC ATT Pro Thr Ile Val Ile 1632 1680 1728 1776 1824 1872

TTG

Leu GGA AGA CAA AAC Gly Arg Gin Asn 585 CTC GCC TCC ATC Leu Ala Ser Ile

CGA

Arg

ACA

Thr 590

TGG

Trp GTC TGG TCT Val Trp Ser 595 GTT CTT Vai Leu TTC TCG TTG Phe Ser Leu GTC AGG Val Arg ATC AAT CCC TTT GTG Ile Asn Pro Phe Val 610 AAA GGA GGT GTC TTT Lys Gly Gly Val Phe GAC GCC AAT CCC AAT GCC AAC Asp Ala Asn Pro Asn Ala Asn TTC AAT GGC Phe Asn Gly TAGACCCTAT TTATATACTT GTGTGTGCAT ATATCAAAAA 1927

CGCGCAATGG

GGTGATTCCA

AACACTATTG

3 5 TACAAAAAGA

CACTTATGTA

AGAATCTGAA

GAATTCCAAA TCATCTAAAC CCATCAAACC TGTCCAAGAT TAGCTTTCTC CGAGTAGCCA TAATGATTTT CCAGTGGGGA AGAAGATGTG ATTAGTTATA ACTTTCTTAT ATTTATTTTA ATGTTGGAAC TTGTTGTCCT AAAAAGGGAT GTTTATATGC T CCAGTGAACC GGGCAGTTAA GAGAAGGTGA AATTGTTCGT GACCCAAATG ATACATAGTC TTTAAAGCTT GTTAGACTCA TGGAGTTTTC TTTTTATCTA 1987 2047 2107 2167 2227 2248 WO 98/00549 WO 9800549PCTIAU97/00402 90 INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 629 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein Arg 1 Leu (xi) SEQUENCE Ala Met Lys Arg DESCRIPTION: SEQ ID Glu Tyr Giu Giu Phe NO: 2: Lys Val Arg Val Ala Lys Ala Gly Thr Pro Trp Asp Gin Lys Ile Pro Gly Asn 40 Gly His Ser Pro Gly 25 Asn Thr Glu Gly Trp, Arg Asp His Leu Asp Thr Ile Asn Ala Thr Met Gin Pro Gly Met Asp Gly Asn Ile Gin Val Phe Leu Gly Gly Glu Gin Leu Pro Arg Leu Val Ser Arg Arg Pro Gly His His Lys Gly Ala Met Ser Ile Arg Val Ser Ala Vai Leu Tyr Phe Asn 115 Asp Pro Aia 130 Gly Ala Tyr Leu 105 Lys Asn Val Asp Cys Asp His 110 Ser Lys Ala Giu Ala Met Cys Phe Met Met 125 Phe Pro Gln Arg Ile Giy Lys Lys 13S Cys Tyr Val Gin Phe Asp 145 Gly Ile Asp Leu His 150 Asp Arg Tyr Asn Arg Asn Ile WO 98/00549 WO 9800549PCT/AU97/00402 91 Phe Phe Asp Ile Asn Met Lys Giy Leu 165 Giy Ile His Gly Pro Vai 175 Tyr Val Gly Giy Cys Cys Phe Arg Gin Aia Leu Tyr Gly Tyr 190 Ile Val Lys Asp Pro VaIl 195 Leu Thr Giu Glu Leu Giu Pro Asn Ser Cys Cys Giy Ser Arg Lys Giy Lys Ser Lys Lys Tyr Asn Giu Lys Arg Arg Ile Asn Arg Ser Ser Asn Ala Pro Phe Asn Met Glu Ilie Asp Glu Gly Giu Giy Tyr Asp Asp Giu 255 Arg Ser Ile Met Ser Gin Arg Val Giu Lys Arg Phe Giy Gin 270 Giy Ile Pro Ser Pro VaIl 275 Phe Ile Ala Ala Phe Met Giu Gin Pro Ser 305 Thr Asn Pro Ala Leu Leu Lys Giu Ile His ValIle Cys Gly Tyr Glu Lys Thr Glu Trp Lys Gu Ile Gly Ile Tyr Giy Ser Thr Giu Asp Ile Thr Gly Phe Lys Met His 335 Ala Arg Gly Ile Ser Ile Tyr Asn Pro Pro Arg Pro Ala Phe 350 Gin Val Leu Lys Giy Ser 355 Ala Pro Ile Asn Ser Asp Arg Leu WO 98/00549 PTA9/00 PCT/AU97/00402 92 Arg Trp 370 Ala Leu Gly Ser Giu Ile Leu Leu Arg His Cys Pro Ile 385 Trp Tyr Gly Tyr Gly Arg Leu Arg Leu Giu Arg Ile Tyr Ile Asn Thr Val Tyr Pro Ile Thr Ser Ile Pro Leu 410 Ile Ala 415 Tyr Cys Ile Pro Ala Phe Cys Ile Thr Asp Arg Phe Ile Ile 430 Leu Phe Ile Pro Giu Ile Ser Asn Tyr Ala Ser Ile Trp Phe Ile 43S 440 Ser Ile 450 Ala Val Thr Giy Leu Lys Leu Lys Asn Giy Val Ser Ile 465 Giu Asp Trp Trp Arg Asn Asn Gin Phe 470 Val Ile Gly Gly Ser Thr His Leu Ala Val Phe Gin Leu Leu Lys Vai Leu Ala 495 Giy Ile Asn Giy Asp Phe 515 Asn Phe Thr Val Ser Lys Ala Thr Asn Lys Asn 510 Leu Leu Ile Ala Lys Leu Tyr Phe Lys Trp, Thr Pro Pro Thr Thr Vai Leu 530 Val Asn Leu Ile Gly Ile Val Ala Gly 540 Val 545 Ser Tyr Ala Vai Ser Gly Tyr Gin Trp Gly Pro Leu Gly Lys Leu Phe Ala Leu Trp Vai Ala His Leu Tyr Pro Phe 575 WO 98/00549 WO 9800549PCT/AU97/00402 93- Leu Lys Gly Val Trp Ser Leu Leu Gly Arg Gin Asn Arg Thr Pro Thr Ile Val Ile 580 585 590 Val Leu Leu Ala Ser Ile Phe Ser Leu Leu Trp Val Arg 600 605 Ile Asn 610 Pro Phe Val Asp Ala Asn Pro Asn Ala Asn Asn Phe Asn Gly 615 620 Lys Gly Gly Val Phe 625

S.

WO 98/00549 PCT/AU97/00402 -94- INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 8411 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana STRAIN: Columbia (wild-type) (vii) IMMEDIATE SOURCE: CLONE: 23H12 RSW1 GENE (ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

exon 2296..2376 exon 2904..3099 exon 3198..3370 exon 3594..3708 (ix) FEATURE: n~1 1'~ WO 98/00549 PCT/AU97/00402 NAME/KEY: exon LOCATION: 3824..4013 (ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

exon 4181..4447 exon 4783..5128 exon 5207..5344 exon 5426..5551 (ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

exon 5703..5915 exon 6022..6286 exon 6374..6570 exon 6655..7005 exon 7088..8032 WO 98/00549 96 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: PCT/AU97/00402 TTAGAAGAAG CCTGAGCCGG AGTCCTATTC AATTATCTAG AAGAAGTCTG AGCCGGAGTC CCACTCGATT GTCTAGGAGA AGCCTAAGCC TGAGCAGGAG TCCAGTCCGA TCATCTAGGA CCAGGAGACG TATCAGCAGG AGTCCAGTCC CTATTCGATT GTCCAGAAGA AGTATCAGCA TCAGCAGGAG TCCTGTTAGA GAAGGAGTGT GCGGCCTAGA CTTCTCCTAA TGGGCGCATA GTCGATACAG AACTAGTCCA ACCGTGACAG GTGAATAGCC TACTGAACCG TCTTTTTTAT GCCAAGTCGG TTCAGAAGCC AGTACATTGT TTCAGCTGAT GGTGATCCAA AAGAATTATT AGAAGCCGCT CAG3TATCGCC ATCCGTAGCC GATCTCCACC 3 5 AGTCCATCGA GGTCAAGATC AA-AGCAAGAT CAGTGTCAAG ACATCCTCGG ATAATAGCCC

GGAAGAAGAA

TCTCCTCCTC

AGGAGAGGGA

TCTCCTGATC

CACACATAAT

AACGTCTTCT

CACTAAGAGG

TCTTTACATC

CTTGATTTCA

TGGTCTCTGT

TTACAGAAAG

GGAGTCCCAT

AGAGTGTGAG

GATCATCTAG

GGAGTCCTAT

GAATTAGCAG

CTGACCGCAG

GAGGATTTAG

GATCTCCTTA

ATAACTCCCC

CTGTAGATTT

AAGAACACCT

TAAAAGTTTC

CAACTCGAAA

TATCGCAACC

AGAAGGTCAC

TCGATCACCT

CAGAAGTCCG

GAAGAGTGTG

TCGATTGTCC

AAGTCCAGTT

AAGAAGTTTG

CCAAAGATTC

TCGCTTTAGT

CTTTCTGTTA

AGAAGTCGCA

CCAAGGTACT

ATGAATATGG

GTATGCTCAG

GGCGGTAC-AG

CATCCGCTAG

A.GGAAGAGTG

GTTCGTTC.AT

AGCAGGAGTC

AGGAGAAGTA

CCGGCAAGGA

TCAAGAAGTG

TCATACGCCC

GATAGGAGTG

CACACTCTCG

GAAGGTTCTC

TATCCTCTTT

AACTAAAATT

GTATAGAAGA

CCGCAGCCCT

CCACAGCCTG

TGGGACGGGG

GTCGGATTCG

TTAATGAATA

120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 AAGATCAAAG TCATATTCAA AATCTCCCAT

ATCACCATCC

AGGTGGGAAA

AAGGCAAGGT

AAGGGATTAG

CTCCATCGAA

TAGCCTATGA

WO 98/00549 PTA9/00 PCT/AU97/00402 97 ATGATTACCC TTAAGTTAAG TGTTTGTTCT TTTTACTGAG AAGAGATGGT AAAGAGAGTA

AGTAGTTTAC

CCAAGATTGT

CGACAAATGT

TCATAAAATT

GTAATTTATT

TCACTATACC

ATGGAATTTT

GTATACTGAA

TCCATTTAAA

ATCAACACTA

GTACCAGAGC

ATACTTTGTC

GCGCGTCTAG

GCCGAATAGA

CCATCTCCGG

TTCTGTAAAA

TAAAAATTTC

TAACTTCCAT

AAGCATAACT

TATTTGGATA

ACCGTCATTA

TTTTGTTAAA

TTAAAACTTG

AATTATTTGC

GTTATTTTGA

CAACAACAAC

CTATCTCTTT

TGGGGAAGCC

GCCGAGCTGP

TAAAATAATC

GGCAAAAGCZ

CATAAGCATT

TGTTGATGTT

TATTCGTTGC

AAATGTGACG

ATCAATATAA

TCACTATCAC

AGTTTATTGA

TAAATATAAC

GAATTCGCCC

GTCCACCGAC

GTGGCTTCTT

CTTGCAATAA

AGAACGGCTC

GCTAAAACGG

STACTTGTCAI

SGATTTGAAG.

GTCTTTTGCG TATGTTTGTT

TGCCGACATT

GGAGTTGGTT

TTTGTCACCA

TTTACGATTT

TAAATATAAA

CACAAAAAAT

AACAAAATGG

TTTTCTTTGT

TTGGTCCAAT

ALACTTTAGAA

CTTCCTACAT

AATGTTAAAA

GAATTAAAAC

GATTAAAAAA

GTAACTTCTT AAGCTAACAA

AGGTGATAGC

CTTTTTTTTT

GATTTTGCCA

ACTTTAAAAA

TGGGAGAGGA

TTAAAAATTA

AGAAGCAGCT

AAATAAAAAA

TTTAATATAA

CGTCACATAC

GTAGAGAGAT

AGAGGCTACT

AGAAAAAACA

TGAGATATCA

L'GATTATGCT

TGCC-ATTTGC

A.ATTAAACTT

CAACGACATC

A!,TATTCATA

TGATTTCTTA

TCAGAAATCT

AGAAGTGGCA

TTAGAACCTA

GAACAGGCTG

TCAAACAATC

TAAGAAGCTG

GATAACTTGA

ACTACCGTCA

CATCACTCTG

AATAGAGAGA

TCGAGATGCC

CTTTCTCTGT

CCGGAGAAAC

1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 18G0 1920 1980 2040 2100 2160 2220 2280 2340

CGATAAAATA

GAGAGTGACA GAGGAGTGTG TGAACATCCT TTTTTAGTAG ATTTGGGTTT

GTATTGAATC

GTGTCGGTGG

GGCTACGAAT

CTGCGATGGA

TTCCCAATTT

GGCCAGTGCC

TGAATTTTGT

GGCTTGGTTG

GAATCTCTCT

CTGGATCCTA

WO 98/00549 PCT/AU97/00402 98

GAGCTCGTTC

ATTTTTTTGT

CTCTGTTTCT

GCTTCCTTTG

AACTTTTCTG

TTTCTTAAAA

AAAATCTTTG

TTAGTTCCCG3

TTTGATTTAG

ATGCAATGTT

AGATCTGTGG

AATGTGCCTT

GGATCCGACA

TATTGTTTTT

AAATCTCGTC

ACTGTGTACT

CTCCAATTCT

TCTCTTATGG

ACTCTTAAAG

CCATGGGATA

ATCCTTGCTC

GGTCAAAGTG

TGATGATGTT

CCCTGTGTGT

TGAATCTGAT

CGTTCTTACA

TCTTTTGGAT

TAGTTACTGT

TCAAATTGTT

AAAGTGTCGG

TTTAGCTTTA

CAGACTGTGA

ATCTCTTTCT

CAGACCAAAC

GGACTCGCTG

CGGCCTTGCT

GGCGGGGTCT

ATTTTTGATG

CCATAATTGG

CAGTTGATCA

GTGATCTATA

TGGATTTCAG

CTTATTGAGA

CTCGCCTTAA

GTAGTTTCTA

CTTTGAAGAA

AAACTGGAGA

GTTCATCTTC

TGTAGATCTC

ATCATTGAAA

AGTAAGTGTC

TCAATTAATG

TTCGTTAACT

TTTAGCTCAA

TTCAGATCTG

ATACTCAATG

TATGAATGGC

TGTCTTTGTC

CCTTTTTCCC

ATCTAGATTT

CTCAGATTTC

TGAAAATGGA

CCGCATCTGT

TTTTTAAGCT

CTAGATCTCG

CATTGATTGT

ACTAACAATG

CAGATATGTC

GCGTGTAATG

2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 ATGAGTACGA GAGGAAAGAT GGAACTCAGT GTTGCCCTCA ATG3CAAGACT AGATTCAGAC

GTTGGCAATT

TTTTTGGAAT

TGATATCGAG

TGGCGAAGAG

TGGCC-ATACG

ATAAATCCCT

ATCTTAACAG

GCTATATATG

TTTCCAGGGA

AATGAGTTCA

TTTTCTTCTT

GTAGGGACCT

AGAGTGGTCA

AATTTTTTTC

GATTTTCTCT

GTCCTCGTGT

ATTACGCCCA

CCTCTAGACA

ACATTTTCCC

TTTATTACTT

TGATAGTACA

GACACAGGGG

TTTTGTTTCT

TGAAGGAGAT

GGGAGCTAAC

TGAATCTCAA

TTTAGACTCT

ACTATTCACG

TGGTCATCCA

TCAGTTGTCT TTTTCTTTTT

TTGCTGTTGT

GAAGATGAGG

AAGGCGAGAC

CCAATTCCTC

AGAGTGATTT

TTAATGTTAT1

AATTAAGAAA

GTTGAACAAT

ATGATGTTGA

ACCAACGCCA

TTCTCACCCA

GTATTACTCA

ATGTGAACAA

TAATAATAGA

WO 98/00549 PTA9/00 PCT/AU97100402 99

TGTTGTTAGT

GAGAGATTCG

TGTGTCTGTT TTCAATAGAT CACGCCTGAT ACACAATCTG CTGACAGGAA TGCTATTTCA TCTCCATATA TTTTCCCTTG TGCACGTGGT TATTTTGTGT ACCGTAACTG GACCCGTCAA AAGACTTGAA GAAGGCTGGA AGCTGAAGCA GGGAAAGGAG GAGAAATTGA GAAATACTAG ACCAATATCT AACCACTCTT GGTTTCTTAT TTTGTTTCCT TCCACTCAAT CTATGAGTCG TGTGGTGCCT TTCTCCGGCT TATCATCTTG ATGCATATCC TTTGTGGTTG TTCTTGATCA GTTTCCCAAA CTATAAGGTT GGTCTTTAAG CTTGATATCT TCTATCACAG

CTTTGTTAAA

ATAGCTCCCG

CTCTTATGGG

GGAGAAAAAT

AGGGACTGGT

TTATTGTCCA

TATGTATTGA

TATGGTTCTC

ATCCCATCTT

TGTTTCTTCT

ACCTCGGTTA

TGGTACCCCA

TTTATACATC

TTTTCGATAG

TCATGACCTT

TGCGAACTAC

TTGATCCACG

TGTGATTCCT

CTAAAAATTG

CTTGGTAATG

ATGTTACAGA

TCCAATGGCG

ACTCAAACAG

TAGACATAAT

GTACTTACAG

CTCGCCTAAC

TGCAATATCG

TCTGTGAGAT

TTAACAGGGA

CCCTACTCTC

TTGACTTTTT

TTATGATATA

GCAATCTACT

TGTTTATTAA

TTTCTATACA

ATCAGGTCCT

GCAACCTGGT

ATTCATTTTT

CAGTCCCTGT

TTGACTGGAA

TGACTGGTAA

AAGAACTCCA

CTCTTGGCCG

TAAGTATCTG

GGCTGATGAT

CCCTTATCGG

TACAACTCAC

CTGGTTTGCA

GACTTATCTT

ATCTCTCTTT

CCCCCTGTAA

TCTTAAGAAG

CAGGTTTCTG

TTGGGTCCTT

ATTCATATGT

ACAACATATA

AAGAATCGTG

AGAAAGAGTT

ATACCATGAA

AATGTAAGTG

TGATGCTAAT

CTTTGTTACA

ACACGTCTTC

GTTGTGATTA

CCTGTGAAAA

TTTTCTTGGC

GACCGTCTCG

TATGTATTAA

ATTTAATTTA

ATTATGATTG

3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 AATTTAGACA ATGGTGCATC TGAATTTTGA

TAAATCTTGA

TAGACTATGA

AATTTAGTAG AAAACCATCT TAGAAGCATG CCAAGTGGAG GACCATGTGA AGTTTCCGAC GATAGAGCTT AGCTATTATA 1* WO 98/00549 PCT/AU97/00402 100 CTGATTTTAT ATGTGTTTTG ATTTTTTGGT ACCATCACAG CTCGTTCCTG TTGATGTGTT TCCCCTTGTT ACAGCAAACA CAGTTCTCTC AGTAGCCTGT TATGTTTCAG ATGATGGTTC AACCGCTGAG TTTGCAAAGA AATGGGTACC

TTCTTATTGT

TGTTAGTACA

AGATATGATC

GTGGACCCAT

GATTCTTTCT GTGGACTACC GGCCCCTGAA TTCTATTTTG CCCAGAAGAT TTTTGTTAAA GAGCGACGAG CTATGAAGGT TACGGCAAAG AGATTGACTG ACTTTTTCTT GTTTAAAGTG AGGATAAATG CTCTTGTTGC GACAATGCAG GATGGTACTC CCTGGCCTGG ACAGGTACAG TGTGGCAATC CCTTGATTGT TTACATCGTT TTGTTTCAAT TTCAGGTGTT TGGAAATGAG CTGcC-TAGAC TCATCTATGT CCACAAAAAG GCTGGAGCTA TGAATGCATT CCTCTATTTT ATTCTCTTGT TCACTGCCTA CACATTCTTT TTTTTCTAGG CTATGTGTTC AGATCCGTGT ATCTGCTGTT CTTACCAATG 3 5 ATTACTTTAA TAACAGTAAG GCTATTAAAG TTGGAAAGAA GTGCTGCTAT GTCCAGTTCC

AGCTATGCTT

DTTTTGCAAG

PGATTACTTG

CATTTGAAAA

TGGTTTGTAT

CAAAGCACAG

TAACAACACT

GACAGAGAGG

CTTAGGCCAT

TTCTCGTGAA

GGTTTGTTAA

AGAAACGTTC

TCTCCTAATT

GAGCATATCIr

AAGCTATGTC

ACCTTTGAAT

AAATTCAACA

A.AGGACAAGA

GTCCACCTGC

TGACAGAGAG

AAAATCCCTG

AGAGATCATC

GAGACGGTGA

TGAAAGAGCC

CGGTAGATAA

CCCTTTCTGA

TTGAACCTAG

TCCAACCGTC

TTCTCATCCA

AGTATGAAGA

AAGAAGGCTG

CTGGAATGAT

4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 ATAACGTAAA GGAAACATGT AGTGGGGGTC TGGATACCGA

AAGCGGCCTG

CTTTCAGAAT

TTCTTGTGTA

TAGTATCTCT

TTTGAACGTG

TTTCATGATG

GATTTCAACA

CCTATTGTGT

GCCGTTGCTT

TTACTTTGAC

GATTGTGATC

GACCCGGCTA

CTCAACGTTT TGACGGTATT GATTTGCACG ATCGATATGC CAACAGGAAT ATAGTCTTTT TCGATGTGAG TATCACTTCC CCATTGTCTT 1~ WO 98/00549 PCT/AU97/00402 101 TTGTTTCTCT TTTGTTCATA GATATTTGTT CTCTTGGGCA TATGTGGGTA CTGGTTGTTG ACGGAAGAAG ATTTAGAACC GGTAAAAGTA GCAAGAAGTA AATGCTCCAC TTTTCAATAT ATTGTGTAAT AACATCACTT CTTGTTTATG CAGGTTATGA AAGCGTTTTG GTCAGTCGCC CCACCAACAA CCAATCCCGC TACGAAGACA AGACTGAATG CTTATGTTCT CTTTCTTACC TGGATCTATG GTTCCGTGAC TGGATATCGA TCTACTGCAA CTTTCTGATC GTTTGAACCA AGCAGACATT GTCCTATCTG GCTTATATCA ACACCATCGT CTTCCCGCTT TTTGTCTCAT CACACTGCTA TTTACTATTT GTTGCAGATA AGCAACTACG

TTTTGGTTGG

GATTAACATG

TTTTAATAGG

AAATATTATT

ATTTACTCGT

AAGGGGTTGG

CAGGCTCTAT

GTCAAGAGCT

TAACTACGAA AAGAGGAGAG

GGAGGACATC

CTTTATGTAA

TGATGAGAGG

GGTATTTATT

TACTCTTCTG

GGGCAAAGAG

TGTTTGATGA

GGAAGATATT

TCCTCCACGC

AGTTCTTCGA

GTATGGTTAC

CTATCCTATT

CACCGACAGA

GAATCCCATT

CGAGTATTTG

GATGAGGGTT

TGATTTATGT

TCTATTCTAA

GCGGCAACCT

AAGGAGGCTA

GTCAGTTTTC

CATCTTATTT

CTTACTGGGT

CCTGCGTTCA

TGGGCTTTGG

CATGGAAGGT

ACATCCATCC

TTCATCATAC

TTGTGAATGC

GTTCATTCTA

TTCTGCTATG

ATGGTATCCA

ATGGGTATGA

GTTGCGGGTC

GOAT CAACAG

TTGAAGGTTT

GATGGTGAAA

TGTCCCAGAG

TCATGGAACA

TTCATGTTAT

AAATGCAGCT

GGCACTTTTG

GCCTGACTTG

GGGTCCAGTA

TCCTGTTTTG

AAGGAAGAAA

AAGTGACTCC

GATTGAGCTG

TCTTACAATC

GAGTGTAGAG

AGGCGGCATT

AAGCTGTGGT

ACAGAATCTT

TTAGATTGGT

6000 6060 6120 6180 6240 6300 6360 6420 6480 6540 6600 6660 6720 6780 6840 6900 6960 7020 7080 7140 TCAAGATGCA TGCCCGGGGT AGGGATCTGC ACCAATCAAT GATCTATCGA GATTCTTCTT TGAGACTTTT GGAGAGGATC CTCTTATTGC GTATTGTATT CCGAGGTTTG TAAAACTGAC ATTTTTTTGT CATCATCATT CTCTTCATCT CAATTGCTGT WO 98/00549 WO 9800549PCT/AU97/00402 102 GACTGGAATC C GCAGTTCTGG C TAAGGTTCTT C

TGGGGATTTT

CGTCCTACTT

CTACCAGTCG

TCTCTACCCT

TGTCTGGTCT

TGTGGACGCC

ATTTATATAC

ACCCATCAAA

TCCGAGTAGC

GAAGAAGATG

TATTTATTTT

TAAAAAGGGA

CACTTTACTA

GATGTCTTCG

CATTTATTTT

TACATATTTG

TGGAGCTGA

;TCATTGGTG

3CTGGTATCG 3CAGAACTCT 3TGAACCTCA

TGGGGTCCGC

TTCTTGAAAG

GTTCTTCTCG

AATCCCAATG

TTGTGTGTGC

CCCCAGTGAA

CAGAGAAGGT

TGGACCCAAA

ATTTAAAGCT

TTGGAGTTTT

CAAAAAGTTT

GGTGAACTCG

TTTGAACTTT

GATGGAGCGG

GCACATCCGC

ACACCAACTT

ACATCTTCAA

TAGGCATTGT

TTTTCGGGAA

GTCTGTTGGG

CCTCCATCTT

CCAACAACTT

ATATATCAAA

CCGGGCAGTT

GAAATTGTTC

TGATACATAG

TGTTAGACTC

CTTTTTATCT

ATGGATATGA

TGTGAGCATT GAGGATTGGT CCATCTTTTT GCTGTCTTCC CACCGTTACA TCTAAAGCCA

ATGGACAGCT

GGCTGGTGTC

GCTCTTCTTC

AAGACAAAAC

CTCGTTGCTT

CAATGGCAAA

AACGCGCAAT

AAGGTGATTC

GTAAC.ACTAT

TCTACAAAAA

ACACTTATGT

AAGAATCTGA

TGGTGTACGT

CTTCTCATTC

TCTTATGCTG

GCCTTATGGG

CGAACACCAA

TGGGTCAGGA

GGAGGTGTCT

GGGAATTCCA

CATGTCCAAG

TGTAATGATT

GAATT'TGTTA

AATGTTGGAA

AGTTTATATG

CAATTGTTGG

CATCAGTACA

TGGGACTTGA

GGAGGAACGA

AAGGTCTACT

CAGACGAAGA

CACCAACCAC

TAAACAGTGG

TTATTGCCCA

CCATCGTCAT

TCAATCCCTT

TTTAGACCCT

AATCATCTAA

ATTAGCTTTC

TTCCAGTGGG

TTCTTTCTTA

CTTGTTGTCC

CTAAGCTTTT

TGCAAGTGTT

AATAGAATGA

TCAGTAAAGT

7200 7260 7320 7380 7440 7500 7560 7620 7680 7740 7800 7860 7920 7980 8040 8100 8160 8220 8280 8340 CCCTCTTGTT TTGTCTCACC TAACGAAATC TTTGTCATTA AAGAGATATT GTGTAAACTC TTATTTGAAT CAGAATCAGA TCAATCAAAA ATTGAAAACG TAAAGTTCAA ACAAAAAGGT AGAGTGAATC TTTTAATCCC CCCTCAATAC WO 98/00549 PTA9100 PCT/AU97/00402 103- TAATTTGTGA AATCTCAAGT GGTGTAAAAT GAACCCAATT AGTATCCACA ATGTGTTTCT 8400 8411 CTGATCAATC C INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 5009 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana STRAIN: Columbia (vii) IMMEDIATE SOURCE: CLONE: 12C4 (ix) FEATURE: NAME/KEY: exon LOCATION: 863. .943 (ix) FEATURE:

NAME/KEY:

(B3) LOCATION: (ix) FEATURE:

NAME/KEY:

LOCATION:

exon 1454. .1840 exon 1923. .2025 (ix) FEATURE: Ii I, WO 98/00549 PCT/AU97/00402 104

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:.

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

(ix) FEATURE:

NAME/KEY:

LOCATION:

exon 2122. .2311 exon 2421. .2687 exon 2776. .3121 exon 3220. .3357 exon 3507. .3623 exon 3723. .3935 exon 4027. .4297 (ix) FEATURE:

NAME/KEY:

LOCATION:

exon 4380. .4576 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: AAGGAATAAT AAGATAGGGG TTTAATGGGA GACAATCAAT CTTCAGGGGT TTTCTGGAAN AACGGCGGGG TAAAAAACAA GACATCAATC GGACCCGATC ACGAGGACCC GGATCCGNAT CGATAAACAG NGTAGCTTTC AATACCCCAT TTTCCCAGAA ACACCTCTCA AAAATTTTTT Is WO 98/00549 PCTIAU97/00402 105- CAAGAACTNG TATAAATATC TCAGTTTCGT TCACGCAGGT TNTNTTCATN GTTCACCAAC TCCCTCTTGA AGGTGGGACA CATAGCCATC GCGTCGTTTT CTCCGGGACC CACTTATTTC TATACATACA ATTGTTTTCA GTCTCAATTT GCTGTCCACA AGGGGTGGTG TCTGAATCTC GTCTCTCTCA TTCCTATTTA AACCCTTCCA CATTGCTTTT GTCAGTCTGT AAAATTCTCT CTTAAATCCA AAACAGTTTT TTTTTCTTTC TTTCTTTATT GCTGTCTCCG GGAAAATTCG TTTTTTTTCT CCTTCGGGAT TTTTATTTAA TAATTATCCC CGAGCCAACA TTTATTGTCG TTCGTCTTCC ACTCTTACTA GTGCATGCTC TGAATCTGTA CTGGATCCAT TATCCTAGCC GGGTCGGGTC AAGGTCTTTG TTGATTCGGT GTAGAAGACA TCATGAATAC TGGTGGTCGG CAGAAACGAA TTCGTTCTCA TTAACGCCGA TGAGAGTGCC ANGAATTTGT GACGGAAAAA AGTTTAATTT TTTCTCTTTC AATCTAGATG GAATATTTTG ATCTGAAATT GGAAGTTTCT ACATGTTCTG TTTTTTCTTT TTTCTTTTCT TCAAGTAGTG GGCAGAGATG TCCTGAGAAC CGAATTCAAT GTTGTAGCAG GTCCATTTTT TTATATTACT AATTCTGTTC TTGGTTTATT

CTTTNTTATT

GAGTCCAGCT

GTGACGTTTC

TTTTAACACA

TCCCAATCTA

TTGAATCAGT

TGCTTGTTGT

CTCTTTTTTT

ATTCGGTTTA

TGTAATGGGA

AGTAAGAGAG

CTCATTGCTG

AGAGTAAGAA

TTGGGGATCT

AGGGAGTAAT

TTGCATGATT

TAGCAATAAG

TGAGCTGGTC

TTGGNAPANTC

CCACCACCAC

TCTCTTTGTA

ACTCTATCTC

ATCTATCACA

GAATCACTCA

GGAATCAATA

TTTTTTTTGG

TTTCGTCTCC

GTTCAACAGT

ACAATTCGTT

GCTCTCACAA

TAACTTTTGT

AGATTATGAG

GCCGCAACCC

CATACGTGTC

TTCAAAGAAA

TTTATTGCAT

ACTTTATGAA

240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 TTCACCTGGA TTCAGATACT AATAACTGTC TCAATTATGT AAAAATc3ACA ATTCAGTTTC ACAATTATGT AATTCATAAT CGATGAATGT TTTTCTTGAG TCTTTATCAT WO 98/00549

P

106 CTTTAGGATT TGATTAAGAT GCAATTTGAT GAAAATACTA AAAAGACTCA TGTGTTCTCA T/AU97/00402 TTTCTCTATG TAGATACGAT CAGTACAAGA AGATGAAATc

CCCGGTTTGT

GTGCAAAACT

AGAAGAAGAC

CGCTGAAGCC

ACCTGGCTCT

GTTTTCTCTG

AGGATGCTGA

GGAATCGCGT

TATGATTCCT

GGTTTTGTTT

AATATGGTTA

GCGAAAAGCT

ACGACGAACT

GATGAAATGA

CATTTTTCGT

ATTCGTTCAA

GGTCTTTTCT

GAATTAACGG

AGACCATGCT

CGATACAAAA

ATTGATGATC

GCACTCTCTT

CAGATTCCTC

ACAATGTTGT

TATGTATTCT

CTATCCTGCA

ACAATTTTTC

GTTGTTTTCA

TGGAAGTGTT

TCAAGTCATT

AGATGATCCT

TGCTCTGAAA

GCTAATTC-AG

GCAGAATAAP

TTCATTATAC

TTAGCAGTGA

ATGAGTATGA

GGATTAAAGG

TTGAGTATGA

CACGCCTTAA

TTTTGACTTA

TGCTTAGATG

GATCGTCATG

CCGTTTACAG

TTCTTATATG

GCACAGGCGA

GCTTGGAAGG

ACTGAGTGGG

GCTCTTTGTT

ACGTAGAGAA

TAGTCCACGG

GTTTGATCAT

CACCGGTCGT

TTGTGATGAA

ATTCTTTTTC

CTCTTATCGT

ATTCTTCTGC

ATTTGGTCAC

GATCAATGGT

ACCGTATGGA

CAAACATGTC

.CTTGCAACG

GGAAATCAAG

.TTGATGGAG

GGGATGGACC

GGTGGATTGG

GTGAGGAATC

TTATTAGTCT

GCCTCCTTCA

ACCTCGTATG

CTTCTAATGA

TCCTCAGAAA

AGTTTGGAAG

TGGTCGAGGT

TAAAATCTA.A

TTTTCTTATI

CTCTCTCAAC

TGTGTCGCCU

ATGCATATGC

AAATCTGTGG

AATGCGCATT

CTTOTCCTCA

ATGATGAAGA

CTGAACATGC

ATTCAGCTCC

CAAATTGTTT

ATGTGTTTTC

ACGGGATATG

TGTTTACTTT

GTTATGAAAT

GATATTGCGG

AGACGACAAG

TCCAATGATG

CAAAAGTTCA

GTTGTTTAAA

AAAGCTACCT

CGCGATTCTT

;ATTATGGTTA

1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 AAGCATGAAG GAGGAAACAA

GACATGCCTA

TTTTGTGTTC

GATGGATGAA

TCCTTACAGG

AATTCTCCAT

TGTAAGTTGT

AATGGNTTTG

GGAAGACAAC

ATGTTAATTC

CCAGTCAALTG

WO 98/00549 PCT/AT397/00402 -107- ACGTCAGTTA TATGCGAGAT ATGGTTTGCA GTGTCTTGGA TTCTTGATCA ATTCCCCAAA TGGTATCCTA TAGAACGTGA AACATACCTC GATAGACTCT CTCTCAGGTA ACATAAACCC TGAAAAGTTC TTGTCTGCAA ATATTCATTT TTTTTCTTAC ATAAGGTACG AGAAGGAAGG

TTTTGTTAGT

TTCCATTCTA

TGCAGCTATG

TCCTTTTTGT

GATGGATTAC

GGTTTTCTTT

GAATCTGATT

GTGAAGATAA

CAAGATGGAA

ATGATGAGTT

CCTGGATCTG

TTTTTAAGAG

ATGGTAATGA

3 5 ACCACAAGAA

ACAGTGGATC

GCAGTTGATT

CTTACATTTG

AAGAAGTTTA

CTGAAGAACA

GCTGCTTTTT

TACGTTTTTT

ATGCACTGGT

CTCCTTGGCC

TGATTGAATA

TTAAATTGGA

ATTTTGThAA

GTTACCACGT

AGCTGGAGCT

CGTTGAAAGA

ATCCTGTGGA

AAGCTCTCTC

ATATCGAGCC

AAGTTCATCC

CTCTTTCTGA

GCTTGTTTGT

TGCTACTGCA

TGGAAACAAC

GGCAAAAA\A

ATGAGCACTC

TGACAGGTG7

CTAGTGTATC

ATGAATTCCI

TTTACATTCC

AAAACCGTCA

GCCACCCTTG

TAAGGTTGCG

TGATACAGCT

ACGAGCTCCT

TGCTTTTGTC

G3TATATCCTA

TTGTTGCAGA

CAGAAAGTGC

GTCCGTGACC

AAGCGGTTTT

TACTTCTCAA

TCTTGGGTCA

ITTTCTCGTGP.

TGGTAAGTAI

CAAAAATTTT

GGATTAGCAC

ATTACAGCAA

TGTTATGTAT

GAGTTTGCTA

GAGTGGTATT

AGGGAACGTC

TCATAAAAGT

GAGATTATGA

CTGAGGAAGG

ATCCTGGAAT

TGTCCTCTTC

TATATCTTCA

*TAGTGGAGTT

GAAGCGGCC'I

*AATGTGTTTC

TGAAACTCTA

CTGTTGATGT

ACACAGTTCT

CAAACAATGG

GAAAATGGGT

TTTCTCAGAA

GTGCTATGAA

GTTGTTTCAA

GGAGTTTAAA

TTGGACTATG

GATTCAGGTA

ACTTTGTTTC

GACCGAAGCC

CGTGATACGG

GGATTTGATC

TTTATTTATG

2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 37B0 AATCTCTCTT TTCGGAGCCC TGACTTCTCA TAAACTAAAA CTCATCTTAC TTCTTCTTGA AGATCCGAGT CTCTGCTGTT CTATCAAACG CTCCTTACCT TCTTAATGTC GATTGTGATC WO 98/00549 WO 9800549PCT/AU97/00402 108- ACTACATCAA CAACAGCAAA CGGGAAAGAA AGTTTGTTAT ATAGATACTC AAACCGTAAC TGTTTCTGTT TATGTTTTAG AAACAGATTA ACATGAAAGG TGTGTGTTTA GAAAACAGGC GGCAAAACCT GTAACTGTTG AGTAAAACGA AAGCCAAAGA GCGCTAGAGA ATGTCGACGA ACATTTCTTA TTTGGTTTCT GTCAAATGTT GAGAAGAGAT ATCTCCGGTT TTCGTTGCCT CCCCGCATGT TTGTTAAGAG CGAATGc3GGA AAAGAGGTAG GTTCTTGTGA TCTCATTCTT TGACTGAAGA TATCCTGACG GTATGCCTAA GCGTGCAGCT ATCAAGTTCT ACGTTGGGCT TATGGTATGG TTATGGTGGT

GCAATTAGAG

GTTCAGTTTC

GTTGTGTTCT

TGCTTTTCCT

TCTTGATGGG

TCTTTATGGT

GCCTAAATGG

TAAGAAAACT

AGGTGTTATC

GTCTTGTTGA

CTGAAGCAAC

CTGCTGTTCT

AATCTATGTG

CGCAGAGATT

TTGATGTATG

CTTTTCTCAT

ATACAAGGAC

TTTGATGCAC

TGTTGTTTGT

AACACTAAAG

GTCCCAGGTA

AAGTCTAAGT

ACAATTGAAA

ACAGAACGGT

TTTCATGATG

TGATGGGATT

TGTCCTTATC

TTGATATTGT

CGATATATGT

CAAAGAAGAA

GTTGTGGGTT

AGACTTCAAA

AAAAAAGAAG

AGATCCTTTT

TTGGAGAAGA

GGAGTTCCCC

TGCGGGTACG

AACTTCTGAA

CGGGTGGATT

TGGATGGAGA

TAACTTGTCA

CTTGAGCAGA

ATTCTCTTAC

GACCCGCAAT

GATAGACATG

TCTTTTGCTT

TTTGGTGTGG

CGGGACAGGT

GAAACCACCA

GAGAAAGAAG

GCAGATTCAT

GAAAAAAAAA

GATTGTTAGT

AGTTTGGACA

GTAACGCAAG

AAGATAAAAC

AACTAGAAAA

TATGGATCGG

TCTGTGTACT

GATCGTCTTC

CATTGTCCGA

ATCAACTCTG

3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 AAGCCATTCA AGTTATTAGC AAAACATTAC AAAGTTTTTC GCTGATAATC ACACGCAGAT GGTTTCAAGA TGCATTGCCA TTTAAAGGAT CTGCTCCTAT CTTGGCTCTG TAGAGATTTT GGTTTAAAAT GGTTGGAGAG TCGTCTATCC TTGGACTTCA CTTCCATTGA TCGTCTATTG TTCTCTCCCC GCGGTTTGTT

I~

L II !rr rC WO 98/00549 PCT/AU97/00402 -109- TACTCACAGG AAAATTCATC GTCCCTGAG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 3603 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana STRAIN: Columbia 5009 (vii) IMMEDIATE SOURCE: CLONE: RSW1 cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..3243 (xi) SEQUENCE DESCRIPTION: SEQ ID ATG GAG GCC AGT GCC GGC TTG GTT GCT GGA TCC TAC CGG AGA AAC GAG Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser Tyr Arg Arg Asn Glu CTC GTT CGG ATC CGA CAT GAA TCT GAT GGC GGG Leu Val Arg Ile Arg His Glu Ser Asp Gly Gly 25 ACC AAA CCT TTG AAG Thr Lys Pro Leu Lys WO 98/00549 PCT/AU97/00402 110- AAT ATG AAT GGC CAG ATA TGT CAG ATC TGT GGT GAT GAT GTT GGA CTC Asp Asp Val Giy Leu Asn Met Asn Gly Gin Ile Cys Ile Cys Gly GCT GAA Ala Glu ACT GGA GAT GTC Thr Gly Asp Vai GTC GCG TGT AAT Val Ala Cys Asn GAA TGT GCC TTC CCT Glu Cys Aia Phe Pro

GTG

val TGT CGG CCT TGC TAT Cys Arg Pro Cys Tyr 70 GAG TAC GAG AGG Glu Tyr Giu Arg GAT GGA ACT Asp Giy Thr CAG TGT Gin Cys CCT CGT Pro Arg TGC CCT CAA TGC AAG ACT AGA TTC AGA Cys Pro Gin Cys Lys Thr Arg Phe Arg CAC AGG GGG AGT His Arg Giy Ser GTT GAA GGA Vai Giu Gly TTC AAT TAC Phe Asn Tyr 115 GAA GAT GAG GAT Glu Asp Giu Asp GTT GAT GAT ATC GAG AAT GAG Val Asp Asp Ile Giu Asn Glu 110 GCC CAG GGA GCT Ala Gin Gly Ala AAG GCG AGA CAC Lys Ala Arg His CAA CGC Gin Arg 125 CAT GGC His Gly GAA GAG Glu Glu 130 TTT TCT TCT TCC TCT AGA CAT GAA TCT Phe Ser Ser Ser Ser Arg His Giu Ser 135 CCA ATT CCT CTT Pro Ile Pro Leu CTC ACC CAT GGC CAT Leu Thr His Giy His 145 GTT TCT GGA GAG Val Ser Gly Glu CGC ACG CCT GAT Arg Thr Pro Asp CAA TCT GTG CGA ACT ACA TCA GGT CCT TTG GGT CCT TCT Gin Ser Val Arg Thr Thr Ser Gly Pro Leu Gly Pro Ser GAC AGG AAT Asp Arg Asn 175 528 GCT ATT TCA Ala Ile Ser TCT CCA TAT ATT GAT CCA CGG CAA CCT GTC CCT GTA AGA Ser Pro Tyr Ile Asp Pro Arg Gin Pro Val Pro Val Arg 190 WO 98/00549 WO 9800549PCT/AU97/00402 -1III- ATC GTG GAC Ile Vai Asp 195 CCG TCA AAA GAC Pro Ser Lys Asp AAC TCT TAT GG Asn Ser Tyr Gly CTT GGT AAT GTT Leu Gly Asn Val 205 CAG GAG'AMA AAT Gin Giu Lys Asn GAC TCG AAA GAA AGA GTT Asp Trp Lys Ciu Arg Val 210 GAA GCC TGG AAG Giu Ciy Trp Lys 215 CTG AAG Leu Lys 220

ATG

Met 225 TTA CAG ATG ACT Leu Gin Met Thr AAA TAC CAT GAA CCC Lys Tyr His Giu Gly 235 AAA GCA CCA Lys Gly Cly CMA ATT Giu Ile 240 CAA CCC ACT CGT TCC AAT CCC GAA GAA CTC CAA Giu Cly Thr Ciy Ser Asn Ciy Ciu Giu Leu Gin 245 250 ATG OCT GAT GAT 768 Met Ala Asp CGT CTT CCT ATG Arg Leu Pro Met 260 ACT CGT GTC GTG Ser Arg Vai Val ATC CCA TCT TCT CCC CTA ACC Ile Pro Ser Ser Arg Leu Thr 270 CCT TAT Pro Tyr CTT CTC ATT ATT Val Val Ile Ile CCC CTT ATC ATC Arg Leu Ile Ile TTC TGT TTC TTC Leu Cys Phe Phe 285 TTC CAA Leu Gin 290 TAT CGT ACA ACT Tyr Arg Thr Thr CAC CCT GTG AAA AAT His Pro Val Lys Asn 295 GAG ATC TGC TTT GCA Ciu Ile Trp Phe Ala 315 GCA TAT Ala Tyr 300 CCT TTG TCC Pro Leu Trp

TTG

Leu 305 ACC TCC GTT ATC Thr Ser Val Ile TTT TCT TGG CTT Phe Ser Trp Leu CAT CAC TTT CCC Asp Gin Phe Pro TOG TAC CCC ATT AAC Trp Tyr Pro Ile Asn 330 AGO GAG ACT TAT CTT GAC Arg Ciu Thr Tyr Leu Asp 335 CAA CCA TCA CAC CTC OTT Ciu Pro Ser Gin Leu Vai 1008 1056 CCT CTC GCT ATA AGA Arg Leu Ala Ile Arg 340 TAT CAT CCA Tyr Asp Arg q WO 98/00549 PCT/AU97/00402 -112- CCT GTT GAT Pro Val Asp 355 GTG TTT GTT AGT Val Phe Val. Ser GTG GAC Val Asp CCA TTG AAA GAG CCT CCC Pro Leu Lys Glu Pro Pro 365 1104 CTT GTT Leu Val 370 ACA GCA AAC ACA Thr Ala Asn Thr CTC TCG ATT CTT Leu Ser Ile Leu GTG GAC TAC CCG Val Asp Tyr Pro

GTA

Val 385 GAT AAA GTA GCC TOT Asp Lys Val Ala Cys 390 TAT OTT TCA GAT Tyr Val Ser Asp GGT TCA GCT ATG Gly Ser Ala Met 1152 1200 1248 1296 ACC TTT GAA TCC Thr Phe Glu Ser TCT OAA ACC OCT Ser Olu Thr Ala TTT OCA AAG AAA Phe Ala Lys Lys TOG OTA Trp Val 415 TTC TAT Phe Tyr CCA TTT Pro Phe TOC AAO AAA Cys Lys Lys 420 TTC AAC ATT Phe Asn Ile CCT AGO GCC CCT Pro Arg Ala Pro

GAA

Glu 430 TTT 0CC CAG Phe Ala Oln 435 AAO ATA OAT Lys Ile Asp TAC TTG Tyr Leu 440 AAO GAC AAG ATC Lys Asp Lys Ile CAA CCO TCT TTT Gin Pro Ser Phe 445 GAG TTT AAA OTO Giu Phe Lys Val 1344 1392 OTT AAA Val Lys 450 GAG CGA CGA Glu Arg Arg OCT ATO Ala Met 455 AAG, AGA GAG TAT OAA Lys Arg Olu Tyr Glu 460

AGO

Arg 465 ATA AAT OCT CTT Ile Asn Ala Leu 0CC AAA GCA CAG Ala Lys Ala Gin ATC CCT GAA GAA Ile Pro Giu Glu TOG ACA ATO Trp Thr Met CAG OAT Gin Asp 485 GOT ACT CCC TOG CCT Oly Thr Pro Trp Pro 490 GOT AAC AAC ACT Giy Asn Asn Thr AGA OAT Arg Asp 495 1440 1488 1536 CAT CCT OGA His Pro Gly ATA CAG GTG TTC Ile Gln Val Phe GOC CAT AGT 000 GOT CTO GAT Oly His Ser Gly Gly Leu Asp 510 .WO 98/00549 PCT/AU97/00402 -113- ACC GAT Thr Asp CGG CCT Arg Pro 530 AAT GAG CTG CCT Asn Giu Leu Pro CTC ATC TAT GTT Leu Ile Tyr Val CGT GAA AAG Arg Giu Lys GGA TTT CAA CAC CAC Gly Phe Gin His His 535 AAA AAG GCT GGA GCT Lys Lys Ala Giy Ala 540 ATG AAT GCA TTG Met Asn Ala Leu 1584 1632 1680 ATC lie 545 CGT GTA TCT GCT Arg Val Ser Ala CTT ACC AAT GGA GCA Leu Thr Asn Gly Ala 555 CTT TTG AAC Leu Leu Asn GAT TGT GAT CAT TAC TTT AAT AAC AGT Asp Cys Asp His Tyr Phe Asn Asn Ser 565 GCT ATT AAA GAA Ala Ile Lys Glu GCT ATG Ala Met 575 1728 TGT TTC ATG ATG Cys Phe Met Met 580 GAC CCG GCT ATT GGA AAG AAG TGC TGC Asp Pro Aa Ile Gly Lys Lys Cys Cys 585 TAT GTC CAG Tyr Val Gin 590 TAT GCC AAC Tyr A-a Asn 1776 TTC CCT CAA Phe Pro Gin 595 CGT TTT GAC GGT Arg Phe Asp Gly GAT TTG CAC GAT Asp Leu His Asp 1824 AGG AAT Arg Asn 610 ATA GTC TTT TTC Ile Val Phe Phe ATT AAC ATG AAG Ile Asn Met Lys TTG GAT GGT ATC Leu Asp Gly Ile 1872

CAG

Gin 625 GGT CCA GTA TAT Gly Pro Val Tyr GGT ACT GGT TGT TGT TTT AAT AGG Gly Thr Gly Cys Cys Phe Asn Arg 635 CAG GCT Gin Ala 640 CCA AAT Pro Asn 655 1920 1968 CTA TAT GGG TAT GAT CCT GTT TTG ACG GAA GAA GAT TTA GAA Leu Tyr Giy Tyr Asp Pro Val Leu Thr Giu Glu Asp Leu Glu ATT ATT GTC Ile Ile Val AGC TGT TGC GGG Ser Cys Cys Gly TCA AGG AAG Ser Arg Lys AAA GGT AAA AGT AGC Lys Gly Lys Ser Ser 670 2016 WO 98/00549 PTA9/00 PCT/AU97/00402 -114- AAG AAG TAT AAC TAC GAA AAG AGG AGA GGC ATC AAC AGA AGT Asn Arg Ser GAC TCC Asp Ser 2064 Lys Lys Tyr Asn 675 Tyr Giu Lys Arg Gly Ile AAT GCT Asn Ala 690 CCA CTT TTC Pro Leu Phe AAT ATG Asn Met 695 GAG GAG ATC GAT Glu Asp Ile Asp GGT TTT Gly Phe GAA GGT Giu Gly 2112

TAT

Tyr 705 GAT GAT GAG AGG TCT Asp Asp Giu Arg Ser 710 ATT CTA ATG TCC GAG AGG AGT GTA GAG AAG Ile Leu Met Ser Gin Arg Ser Val Glu Lys 715 720 2160 CGT TTT GGT GAG TCG Arg Phe Gly Gin Ser 725 CCG GTA TTT ATT Pro Val Phe Ile GCA ACC TTC ATG Ala Thr Phe Met GAA CAA Giu Gin 735 2208 GGC GGC ATT CGA Giy Gly Ile Pro 740 CGA AGA ACC AAT Pro Thr Thr Asn GGT ACT CTT CTG Ala Thr Leu Leu AAG GAG GCT Lys Giu Aia 750 22S6 ATT CAT GTT Ile His Val 755 ATA AGC TGT GGT Ile Ser Cys Gly GAA GAG Giu Asp AAG ACT GAA TGG GGC AAA Lys Thr Giu Trp Gly Lys 765 2304 GAG ATT Giu Ile 770 GGT TGG ATC TAT GGT Giy Trp Ile Tyr Giy 775 TCC GTG ACG GAA Ser Val Thr Glu GAT ATT CTT ACT GGG Asp Ile Leu Thr Giy 780 TAG TGC AAT CCT CGA Tyr Cys Asn Pro Pro 800 2352

TTC

Phe 785 AAG ATG CAT GCC CGG GGT TGG ATA TCG Lys Met His Ala Arg Gly Trp Ile Ser 790 2400 CGC CCT GCG Arg Pro Ala TTC AAG GGA TCT GGA Phe Lys Gly Ser Aia 805 CTT CGA TGG GGT TTG Leu Arg Trp Aia Leu 820 AAT CTT TCT Asn Leu Ser GAT CGT TTG Asp Arg Leu 815 CTT CTT AGC Leu Leu Ser 830 2448 2496 AAC CAA GTT Asn Gin Val GGA TCT ATC GAG ATT Gly Ser Ile Giu Ile 825 WO 98/00549 WO 9800549PCT/AU97/00402 -115- AGA CAT Arg His CCT ATC TGG TAT Pro Ile Trp Tyr TAC CAT GGA Tyr His Gly AGG TTG AGA CTT TTG Arg Leu Arg Leu Leu 845 GAG AGG Glu Arg 850 ATC GCT TAT ATC Ile Ala Tyr Ile ACC ATC GTC TAT Thr Ile Val Tyr ATT ACA TCC ATC Ile Thr Ser Ile 2544 2592 2640 2688

CCT

Pro 865 CTT ATT GCG TAT Leu Ile Ala Tyr ATT CTT CCC GCT Ile Leu Pro Ala TTT TGT Phe Cys 875 CTC ATO ACC Leu Ile Thr AGA TTC ATC ATA Arg Phe Ile Ile GAG ATA AGC AAC Glu Ile Ser Asn GCG AGT ATT TGG Ala Ser Ile Trp TTC ATT Phe Ile 895 CTA CTC TTC Leu Leu Phe TCA ATT GCT GTG Ser Ile Ala Val GGA ATC CTG GAG Gly Ile Leu Glu CTG AGA TGG Leu Arg Trp 910 2736 AGC GGT GTG AGC ATT GAG GAT Ser Gly Vai Ser Ile Giu Asp 915 TGG TGG AGG Trp Trp Arg 920 AAC GAG Asn Glu CAG TTC TGG GTC Gin Phe Trp Val 925 2784 ATT GGT Ile Giy 930 GGC ACA TCC GCC CAT CTT TTT GCT GTC TTC CAA GGT CTA CTT Giy Thr Ser Ala His Leu Phe Ala Val Phe Gin Gly Leu Leu 935 940 2832 AAG GTT CTT GCT GGT Lys Vai Leu Ala Gly 945 GAC ACC AAC TTC Asp Thr Asn Phe GTT ACA TCT AAA Val Thr Ser Lys 2880 2928 ACA GAC GAA GAT Thr Asp Giu Asp GAT TTT GCA GAA CTC Asp Phe Ala Giu Leu 970 TAC ATC TTC AAA Tyr Ile Phe Lys TGG ACA Trp Thr 975 GCT CTT CTC ATT CCA CCA ACC ACC GTC CTA CTT GTG AAC CTC ATA GGC Ala Leu Leu Ile Pro Pro Thr Thr Val Leu Leu Val Asn Leu Ile Gly 2976 P WO 98/00549 PCT/AU97/00402 -116- ATT GTG Ile Val GGT CCG Gly Pro 1010 CTC TAC 0 Leu Tyr GCT GGT GTC TCT TAT GCT GTA AAC AGT Ala Gly Val Ser Tyr Ala Val Asn Ser 995 1000 CTT TTC GGG AAG CTC TTC TTC GCC TTA Leu Phe Gly Lys Leu Phe Phe Ala Leu 1015 CCT TTC TTG AAA GGT CTG TTG GGA AGA Pro Phe Leu Lys Gly Leu Leu Gly Arg 1030 103' GTC ATT GTC TGG TCT GTT CTT CTC GCC Val Ile Val Trp Ser Val Leu Leu Ala GGC TAC CAG TCG TGG Gly Tyr Gin Ser Trp 1005 TGG GTT ATT GCC CAT Trp Val Ile Ala His 1020 1 1025

ACC

Thr CAA AAC CGA AC Gin Asn Arg T1 TCC ATC TTC T( Ser Ile Phe SE

ATC

Ile :A CCA ir Pro 1040 G TTG r Leu 055 CC AAC la Asn 3024 3072 3120 3168 3216 1045 CTT TGG GTC Leu Trp Vai AGG ATC AAT CCC TTT Arg Ile Asn Pro Phe 1060 10501 GTG GAC GCC AAT CCC AAT G( Val Asp Ala Asn Pro Asn A: 1065 1070 TTT TAGACCCTAT TTATATACTT Phe AAC TTC AAT GGC AAA GGA GGT GTC Asn Phe Asn Giy Lys Gly Gly Val 3263 1075 1080

GTGTGTGCAT

CCAGTGAACC

GAGAAGGTGA

GACCCAAATG

TTAAAGCTTG

3 5 GGAGTTTTCT ATATCAAAAA CGCGCAATGG GGGCAGTTAA GGTGATTCCA AATTGTTCGT AACACTATTG ATACATAGTC TACAAAAAGA TTAGACTCAC ACTTATGTAA TTTTATCTAA GAATCTGAAG

GAATTCCAAA

TGTCCAAGAT

TAATGATTTT

ATTTGTTATT

TGTTGGAACT

TTTATATGCT

TCATCTAAAC CCATCAAACC TAGCTTTCTC CGAGTAGCC.A CCAGTGGGGA AGAAGATGTG CTTTCTTATA TTTATTTTAT TGTTGTCCTA AAAAGGGATT 3323 3383 3443 3503 3563 3603 INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: WO 98/00549 WO 9800549PCT/AU97/00402 -117- LENGTH: 1081 amino acids TYPE: amino acid TOPOLOGY: linear Met 1 Leu (ii) MOLECULE TYPE: protein (xi) SEQUENCE Glu Ala Ser Ala 5 Val Arg Ile Arg Met Asn Gly Gin DESCRIPTION: SEQ ID NO:6: Gly Leu Val Ala Gly Ser Tyr 10 His Glu Ser Asp Gly Gly Thr 25 Arg Arg Asn Glu Lys Pro Leu Lys Asp Val Gly Leu Cys Ala Phe Pro Asn Ile Cys Ile Cys Gly Asp Ala Cys Asn Giu Ala Glu Val Cys Gly Asp Val Arg Pro Cys Tyr Glu Arg Asp Gly Thr Gin Pro Gln Cys Arg Phe Arg Arg His Arg Gly Ser Pro Arg Val Glu Gly Asp Glu Asp Val Ala Phe Asn Glu Glu 130 Leu Thr 145 Gln Gly Ala Asp Asp Ile Glu Asn Glu 110 Arg His Gln Arg His Gly 125 Ser Gin Pro Ile Pro Leu Ser Ser Ser His Glu His Gly His Ser Gly Glu Thr Pro Asp Gin Ser Val Arg Thr 165 Thr Ser Gly Pro Gly Pro Ser Asp Arg Asn 175 WO 98/00549 PTA9/00 PCT/AU97/00402 118- Ala Ile Ser Ser Pro Tyr Ile Asp Pro Arg Gin Pro Val Pro Val Arg 190 Gly Asn Val Ile Val Asp 195 Pro Ser Lys Asp Asn Ser Tyr Gly Asp Trp Lys Giu Arg Vai Giu Gly Trp Lys Leu Gin Glu Lys Asn 210 215 Met Leu Gin Met Thr Lys Tyr His Giu Lys Gly Gly Giu Glu Gly Thr Gly Asn Gly Giu Glu Gin Met Ala Asp Asp Thr 255 Arg Leu Pro Ser Arg Val Val Pro Ile Pro Ser Ser 265 Arg Leu Thr 270 Cys Phe Phe Pro Tyr Arg 275 Val Val Ile Ile Arg Leu Ile Ile Leu Gin 290 Tyr Arg Thr Thr Pro Vai Lys Asn Tyr Pro Leu Trp Thr Ser Val Ile Giu Ile Trp Phe Phe Ser Trp Leu Asp Gin Phe Pro Trp Tyr Pro Ile Arg Giu Thr-Tyr Leu Asp 335 Arg Leu Ala Pro Vai Asp 355 Arg Tyr Asp Arg Giy Glu Pro Val Phe Val Ser Val Asp Pro Leu Ser Gin Leu Val 350 Lys Glu Pro Pro 365 Val Asp Tyr Pro Leu Vai 370 Thr Ala Asn Thr Leu Ser Ile Leu 1* WO 98/00549 PCT/AU97/00402 -119- Val Asp Lys Val Ala Cys Tyr Val Ser Asp Gly Ser Ala Met Thr Phe Glu Ser Ser Glu Thr Ala Phe Ala Lys Lys Trp Val 415 Pro Phe Cys Lys Phe Asn Ile Giu Pro Arg Ala Pro 425 Glu Phe Tyr 430 Phe Ala Lys Ile Asp Tyr Lys Asp Lys Ile Gin Pro Ser Phe 445 Val Lys 450 Glu Arg Arg Ala Lys Arg Glu Tyr Glu Phe Lys Val Ile Asn Ala Leu Ala Lys Ala Gin Ile Pro Giu Glu Trp Thr Met Gin Gly Thr Pro Trp Gly Asn Asn Thr Arg Asp 495 His Pro Gly Ile Gin Val Phe Gly His Ser Gly Gly Leu Asp 510 Arg Glu Lys Thr Asp Asn Glu Leu Pro Leu Ile Tyr Val Arg Pro Gly Phe Gin His 530 Lys Lys Ala Gly Met Asn Ala Leu Arg Val Ser Ala Leu Thr Asn Gly Tyr Leu Leu Asn Asp Cys Asp His Tyr Phe Asn Asn Ser 565 Ala Ile Lys Giu Ala Met 575 Cys Phe Met Asp Pro Ala Ile Gly Lys Lys Cys Cys Tyr Val Gin 590 WO 98/00549 WO 9800549PCT/AU97/00402 -120- Phe Pro Gin 595 Arg Phe Asp Gly Asp Leu His Asp Arg Tyr Ala Asn 605 beu Asp Gly Ile Arg Asn 610 Ile Val Phe Phe Ile Asn Met Lys Gly Pro Val Tyr Gly Thr Gly Cys Phe Asn Arg Gin Leu Tyr Gly Tyr Pro Vai Leu Thr Giu Giu Asp Leu Giu 650 Pro Asn 655 Ile Ile Val Lys Lys Tyr 675 Ser Cys Cys Giy Arg Lys Lys Gly Lys Ser Ser 670 Ser Asp Ser Asn Tyr Giu Lys Arg Gly Ile Asn Asn Ala 690 Pro Leu Phe Asn Giu Asp Ile Asp Gly Phe Giu Gly Asp Asp Giu Arg Ile Leu Met Ser Arg Ser Val Giu Arg Phe Gly Gin Pro Val Phe Ile Ala Thr Phe Met Giu Gin 735 Gly Gly Ile Ile His Val 755 Pro Thr Thr Asn Ala Thr Leu Ile Ser Cys Gly Giu Asp Lys Thr Leu Lys Giu Ala 750 Glu Trp Gly Lys 765 Ile Leu Thr Gly Giu Ile 77 0 Gly Trp Ile Tyr Ser Val Thr Giu Lys Met His Ala Lys Mt Hi Ala Gly Trp Ile Ser TyCsAnPr Tyr Cys Asn Pro 1' WO 98100549 PCT/AU97/004O2 121 Arg Pro Ala Phe Gly Ser Ala Pro Asn Leu Ser Asp Arg Leu 815 Asn Gin Val Arg Trp Ala Leu Ser Ile Glu Ile Leu Leu Ser 830 Arg Leu Leu Arg His Cys 835 Pro Ile Trp Tyr Tyr His Gly Arg Giu Pro 865 Ile Ala Tyr Ile Thr Ile Val Tyr Ile Thr Ser Ile Leu Ile Ala Tyr Ile Leu Pro Ala Cys Leu Ile Thr Arg Phe Ile Ile Giu Ile Ser Asn Ala Ser Ile Trp Phe Ile 895 Leu Leu Phe Ser Ile Ala Val Gly Ile Leu Giu Leu Arg Trp 910 Phe Trp Val Ser Giy Val 915 Ser Ile Giu Asp Trp Arg Asn Glu Ile Gly 930 Gly Thr Ser Ala Leu Phe Ala Val Phe Gin Gly Leu Leu 940 Lys Val Leu Ala Gly 945 Asp Thr Asn Phe Val Thr Ser Lys Thr Asp Giu Asp Asp Phe Ala Giu Tyr Ile Phe Lys Trp Thr 975 Ala Leu Leu Pro Pro Thr Thr Leu Leu Val Asn Leu Ile Giy 990 Ile Val Ala 99 R Gly Val Ser Tyr Ala Val Asn Ser Gly Tyr Gin Ser Trp 1000 1005 1' WO 98/00549 PCT/AU97/00402 122 Gly Pro Leu 1010 Phe Gly Lys Leu Phe 1015 Phe Ala Leu Trp Val 1020 Leu Gly Arg Gin Asn 1035 Ile Ala His Leu Tyr 1025 Pro Phe Leu Lys Gly Leu 1030 Arg Thr Pro 1040 Thr Ile Val Ile Val Trp Ser Val 1045 Leu Leu Ala Ser Ile Phe Ser Leu 1050 1055 Val Asp Ala Asn Pro Asn Ala Asn 1065 1070 Leu Trp Val Arg Ile Asn Pro Phe 1060 Asn Phe Asn Gly Lys Gly Gly Val Phe 1075 1080 INFORMATION FOR SEQ ID NO:7: Wi SEQUENCE CHARACTERISTICS: LENGTH: 3828 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana STRAIN: Columbia (vii) IMMEDIATE SOURCE: CLONE: Ath-A (ix) FEATURE: NAME/KEY: CDS WO 98/00549 WO 9800549PCTAU97OO402 123 LOCATION: 239. .3490 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: GTCGACACTA AGTGGATCCA AAGAATTCOC GGCCOCGTCG ATACGGCTGC GAGAAGACOA CAGAAGGGGA TTGTCGATTC GOTTTATTTC GTCTCCTTCG TCTTCCACTC TTACTAGTGC ATGCTCTGAA TCTGTATGTA ATGGGAGTTC AACAGTCTGG ATCCATTATC CTAGCCGGGT CGOGTCAAGG TCTTTGAATA AOAGAGACAA TTCGTTTTGA TTCGGTGTAG AAGACATC ATG AAT ACT Met Asn Thr

I

TTC GTT CTC Phe Val Leu GOT GGT CGG CTC ATT Gly Gly Arg Leu Ile 5 GCT GGC Ala Gly 10 TCT CAC AAC AGA AAC GAA Ser His Asn Arg Asn Giu AAC GCC GAT Asn Ala Asp GAG AGT Olu Ser GCC AGA ATA CGA Ala Arg Ilie Arg GTA CAA Val Gin GAA CTG AOT GGG CAA ACA TGT CAA ATC TOT OGA GAT Olu Leu Ser Giy Gin Thr Cys Gin Ile Cys Giy Asp ATC OAA TTA Ile Glu Leu ACO GTT Thr Val AGC AGT GAG CTC Ser Ser Glu Leu GTT GCT TGC AAC Vai Ala Cys Asn TGC GCA TTC CCG Cys Ala Phe Pro TGT AGA CCA TGC Cys Arg Pro Cys GAG TAT GAA COT AGA Glu Tyr Giu Arg Arg OAA GGA AAT Giu Gly Asn CAA OCT Gin Ala CCA CGO Pro Arg TOT CCT CAG TOC AAA Cys Pro Gin Cys Lys OTT OAT OGA OAT GAT Val Asp Oly Asp Asp 100 ACT CGA TAC AAA Thr Arg Tyr Lys ATT AAA Ile Lys GOT AOT Gly Ser 526 GMA GAA GAA GM GAC ATT OAT Giu Glu Giu Giu Asp Ile Asp OAT CTT GAG TAT Asp Leu Giu Tyr 110 1~ WO 98/00549 PCTAU97OO4O2 124 GAG TTT GAT Glu Phe Asp 115 CAT GGG ATG GAC CCT GAA CAT GCC GCT His Gly Met Asp Pro Glu His Ala Ala 120 0CC GCA CTC Ala Ala Leu TCT TCA CGC CTT AAC ACC GGT CGT OGT GGA TTG Ser Ser Arg Leu Asn Thr Gly Arg Gly Gly Leu TCA OCT CCA CCT Ser Ala Pro Pro 670 GGC TCT CAG ATT OCT Giy Ser Gin Ile Pro 145 TTG ACT TAT TGT Leu Thr Tyr Cys GAA GAT GCT GAT ATG Glu Asp Ala Asp Met 160 TAT TCT GAT CGT Tyr Ser Asp Arg GCT OTT ATC GTG Ala Leu Ile Val COT TCA ACG GGA Pro Ser Thr Gly TAT GGG Tyr Gly 175 AAT CGC GTC Asn Arg Val OCT GCA COO TTT ACA Pro Ala Pro Phe Thr 185 GAT TCT TCT GCA CCT OCA CAG Asp Ser Ser Ala Pro Pro Gin 190 GCG AGA TCA ATG GTT Ala Arg Ser Met Val 195 CCT CAG AAA Pro Gin Lys 200 GAT ATT GCG G?.A TAT Asp Ile Ala Glu Tyr 205 GGT TAT GGA Gly Tyr Gly AGT GTT GCT TOG AAG GAC Ser Val Ala Trp Lys Asp 210 ATG GAA GTT TGG AAG AGA CGA CAA GGC Met Olu Val Trp Lys Arg Arg Gin Gly 220

GAA

Giu 225 AAG CTT CAA GTC Lys Leu Gin Val AAG CAT GAA GGA Lys His Glu Oly GGA AAC Oly Asn 235 GAO ATO Asp Met AAT GOT OGA Asn Gly Arg 958 TCC AAT OAT GAC GAC Ser Asn Asp Asp Asp 245 OAA OGA AGA CAA OCT Glu Gly Arg Gin Pro 260 GAA OTA OAT Glu LeU Asp OAT CCT Asp Pro 250 OCT ATG ATO OAT Pro Met Met Asp 255 COT TCA AOC AGA Arg Ser Ser Arg 270 1006 1054 CTC TCA AGA AAO CTA OCT ATT Leu Ser Arg Lys Leu Pro Ilie 265 WO 98/00549 PTA9/00 PCT/AU97/00402 125 ATA AAT CCT TAC AGG ATG Ile Asn Pro Tyr Arg Met 275 TTA ATT Leu Ile 280 CTG TGT CGC Leu Cys Arg CTC GCG ATT Leu Ala Ile 28S CTT GGT Leu Giy 1102 CTT TTC TTT Leu Phe Phe 290 CAT TAT AGA His Tyr Arg CTC CAT CCA GTC Leu His Pro Val GAT GCA TAT GGA Asp Ala Tyr Gly 1150

TTA

Leu 305 TGG TTA ACG TCA Trp, Leu Thr Ser ATA TGC GAA ATA Ile Cys Glu Ile TTT GCA GTG Phe Ala Val ATT CTT GAT CAA Ile Leu Asp Gin CCC AAA TGG TAT Pro Lys Trp Tyr ATA GAA CGT GAA Ile Giu Arg Glu TCT TGG Ser Trp 320 ACA TAC Thr Tyr 335 TCA GGA Ser Giy 1198 1246 1294 CTC GAT AGA CTC Leu Asp Arg Leu 340 TCT CTC AGG TAC Ser Leu Arg Tyr AAG GAA GGA AAA Lys Glu Gly Lys

CCG

Pro 350 TTA GCA CCT Leu Ala Pro 355 GTT GAT GTT TTT GTT AGT ACA GTG GAT Val Asp Val Phe Val Ser Thr Val Asp 360 CCG TTG AAA GAG Pro Leu Lys Giu 365 CTA GCA GTT GAT Leu Ala Val Asp 1342 CCC CCC Pro Pro 370 TTG ATT ACA GCA Leu Ilie Thr Ala ACA GTT CTT TCC ATT Thr Val Leu Ser Ilie 380 1390

TAT

3 0 Tyr 385 CCT GTG GAT AAG GTT Pro Val Asp Lys Val 390 GCG TGT TAT GTA Ala Cys Tyr Val AAC AAT GGT Asn Asn Gly GCA GCT Ala Ala 400 1438 ATG CTT ACA TTT Met Leu Thr Phe GAA GCT CTC Glu Ala Leu 405 TCT GAT ACA GCT GAT Ser Asp Thr Ala Asp 410 TTT GCT ACA AAA Phe Ala Thr Lys 415 1486 TGG GTT CCT TTT TGT AAG AAG TTT AAT ATC GAG CCA CGA GCT CCT GAG Trp Val Pro Phe Cys Lys Lys Phe Asn Ile Giu Pro Arg Ala Pro Glu 420 425 430 1534 4 WO 98/00549 PCT/AU97/00402 126- TGG TAT TTT TCT CAG AAG ATG GAT TAC CTG AAG AAC Trp Tyr Phe Ser Gin Lys Met Asp Tyr Leu Lys Asn AAA GTT CAT CCT Lys Val His Pro 445 TAT GAA GAG TTT Tyr Giu Giu Phe 1582 OCT TTT Ala Phe 450 GTC AGO GAA CGT COT GCT ATG AAG AGA Val. Arg Giu Arg Arg Ala Met Lys Arg 455 1630 AAA OTG AAG, ATA AAT Lys Val Lys Ile Asn 465 CTO GTT GCT ACT Leu Val Ala Thr CAG AAA OTO CCT Gin Lys Val Pro 1678 OAA CGT TOG ACT ATO CAA OAT GGA ACT Oiu Arg Trp Thr Met Gin Asp Giy Thr 485 TOG CCT OGA AAC Trp Pro Gly Asn AAC OTC Asn Val 495 1726 COT GAC CAT CCT OGA ATO ATT CAG Arg Asp His Pro Oly Met Ile Gin 500 COT OAT ACO OAT GOT AAT GAO TTA Arg Asp Thr Asp Gly Asn Giu Leu 515 520 TTC TTO OOT CAT Phe Leu Oly His AOT OGA OTT Ser Oly Val 510 OTT TCT COT Val Ser.Arg 1774 CCA COT CTA OTO Pro Arg Leu Val 1822 GAO AAO Oiu Lys 530 COO CCT OGA TTT OAT CAC Arg Pro Oly Phe Asp His 535 CAC AAO AAA OCT His Lys Lys Ala 540 OGA OCT ATO AAT Gly Ala Met Asn 1870

TCC

Ser 545 TTO ATC COA OTC Leu Ile Arg Val.

OCT OTT CTA TCA Ala Val. Leu Ser AAC OCT Asn Ala 555 CCT TAC CTT Pro Tyr Leu 1918 AAT OTC OAT TOT Asn Vai Asp Cys CAC TAC ATC AAC His Tyr Ile Asn AGC AAA OCA Ser Lys Ala ATT AGA OAA Ile Arg Giu 575 1966 TCT ATO TOT TTC ATO Ser Met Cys Phe Met 580 ATO GAC CCO CAA TCO OGA AAO AAA OTT TOT TAT Met Asp Pro Gin Ser Oly Lys Lys Val. Cys Tyr 585 590 2014 WO 98/00549 PTA9/00 PCT/AU97/00402 127 GTT CAG TTT CCG CAG AGA TTT GAT Vai Gin Phe 595 Pro Gin Arg Phe Asp 600 GGG ATT GAT AGA CAT Giy Ilie Asp Arg His 605 GAT AGA TAC Asp Arg Tyr 2062 TCA AAC Ser Asn 610 CGT AAC GTT GTG Arg Asn Vai Val TTT GAT ATT AAC Phe Asp Ilie Asn AAA GGT Lys Gly CTT GAT Leu Asp 2110

GGG

Gly 625 ATA CAA GGA CCG Ile Gin Giy Pro ATA TAT Ile Tyr 630 GTC GGG ACA Val Giy Thr TGT GTG TTT AGA Cys Val Phe Arg 2158 2206 CAG GCT CTT TAT Gin Ala Leu Tyr TTT GAT GCA CCA Phe Asp Ala Pro AAG AAG AAA CCA CCA GGC Lys Lys Lys Pro Pro Giy 655 AAA ACC TGT Lys Thr Cys AGA AAG AAG Arg Lys Lys 675 TGT TGG CCT AAA Cys Trp Pro Lys TGT TGT TTG TGT Cys Cys Leu Cys TGT GGG TTG Cys Gly Leu 670 AAC ACT AAA Asn Thr Lye 2254 AGT AAA ACG AAA Ser Lys Thr Lys ACA GAT Thr Asp AAG AAA ACT Lys Lys Thr 685 2302 GAG ACT Glu Thr 690 TCA AAG CAG ATT Ser Lys Gin Ile GCG CTA GAG AAT Ala Leu Glu Asn GAC GAA GGT GTT Asp Giu Gly Val

ATC

Ile 705 GTC CCA GTG TCA Val Pro Val Ser GTT GAG AAG AGA TCT Val Giu Lys Arg Ser 715 GAA GCA ACA CAA Glu Ala Thr Gin 2350 2398 2446 AAA TTG GAG AAG Lye Leu Glu Lys TTT GGA CAA TCT Phe Giy Gin Ser GTT TTC GTT GCC Val Phe Val Ala TCT GCT Ser Ala 735 GTT CTA CAG Val Leu Gin GGT GGA Gly Gly GTT CCC CGT AAC Val Pro Arg Asn 745 GCA AGC CCC Ala Ser Pro GCA TGT TTG Ala Cys Leu 750 2494 WO 98/00549 WO 9800549PCT/AU97/00402 128 TTA AGA GAA GCC ATT CAA GTT ATT AGC TGC GGG TAC Ser Cys Gly Tyr GAT AAA ACC Asp Lys Thr 2542 Leu Arg Oiu 755 Ala Ile Gin Val GAA TGG Oiu Trp 770 GGA AAA GAG ATC G TGG ATT TAT GGA Giy Lys Giu Ile Gly Trp Ile Tyr Oly 775 GTO ACT GAA OAT Vai Thr Oiu Asp 2590 ATC CTG ACO GOT TTC AAG Ile Leu Thr Giy Phe Lys 78S 790 ATG CAT TGC CAT GGA TOG AGA TCT GTG Met His Cys His Giy Trp Arg Ser Vai 795

TAC

Tyr 2638 2686 TGT ATG CCT AAG COT GCA GCT TTT AAA OGA TCT OCT CCT ATT Cys Met Pro Lys Arg Aia Aia Phe Lys Gly Ser Aia Pro Ile AAC TTG Asn Leu 815 TCA GAT CGT Ser Asp Arg ATT TTC TTG Ile Phe Leu 835 CAT CAA OTT His Gin Val CTA COT Leu Arg 825 TOG OCT CTT GOC TCT OTA GAG Trp, Aia Leu Giy Ser Val Oiu 830 TOG TAT GOT TAT GOT GGT GGT Trp Tyr Gly Tyr Oly Oiy Giy 845 2734 AGC AGA CAT TGT CCG ATA Ser Arg His Cys Pro Ile 840 2782 TTA AAA TOO TTO GAG AGA Leu Lys Trp Leu Oiu Arg 850 TCT TAC ATC AAC TCT Ser Tyr Ile Asn Ser 860 GTC GTC TAT CCT Val Vai Tyr Pro 2830 2878

TOO

Trp 865 ACT TCA CTT CCA Thr Ser Leu Pro ATC OTC TAT TOT Ile Vai Tyr Cys CTC CCC OCO OTT Leu Pro Ala Val TTA CTC Leu Leu AC-A GGA AAA TTC Thr Giy Lys Phe 885 ATC GTC CCT Ile Val Pro ATA AGC AAC Ile Ser Asn TAC OCA GOT Tyr Ala Giy 895 OGA ATC CTC Gly Ile Leu 910 2926 ATA CTC TTC Ile Leu Phe CTC ATO Leu Met TTC ATA TCC Phe Ile Ser 905 ATA OCA Ile Ala OTA ACT Vai Thr 2974 WO 98/00549 PTA9/00 PCT/AU97/00402 -129- GAA ATG CAA TGG GGA GGT GTC GGA ATC GAT GAT TGG Giu Met Gin Trp Giy Gly Val Giy Ile Asp Asp Trp TGG AGA AAC GAG Trp Arg Asn Giu 925 3022 CAG TTT Gin Phe 930 TGG GTA ATC GGA Trp Val Ile Gly GCC TCC TCG CAT CTA TTT GCT CTG TTT Ala Ser Ser His Leu Phe Ala Leu Phe 940 3070 CAA GGT TTG CTC AAA Gin Giy Leu Leu Lys 945 CTA GCC GGA Leu Ala Giy GTT AAC Vai Asn 955 ACG AAT TTC ACA Thr Asn Phe Thr 3118 3166 ACT TCA AAA GCA Thr Ser Lys Ala GAC GAT GGA GCT TTC Asp Asp Gly Ala Phe 970 TCT GAG CTT TAC ATC TTC Ser Giu Leu Tyr Ile Phe 975 AAG TGG ACA ACT Lys Trp, Thr Thr 980 TTG TTG ATT Leu Leu Ile CCT CCG Pro Pro 985 ACA ACA CTT CTG Thr Thr Leu Leu ATC ATT AAC Ile Ile Asn 990 3214 ATC ATT GGA Ile Ile Giy 995 GTT ATT GTC Val Ile Val GGC GTT TCT Gly Val Ser 1000 GAT GCC ATT AGC AAT GGC TAT Asp Ala Ilie Ser Asn Gly Tyr 1005 3262 GAC TCA TGG Asp Ser Trp 1010 ATT GTT CAT Ile Val His 1025 AAA ATG CCT Lys Met Pro GGA CCT CTC TTT GGG AGA CTT TTC Gly Pro Leu Phe Gly Arg Leu Phe 1015 TTC GCT CTT TGG GTC Phe Ala Leu Trp Val 1020 3310 TTA TAC CCA TTC CTC Leu Tyr Pro Phe Leu 1030 AAG GGA ATG CTT GGG Lys Gly Met Leu Gly 1035 TGG TCT ATT CTT CTA Trp Ser Ilie Leu Leu 1050 AAG CAA GAC Lys Gin Asp 1040 GCT TCG ATC Ala Ser Ile 3358 3406 ATT Ilie 1045 ATT GTG GTC Ile Val Vai TTG ACA CTC TTG Leu Thr Leu Leu 1060 TGG GTC AGA ATT AAC CCG TTT GTG GCT AAA GGG GGA Trp Val Arg Ile Asn Pro Phe Val Ala Lys Gly Gly 3454 1065 1070 WO 98/00549 WO 9800549PCTIAU97/00402 130- CCA GTG TTG GAG ATC TGT GGT CTG AAT TOT GGA Pro Val Leu Glu Ile Cys Gly Leu Asn Cys Gly 1075 1080 AGTGAAAGAA GAGCAAAGGA GTTTGTGTTG GAGCTTTGGA TGCAAGTGTG TTTGTAGACA AAGATGTGCA GTTTTTACTT TTTTGTTACC CCTAAATTAA TTCTTTTGTT ATCATGGTTA TTTCTTTTTT ACATGTACTT TTAGTTATTC CGTAGTTATT ATATATACAC ACTTTGTTAA CAAAAAAAAA AAAAAAAAAA CTCGAATTGT CGACGCGGCC GCGAATTC AAC TAAGATCCTC Asn AGCAAATGTG TTGATGATGA TTTACGACTT GTTAAACCTT TACTAATAGA ATTGTTTGTT GTATAATACT GATAACGATC AAAAAAAAAA AAAGCGGCCG 3500 3560 3620 3680 3740 3800 3828 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 1084 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8; Met Asn Thr Gly Gly Arg Leu Ile Ala Gly Ser His Asn Arg Asn Glu 1 5 10 Phe Val Leu Ile Asn Ala Asp Glu Ser Ala Arg Ile Arg Ser Val Gin 20 25 Giu Leu Ser Gly Gin Thr Cys Gin Ile Cys Gly Asp Giu Ile Giu Leu 40 WO 98/00549 WO 9800549PCT/AU97/00402 131 Thr Val Ser Ser Giu Leu Phe Val Ala Cys Asn Cys Ala Phe Pro Val 65 Cys Arg Pro Cys Glu Tyr Glu Arg Giu Gly Asn Gin Ala Cys Pro Gln Cys Thr Arg Tyr Lys Arg Glu Asp 105 Ile Lys Gly Ser Pro Arg Val Asp Gly Asp Giu Glu Giu Ile Asp Asp Leu Giu Tyr 110 Ala Ala Leu Giu Phe Asp His Gly Met Asp 115 Glu His Ala Ala Ser Ser 130 Arg Leu Asn Thr Arg Gly Gly Leu Ser Ala Pro Pro Gly 145 Ser Gin Ile Pro Leu Thr Tyr Cys Glu Asp Ala Asp Tyr Ser Asp Arg Ala Leu Ile Val Pro Ser Thr Gly Tyr Gly 175 Asn Arg Val Pro Ala Pro Phe Thr Asp Ser Ser Ala 185 Pro Pro Gin 190 Gly Tyr Giy Ala Arg Ser 195 Met Val Pro Gin Asp Ile Ala Giu Ser Val 210 Ala Trp Lys Asp Met Glu Val Trp Arg Arg Gin Gly Glu 225 Lys Leu Gin Val Lys His Giu Gly Asn Asn Giy Arg Ser Asn Asp Asp Glu Leu Asp Asp Pro Asp Met Pro Met Met Asp 255 WO 98/00549 PTA9/00 PCT/AU97/00402 132 Giu Giy Arg Gin Pro Leu Ser Arg 260 Lys Leu 26S Pro Ile Arg Ser Ser Arg 270 Ile Asn Pro 275 Tyr Arg Met Leu Leu Cys Arg Leu Ala Ile Leu Gly 285 Leu Phe 290 Phe His Tyr Arg Leu His Pro Val Asp Ala Tyr Gly Leu Trp Leu Thr Ser 305 Ile Cys Giu Ile Phe Ala Val Ser Ile Leu Asp Gin Pro Lys Trp Tyr Pro Ile Glu Arg Glu 330 Thr Tyr 335 Leu Asp Arg Leu Ser Leu Arg Tyr 340 Leu Ala Pro Val Asp Val Phe Vai 355 360 Lys Glu Gly Lys Pro Ser Giy 350 Leu Lys Glu Ser Thr Val Asp Pro Pro 370 Leu Ile Thr Aia Thr Vai Leu Ser Leu Ala Val Asp Pro Val Asp Lys Ala Cys Tyr Val Asn Asn Gly Ala Met Leu Thr Phe Ala Leu Ser Asp Thr 410 Ala Asp Phe Ala Thr Lys 41S Trp Vai Pro Phe Cys Lys Lys Phe Asn Ile Glu Pro Arg Ala Pro Giu 430 Trp Tyr Phe 435 Ser Gin Lys Met Asp Tyr Leu Lys Asn Lys Val His Pro Ala Phe 450 Val Arg Glu Arg Arg Ala Met Lys Arg ASP Tyr Glu Giu Phe 455 460 WO 98/00549 WO 9800549PCT/AU97/00402 133 Val Lys Ile Asn Leu Val Ala Thr Gin Lys Val Pro Glu Arg Trp Thr Gin Asp Gly Thr Trp, Pro Gly Asn Asn Val 495 Arg Asp His Gly Met Ile Gin Phe Leu Gly His Ser Gly Val 53-0 Val Ser Arg Arg Asp Asp Gly Asn Giu Pro Arg Leu Val Giu Lys 530 Arg Pro Gly Phe Asp His His Lys Lys 535 Gly Ala Met Asn Leu Ile Arg Val Ala Val Leu Ser Ala Pro Tyr Leu Asn Val Asp Cys His Tyr Ile Asn Ser Lys Ala Ile Arg Giu 575 Ser Met Cys Met Met Asp Pro Ser Giy Lys Lys Val Cys Tyr 590 Asp Arg Tyr Val Gin Pro Gin Arg Phe Gly Ile Asp Arg Ser Asn 610 Arg Asn Vai Val Phe Asp Ile Aen Lys Gly Leu Asp Ile Gin Gly Pro Tyr Vai Gly Thr Gly 635 Cys Val Phe Arg Gin Ala Leu Tyr Phe Asp Ala Pro Lys Lys Lys Pro Pro Gly 655 Lys Thr Cys Cys Trp Pro Lys Cys Cys Leu Cys Cys Gly Leu 670 WO 98/00549 PTA9/00 PCT/AU97/00402 134 Arg Lys Lys 675 Ser Lys Thr Lys Ala Thr Asp Lys Lys 680 Asn Thr Lys Giu Thr 690 Ser Lys Gin Ile Ala Leu Glu Asn Asp Giu Gly Val Vai Pro Val Ser Val Giu Lys Arg Giu Ala Thr Gin Lys Leu Giu Lys Phe Gly Gin Ser Val Phe Val Ala Ser Ala 735 Val Leu Gin Asn Gly Giy Vai Pro '740 Asn Ala Ser Pro Ala Cys Leu 750 Asp Lys Thr Leu Arg Giu 755 Ala Ile Gin Val Ser Cys Gly Tyr Giu Trp 770 Gly Lys Giu Ile Trp Ile Tyr Giy Val Thr Giu Asp Leu Thr Gly Phe Met His Cys His Trp, Arg Ser Val Cys Met Pro Lys Ala Ala Phe Lys Ser Ala Pro Ile Asn Leu 815 Ser Asp Arg His Gin Val Leu Trp Ala Leu Giy Ser Val Giu 830 Ile Phe Leu Ser Arg His Cys 835 Leu Lys Trp Leu Glu Arg Phe 850 855 Pro Ile Trp Tyr Gly Tyr Giy Giy Gly 840 845 Ser Tyr Ile Asn Ser Val Val Tyr Pro 860 Thr Ser Leu Pro ThrSerLeu Pro le Vai Tyr Cys ePrAaVl Leu Pro Ala Val WO 98/00549 WO 9800549PCT/AU97/00402 135 Leu Leu Thr Giy Phe Ile Val Pro Ile Ser Asn Tyr Ala Gly 895 Ile Leu Phe Met Leu Met Phe Ile 900 Ile Ala Val Thr Gly Ile Leu 910 Arg Asn Giu Giu Met Gin Trp Gly Gly Vai Gly Ile Asp Asp Trp 915 920 Phe 930 Trp Vai Ile Gly Ala Ser Ser His Phe Ala Leu Phe Giy Leu Leu Lys Leu Ala Gly Val Thr Asn Phe Thr Thr Ser Lys Ala Asp Asp Gly Ala Ser Giu Leu Tyr Ile Phe 975 Lys Trp Thr Leu Leu Ile Pro Thr Thr Leu Leu Ile Ile Asn 990 Ser Asn Gly Tyr 1005 Ile Ile Gly 995 Val Ile Val Gly Val Ser Asp Ala Ile 1000 Asp Ser 1010 Trp, Gly Pro Leu Phe Gly Arg Leu Phe 1015 Phe Ala Leu Trp Val 1020 Ile Val His Leu Tyr 1025 Lys Met Pro Thr Ile 104 Pro Phe Leu Lys Gly 1030 Met Leu Gly Lys Gin 1035 Asp 1040 Ile Val Val Trp Ser Ile Leu Leu Ala 1050 Ser Ile 1055 Leu Thr Leu Leu Trp Val Arg Ile 1060 Asn Pro Phe Val Ala 1065 Lys Gly Gly 1070 Pro Val Leu Giu Ile Cys Gly Leu Asn Cys Gly Asn 1075 1080 WO 98/00549 -136- INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 3614 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana STRAIN: Columbia (vii) IMMEDIATE SOURCE: CLONE: Ath-B (ix) FEATURE: NAME/KEY: CDS LOCATION: 217..3411 PCT/AU97/00402 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: GAATTCGCGG CCGCGTCGAC TACGGCTGCG AGAAGACGAC TCCTCTTCGT CTTCCTTATA AACTATCTCT CTGTAGAGAA AGAGAGATTC AGAGAGCCAC ATCACCACAC TCCATCTTCA TCCGACGTTT CGGTGTTGGA AGCAACTAAG TGACAA ATG Met 1 ACC GCG GGA AAG CCG ATG AAG AAC ATT GTT CCG Thr Ala Gly Lys Pro Met Lys Asn Ile Val Pro AGAAGGGGAT CCCAAGATTC GAAAGCTTGG ATCCAGATTG GATCTCATGA TTTGAACTAT GAA TCC GAA GGA GAA Glu Ser Glu Gly Glu CAG ACT TGC CAG ATC Gln Thr Cys Gln Ile WO 98/00549 WO 9800549PCT/AU97/00402 137 TOT AOT GAC AAT OTT GOC AAG ACT Cys Ser Asp Asn Val Gly Lys Thr OTT GAT GOA GAT CGT TTT GTG OCT Val Asp Gly Asp Arg Phe Val Ala TOT GAT ATT TOT TCA Cys Asp Ile Cys Ser AGG AAA GAT GGG AAT Arg Lys Asp Oly Asn TTC CCA OTT Phe Pro Val 4S TOT COO CCT TOC TAO GAG TAT GAG Cys Arg Pro Cys Tyr Olu Tyr Olu so CAA TCT TOT COT OAO TOO AAA ACC AGA Oln Ser Cys Pro Oin Cys Lys Thr Arg AOO CTC AAA OOT AGT CCT OCT ATT CCT GOT OAT AAA GAC GAG Arg Leu Lys Oly Ser Pro Ala Ile Pro Oly Asp L~ys Asp Oiu TAC AAO Tyr Lys OAT 000 Asp Gly GAG AAA Oiu Lys TTA OCT OAT OAA OOT ACT OTT GAG Leu Ala Asp Oiu Oly Thr Val Olu AAC TAO OCT CG Asn Tyr Pro Gin ATT TCA GAG Ile Ser Olu 105 COO ATO OTT OGT TOG OAT OTT ACT COT Arg Met Leu Oly Trp His Leu Thr Arg 110 000 AAG OOA GAO Gly Lys Oly Oiu 115 OAA ATO 000 GAA COO OAO 0Th Met Oly Giu Pro Gin 120 TAT OAT Tyr Asp 125 AAA GAO OTO TOT GZAO AAT OAT OTT Lys Giu Val Ser His Asn His Leu OCT COT OTO ACO AGO AGA CAA OAT ACT TOA OGA GAO TTT TOT Pro Arg Leu Thr Ser Arg Gin Asp Thr Ser Oly Oiu Phe Ser 135 140 145 TOA COT OAA COO Ser Pro Giu Arg OTO TOT Leu Ser 155 OTA TOT Val Ser TOT ACT ATO Ser Thr Ile OCT 000 OGA Ala Gly Gly GOT 000 Ala Ala 150 AAG COO Lys Arg 165 ATT OTO Ile Val OTT 000 TAT TOA TOA OAT GTO AAT CAA TOA OCA Leu Pro Tyr Ser Ser Asp Val Asn Gin Ser Pro AAT AGA AGO Asn Arg Arg 180 WO 98/00549 WO 9800549PCT/AU97/00402 138 GAT CCT GTT GGA Asp Pro Val Gly 185 CTC GGG AAT GTA GCT TGG AAG GAG AGA GTT GAT GGC Leu Gay Asn Val Ala Trp Lys Glu Arg Val. Asp Gly 190 195 TGG AAA ATG Trp Lys Met 200 AAG CAA GAG AAG Lys Gin Giu Lys 205 AAT ACT GGT CCT Asn Thr Giy Pro AGC ACG CAG GCT Ser Thr Gin Ala

GCT

Ala 215 TCT GAA AGA Ser Giu Arg GGT GGA Gay Gly 220 GTA GAT ATT Val Asp Ile GAT GCC Asp Ala 225 AGC ACA GAT ATC Ser Thr Asp Ile GCA GAT GAG GCT Ala.Asp Giu Ala CTG AAT GAC GAA Leu Asn Asp Giu AGG CAG CTT CTG TCA AGG Arg Gin Leu Leu Ser Arg 245 AAA GTT Lys Val TCA ATT Ser Ile 250 CCT TCA TCA CGG Pro Ser Ser Arg AAT CCT TAC AGA Asn Pro Tyr Arg ATG GTT ATT Met Val Ile 260 1002 ATG CTG CGG Met Leu Arg 265 CTT GTT ATC CTT Leu Val Ile Leu CTC TTC TTG CAT Leu Phe Leu His TAC CGT ATA ACA Tyr Arg Ile Thr 275 TCT GTG ATA TGT Ser Vai Ile Cys 1050 AAC CCA Asn Pro 280 GTG CCA AAT GCC Val Pro Asn Ala GCT CTA TGG CTG Ala Leu Trp Leu

GAG

Giu 295 ATC TGG TTT GCC Ile Trp, Phe Ala TCC TGG ATT TTG Ser Trp Ile Leu CAG TTT CCC AAG Gin Phe Pro Lys 1098 1146 11 94 TTT CCT Phe Pro GTG AAC CGT Val Asn Arg 315 GAA ACC TAG CTC Giu Thr Tyr Leu AGG CTT GCT TTA Arg Leu Ala Leu AGA TAT Arg Tyr 325 GAT CGT GAA GGT Asp Arg Giu Gly 330 GAG CCA TCA GAG TTA Giu Pro Ser Gin Leu 335 GCT GCT GTT Ala Aia Val GAC ATT TTC GTG Asp Ile Phe Val 340 1242 WO 98/00549 PTA9/00 PCT/AU97/00402 139- AGT ACT GTT Ser Thr Val 345 GAC CCC TTG AAO GAG CCA CCC CTT GTG Asp Pro Leu Lys Glu Pro Pro Leu Val 350 GCC AAC ACA Ala Asn Thr 1290 GTG CTC Val Leu 360 TCT ATT CTG GCT GTT GAC TAC CCA GTT Ser Ile Leu Ala Val Asp Tyr Pro Val 365 AAG GTG TCC TGT Lys Val Ser Cys 1338 1386 TAT Tyr 375 GTT TCT GAT GAT GGT GCT GCT ATG TTA Val Ser Asp Asp Gly Ala Ala Met Leu .380 TCA TTT GAA TCA CTT GCA Ser Phe Glu Ser Leu Ala 385 390 GAA ACA TCA GAG Glu Thr Ser Glu GCT COT AAA TG Ala Arg Lys Trp CCA TTT TGC AAG Pro Phe Cys Lys AAA TAT Lys Tyr 405 ATA GAT Ile Asp 1434 AGC ATA GAG CCT COT GCA CCA GAA Ser Ile Glu Pro Arg Ala Pro Glu 410 TOO TAC TTT GCT GCG AAA Trp Tyr Phe Ala Ala Lys 415 420 1482 TTG AAG Leu Lys 425 GAT AAA OTT CAG ACA TCA TTT OTC AAA OAT Asp Lys Val Gin Thr Ser Phe Val Lys Asp 430 435 COT AGA OCT Arg Arg Ala 1530 22 ATO AAG AGO Met Lys Arg 440 TCC AAA 0CC Ser Lys Ala 455 OAA TAT GAG OAA TTT AAA ATC CGA ATC AAT GCA CTT OTT Glu Tyr Olu Glu Phe Lye Ile Arg Ile Asn Ala Leu Val 1578 CTA AAA TOT CCT GAA OAA 000 TGO OTT ATO CAA OAT Leu Lys Cys Pro Olu Glu Gly Trp Val Met Gin Asp 1626 ACA CCO TOO Thr Pro Trp CCT OGA AAT AAT Pro Oly Asn Asn 475 ACA GOG GAC CAT Thr (fly Asp H-is 480 CCA OGA ATO Pro Gly Met ATC CAG Ile Gin 485 1674 GTC TTC TTA 000 CAA AAT GOT OGA CTT OAT GCA GAG Val Phe Leu Gly Gin Asn Gly Oly Leu Asp Ala Olu GGC AAT GAO CTC Gly Asn Glu Leu 500 1722 WO 98100549 PTA9IOO PCT/AU97/00402 -140- CCG COT TTC GTA Pro Arg Leu Val 505 TAT GTT TCT OGA GAA AAG Tyr Val Ser Arg Glu Lys 510 CGA CCA GGA Arg Pro Oly TTC CAG CAC Phe Gin His 1770 cAC AAA His Lys 520 AAG GCT GGT GCT Lys Ala Cly Ala ATG PAT GCA CTG Asn Ala Leu GTG AGA Val Arg 530 GTT TCA GCA GTT Val Ser Ala Val 1818

CTT

Leu 535 ACC PAT GGA CCT TTC Thr Asn Cly Pro Phe 540 ATC TTG AAT CTT GAT TGT GAT CAT TAC Ile Leu Asn Leu Asp Cys Asp His Tyr 545 1866 PAT PAC ACC AAA GCC Asn Asn Ser Lys Ala 555 TTA AGA Leu Arg GPA GCA ATG Olu Ala Met 560 TGC TTC CTG ATG Cys Phe Leu Met GAC CCA Asp Pro 565 TTT GAT Phe Asp 1914 1962 PAC CTC GGG Asn Leu Gly CAA GTT TGT TAT Gin Val Cys Tyr CAG TTC CCA CAA Gin Phe Pro Gin OCT ATC GAT AAG AAC GAT AGA Gly Ile Asp Lys Asn Asp Arg 585 GAT ATT AAC TTG AGA GGT TTA Asp Ile Asn Leu Arg Gly Leu 600 605 TAT GCT Tyr Ala 590 PAT COT AAT ACC GTG TTC TTT Asn Arg Asn Thr Val Phe Phe 595 2010 OAT GO ATT CAA OGA Asp Gly Ile Gin Gly 610 CCT GTA TAT GTC Pro Val Tyr Val 2058 OGA ACT Gly Thr 615 OGA TOT OTT TTC Oly Cys Val Phe 620 PAC AGA ACA OCA Asn Arg Thr Ala TAC OGT TAT GAA CCT Tyr Gly Tyr Giu Pro 630 2106 CCA ATA AAA GTA AAA Pro Ile Lys Val Lys 635 OCT OCA TCA AGA AC Oly Oly Ser Arg Lys 650 CAC PAG His Lys PAG CCA AGT CTT TTA TCT AAG Lys Pro Ser Leu Leu Ser Lys 640 CTC TOT Leu Cys 645 GAC AAA Asp Lys 2154 2202 PAG PAT TCC AAA GCT AAG AAA GAG Lys Asn Ser Lys Ala Lys Lys Giu 655 WO 98/00549 WO 9800549PCT/AU97/00402 141 AAG AAA TCA GGC AGG CAT ACT GAC Lys Lys Ser Gly Arg His Thr Asp 665 670 TCA ACT Ser Thr GTT CCT Val. Pro TTC AAC CTC Phe Asn Leu 2250 GAT GAC Asp Asp 680 ATA GAA GAG GGA Ile Glu Glu Gly GAA GGT GCT Giu Gly Ala

GCG

Ala 695 CTC TTA ATG TCG Leu Leu Met Ser ATG AGC CTG GAG Met Ser Leu Giu GGT TTT Gly Phe 690 AAG CGA Lys Arg 705 AAT GGT Asn Gly GAT GAT GAA AAG Asp Asp Giu Lys TTT GGA CAG Phe Gly Gin 2298 2346 2394 GCT GTT TTT GTT Ala Val Phe Val TCT ACC CTA Ser Thr Leu ATG GAA Met Giu 720 GGT GTT CCT CCT Gly Val Pro Pro 725 TCA GCA ACT Ser Ala Thr GAA AAC TTT CTC Glu Asn Phe Leu GAG GCT ATC CAT Glu Ala Ile His GTC ATT AGT Val Ile Ser 740 2442 TGT GGT TAT Cys Gly Tyr 745 GAG GAT AAG TCA Glu Asp Lys Ser TGG GGA ATG GAG ATT GGA TGG ATO Trp Gly Met Gu. Ile Gly Trp Ile 755 2490 TAT GGT Tyr Gly 760 TCT GTG ACA GAA GAT Ser Val Thr Giu Asp 765 ATT CTG ACT GGG TTC Ile Leu Thr Gly Phe 770 AAA ATG CAT GCC Lys Met His Ala 2538

CGT

3 0 Arg 775 GGA TOG CGA TCC Gly Trp Arg Ser TAC TGC ATG, Tyr Cys Met CCT AAG Pro Lys 785 CTT CCA GCT TTC AAG Leu Pro Ala Phe Lys 790 GGT TCT GCT CCT Giy Ser Ala Pro AAT CTT TCA OAT Asn Leu Ser Asp CTG AAC CAA GTG Leu Asn Gin Val CTG AGO Leu Arg 805 2586 2634 2682 TOO GCT TTA GGT Trp Ala Leu Oly 810 TCA GTT GAG ATT CTC TTC AGT CGG CAT Ser Val Giu Ile Leu Phe Ser Arg His 815 TGT CCT ATA Cys Pro Ile 820 WO 98/00549 WO 9800549PCT/AU97/00402 142 TOG TAT GGT TAC AAT Trp, Tyr Gly Tyr Asn 825 GGG AGG CTA AAA Gly Arg Leu Lys 830 TTT CTT GAG AGO TTT GCG TAT Phe Leu Glu Arg Phe Ala Tyr 2730 GTG AAC ACC Val Asn Thr 840 ACC ATC TAC Thr Ile Tyr ATC ACC TCC ATT Ile Thr Ser Ile CTT CTC ATO TAT Leu Leu Met Tyr 2778

TGT

Cys 855 ACA TTG CTA GCC Thr Leu Leu Ala TOT CTC TTC ACC Cys Leu Phe Thr TTT ATT ATT CCT Phe Ile Ile Pro 870 2826 CAG ATT AGT AAC Gin Ilie Ser Asn GCA AGT ATA TOG TTT CTG TCT CTC TTT Ala Ser Ile Trp Phe Leu Ser Leu Phe 880 CTC TCC Leu Ser 885 2874 ATT TTC 0CC Ile Phe Ala GOT ATA CTA GAA Gly Ile Leu Giu AGG TOG AGT GGC Arg Trp Ser Giy GTA GOC ATA Val Gly Ile 900 GOA GTA TCC Giy Val Ser 2922 GAC GAA TGG TGO AGA AAC GAG Asp Giu Trp Trp Arg Asn Oiu 905 CAG TTT TOG OTC ATT GOT Gin Phe Trp Val Ilie Oly 910 915 CAA GOT ATC CTC AAA GTC Gin Gly Ile Leu Lys Val 930 2970 OCT CAT TTA TTC Aia His Leu Phe 920 OCT OTO TTT Ala Val Phe 925 CTT 0CC GOT Leu Ala Oly 3018 ATT GAC ACA AAC TTC Ile Asp Thr Asn Phe 935 OTT ACC TCA AAA Val Thr Ser Lys TCA OAT GAA GAC Ser Asp Giu Asp 3066 3114 GAC TTT OCT GAO Asp Phe Ala Giu TAC TTG TTC AAA Tyr Leu Phe Lys ACA ACA CTT CTG Thr Thr Leu Leu ATT CCO Ile Pro 965 CCA ACO ACO Pro Thr Thr CTC ATT OTA AAC Leu Ile Val Asn TTA OTO Leu Val 975 OGA OTT OTT Oly Val Val OCA OGA GTC Ala Oly Val 980 3162 WO 98/00549 WO 9800549PCT/AU97/00402 143 TOT TAT GOT Ser Tyr Ala 985 AAG TTG TTC Lys Leu Phe 1000 AAG GGT TTG Lys Gly Leu 1015 ATC AAC AGT GGA TAC Ile Asn Ser Gly Tyr 990 TTT GCC TTC TGG GTG Phe Ala Phe Trp, Val 1005 CAA TCA TGG Gin Ser Trp GGA CCA CTC Gly Pro Leu 995 TTT GGT Phe .Gly 3210 ATT GTT CAC TTG TAO CCT TTO CTC Ile Val His Leu Tyr Pro Phe Leu 1010 3258 ATG GGT Met Gly OGA CAG AAC Arg Gin Asn 1020 CGG ACT COT ACC Arg Thr Pro Thr 1025 TTC TCG TTG TTG Phe Ser Leu Leu 1040 ATT GTT GTG Ile Val Val

GTC

Val 1030 3306 3354 TGG TOT Trp Ser GTT CTC TTG GCT TOT ATC Val Leu Leu Ala Ser Ile 1035 TGG GTT AGG ATT Trp Val Arg Ile 1045 GAT COO Asp Pro TTO ACT AGC CGA GTC ACT GGO CCG GAO ATT CTG Phe Thr Ser Arg Val Thr Giy Pro Asp Ile Leu 1050 1055 GAA TGT GGA Glu Cys Gly 1060 3402 ATO AAO TGT TGAGAAGOGA GCAAATATTT ACOTGTTTTG AGGGTTAAAA Ile Asn Cys 1065 AAAAOACAGA ATTTAAATTA TTTTTOATTG TTTTATTTGT TCAOTTTTTT ACTTTTGTTG TGTGTATOTG TOTGTTOGTT OTTOTGTOTT GGTGTOATAA ATTTATGTGT AGAATATATO TTAOTOTAGT TAOTTTGGAA AGTTATAATT AAAGTGAAAG OCA 3451 3511 3571 3614 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1065 amino acids TYPE: amino acid TOPOLOGY: linear WO 98/00549 WO 9800549PCT/AU97/00402 -144- (ii) MOLECULE (xi) SEQUENCE Giu Ser Giu Gly TYPE: protein DESCRIPTION: SEQ ID Met 1 Pro Giu Thr Ala Gly Pro Met Lys Asn Ile Val Gin Thr Cys Gin Ile Cys Ser Asp 25 Val Ala Cvs Asp Ile Val Gly Lys Giy Asp Arg Pro Cys Tyr Phe Cys Ser Phe Thr Val Asp Val Cys Arg Cys Pro Gin Tyr Giu Arg Lys Glu Asp Gly Asn Cys Lys Thr Arg Tyr Leu Lys Giy Ala Ile Pro Asp Lys Asp Giu Leu Ala Asp Thr Val Giu Phe Asn Tyr Pro Gin Lys Ile Ser Met Leu Gly Thr Arg Gly 115 Val Ser His 130 Gly Giu Glu Glu Pro Gin Trp His Leu 110 Asp Lys Giu Asp Thr Ser Asn His Leu Leu Thr Ser Gly 145 le Giu Phe Ser Ala Pro Giu Arg Val Ser Ser Ala Gly Gly Leu Pro Tyr Asp Val Asn Pro Asn Arg Arg Ile Val Asp Pro Val Gly Leu Gly Asn Val Ala Trp, WO 9$/00549 WO 9800549PCT/AU97/00402 145 Lys Giu Arg Vai Asp Gly Trp Lys Met Lys Gin Giu Asn Thr Gly Pro Val 210 Ser Thr Gin Ala Ser Giu Arg Gly Val Asp Ile Asp Ser Thr Asp Ile Ala Asp Giu Ala Leu Asn Asp Giu Arg Gin Leu Leu Arg Lys Val Ser Pro Ser Ser Arg Ile Asn 255 Pro Tyr Arg Val Ile Met Leu Arg Leu Val Ilie Leu 265 Cys Leu Phe 270 Ala Leu Trp Leu His Tyr 275 Arg Ile Thr Asn Val Pro Asn Ala Leu Val 290 Ser Val Ile Cys Ile Trp Phe Ala Ser Trp Ile Leu Asp Gin Phe Pro Lys 305 Arg Leu Ala Leu Arg Phe Pro Val Asn Arg Giu Thr Tyr Leu 315 Tyr Asp Arg Giu Giu Pro Ser Gin Leu Ala 335 Ala Val Asp Leu Val Thr 355 Ile Phe Val Ser Thr Vai Asp Pro Leu Lys 340 345 Ala Asn Thr Vai Leu Ser Ile Leu Aia Vai 360 365 Giu Pro Pro 350 Asp Tyr Pro Val Asp 370 Lys Val Ser Cys Tyr Val Ser 375 Asp Asp Gly 380 Ala Ala Met Leu Ser Phe 385 Giu Ser Leu Ala Glu Thr Ser Glu Phe Ala 390 395 Arg Lys Trp WO 98100549 WO 9800549PCT/AU97/00402 -146- Pro Phe Cys Lys Tyr Ser Ile Glu Pro Arg Ala Pro Glu Trp, Tyr 410 415 Phe Ala Ala Ile Asp Tyr Leu Lys Asp Lys Val Gin 425 Thr Ser Phe 430 Phe Lys Ile Val Lys Asp 435 Arg Arg Ala Met Arg Glu Tyr Glu Arg Ile 450 Asn Ala Leu Val Lys Ala Leu Lys Pro Giu Giu Gly Val Met Gin Asp Thr Pro Trp Pro Asn Asn Thr Gly His Pro Gly Met Gin Val Phe Leu Gin Asn Gly Gly Leu Asp 495 Ala Giu Gly Giu Leu Pro Arg Val Tyr Val Ser Arg Giu Lys Asn Ala Leu Arg Pro Gly 515 Phe Gin His His Lys Ala Gly Ala Val Asp 545 Val Ser Ala Val Thr Asn Giy Pro Phe 540 Ile Leu Asn Leu Cys Asp His Tyr Asn Asn Ser Lys Leu Arg Giu Ala Cys Phe Leu Met Pro Asn Leu Gly Gin Val Cys Tyr Val Gin 575 Tyr Ala Asn 590 Phe Pro Gin Phe Asp Gly Ile Asp Lys Aso Asp Arg 585 Arg Asn Thr 595 Val Phe Phe Asp Ile f;t)0 Asn Leu Arg Gly Leu Asp Gly Ile 605 WO 98/00549 WO 9800549PCT/AU97/00402 147- Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe 620 Asn Arg Thr Ala Leu 625 Tyr Gly Tyr Glu Leu Ser Lys Leu 645 Pro Ile Lys Val His Lys Lys Pro Leu Cys Gly Gly Ser Lys Lys Asn Ser Lys Ala 655 Lys Lys Giu Val Pro Val 675 Asp Lys Lys Lys Gly Arg His Thr Asp Ser Thr 670 Giu Gly Ala Phe Asn Leu Asp Ile Glu Glu Gly Gly Phe 690 Asp Asp Giu Lys Leu Leu Met Ser Gin 700 Met Ser Leu Glu Lys -705 Arg Phe Giy Gin Ala Val Phe Val Ser Thr Leu Met Asn Gly Gly Val Pro Ser Ala Thr Giu Asn Phe Leu Lys Giu 735 Ala Ile His Ile Ser Cys Gly Tyr Glu Asp Lys Ser 745 Asp Trp Gly 750 Ile Leu Thr met Giu Ile Gly Trp Ilie Tyr Ser Vai Thr Glu Gly Phe 770 Lys Met His Ala Giy Trp Arg Ser Tyr Cys Met Pro Lys Leu 78s Pro Ala Phe Lys Gly Ser Ala 790 Pro Ile Asn 795 Leu Ser Asp Arg 800 Leu Asn Gin Val Leu Arg 805 Trp Ala Leu Gly Ser Val Giu Ile Leu Phe 815 WO 98/00549 148 Ser Arg His Cys Pro Ile Trp Tyr Gly Tyr Asn Gly Arg Leu Lys Phe PCT/AU97/00402 Leu Giu Arg 835 Phe Ala Tyr Val Thr Thr Ile Tyr Ile Thr Ser Ile Pro 850 Leu Leu Met Tyr Thr Leu Leu Ala Val1 860 Cys Leu Phe Thr Asn 865 Gin Phe Ile Ile Gin Ilie Ser Asn Ala Ser Ile Trp Leu Ser Lau Phe Ser Ile Phe Ala Gly Ilie Leu Giu Met Arg 895 Trp Ser Giy Gly Ile Asp Giu Trp, Arg Asn Glu Gin Phe Trp 910 Gin Gly Ile Vai Ile Giy Gly Val Ser Ala 915 Leu Phe Ala Val Leu Lys 930 Val Lau Ala Giy Asp Thr Asn Phe Val Thr Ser Lys Ala 945 Ser Asp Giu Asp Giy Asp Phe Ala Giu 950 Tyr Leu Pha Lys Thr Thr Leu Leu Pro Pro Thr Thr Leu Ile Val Asn Leu Vai 975 Gly Val Val Gly Val Ser Tyr Ile Asn Ser Giy Tyr Gin Sar 990 Trp Gly Pro Leu 995 His Leu Tyr Pro 1010 Phe Gly Lys Leu Phe Phe Ala Phe 1000 Trp Val Ile Val 1005 Phe Leu Lys Giy Leu Met 1015 Giy Arg Gin Asn Arg Thr 1020 WO 98/00549 PCT/AU97/00402 149- Pro Thr 1025 Ile Val Val Val 1030 Trp Ser Val Leu Leu Ala Ser Ile Phe Ser 1035 1040 Asp Pro Phe Thr Ser Arg Val Thr Gly Pro 1050 1055 Leu Leu Trp Val Arg Ile 1045 Asp Ile Leu Glu Cys Gly 1060 Ile Asn Cys 1065 INFORMATION FOR SEO ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 3673 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (vi) ORIGINAL SOURCE: ORGANISM: Arabidopsis thaliana STRAIN: Columbia INDIVIDUAL ISOLATE: rswl mutant (ix) FEATURE: NAME/KEY: CDS LOCATION: 71..3313 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: GAATCGGCTA CGAATTTCCC AATTTTGAAT TTTGTGAATC TCTCTCTTTC TCTGTGTGTC WO 98/00549 WO 9800549PCT/AU97/00402 150 GGTGGCTOCG ATG GAG GCC AGT GCC GGC TTG GTT GCT GGA TCC TAC CGG Met Giu Ala Ser Ala Gly Leu Val Ala Gly Ser.Tyr Arg AGA AAC GAG CTC Arg Asn Giu Leu GTT CGG Val Arg ATC CGA Ile Arg 20 CAT GAA TCT GAT GOC GGG ACC AAA His Olu Ser Asp Oly Gly Thr Lys

CCT

Pro TTO AAG AAT Leu Lys Asn ATG AAT GGC CAG ATA TGT CAG ATC TOT GOT Met Asn Oly Gin Ile Cys Gin Ile Cys Gly GAT GAT Asp Asp GTT GGA CTC GCT GAA Val Giy Leu Ala Giu ACT GGA OAT GTC TTT Thr Oly Asp Val Phe GTC GCO TOT AAT Val Ala Cys Asn 0CC TTC CCT Ala Phe Pro TOT COG CCT TOC Cys Arg Pro Cys GAO TAC GAG AGO Glu Tyr Giu Arg AAA OAT OGA Lys Asp Gly ACT CAG TOT Thr Gin Cys TGC CCT CAA TOC Cys Pro Gin Cys ACT AGA TTC AGA COA CAC AOO GG Thr Arg Phe Arg Arg His Arg Oly AGT CCT Ser Pro CGT OTT GAA OGA OAT GAA OAT GAO OAT GAT OTT OAT OAT ATC Arg Val Oiu Oly Asp Oiu Asp Oiu Asp Asp Val Asp Asp Ile 100 105

GAG

Giu 110 AAT GAO TTC AAT TAC GCC CAG OGA Asn Giu Phe Asn Tyr Ala Gin Oly 115 OCT MAC Ala Asn 120 AAG GCG AGA Lys Ala Arg CAC CAA His Gin 125 COC CAT GOC Arg His Gly OAA GAG TTT TCT TCT TCC TCT AGA CAT GAA TCT CAA CCA Oiu Olu Phe Ser Ser Ser Ser Arg His Olu Ser Gin Pro ATT CCT CTT CTC Ile Pro Leu Leu 145 ACC CAT GOC CAT ACO OTT TCT GGA GAG ATT COC ACO Thr His Oly His Thr Val Ser Gly Glu Ile Arg Thr 150 155 WO 98/00549 WO 9800549PCT/AU97/00402 151 CCT GAT ACA CAA TCT GTG CGA ACT ACA TCA GGT Pro Asp Thr 160 Gin Ser Val Arg Thr Thr Ser Gly 165 CCT TTG GGT CCT TCT Pro Leu Gly Pro Ser 170 GAC AGG AAT GCT ATT TC-A TCT CCA Asp Arg Asn Ala Ile Ser Ser Pro 175 180 TAT ATT GAT Tyr Ile Asp CGG CAA CCT GTC Arg Gin Pro Val

CCT

Pro 190 GTA AGA ATC GTG GAC Val Arg Ile Val Asp 195 CCG TCA AAA GAC Pro Ser Lys Asp AAC TCT TAT GGG Asn Ser Tyr Gly GGT A.AT GTT GAC Gly Asn Val Asp AAA GAA AGA GTT Lys Giu Arg Val GAA GGC TGG AAG CTG AAG CAG Giu Gly Trp Lys Leu Lys Gin 215 220 AAA TAC CAT GAA GGG AAA GGA Lys Tyr His Giu Gly Lys Gly 235 GAG AAA AAT Giu Lys Asn TTA CAG ATG ACT Leu Gin Met Thr GGA GAA ATT Gly Giu Ile 240 GAA GGG ACT GGT TCC AAT GGC GAA Glu Gly Thr Gly Ser Asn Gly Giu 24S GAA CTC Giu Leu 250 CCT ATC Pro Ile 265 CAA ATG GCT Gin Met Ala GAT GAT Asp Asp 255 ACA CGT CTT CCT Thr Arg Leu Pro AGT COT GTG GTG Ser Arg Val Val CCA TCT TCT Pro Ser Ser

CGC

3 0 Arg 270 CTA ACC CCT TAT Leu Thr Pro Tyr GTT GTG ATT ATT Val Val Ilie Ile CGG CTT ATC ATC Arg Leu Ile Ile TGT TTC TTC TTG CAA Cys Phe Phe Leu Gin 290 TAT COT ACA ACT CAC Tyr Arg Thr Thr His 295 CCT GTG AAA AAT Pro Val Lys Asn GCA TAT Al a Tyr 300 CCT TTG TG Pro Leu Trp ACC TOG GTT ATC Thr Ser Val Ile GAG ATC TOG TTT Oiu Ile Trp Phe GCA TTT TOT Ala Phe Ser 315 1021 WO 98100549 WO 9800549PCT/AU97/00402 152 TGG GTT CTT GAT GAG TTT CCC AAA TGG TAG GGG ATT AAC AGG GAG ACT 1069 Trp Leu Leu 320 Asp Gin Phe Pro Lys 325 Trp Tyr Pro Ilie Asn Arg Glu Thr 330 Tyr GTT GAG GGT CTG GCT ATA AGA TAT GAT CGA GAC GGT GAA GGA TGA Leu Asp Arg Leu Ala Ile Arg Tyr Asp Arg Asp Gly Giu Pro Ser 335 340 345 1117 GAG CTC Gin Leu 350 GTT GGT GTT GAT Vai Pro Val Asp 355 GTG TTT GTT AGT Val Phe Vai Ser GTG GAG GGA TTG Vai Asp Pro Leu 1165 GAG GGT GGG GTT Giu Pro Pro Leu AGA GCA AAG ACA Thr Ala Asn Thr CTC TCG ATT CTT TGT GTG Leu Ser Ile Leu Ser Vai 380 1213 GAC TAG GGG GTA Asp Tyr Pro Val 385 GAT AAA GTA GGG TGT TAT GTT TCA GAT Asp Lys Vai Ala Gys Tyr Val Ser Asp 390 GGT TGA Gly Ser 1261 OCT ATG GTT Ala Met Leu 400 ACG TTT GAA TCC CTT TGT GAA AGC GCT Thr Phe Giu Ser Leu Ser Glu Thr Ala 405 TTT GGA AAG Phe Ala Lys 1309 AAA TGG GTA CGA TTT TGG Lys Trp Vai Pro Phe Gys 415 AAA TTC AAC ATT GAA CCT AGG GCC CCT Lys Phe Asn Ile Glu Pro Arg Ala Pro 425 1357

GAA

Giu 430 TTG TAT TTT GC Phe Tyr Phe Ala AAG ATA GAT TAG Lys Ile Asp Tyr AAG GAC AAG ATG GAA Lys Asp Lys Ile Gin 445 1405 GGG TGT TTT OTT Pro Ser Phe Val AAA GAG OGA GGA Lys Giu Arg Arg 450 GGT ATG AAG Ala Met Lys 455 AGA GAG Arg Giu TAT GAA GAG Tyr Giu Oiu 460 AAA ATG CGT Lys Ile Pro 475 1453 1501 TTT AAA GTG Phe Lys Val AGG ATA AAT GGT GTT GTT GGG Arg Ilie Asn Ala Leu Val Ala 465 470 AAA OCA GAG Lys Ala Gin WO 98/00549 WO 9800549PCT/AU97/00402 153 GAA GAA GGC Glu Glu Gly 480 TGG ACA ATG CAG GAT Trp Thr Met Gin Asp GGT ACT Gly Thr CCC TGG Pro Trp CCT GGT AAC AAC Pro Gly Asn Asn 1549 ACT AGA Thr Arg 49S GAT CAT CCT GGA ATG Asp His Pro Gly Met S00 ATA CAG GTG TTC Ile Gin Val Phe TTA GGC Leu Gly 505 CAT AGT GGG His Ser Gly 1597

GGT

Gly 510 CTG GAT ACC GAT Leu Asp Thr Asp AAT GAG CTG CCT Asn Glu Leu Pro AGA CTC ATC TAT GTT TCT Arg Leu Ile Tyr Val Ser S20 525 AAA AAG GCT GGA GCT ATG Lys Lys Ala Gly Ala Met 540 CGT GAA AAG CGG CCT Arg Glu Lys Arg Pro 530 GGA TTT CAA CAC Gly Phe Gin His 1645 1693 1741 AAT GCA TTG Asn Ala Leu COT GTA TCT OTT OTT CTT ACC AAT GGA Arg Val Ser Val Val Leu Thr Asn Gly 550 GCA TAT CTT Ala Tyr Leu 555 OCT ATT AAA Ala Ile Lys TTG A.AC OTG Leu Asn Val 560 GAT TGT GAT CAT TAC Asp Cys Asp His Tyr 565 TTT AAT AAC Phe Asn Asn AGT AAG Ser Lys 570 1789 GAA GCT Giu Ala 575 ATG TGT TTC ATO Met Cys Phe Met ATG GAC Met Asp 580 CCG GCT ATT Pro Ala Ile GGA AAG AAG TGC TOC Oly Lys Lys Cys Cys 585 1837 1885

TAT

Tyr 590 OTC CAG TTC CCT Val Gin Phe Pro CAA COT Gin Arg 595 TTT GAC GOT Phe Asp Gly OAT TTG CAC GAT Asp Leu His Asp TAT 0CC AAC AGG AAT Tyr Ala Asn Arg Asn 610 ATA GTC TTT TTC GAT ATT AAC ATO AAG Ile Val Phe Phe Asp Ile Asn Met Lys 615 GGG TTO Oly Leu 620 1933 GAT GOT ATC CAG GGT CCA OTA TAT GTO GOT ACT GGT TOT TOT TTT AAT 1981 Asp Gly Ilie Gly Pro Vai Tyr Val Gly Thr Gly Cys Cys Phe Asn 630 635 WO 98/00549 WO 9800549PCT/AU97/00402 154- AGG CAG GCT Arg Gin Ala 640 CTA TAT GGG TAT Leu Tyr Gly Tyr CCT GTT TTG Pro Val Leu ACG GAA Thr Giu 650 GAA GAT TTA Giu Asp Leu 2029 GAA CCA Glu Pro 655 AAT ATT ATT Asn Ile Ile GTC AAG AGC TGT TGC GGG Val Lys Ser Cys Cys Gly 660 TCA AGG AAG AAA GGT Ser Arg Lys Lys Gly 665 AGA GGC ATC AAC AGA Arg Giy Ile Asn Arg 685 2077 2125

AAA

Lys 670 AGT AGC AAG AAG Ser Ser Lys Lys AAC TAC GAA AAG Asn Tyr Giu Lys AGT GAC TCC AAT GCT CCA CTT TTC AAT Ser Asp Ser Asn Ala Pro Leu Phe Asn 690 GAG GAC ATC GAT Giu Asp Ilie Asp GAG GGT Glu Gly 700 2173 TTT GAA GGT Phe Glu Gly GAT GAT GAG AGG Asp Asp Giu Arg TCT ATT Ser Ile 710 CTA ATG TCC CAG AGG AGT Leu Met Ser Gin Arg Ser 715 2221 GTA GAG Val Giu CGT TTT GGT CAG Arg Phe Giy Gin CCG GTA TTT ATT Pro Val. Phe Ile GCA ACC TTC Ala Thr Phe 2269 2317 ATG GAA CAA GGC GGC ATT Met Giu Gin Giy Gly Ile CCA ACA ACC AAT Pro Thr Thr Asn GCT ACT CTT CTG Ala Thr Leu Leu

AAG

Lys GAG GCT ATT CAT Giu Ala Ile His ATA AGC TGT GGT Ile Ser Cys Gly GAA GAG Glu Asp ACT GAA Thr Giu 765 2365 TGG GGC AAA GAG Trp Gly Lys Giu GGT TGG ATC TAT Gly Trp Ile Tyr TCC GTG ACG Ser Val Thr GAA GAT ATT Giu Asp Ilie 780 2413 CTT ACT GGG TTC AAG Leu Thr Giy Phe Lys 785 ATO CAT GCC CGG GGT TGG ATA TCG ATC TAC TGC 2461 Met His Aia Arg 790 Gly Trp Ile Ser Ile Tyr Cys 795 WO 98/00549 PTA9/00 PCT/AU97/00402 155 AAT OCT CCA CGC Asn Pro Pro Arg 800 OCT GCG TTC AAG GGA TOT GCA CCA ATC AAT OTT TOT Pro Ala Phe Lys Giy Ser Ala Pro Ile Asn Leu Ser 2509 GAT CGT Asp Arg 815 TTG AAC CAA GTT Leu Asn Gin Val CGA TGG GCT TTG GGA Arg Trp Ala Leu Gly 825 TOT ATC GAG ATT Ser Ile Glu Ile 2557 2605

OTT

Leu 830 CTT AGO AGA CAT Leu Ser Arg His OCT ATO TGG TAT Pro Ile Trp Tyr TAO OAT GGA AGG Tyr His Giy Arg AGA CTT TTG Arg Leu Leu GAG AGG ATC Giu Arg Ile 850 GCT TAT ATO AAC Aia Tyr Ilie Asn 855 ACC ATC GTO TAT Thr Ilie Val Tyr COT ATT Pro Ile 860 2653 ACA TCO ATC Thr Ser Ile ATC ACC GAC Ile Thr Asp 880 OCT OTT Pro Leu 865 ATT GOG TAT TGT Ile Ala Tyr Oys 870 ATT OTT COO GOT Ile Leu Pro Ala TTT TGT OTO Phe Cys Leu 875 000 AGT ATT Ala Ser Ile 2701 AGA TTO ATO ATA Arg Phe Ile Ile GAG ATA AGO AAO Glu Ile Ser Asn 2749 TGG TTC Trp Phe 895 ATT OTA OTO TTC Ile Leu Leu Phe TOA ATT GOT GTG Ser Ilie Ala Val ACT OGA ATO OTG GAG Thr Giy Ile Leu Oiu 905 TGG AGO AAO GAG CAG Trp, Arg Asn Giu Gin 925 2797 2845

OTG

Leu 910 AGA TOG AGO GGT Arg Trp Ser Gly AGO ATT GAG GAT TG Ser Ile Giu Asp Trp 920 TTO TGG GTO ATT Phe Trp Val Ile GOT 000 ACA Oiy Oly Thr 930 TOO 000 OAT Ser Ala His 935 OTT TTT GOT GTO TTC CAA Leu Phe Ala Vai Phe Gin 940 2893 GGT OTA OTT AAG GTT OTT GOT Oly Leu Leu Lys Val Leu Ala 945 GOT ATO GAO ACC AAO TTC ACC GTT ACA Gly Ile Asp Thr Asn Phe Thr Val Thr 950 955 2941 WO 98/00549 PTA9/00 PCT/AU97/00402 -156- TCT AAA GCC ACA GAC GAA GAT Ser Lys Ala Thr Asp Glu Asp 960 GAT TTT GCA Asp Phe Ala GAA CTC TAC ATC TTC Giu Leu Tyr Ile Phe 2989 AAA TGG Lys Trp 975 ACA GCT CTT CTC Thr Ala Leu Leu CCA CCA ACC ACC GTC CTA CTT GTG AAC Pro Pro Thr Thr Val Leu Leu Val Asn 985 3037

CTC

Leu 990 ATA CCC ATT GTG Ile Gly Ile Val CGT GTC TCT Gly Val Ser TAT GCT GTA Tyr Ala Val 1000 AAC ACT GGC TAC Asn Ser Gly Tyr 1005 308S CAG TCC TGG GGT Gin Ser Trp Gly CCG CTT Pro Leu 1010 TTC GGG AAG Phe Gly Lys CTC TTC Leu Phe 1015 TTC GCC TTA TGG GTT Phe Ala Leu Trp Val 1020 3133 ATT GCC CAT Ile Ala His CTC TAC Leu Tyr 1025 CCT TTC TTG Pro Phe Leu AAA GGT CTG Lys Gly Leu 1030 TTG GGA Leu Gly AGA CAA AAC Arg Gin Asn 1035 3181 CGA ACA CCA ACC Arg Thr Pro Thr 1040 ATC GTC ATT Ile Val Ile GTC TGG Val Trp 1045 TCT GTT Ser Val CTT CTC GCC TCC ATC Leu Leu Ala Ser Ile 1050 GTG GAC GCC AAT CCC Val Asp Ala Asn Pro 1065 3229 3277 TTC TCG TTG CTT TGC Phe Ser Leu Leu Trp l055 GTC AGO ATC AAT CCC TTT Vai Arg Ilie Asn Pro Phe 1060 AAT GCC AAC AAC TTC AAT GGC AAA OCA GOT GTC Asn Ala Asn Asn Phe Asn Gly Lys Gly Gly Val TTT TAGACCCTAT Phe 3323 1070 1075 1080 TTATATACTT GTGTGTGCAT ATATCAAAAA CGCGCAATGG GAATTCCAAA TCATCTAAAC CCATCAAACC CCAOTGAACC GGGCAGTTAA GOTGATTCCA TGTCCAAGAT TAGCTTTCTC cGAGTAGcr-A GAGAAGGTGA AATTGTTCGT AACACTATTG TAATGATTTT CCAGTGGGGA AGAAGATGTO GACCCAAATG ATACATAGTC TACAAAAAGA ATTTGTTATT CTTTCTTATA 3383 3443 3503 3563 WO 98/00549

P

-157- TTTATTTTAT TTAAAGCTTG TTAGACTCAC ACTTATGTAA TGTTGGAACT TGTTGTCCTA AAAAGGGATT GGAGTTTTCT TTTTATCTAA GAATCTGAAG TTTATATGCT 'CT/AU97100402 3623 3673 INFORMATION FOR SEQ ID NO:12: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1081 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Met Giu Ala Ser Ala Gly Leu Val Ala Gly Ser Tyr Arg Arg Asn Giu Leu Val Arg 5 Ile Arg Asp Gly 25 His Glu Ser Gly Thr Lys Asn Met Ala Glu Val Cys Gly Gin Ile Cys Ile Cys Gly Asp Asp Cys Pro Leu Lys Val Gly Leu Ala Phe Pro Thr Gly Asp Val Val Ala Cys Asn Arg Pro Cys Tyr Giu Arg Lys Asp Gly Thr Gln Cys Pro Gin Cys Lys Arg Phe Arg Arg His Arg Gly Ser Pro Arg Val Glu Gly Asp Giu Asp Glu Asp Asp Val Asp Asp 100 105 Ile Giu Asn Glu 110 WO 98/00549PC/19/40 PCT/AU97/00402 158 Phe Asn Tyr 115 Ala Gin Gly Ala Lys Ala Arg His Gin Arg His Gly 125 Glu Giu 130 Phe Ser Ser Ser Arg His Giu Ser Gin 140 Pro Ile Pro Leu Thr His Gly His Vai Ser Gly Glu Arg Thr Pro Asp Gin Ser Vai Arg Thr Thr Ser Gly Pro 165 Gly Pro Ser Asp Arg Asn 175 Ala Ile Ser Pro Tyr Ile Asp Arg Gin Pro Val Pro Val Arg 190 Ile Vai Asp 195 Pro Ser Lys Asp Asn Ser Tyr Gly Leu Gly Asn Val 205 Asp Trp 210 Lys Giu Arg Val Gly Trp Lys Leu Gln Giu Lys Asn Leu Gin Met Thr Lys Tyr His Giu Lys Gly Giy Giu Giu Gly Thr Gly Asn Gly Giu Giu Gin Met Ala Asp Asp Thr 255 Arg Leu Pro Pro Tyr Arg 275 Ser Arg Val Val Ile Pro Ser Ser Arg Leu Thr 270 Leu Cys Phe Phe 285 Val Val Ile Ile Arg Leu Ilie Ile Leu Gin 290 Tyr Arg Thr Thr Pro Vai Lys Asn Ala Tyr Pro Leu Trp 300 Thr Ser Val Ile Cys Giu Ile Trp Phe 310 Phe Ser Trp Leu Leu WO 98/00549 WO 9800549PCT/AU97/00402 159 Asp Gin Phe Pro Lys Trp Tyr ProIle Asn Arg Giu Thr 325 330 Arg Leu Ala Tyr Asp Arg Glu Pro Ser Tyr Leu Asp 335 Gin Leu Val 350 Giu Pro Pro Asp Tyr Pro Pro Val Asp 355 Leu Val Thr Phe Val Ser Asp Pro Leu Ala Asn Thr Ser Ile-Leu 370 Vai Asp Lys Val Ala Vai Ser Asp Ser Ala Met 385 Thr Phe Glu Ser Glu Thr Ala Ala Lys Lys Trp Vai 415 Pro Phe Cys Phe Asn Ile Arg Ala Pro Phe Ala Gin 435 Ile Asp Tyr Asp Lys Ile Glu Phe Tyr 430 Pro Ser Phe Phe Lys Val Val Lys 450 Ara Ilie Glu Arg Arg Ala Arg Giu Tyr Asn Ala Leu Trp Thr Met Gin Val Ala Lys Ala Gin 470 Gly Thr Pro Trp Pro 490 Gin Val Phe Leu Gly Pro Giu Giu Asn Asn Thr Arg Asp 495 Leu Asp His Pro Gly His Ser Gly Thr Asp Gly 515 Asn Glu Leu Pro Arg Leu Ile 520 Tyr Val Ser 525 Arg Giu Lys WO 98/00549 PTA9/00 PCT/AU97/00402 160- Arg Pro Gly Phe Gin His His Lys Lys Ala Gly Met Asn Ala Leu Ile 545 Arg Val Ser Val Leu Thr Asn Gly Ala 555 Tyr Leu Leu Asn Asp Cys Asp His Phe Asn Asn Ser Ala Ile Lys Giu Ala Met 575 Cys Phe Met Phe Pro Gin 595 Asp Pro Ala Ile Lys Lys Cys Cys Tyr Val Gin 590 Tyr Ala Asn Arg Phe Asp Gly Asp Leu His Asp Arg Asn 610 Ilie Val Phe Phe Ile Asn Met Lys Leu Asp Gly Ile Gin 625 Gly Pro Val Tyr Giy Thr Gly Cys Phe Asn Arg Gin Leu Tyr Gly Tyr Pro Val Leu Thr Giu Asp Leu Giu Pro Asn 655 Ile Ile Val Ser Cys Cys Gly Arg Lys Lys Gly Lys Ser Ser 670 Ser Asp Ser Lys Lys Tyr 675 Asn Tyr Giu Lys Arg Arg Gly Ile Asn 680 Asn Ala 690 Pro Leu Phe Asn Giu Asp Ile Asp Gly Phe Giu Gly Tyr 705 Asp Asp Giu Arg Ser Ile Leu Met 710 Ser Gin 715 Ala Ala 730 Arg Ser Val Giu Arg Phe Gly Gin Ser Pro Val Phe Ilie Thr Phe Met Giu Gin 735 WO 98/00549 PTA9100 PCT/AU97/00402 161 Gly Gly Ilie Pro Pro Thr Thr Asn 740 Ala Thr Leu Leu Lys Giu Ala 750 Trp Gly Lys Ile His Val 755 Ile Ser Cys Giy Giu Asp Lys Thr Glu Ile 770 Gly Trp Ile Tyr Ser Val Thr Giu Ilie Leu Thr Gly Phe 785 Lys Met His Ala Gly Trp Ile Ser Tyr Cys Asn Pro Arg Pro Ala Phe Giy Ser Ala Pro Asn Leu Ser Asp Arg Leu 815 Asn Gin Val Arg His Cys 835 Arg Trp Ala Leu Ser Ile Glu Ile Leu Leu Ser 830 Arg Leu Leu Pro Ile Trp Tyr Tyr His Gly Arg Glu Arg 850 Ile Ala Tyr Ile Thr Ile Val Tyr Pro 860 Ile Thr Ser Ile Pro 865 Leu Ilie Ala Tyr Ilie Leu Pro Ala Cys Leu Ile Thr Arg Phe Ilie Ile Giu Ile Ser Asn Ala Ser Ile Trp Phe Ile 895 Leu Leu Phe Ser Ile Ala Val Thr Gly 905 Ile Leu Glu Leu Arg Trp 910 Phe Trp Val Ser Gly Val 915 Ser Ile Glu Asp Trp, Trp, Arg Asn Glu 920 Ile Gly 930 Gly Thr Ser Ala Leu Phe Ala Val PheGin Gly Leu Leu 940 WO 98/00549 WO 9800549PCTAU97OO402 162 Lys Val Leu Ala Gly Ile Asp Thr Asn Phe Vai Thr Ser Lys Thr Asp Giu Asp Asp Phe Ala Glu Tyr Ile Phe Lys Trp Thr 975 Ala Leu Leu Pro Pro Thr Thr Leu Leu Val Asn Leu Ile Giy 990 Ile Val Gly Val Ser Tyr Ala Val 1000 Asn Ser Gly Tyr Gin Ser Trp 100S Gly Pro Leu 1010 Leu Tyr Pro 1025 Thr Ile Val Phe Gly Lys Leu Phe Phe Ala Leu Trp Val 1015 1020 Ile Ala His Phe Leu Lys Gly Leu Leu Gly 1030 Ile Val Trp Ser Val Leu Leu 1045 105' Arg Gin Asn Arg Thr Pro 1035 1040 Ala Ser Ilie Phe Ser Leu 0 1055 Leu Trp Vai Arg Ilie Asn 1060 Pro Phe Val Asp Ala Asn Pro Asn Ala Asn 1065 1070 Asn Phe Asn Gly Lys Gly Gly Val Phe 1075 1080 INFORMATION FOR SEQ ID NO:13: Wi SEQUENCE CHARACTERISTICS: LENGTH: 174i base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA (iii) HYPOTHETICAL: NO WO 98/00549 WO 9800549PCTAU97/OO402 163 (iv) ANTI-SENSE: NO (vi) ORIGINAL SOURCE: ORGANISM: Oryza sativa (vii) IMMEDIATE SOURCE: CLONE: S0542 (ix) FEATURE: NAME/KEY: CDS LOCATION: 101. .1741 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: GTGCGGCCGC CGCGCATCTA GGCTTGCCGC GCGCGCGCGG ATCTGCGAGC TGCGTAGCCG TTTCTCGCTG TGAGTGGAGG AGGAGGAGGA AGGGAGGAGG ATG GCG GCG AAC GCG Met Ala Ala Asn Ala 1 GOG ATG GTG GCG GGA TCC CGC AAC CGG AAC GAG TTC GTC ATO ATC CGC Gly Met Val Ala Gly Ser Arg Asn Arg Asn Olu Phe Val Met Ile Arg CCC GAC Pro Asp GGC GAC GCG Gly Asp Ala CCA CCG CCG OCT AAG Pro Pro Pro Ala Lys 30 CCA GOG AAG, AGT GTG AAT Pro Gly Lys Ser Val Asn GOT CAG Gly Gln TGC CAG ATT TGT Cys Gln Ile Cys GAC ACT OTT GGC Asp Thr Val Gly TCG GCC ACC Ser Ala Thr GGC GAC Gly Asp CCT TOC Pro Cys 70 GTC TTT OTT GCC TGC AAT GAG TGC Val Phe Val Ala Cys Asn Glu Cys 60 GCC TTC Ala Ph-e CCG GTC TGC CC Pro Val Cys Arg TGC TOC CCC CAG, Cys Cys Pro Gln TAC GAG TAC OAA CGC AAG GAA 000 AAC CAG Tyr Glu Tyr Glu Arg Lys Glu Oly Asri Gin WO 98/00549 WO 9800549PCT/AU97/00402 164- TGC AAG ACT AGA TAC AAG AGG CAC AAA GGT TGC Cys Lys Thr Arg Tyr Lys Arg His GAT GAG GAA GAA GAA GAT GTT GAT Asp Giu Giu Giu Giu Asp Val Asp 105 Lys Gly Cys 95 CCT AGA GTT CAG GGC Pro Arg Val Gin Gly 100 CTG GAC AAT GAA TTC Leu Asp Asn Giu Phe 115 CAT TAT His Tyr AAG CAT GGC AAT Lys His Gly Asn 120 GGC AAA GGT CCA GAG TGG CAG ATA Gly Lys Gly Pro Giu Trp Gin Ile 1.25 CAG AGA Gin Arg 130 CAG GGG Gin Giy GAA GAT GTT GAC CTG TCT TCA TCT TCT CGC CAC GAA CAA CAT CGG ATT Giu Asp Val Asp Leu Ser Ser Ser Ser Arg His Giu Gin His Arg Ile 2.35 140 145 CCC CGT Pro Arg 150 CTG ACA AGT GGG Leu Thr Ser Giy 1.55 CAA CAG ATC TCA GGA Gin Gin Ile Ser Gly 160 GAG ATC CCT GAT Giu Ile Pro Asp 595 TCC CCC GAT CGC Ser Pro Asp Arg TCT ATC CGC AGC Ser Ile Arg Ser ACA TCA AGC TAT Thr Ser Ser Tyr GTT GAT Val Asp 180 CCA AGT GTT Pro Ser Val AAT TCC TAT Asn Ser Tyr 200 TGG AGG AAC Trp Arg Asn 215 CCA GTT Pro Val 185 CCT GTG AGG Pro Val Arg GTG GAC CCC TCC Val Asp Pro Ser AAG GAC TTG Lys Asp Leu 195 GGG ATT AAC AGT Gly Ilie Asn Ser GTT GAC TGG CAA Val Asp Trp Gin 205 GAA AGA GTT GCC AGC Giu Arg Val Ala Ser 210 AAG CAG GAC AAA AAT ATG ATG CAG Lys Gin Asp Lys Asn Met Met Gin 220 OCT AAT AAA TAT Ala Asn Lys Tyr CCA GAG GCA AGA COG OCA GAC ATG GAA GGG ACT GGT TCA AAT OGT GAA~ Pro Glu Ala Arg Cly Oly Asp Met Oiu Giy Thr Gly Ser Asn Gly Giu 230 235 240 245 WO 98/00549 PTA9/00 PCT/AU97/00402 165 GAT ATC Asp Ilie CAA ATG GTT GAT GAT GCA Gin Met Val Asp Asp Ala 250 CGT CTA CCT Arg Leu Pro 255 CTG AGC CGC ATA Leu Ser Arg Ile 260 CCT ATC CCT TCA AAC CAG CTC AAC Pro Ilie Pro Ser Asn Gin Leu Asn 26S TAC CGG ATT GTT ATC ATT CTC Tyr Arg Ilie Vai Ile Ile Leu 275 CGT CTT ATC Arg beu Ile 280 GTG CGG GAT Vai Arg Asp 295 ATC CTG ATG TTC Ile Leu Met Phe TTC CAA TAT CGT Phe Gin Tyr Arg GTC ACT CAT CCA Val Thr His Pro 290 979 1027 GCT TAT GGA Aia Tyr Giy TGG CTA GTA Trp Leu Val TCT GTT ATC TGT GAA ATT Ser Val Ile Cys Giu Ile 305 TTG CCC TTA TCC TGG CTC CTA GAT CAA Leu Pro Leu Ser Trp Leu Leu Asp Gin 315 CCA AAG TGG TAC Pro Lys Trp Tyr 1075 ATA AAC CGT GAA Ile Asn Arg Giu ACA TAC Thr Tyr 330 CTT GAC AGG CTT Leu Asp Arg Leu GCA TTG AGA TAT GAT AGG Aia Leu Arg Tyr Asp Arg 340 1123 GAG GGA GAG Giu Gly Glu TCA CAG CTT GCT Ser Gin Leu Ala ATT GAT GTC TTT Ile Asp Vai Phe GTC AGT ACG Val Ser Thr 355 1171 GTG GAT Val Asp CTA AAG GAA CCT CCT CTG ATC ACA GCA Leu Lys Giu Pro Pro Leu Ile Thr Ala 365 AAC ACT GTT TTG, Asn Thr Vai Leu 370 TCA TGC TAT GTT Ser Cys Tyr Val 1219

ATT

Ser Ile 375 CTG GCT GTG GAT TAC CCT GTT GAC Leu Ala Val Asp Tyr Pro Val Asp 380 AAA GTG Lys Vai 385 1267 TCT GAC GAT GGT TCA GCT ATG Ser Asp Asp Gly Ser Ala Met 390 395 TTA ACT TTT GAG GCT CTG Leu Thr Phe Giu Ala Leu 400 TCA GAA ACT Ser Giu Thr 405 1315 WO 98/00549 WO 9800549PCT/AU97/00402 166 GCA GAA TTT GCT AGG AAG TGG GTT Ala Glu Phe Ala Arg Lys Trp Val 410 CCG TTT TGC AAG AAG Pro Phe Cys Lys Lys 415 CAC AAT ATT His Asn Ile 420 1363 GAA CCA CGA GCT CCA GAG TTT TAC TTT GCT CAA AAA ATA Glu Pro Arg Ala Pro Giu Phe Tyr Phe Ala Gin Lys Ile

GAT

Asp 435 TAC CTG Tyr Leu 1411 AAG GAC AAA ATC CAA CCT TCC TTT GTT AAA GAA AGG Lys Asp Lys Ile Gin Pro Ser Phe Val Lys Giu Arg GCA ATG AAG Aia Met Lys 1459 440 445 AGA GAG Arg Giu 455 TAT' GAA GAA TTC Tyr Glu Giu Phe GTA CGG ATC AAT GCT CTT GTT GCG AAG Val Arg Ile Asn Ala Leu Val Ala Lys 465 1507 CAA AAA GTA CCT GAA GAG Gin Lys Val Pro Giu Glu 475 GGG TGG ACC ATG Gly Trp, Thr Met 480 GCT GAT GGC ACT Ala Asp Gly Thr 1555 TGG CCT GGG AAT Trp Pro Gly Asn CCA AGG GAT CAC Pro Arg Asp His GGC ATG ATT CAG Gly Met Ile Gin GTG TTC Val Phe 500 1603 1651 TTG GGG CAC Leu Gly His GGT GGG CTT GAC Giy Gly Leu Asp GAT GGT AAC GAG TTG CCA CG Asp Gly Asn Giu Leu Pro Arg 515 CTT GTC TAC Leu Val Tyr 520 GTC TCT CGT GAA AAG Val Ser Arg Glu Lys 525 AGG CCA GGA Arg Pro Giy TTC CAG CAT CAC AAG Phe Gin His His Lys 530 1699 AAG GCT GGT GCA ATG AAT GCA TTG ATT CGT GTA TCT Lys Ala Gly Ala Met Asn Ala Leu Ile Arg Val Ser 535 540 545 GCT GTG Ala Val 1741 WO 98/00549 WO 9800549PCT/AU97/00402 167 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 547 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: Met Ala Ala Asn Ala Gly Met Val Ala Gly Ser Arg Asn Arg Asn Glu Phe Val Met Arg Pro Asp Gly Ala Pro Pro Pro Ala Lys Pro Asp Thr Val Gly Lys Ser Val Asn Gly Gin Cys Gin Ilie Cys Gly Val so Ser Ala Thr Gly Val Phe Val Ala Asn Glu Cys Ala Phe 6S Pro Val Cys Arg Cys Tyr Glu Tyr Arg Lys Giu Gly Gin Cys Cys Pro Cys Lys Thr Arg Lys Arg His Lys Gly Cys Pro Arg Val Gly Asp Glu Giu Giu Asp Vai Asp Asp Leu Asp 110 Asn Giu Phe 115 His Tyr Lys His Asn Giy Lys Gly Giu Trp Gin Ile Gin 130 Arg Gin Gly Giu Val Asp Leu Ser Ser Ser Ser Arg His 140 Glu Gln His Arg Ile Pro 145 150 Arg Leu Thr Ser Gly Gin Gin Ile Ser Gly 160 WO 98/00549 WO 9800549PCT/AU97/00402 168 Giu Ile Pro Asp Ser Pro Asp Arg His Ser 170 Ile Arg Ser Gly Thr 175 Ser Ser Tyr Asp Pro Ser Vai Val Pro Vai Arg Ile Val Asp 190 Asp Trp Gin Pro Ser Lys 195 Asp Leu Asn Ser Gly Ile Asn Ser Giu Arg Vai Aia Ser Trp Asn Lys Gin Asp Asn Met Met Gin Aia Asn Lys Tyr Giu Ala Arg Giy Asp Met Giu Giy Gly Ser Asn Giy Asp Ile Gin Met Asp Asp Ala Arg Leu Pro 255 Leu Ser Arg Val Pro Ile Pro Asn Gin Leu Asn Leu Tyr Arg 270 Phe Gin Tyr Ile Val Ile 275 Ile Leu Arg Leu Ile Leu Met Phe Arg Val 305 Thr His Pro Val Asp Ala Tyr Giy Trp Leu Vai Ser Ile Cys Giu Ile Leu Pro Leu Ser Leu Leu Asp Gin Pro Lys Trp, Tyr Pro Ile Asn Arg Giu Thr Tyr Leu Asp Arg Leu Ala 335 Leu Arg Tyr Asp Arg Giu Giy Glu Pro Ser Gin Leu Ala Pro Ile Asp 350 Pro Pro Leu Ile Thr

;F

Val Phe Val Ser Thr Val Asp Pro Leu Lys Giu WO 98100549 WO 9800549PCT/AU97/00402 -169- Ala Asn 370 Thr Val Leu Ser Ile 375 Ser Asp 390 Leu Ala Val Asp Pro Val Asp Lys Val 385 Ser Cys Tyr Val Asp Gly Ser Met Leu Thr Phe Ala Leu Ser Glu Ala Glu Phe Ala Lys Trp Val Pro Phe Cys 415 Lys Lys His Lys Ile Asp 435 Ile Glu Pro Arg Pro Glu Phe Tyr Phe Ala Gin 430 Val Lys Glu Tyr Leu Lys Asp Ile Gin Pro Ser Arg Arg 450 Ala Met Lys Arg Tyr Giu Giu Phe Val Arg Ile Asn Leu Val Ala Lys Gin Lys Val Pro Glu Gly Trp Thr Ala Asp Gly Thr Trp Pro Gly Asn Pro Arg Asp His Pro Giy 495 Met Ile Gin Phe Leu Gly His Gly Gly Leu Asp Thr Asp Gly 510 Asn Glu Leu 515 Pro Arg Leu Val Vai Ser Arg Giu Lys Arg Pro Gly 525 Phe Gin His His Lys Lys 530 Giy Ala Met Asn Leu Ile Arg Val Ser Ala Vai 545

Claims

1. An isolated nucleic acid molecule which encodes a polypeptide of the cellulose biosynthetic pathway or a homologue, analogue or derivative thereof or a complementary sequence thereto, wherein said polypeptide is capable of producing cellulose and/or 13-1,4-glucan and/or an intermediate between cellulose and a -1,4- glucan polymer, wherein said nucleic acid molecule is derived from an organism other than Acetohacter.

2. The isolated nucleic acid molecule according to claim 1, wherein the polypeptide is 10 cellulose synthase or a catalytic subunit thereof.

3. The isolated nucleic acid molecule according to claim 1 or 2, derived from a prokaryote.

4. The isolated nucleic acid molecule according to claim 3, wherein the prokaryote is S a bacterium other than Agrobacterium tumefaciens. 15 5. The isolated nucleic acid molecule according to claim 1 or 2, derived from a eukaryote.

6. The isolated nucleic acid molecule according to claim 5, wherein the eukaryote is a plant or fungus.

7. The isolated nucleic acid molecule according to claim 6, wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutun: (cotton), Oryza sativa (rice), wheat, barley, maize, Brassica ssp., Eucalyptus ssp., hemp, jute, flax, Pinus ssp., Populus ssp., and Picea spp., amongst others.

8. The isolated nucleic acid molecule according to claim 2, wherein the cellulose synthase or catalytic subunit thereof is the Arabidopsis thaliana RSW1 polypeptide. AL16111ORD PIc,,p'r'rO l-17 CII do-4 ApriI. 211-

171- 9. The isolated nucleic acid molecule according to any one of claims 1 to 8, comprising a sequence of nucleotides which is at least 40% identical to any one of SEQ ID Nos: 1, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence thereof. 10. The isolated nucleic acid molecule according to claim 9, wherein the percentage identity to any one of SEQ ID Nos: 1, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence thereof is at least 11. The isolated nucleic acid molecule according to claim 9, wherein the percentage 10 identity to any one of SEQ ID Nos: 1, 3,4, 5, 7, 9, 11 or 13 or a complementary o sequence thereof is at least 12. An isolated nucleic acid molecule which comprises a sequence of nucleotides S substantially as set forth in any one of SEQ ID Nos: 1, 3, 4, 5, 7, 9, 11 or a homologue, analogue or derivative thereof derived from an organism other than °Acetobacter, or a complementary sequence thereto. 13. The isolated nucleic acid molecule according to any one of claims 1 to 12, wherein said nucleic acid molecule hybridizes under at least low stringency conditions to at 20 least 20 contiguous nucleotides of any one of SEQ ID Nos: 1, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence thereto. 14. An isolated nucleic acid molecule which encodes a polypeptide which is capable of cellulose and/or p-1,4-glucan biosynthesis in a plant cell, fungal cell, insect cell, animal cell, yeast cell or bacterial cell when expressed therein, wherein said nucleic acid is derived from an organism other than Acetobacter. The isolated nucleic acid molecule according to claim 14, wherein the polypeptide is cellulose synthase or a catalytic subunit thereof. 16. The isolated nucleic acid molecule according to claim 14 or 15, derived from a WO 98/00549 PCT/AU97/00402 -172- prokaryote. 17. The isolated nucleic acid molecule according to claim 16, wherein the prokaryote is a bacterium other than Agrobacterium tumefaciens, Acetobacter pasteurianus or Acetobacter xylinum. 18. The isolated nucleic acid molecule according to claim 14 or 15, derived from a eukaryote. 19. The isolated nucleic acid molecule according to claim 18, wherein the eukaryote is a plant or fungus. The isolated nucleic acid molecule according to claim 19, wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), wheat, barley, maize, Brassica ssp., Eucalyptus ssp., hemp, jute, flax, Pinus ssp., Populus ssp., and Picea spp., amongst others. 21. The isolated nucleic acid molecule according to claim 20, wherein the cellulose synthase or catalytic subunit thereof is the Arabidopsis thaliana RSW1 polypeptide. 22. The isolated nucleic acid molecule according to any one of claims 14 to 21, comprising a sequence of nucleotides which is at least 40% identical to any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence thereto. 23. The isolated nucleic acid molecule according to claim 22, wherein the percentage identity to any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence thereof is at least 24. The isolated nucleic acid molecule according to claim 22, wherein the percentage identity to any one of SEQ ID NOs:l, 3, 4, 5, 7, 9, 11 or 13 or a complementary sequence A~HIIO lSF;ORD~.iff rruo~crmulo S I(,IK;-97 cliii de-I kpl 2(1(111 173 thereof is at least The isolated nucleic acid molecule according to claim 22, comprising the sequence of nucleotides substantially as set forth in any one of SEQ ID Nos: 3, 4, 5, 7, 9 or 11 or a homologue, analogue or derivative thereof or a complementary sequence thereto. 26. An isolated nucleic acid molecule which encodes or is complementary to a nucleic acid molecule which encodes a polypeptide capable of cellulose and/or P-1,4- glucan biosynthesis wherein said polypeptide comprises a sequence of amino acids which is at least 40% identical to any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14, 10 subject to the proviso that said polypeptide is not an Acetobacter cellulose synthase polypeptide. 27. The isolated nucleic acid molecule according to claim 26, wherein the percentage identity to any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14 is at least 28. The isolated nucleic acid molecule according to claim 27, wherein the percentage 15 identity to any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14 is at least 29. The isolated nucleic acid molecule according to claim 26, wherein the polypeptide comprises a sequence of amino acids substantially as set forth in any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14. A genetic construct which comprises the isolated nucleic acid molecule according to any one of claims 1 to 29. 31. A genetic construct which comprises the isolated nucleic acid molecule according to any one of claims 1 to 29 operably connected to a promoter sequence. 32. The genetic construct according to claim 31, wherein the nucleic acid molecule is operably connected to the promoter sequence in the sense orientation such that RNA which encodes a polypeptide capable of cellulose and/or P-1,4-glucan Q/ biosynthesis or a homologue, WO 98/00549 PCT/AU97/00402 -174- analogue or derivative thereof is produced when said nucleic acid molecule is expressed. 33. The genetic construct according to claim 31, wherein the nucleic acid molecule is operably connected to the promoter sequence in the antisense orientation such that RNA which is complementary to RNA which encodes a polypeptide capable of cellulose and/or 3- 1,4-glucan biosynthesis or a homologue, analogue or derivative thereof, is produced when said nucleic acid molecule is expressed. 34. The genetic construct according to claim 33, wherein the nucleic acid molecule encodes an antisense or ribozyme molecule. The genetic construct according to any one of claims 31 to 34, wherein the promoter is the CaMV 35S promoter. 36. The genetic construct according to any one of claims 31 to 34, wherein the promoter is the Arabidopsis thaliana RSW1 gene promoter. 37. A method of increasing the level of cellulose in a cell, tissue, organ or organism, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 therein, in the sense orientation, for a time and under conditions at least sufficient to produce or increase expression of the polypeptide encoded therefor. 38. The method according to claim 37, comprising the additional first step of transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 39. The method according to claim 38, wherein the cell is a prokaryotic cell. The method according to claim 38, wherein the cell, tissue, organ or organism is a eukaryotic cell, tissue, organ or organism. WO 98/00549 PCT/AU97/00402 -175- 41. The method according to claim 40, wherein the cell, tissue, organ or organism is a plant, fungal, insect, animal or yeast cell. tissue, organ or organism. 42. The method according to claim 41, wherein the cell, tissue, organ or organism is a plant cell, tissue, organ or organism. 43. The method according to claim 42 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. 44. A method of reducing the level of non-crystalline P-1,4-glucan in a cell, tissue, organ or organism, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 therein, in the sense orientation, for a time and under conditions at least sufficient to produce or increase expression of the polypeptide encoded therefor. The method according to claim 44, comprising the additional first step of transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 46. The method according to claim 44, wherein the cell is a prokaryotic cell. 47. The method according to claim 44, wherein the cell, tissue, organ or organism is a eukaryotic cell, tissue, organ or organism. 48. The method according to claim 47, wherein the cell, tissue, organ or organism is a plant, fungal, insect, animal or yeast cell, tissue, organ or organism. 49. The method according to claim 48, wherein the cell, tissue, organ or organism is a plant cell, tissue, organ or organism. °CT/AU 97/O 0 P:\OPER\MRO\CELLULOS.PCT- 13/3/98 ECEIVE -176- (amended) The method according to claim 49 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. 51. A method of reducing the level of starch in a cell, tissue, organ or organism, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 therein, in the sense orientation, for a time and under conditions at least sufficient to produce or increase expression of the polypeptide encoded therefor. 52. (amended) The method according to claim 51, comprising the additional first step of transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 53. (amended) The method according to claims 51 or 52, wherein the cell is a prokaryotic cell. 54. (amended) The method according to claims 51 or 52, wherein the cell, tissue, organ or organism is a eukaryotic cell, tissue, organ or organism. 55. The method according to claim 54, wherein the eukaryote is a plant, fungus, insect, animal or yeast. 56. The method according to claim 55, wherein the eukaryote is a plant. 57. The method according to claim 56 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. 8. A method of reducing the level of cellulose in a cell, tissue, organ or organism, said imethod comprising expressing the isolated nucleic acid molecule according to any one of AMENDED SHEET IPEA/AU WO 98/00549 PCT/AU97/00402 -177 claims 1 to 29 therein, in the antisense orientation, for a time and under conditions at least sufficient to prevent or reduce the expression of the polypeptide encoded therefor. 59. The method according to claim 58, comprising the additional first step of transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. The method according to claims 58 or 59, wherein the cell, tissue, organ or organism is a eukaryotic cell, tissue, organ or organism. 61. The method according to claim 60, wherein the eukaryote is a plant, fungus, insect, animal or yeast. 62. The method according to claim 61, wherein the eukaryote is a plant. 63. The method according to claim 62 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. 64. A method of increasing the level of non-crystalline p-1,4-glucan in a cell, tissue, organ or organism, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 therein, in the antisense orientation, for a time and under conditions at least sufficient to prevent or reduce the expression of the polypeptide encoded therefor. The method according to claim 64, comprising the additional first step of transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 66. The method according to claims 64 or 65, wherein the cell, tissue, organ or organism is a eukaryotic cell, tissue, organ or organism. WO 98/00549 PCT/AU97/00402

178- 67. The method according to claim 66, wherein the eukaryote is a plant, fungus, insect, animal or yeast. 68. The method according to claim 67, wherein the eukaryote is a plant. 69. The method according to claim 68 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. A method of increasing the level of starch in a cell, tissue, organ or organism, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 therein, in the antisense orientation, for a time and under conditions at least sufficient to prevent or reduce the expression of the polypeptide encoded therefor. 71. The method according to claim 70, comprising the additional first step of transforming the cell, tissue, organ or organism with the isolated nucleic acid molecule. 72. The method according to claims 70 or 71, wherein the cell, tissue, organ or organism is a eukaryotic cell, tissue, organ or organism. 73. The method according to claim 72, wherein the eukaryote is a plant, fungus, insect, animal or yeast. 74. The method according to claim 73, wherein the eukaryote is a plant. The method according to claim 74 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. ARBOTSFORD d Pclopl',l) d.s-4 ApliL 2""o

179- 76. A method of producing a recombinant enzymatically active polypeptide which is capable of synthesizing cellulose and/or 3-1,4-glucan and/or an intermediate between cellulose and 3-1,4-glucan in a cell, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 or a homologue, analogue or derivative thereof in said cell for a time and under conditions sufficient for the polypeptide encoded therefor to be produced. 77. The method according to claim 76, comprising the additional first step of transforming the cell with the isolated nucleic acid molecule according to any one S. of claims 1 to 29 or the genetic construct according to any one of claims 11 to 10 78. A recombinant polypeptide produced according to the method defined by claim 76 or 77. 79. The recombinant cellulose biosynthetic polypeptide according to claim 78, further defined as a recombinant cellulose synthase or catalytically active subunit thereof. 80. A recombinant cellulose biosynthetic polypeptide other than an Acetobacter 15 cellulose synthase polypeptide, which is capable of cellulose and/or P-1,4-glucan production and comprising a sequence of amino acids set forth in any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14 or a homologue, analogue or derivative thereof which is at least 40% identical thereto. 81. The recombinant cellulose biosynthetic polypeptide according to claim 80, wherein the percentage identity to any one of SEQ ID Nos: 2, 6, 8, 10, 12 o; 14 is at least 82. The recombinant cellulose biosynthetic polypeptide according to claim 81, wherein the percentage identity to any one of SEQ ID Nos: 2, 6, 8, 10, 12 or 14 is at least 83. The recombinant cellulose biosynthetic polypeptide according to claim 82, Scomprising a sequence of amino acids substantially as set forth in any one of SEQ l ID Nos: 2, 6, 8, 10, 12 or 14. A\BBOTSFORD'J' PCOPc r~c IL r197 d d--4 Al 11

180- 84. A method of altering the mechanical properties of a cell wall, said method comprising expressing the isolated nucleic acid molecule according to any one of claims 1 to 29 in the antisense orientation in said cell for a time and under conditions sufficient for the level of non-crystalline p-1,4-glucan to increase in said cell. The method according to claim 84, wherein the non-crystalline P-1,4-glucan is cross-linked to cellulose microfibrils. 86. The method according to claim 84 or 85, wherein the cell wall normally has a high ratio of cellulose to hemicelluloses. 10 87. The method according to any one of claims 84 to 86, wherein the nucleic acid molecule expressed in the antisense orientation is contained within an antisense molecule or ribozyme molecule. 88. The method according to any one of claims 84 to 87, wherein the cell wall is a plant cell wall. 15 89. The method according to claim 88,wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. An antibody molecule which binds to the recombinant polypeptide according to any one of claims 78 to 83 or a homologue, analogue or derivative thereof, wherein said antibody is not prepared against an Acetobacter cellulose synthase polypeptide. 91. A transgenic plant transformed with the isolated nucleic acid molecule according to any one of claims 1 to 29 or a genetic construct according to any one of claims to 36. Ar *rC WO 98/00549 PCT/AU97/00402 181 92. The transgenic plant according to claim 91, wherein said plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others. 93. Use of an isolated nucleic acid molecule according to any one of claims 1 to 29 to modify the cellulose content of a cell. 94. Use according to claim 93, wherein if the nucleic acid molecule according to any one of claims 1 to 29 is expressed in the sense orientation in said cell, the level of cellulose therein is increased. Use according to claim 93, wherein if the nucleic acid molecule according to any one of claims 1 to 29 is expressed in the antisense orientation in said cell, the level of cellulose therein is decreased. 96. Use according to claim 95, wherein said cell is further characterised by increased non-crystalline P-1,4-glucan content and/or starch content. 97. Use according to claim 95 or 96, wherein said cell is further characterised by increased cross-linking of non-crystalline 0-1,4-glucan to cellulose. 98. Use according to any one of claims 93 to 97, wherein the cell is a plant cell. 99. Use according to claim 98 wherein the plant is selected from the list comprising Arabidopsis thaliana, Gossypium hirsutum (cotton), Oryza sativa (rice), Eucalyptus ssp., Brassica ssp., wheat, barley, maize, hemp, jute, flax, and woody plants such as Pinus ssp., Populus ssp., Picea spp., amongst others.