AU727294B2

AU727294B2 - Regulation of gene expression in plants

Info

Publication number: AU727294B2
Application number: AU89670/98A
Authority: AU
Inventors: Zhongyi Li; Matthew Morell; Sadequr Rahman
Original assignee: Commonwealth Scientific and Industrial Research Organization CSIRO; Goodman Fielder Pty Ltd; Groupe Limagrain Pacific Pty Ltd
Current assignee: Commonwealth Scientific and Industrial Research Organization CSIRO; Biogemma SAS
Priority date: 1997-09-12
Filing date: 1998-09-11
Publication date: 2000-12-07
Anticipated expiration: 2018-09-11
Also published as: AU8967098A

Description

WO 99/14314 PCT/AU98/00743 1 REGULATION OF GENE EXPRESSION IN PLANTS This invention relates to methods of modulating the expression of desired genes in plants, and to DNA sequences and genetic constructs for use in these methods.

In particular, the invention relates to methods and constructs for targeting of expression specifically to the endosperm of the seeds of cereal plants such as wheat, and for modulating the time of expression in the target tissue.

This is achieved by the use of promoter sequences from enzymes of the starch biosynthetic pathway. In a preferred embodiment of the invention, the sequences and/or promoters are those of starch branching enzyme I, starch branching enzyme II, soluble starch synthase I, and starch debranching enzyme, all derived from Triticum tauschii, the D genome donor of hexaploid bread wheat.

A further preferred embodiment relates to a method of identifying variations in the characteristics of plants.

BACKGROUND OF THE INVENTION Starch is an important constituent of cereal grains and of flours, accounting for about 65-67% of the weight of the grain at maturity. It is produced in the amyloplast of the grain endosperm by the concerted action of a number of enzymes, including ADP-Glucose pyrophosphorylase (EC 2.7.7.27), starch synthases (EC 2.4.1.21), branching enzymes (EC 2.4.1.18) and debranching enzymes (EC 3.2.1.41 and EC 3.2.1.68) (Ball et al, 1996; Martin and Smith, 1995; Morell et al, 1995). Some of the proteins involved in the synthesis of starch can be recovered from the starch granule (Denyer et al, 1995; Rahman et al, 1995).

Most wheat cultivars normally produce starch containing 25% amylose and 75% amylopectin. Amylose is composed of large linear chains of a linked a-Dglucopyranosyl residues, whereas amylopectin is a branching form of a-glycan linked by a linkages. The ratio of amylose and amylopectin, the branch chain length and the WO 99/14314 PCT/AU98/00743 2 number of branch chains of amylopectin are the major factors which determine the properties of wheat starch.

Starch with various properties has been widely used in industry, food science and medical science. High amylose wheat can be used for plastic substitutes and in paper manufacture to protect the environment; in health foods to reduce bowel cancer and heart disease; and in sports foods to improve the athletes' performance. High amylopectin wheat may be suitable for Japanese noodles, and is used as a thickener in the food industry.

Wheat contains three sets of chromosomes B and D) in its very large genome of about 1010 base pairs (bp).

The donor of the D genome to wheat is Triticum tauschii, and by using a suitable accession of this species the genes from the D genome can be studied separately (Lagudah et al, 1991) There is comparatively little variation in starch structure found in wheat varieties, because the hexaploid nature of wheat prevents mutations from being readily identified. Dramatic alterations in starch structure are expected to require the combination of homozygous recessive alleles from each of the 3 wheat genomes, A, B and D. This requirement renders the probability of finding such mutants in natural or mutagenised populations of wheat very low.

Variation in wheat starch is desirable in order to enable better tailoring of wheat starches for processing and enduser requirements.

Key commercial targets for the manipulation of starch biosynthesis are: i. "Waxy" wheats in which amylose content is decreased to insignificant levels. This outcome is expected to be obtained by eliminating granule-bound starch synthase activity.

2. High amylose wheats, expected to be obtained by suppressing starch branching enzyme-II activity.

3. Wheats which continue to synthesise starch at elevated temperatures, expected to be obtained by WO 99/14314 PCT/AU98/00743 3 identifying or introducing a gene encoding a heat-stable soluble starch synthase.

4. "Sugary types" of wheat which contain increased amylose content and free sugars, expected to be obtained by manipulating an isoamylase-type debranching enzyme.

There are two general strategies which may be used to obtain wheats with altered starch structure: using genetic engineering strategies to suppress the activity of a specific gene, or to introduce a novel gene into a wheat line; and (b)selecting among existing variation in wheat for missing ("null") or altered alleles of a gene in each of the genomes of wheat, and combining these by plant breeding.

However, in view of the complexity of the gene families, particularly starch branching enzyme I (SBE without the ability to target regions which are unique to genes expressed in endosperm, modification of wheat by combination of null alleles of several enzymes in general represents an almost impossible task.

Branching enzymes are involved in the production of glucose a-1,6 branches. Of the two main constituents of starch, amylose is essentially linear, but amylopectin is highly branched; thus branching enzymes are thought to be directly involved in the synthesis of amylopectin but not amylose. There are two types of branching enzymes in plants ,starch branching enzyme I (SBE I) and starch branching enzyme II (SBE II), and both are about 85 kDa in size. At the nucleic acid level there is about 65% sequence identity between types I and II in the central portion of the molecules; the sequence identity between SBE I from different cereals is about 85% overall (Burton et al, 1995; Morell et al, 1995).

In cereals, SBE I genes have so far been reported only for rice (Kawasaki et al, 1991; Rahman et al, 1997). A cDNA sequence for wheat SBE I is available on the GenBank WO 99/14314 PCT/AU98/00743 4 database (Accession No. Y12320; Repellin Nair Baga and Chibbar Plant Gene Register PGR97-094, 1997).

As far as we are aware, no promoter sequence for wheat SBE I has been reported.

We have characterised an SBE I gene, designated wSBE I-D2, from Triticum tauschii, the donor of the D genome to wheat (Rahman et al, 1997). This gene encoded a protein sequence which had a deletion of approximately 65 amino acids at the C-terminal end, and appeared not to contain some of the conserved amino acid motifs characteristic of this class of enzyme (Svensson, 1994). Although wSBE I-D2 was expressed as mRNA, no corresponding protein has yet been found in our analysis of SBE I isoforms from the endosperm, and thus it is possible that this gene is a transcribed pseudogene.

Genes for SBE II are less well characterised; no genomic sequences are available, although SBE II cDNAs from rice (Mizuno et al, 1993; Accession No. D16201) and maize (Fisher et al, 1993; Accession No. L08065) have been reported. In addition, a cDNA sequence for SBE II from wheat is available on the GenBank database (Nair et al, 1997; Accession No. Y11282); although the sequences are very similar to those reported herein, there are differences near the N-terminal of the protein, which specifies its intracellular location. No promoter sequences have been reported, as far as we are aware.

Wheat granule-bound starch synthase (GBSS) is responsible for amylose synthesis, while wheat branching enzymes together with soluble starch synthases are considered to be directly involved in amylopectin biosynthesis. A number of isoforms of soluble and granulebound starch synthases have been identified in developing wheat endosperm (Denyer et al, 1995). There are three distinct isoforms of starch synthases, 60 kDa, 75-77 kDa and 100-105 kDa, which exist in the starch granules (Denyer et al, 1995; Rahman et al, 1995). The 60 kDa GBSS is the product of the wx gene. The 75-77 kDa protein is a wheat WO 99/14314 PCT/AU98/00743 5 soluble starch synthase I (SSSI) which is present in both the soluble fraction and the starch granule-bound fraction of the endosperm. However, the 100-105 kDa proteins, which are another type of soluble starch synthase, are located only in starch granules (Denyer et al, 1995; Rahman et al, 1995). To our knowledge there has been no report of any complete wheat SSS I sequence, either at the protein or the nucleotide level.

Both cDNA and genomic DNA encoding a soluble starch synthase I of rice have been cloned and analysed (Baba et al, 1993; Tanaka et al, 1995). The cDNAs encoding potato soluble starch synthase SSSII and SSSIII and pea soluble starch synthase SSSII have also been reported (Edwards et al, 1995; Marshall et al, 1996; Dry et al, 1992). However, corresponding full length cDNA sequences for wheat have hitherto not been available, although a partial cDNA sequence (Accession No. U48227) has been released to the GenBank database.

Approach referred to above has been demonstrated for the gene for granule-bound starch synthase.

Null alleles on chromosomes 7A, 7D and 4A were identified by the analysis of GBSS protein bands by electrophoresis, and combined by plant breeding to produce a wheat line containing no GBSS, and no amylose (Nakamura et al, 1995).

Subsequently, PCR-based DNA markers have been identified, which also identify null alleles for the GBSS loci on each of the three wheat genomes. Despite the availability of a considerable amount of information in the prior art, major problems remain. Firstly, the presence of three separate sets of chromosomes in wheat makes genetic analysis in this species extraordinarily complex. This is further complicated by the fact that a number of enzymes are involved in starch synthesis, and each of these enzymes is itself present in a number of forms, and in a number of locations within the plant cell. Little, if any, information has been available as to which specific form of each enzyme is expressed in endosperm. For wheat, a limited WO 99/14314 PCT/AU98/00743 6 amount of nucleic acid sequence information is available, but this is only cDNA sequence; no genomic sequence, and consequently no information regarding promoters and other control sequences, is available. Without being able to demonstrate that the endosperm-specific gene within a family has been isolated, such sequence information is of limited practical usefulness.

SUMMARY OF THE INVENTION In this application we report the isolation and identification of novel genes from T. tauschii, the D-genome donor of wheat, that encode SBE I, SBE II, a 75 kDa SSS I, and an isoamylase-type debranching enzyme (DBE). Because of the very close relationship between T. tauschii and wheat, as discussed above, results obtained with T. tauschii can be directly applied to wheat with little if any modification.

Such modification as may be required represents routine trial and error experimentation. Sequences from these genes can be used as probes to identify null or altered alleles in wheat, which can then be used in plant breeding programmes to provide modifications of starch characteristics. The novel sequences of the invention can be used in genetic engineering strategies or to introduce a desired gene into a host plant, to provide antisense sequences for suppression of one or more specific genes in a host plant, in order to modify the characteristics of starch produced by the plant.

By using T. tauschii, we have been able to examine a single genome, rather than three as in wheat, and to identify and isolate the forms of the starch synthesis genes which are expressed in endosperm. By addressing genomic sequences we have been able to isolate tissue-specific promoters for the relevant genes, which provides a mechanism for simultaneous manipulation of a number of genes in the endosperm. Because T. tauschii is so closely related to wheat, results obtained with this model system are directly applicable to wheat, and we have confirmed this experimentally. The genomic sequences which we have PCT/AU98/00743 Received 21 June 1999 7 determined can also be used as probes for the identification and isolation of corresponding sequences, including promoter sequences, from other cereal plant species.

In its most general aspect, the invention provides a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, said enzyme being selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, and that starch branching enzyme II does not have the N-terminal amino acid sequence:

AASPGKVLVPDGEDDLASPA.

Preferably the nucleic acid sequence is a DNA sequence, and may be genomic DNA or cDNA. Preferably the sequence is one which is functional in wheat. More preferably the sequence is derived from a Triticum species, most preferably Triticum tauschii.

Where the sequence encodes soluble starch synthase, preferably the sequence encodes the 75 kD soluble starch synthase of wheat.

Biologically-active untranslated control sequences of genomic DNA are also within the scope of the invention.

Thus the invention also provides the promoter of an enzyme as defined above.

In a preferred embodiment of this aspect of the invention, there is provided a nucleic acid construct comprising a nucleic acid sequence of the invention, a biologically-active fragment thereof, or a fragment thereof encoding a biologically-active fragment of an enzyme as defined above, operably linked to one or more nucleic acid sequences facilitating expression of said enzyme in a plant, preferably a cereal plant. The construct may be a plasmid or a vector, preferably one suitable for use in the transformation of a plant. A particularly suitable vector S is a bacterium of the genus Agrobacterium, preferably AMENDED SHEET (Article 34) (IPEA/AU) PCT/AU98/00743 Received 21 June 1999 7a- Agrobacte-i ur turnefaci ens. Methods of transforming cereal plants using Agrobacteriurn turnefaciens are known; see for example Australian Patent No. 667939 by Japan Tobacco Inc., AMENDED SHEET (Article 34) (LPEA/AU) WO 99/14314 PCT/AU98/00743 -8- International Patent Application Number PCT/US97/10621 by Monsanto Company and Tingay et al (1997).

In a second aspect, the invention provides a nucleic acid construct for targeting of a desired gene to endosperm of a cereal plant, and/or for modulating the time of expression of a desired gene in endosperm of a cereal plant, comprising one or more promoter sequences selected from SBE I promoter, SBE II promoter, SSS I promoter, and DBE promoter, operatively linked to a nucleic acid sequence encoding a desired protein, and optionally also operatively linked to one or more additional targeting sequences and/or one or more 3' untranslated sequences.

The nucleic acid encoding the desired protein may be in either the sense orientation or in the antisense orientation. Preferably the desired protein is an enzyme of the starch biosynthetic pathway. For example, the antisense sequences of GBSS, starch debranching enzyme, SBE II, low molecular weight glutenin, or grain softness protein I, may be used. Preferred sequences for use in sense orientation include those of bacterial isoamylase, bacterial glycogen synthase, or wheat high molecular weight glutenin Bxl7. It is contemplated that any desired protein which is encoded by a gene which is capable of being expressed in the endosperm of a cereal plant is suitable for use in the invention.

In a third aspect, the invention provides a method of modifying the characteristics of starch produced by a plant, comprising the step of: introducing a gene encoding a desired enzyme of the starch biosynthetic pathway into a host plant, and/or introducing an anti-sense nucleic acid sequence directed to a gene encoding an enzyme of the starch biosynthetic pathway into a host plant, wherein said enzymes are as defined above.

Where both steps and are used, the enzymes in the two steps are different.

Preferably the plant is a cereal plant, more preferably wheat or barley.

WO 99/14314 PCT/AU98/00743 9

I

As is well known in the art, anti-sense sequences can be used to suppress expression of the protein to which the anti-sense sequence is complementary. It will be evident to the person skilled in the art that different combinations of sense and anti-sense sequences may be chosen so as to effect a variety of different modifications of the characteristics of the starch produced by the plant.

In a fourth aspect, the invention provides a method of targeting expression of a desired gene to the endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to the invention.

According to a fifth aspect, the invention provides a method of modulating the time of expression of a desired gene in endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to the second aspect of the invention.

Where expression at an early stage following anthesis is desired, the construct preferably comprises the SBE II, SSS I or DBE promoters. Where expression at a later stage following anthesis is desired, the construct preferably comprises the SBE I promoter.

While the invention is described in detail in relation to wheat, it will be clearly understood that it is also applicable to other cereal plants of the family Gramineae, such as maize, barley and rice.

Methods for transformation of monocotyledonous plants such as wheat, maize, barley and rice and for regeneration of plants from protoplasts or immature plant embryos are well known in the art. See for example Lazzeri et al, 1991; Jahne et al, 1991 and Wan and Lemaux, 1994 for barley; Wirtzens et al, 1997; Tingay et al, 1997; Canadian Patent Application No. 2092588 by Nehra; Australian Patent Application No. 61781/94 by National Research Council of Canada, Australian Patent No. 667939 by Japan Tobacco Co, and International Patent Application Number PCT/US97/10621 by Monsanto Company.

WO 99/14314 PCT/AU98/00743 10 The sequences of ADP glucose pyrophosphorylase from barley (Australian Patent Application No. 65392/94), starch debranching enzyme and its promoter from rice (Japanese Patent Publication No. Kokai 6261787 and Japanese Patent Publication No. Kokai 5317057), and starch debranching enzyme from spinach and potato (Australian Patent Application No. 44333/96) are all known.

Detailed Description of the Drawings The invention will be described in detail by reference only to the following non-limiting examples and to the figures.

Figure 1 shows the hybridisation of genomic clones isolated from T. tauschii.

DNA was extracted from the different clones, digested with BamHI and hybridised with the 5' end of the maize SBE I cDNA. Lanes 1, 2, 3 and 4 correspond to DNA from clones kEl, kE2, XE6 and kE7 respectively. Note that clones El and kE2 give identical patterns, the SBE I gene in iE6 is a truncated form of that in XEl, and kE7 gives a clearly different pattern.

Figure 2 shows the hybridisation of DNA from T. tauschii.

DNA from T. tauschii was digested with BamHI and the hybridisation pattern compared with DNA from kEl and XE7 digested with the same enzyme. Fragment El.1 (see Figure 3) from kEl was used as the probe; it contains some sequences that are over 80% identical to sequences in E7.8.

Approximately 25 ig of T. tauschii DNA was electrophoresed in lane 1, and 200 pg each of XEl and XE7 in lanes 2 and 3, respectively.

Figure 3 shows the restriction maps of clone XE1 and XE7. The fragments obtained with EcoRI and BamHI are indicated. The fragments sequenced from kEl are El.1, E1.2, a part of E1.7 and a part of Figure 4 shows the comparison of deduced amino acid sequence of wSBE I-D4 cDNA with the deduced amino acid WO 99/14314 PCT/AU98/00743 11 sequence of rice SEE I (RSBE I; Nakamura et al, 1992), maize SBE I (MSBE I; Baba et al, 1991), wSBE I-D2 type cDNA (D2 CDNA; Rahman et al, 1997), pea SEE II (PESBE II, homologous to maize SBE I; Burton et al, 1995), and potato SEE I (POSBE; Cangiano et al, 1993). The deduced amino acid sequence of the wSBE I-D4 cDNA is denoted by "D4cDNA".

Residues present in at least three of the sequences are identified in the consensus sequence in capitals.

Figure 5 shows the intron-exon structure of wSBE I-D4 compared to the corresponding structures of rice SEE I (Kawasaki et al, 1993) and wSBE I-D2 (Rahman et al, 1997). The intron-exon structure of wSBE I-D4 is deduced by comparison with the SBE I cDNA reported by Repellin et al (1997).

The dark rectangles correspond to exons and the light rectangles correspond to introns. The bars above the structures indicate the percentage identity in sequence between the indicated exons and introns of the relevant genes. Note that intron 2 shares no significant sequence identity and is not indicated.

Figure 6 shows the nucleotide sequence of part of wSBE I-D4, the amino acid sequence deduced from this nucleotide sequence, and the N-terminal amino acid sequence of the SEE I purified from the wheat endosperm (Morell et al, 1997).

Figure 7 shows the hybridisation of SBE I genomic clones with the following probes, A. wSBE I-D45 (derived from the 5' end of the gene and including sequence from fragments El.1 and E1.7), and B. wSBE I-D43 (derived from the 3' end of the gene and containing sequences from fragment E1.5). For panel A, the tracks 1-13 correspond to clones XEl, XE2, XE6, XE7, XE9, XE14, XE22, kE27, Molecular weight markers, XE29, XE30, XE31 and XE52. For panel B, tracks 1-12 correspond to clones XEl, XE2, XE6, XE7, XE9, XE14, XE22, XE27, kE29, XE31 and XE52. Note that clones XE7 and XE22 do not WO 99/14314 PCT/AU98/00743 12 hybridise to either of the probes and are wSBE I-D2 type genes. Also note that clone XE30 contains a sequence unrelated to SBE I. The size of the molecular weight markers in kb is indicated. Clones kE7 and XE22 do hybridise with a probe from El.l. which is highly conserved between wSBE I-D2 and wSBE I-D4.

Figure 8 shows the alignment of cDNA clones to obtain the sequence represented by wSBE I-D4 cDNA. BED4 and were obtained from screening the cDNA library with maize BEI (Baba et al, 1991). BED1, 2 and 3 were obtained by RT-PCR using defined primers.

Figure 9a shows the expression of Soluble Starch Synthase I (SSS), Starch Branching Enzyme I (BE I) and Starch Branching Enzyme II (BE II) mRNAs during endosperm development..

RNA was purified from leaves, florets prior to anthesis, and endosperm of wheat cultivar Rosella grown in a glasshouse, collected 5 to 8 days after anthesis, 10 to days after anthesis and 18 to 22 days after anthesis, and from the endosperm of wheat cultivar Rosella grown in the field and collected 12, 15 and 18 days after anthesis respectively. Equivalent amounts of RNA were electrophoresed in each lane. The probes were from the coding region of the SM2 SSS I cDNA (from nucleotide 1615 to 1919 of the SM2 cDNA sequence); wSBE I-D43C (see Table I), which corresponds to the untranslated 3' end of wSBE I-D4 cDNA (El and the 5' region of SBE9 (SBE9 corresponding to the region between nucleotides 743 to 1004 of Genbank sequence Y11282. No hybridisation to RNA extracted from leaves or preanthesis florets was detected.

Figure 9b shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the starch branching enzyme I gene. The probe, wSBEI-D43, is defined in Table 1.

Figure 9c shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Wyuna" with WO99/14314 PCT/AU98/00743 13 the starch branching enzyme II gene. The probe, wSBE II-D13, is defined in Table 2.

Figure 9d shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the SSS I gene. The probe spanned the region from nucleotides 2025 to 2497 of the SM2 cDNA sequence shown in SEQ ID No:ll.

Figure 9e shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the DBE I gene. The probe, a DBE3' 3'PCR fragment, extends from nucleotide position 281 to 1072 of the cDNA sequence in SEQ ID No:16.

Figure 9f shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the wheat actin gene. The probe was a wheat actin DNA sequence generated by PCR from wheat endosperm cDNA using primers to conserved plant actin sequences.

Figure 9g shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with a probe containing wheat ribosomal RNA 26S and 18S fragments (plasmid pta250.2 from Dr Bryan Clarke, CSIRO Plant Industry).

Figure 9h shows the hybridisation of RNA from the hexaploid wheat cultivar "Gabo" with the DBE I probe described in Figure 9e. Lane 1; leaf RNA; lane 2, preanthesis floret RNA; lane 3, RNA from endosperm harvested 12 days after anthesis.

Figure 10 shows the comparison of wSBE I-D4 (sr 427.res ck: 6,362,1 to 11,099) and rice SBE I genomic sequence (dl0838.em_pl ck: 3,071,1 to 11,700)(Kawasaki et al, 1993; Accession Number D10838) using the programs Compares and DotPlot (Devereaux et al, 1984). The programs used a window of 21 bases with a stringency of 14 to register a dot.

Figure 11 shows the hybridisation of wheat DNA from chromosome-engineered lines using the following probes: A. wSBE I-D45 (from the 5' end of the gene), WO 99/14314 PCT/AU98/00743 14 B. wSBE I-D43 (from the 3' end of the gene), and C. wSBE I-D4R (repetitive sequence approximately 600 bp 3' to the end of wSBE I-D4 sequence.

N7AT7B, no 7A chromosome, four copies of 7B chromosome; N7BT7D, no 7B chromosome, four copies of 7D chromosome; NTDT7A, no 7D chromosome, four copies of 7A chromosome. The chromosomal origin of hybridising bands is indicated.

Figure 12 shows the hybridisation of genomic clones Fl, F2, F3 and F4 with'the entire SBE-9 sequence.

The DNA from the clones was purified and digested with either BamHI or EcoRI, separated on agarose, blotted onto nitrocellulose and hybridised with labelled SBE-9 (a SBE II type cDNA). The pattern of hybridising bands is different in the four isolates.

Figure 13a shows the N-terminal sequence of purified SBE II from wheat endosperm as in Morell et al, (1997).

Figure 13b shows the deduced amino acid sequence from part of wSBE II-D1 that encodes the N-terminal sequence as described in Morell et al, (1997) Figure 14 shows the deduced exon-intron structure for a part of wSBE II-D1. The scale is marked in bases.

The dark rectangles are exons.

Figure 15 shows the hybridisation of DNA from chromosome engineered lines of wheat (cultivar Chinese Spring) with a probe from nucleotides 550-850 from SBE-9.

The band of approximately 2.2 kb is missing in the line in which chromosome 2D is absent.

T2BN2A: four copies of chromosome 2B, no copies of chromosome 2A; T2AN2B: four copies of chromosome 2A, no copies of chromosome 2B; T2AN2D: four copies of chromosome 2A, no copies of chromosome 2D.

WO 99/14314 PCT/AU98/00743 15 Figure 16 shows the N-terminal sequence of SSS I protein isolated from starch granules (Rahman et al, 1995) and deduced amino acid sequence of part of Sm2.

Figure 17 shows the hybridisation of genomic clones sgl, 3, 4, 6 and 11 with the cDNA clone (sm2) for SSS I. DNA was purified from indicated genomic clones, digested with BamHI or SacI and hybridised to sm2. Note that the hybridisation patterns for sgl, 3 and 4 are clearly different from each other.

Figure 18 shows a comparison of the intron/exon structures of the wheat and rice soluble starch synthase genomic sequences. The dark rectangles indicate exons and the light rectangles represent introns.

Figure 19 shows the hybridisation of DNA from chromosome engineered lines of wheat (cultivar Chinese Spring) digested with PvuII, with the sm2 probe.

N7AT7B: no 7A chromosome, four copies of 7B chromosome; N7BT7D: no 7B chromosome, four copies of 7D chromosome; N7DT7A: no 7D chromosome, four copies of 7A chromosome.

A band is missing in the N7BT7A line.

Figure 20a shows the DNA sequence of a portion of the wheat debranching enzyme (WDBE-1)PCR product. The PCR product was generated from wheat genomic DNA (cultivar Rosella) using primers based on sequences conserved in debranching enzymes from maize and rice.

Figure 20b shows a comparison of the nucleotide sequence of wheat debranching enzyme I (WDBE-I) PCR fragment (WHEAT.DNA) with the maize Sugary-i sequence

(SUGARY.DNA).

Figure 20c shows a comparison between the intron/exon structures of wheat debranching enzyme gene and the maize sugary-1 debranching enzyme gene.

Figure 21a shows the results of Southern blotting of T. tauschii DNA with wheat DBE-I PCR product. DNA from T. tauschii was digested with BamHI electrophoresed, WO 99/14314 PCT/AU98/00743 16 blotted and hybridised to the wheat DBE-I PCR product described in Figure 20a. A band of approximately 2 kb hybridised.

Figure 21b shows Chinese Spring nullisomic/ tetrasomic lines probed with probes from the DBE gene. Panel shows hybridisation with a fragment spanning the region from nucleotide 270 to 465 of the cDNA sequence shown in SEQ ID No:16 from the central region of the DBE gene. Panel (II) shows hybridisation with a probe from the 3' region of the gene, from nucleotide 281 to 1072 of the cDNA sequence given in SEQ ID No:16.

Figures 22a to 22e show diagrammatic representations of the DNA vectors used for transient expression analysis. In each of the sequences the N-terminal methionine encoding ATG codon is shown in bold.

Figure 22a shows a DNA construct pwsssIprolgfpNOT containing a 1042 base pair region of the wheat soluble starch synthase I promoter (wSSSIprol, from -1042 to SEQ ID No:18) fused to the green fluorescent protein (GFP) reporter gene.

Figure 22b shows a DNA construct pwsssIpro2gfpNOT containing a 3914 base pair region of the wheat soluble starch synthase I promoter (wSSSIpro2, from -3914 to SEQ ID No:18) fused to the green fluorescent protein (GFP) reporter gene.

Figure 22c shows a DNA construct psbeIIprolgfpNOT containing an 1203 base pair region of the wheat starch branching enzyme II promoter (sbellprol, from 1 to 1023 SEQ ID No:10 fused to the green fluorescent protein (GFP) reporter gene.

Figure 22d shows a DNA construct psbeIIpro2gfpNOT containing a 1353 base pair region of the wheat starch branching enzyme II promoter and transit peptide coding region (sbellpro2, regions 1-1203, 1204 to 1336 and 1664 to 1680 of SEQ ID No:10 fused to the green fluorescent protein (GFP) reporter gene.

Figure 22e shows a DNA construct pact_jsgfg_nos WO 99/14314 PCT/AU98/00743 17 containing the plasmid backbone of pSP72 (Promega), the rice ActI actin promoter (McElroy et al. 1991), the GFP gene (Sheen et al. 1995) and the Agrobacterium tumefaciens nopaline synthase (nos) terminator (Bevan et al. 1983).

Figure 23 shows T DNA constructs for stable transformation of rice by Agrobacterium. The backbone for each plasmid is p35SH-iC (Wang et al 1997). The various promoter-GFP-Nos regions inserted are shown in (c) and respectively, and are described in detail in Example 24. Each of these constructs was inserted into the NotI site of p35SH-iC using the NotI flanking sites at each end of the promoter-GFP-Nos regions. The constructs were named p35SH-iC-BEIIprol_GFP_Nos, p35SH-iC-BEIIpro2_GFP_Nos p35SH-iC-SSIprol_GFP_Nos and SSIpro2_GFP_Nos Figure 24 illustrates the design of 15 intronspanning BE II primer sets. Primers were based on wSBE II-D1 sequence (SEQ ID No:10), and were designed such that intron sequences in the wSBE II-D1 sequence (deduced from Figure 13b and Nair et al, 1997; Accession No. Y11282) were amplified by PCR.

Figure 25 shows the results of amplification using the SBE II-Intron 5 primer set (primer set 6: sr913F and WBE2E6 R) on various diploid, tetraploid and hexaploid wheats.

i)T.boeodicum (A genome diploid) ii)T.tauschii (D genome diploid) iii)T.aestivum cv. Chinese Spring ditelosomic line 2AS (lacking chromosome arm 2AL) iv)Crete 10 (AABB tetraploid) v)T. aestivum cv Rosella (hexaploid) The horizontal axis indicates the size of the product in base pairs, the vertical axis shows arbitrary fluorescence units. The various arrows indicate the products of different genomes: A, A genome, B, B genome, D, D genome, U, unassigned additional product.

WO 99/14314 PCT/AU98/00743 18 Figure 26 shows the results obtained by amplification using the SBE II-Intron 10 primer set (primer set 11: da5.seq and WBE2E11R on the wheat lines: aestivum cv. Chinese Spring ditelosomic line 2AS.

(ii)T. aestivum Chinese Spring nullisomic/tetrasomic line N2BT2A.

(iii) T. aestivum Chinese Spring nullisomic/tetrasomic line N2DT2B.

The horizontal axis indicates the size of the product in base pairs, the vertical axis shows arbitrary fluorescence units. The various arrows indicate the products of different genomes: A, A genome, B, B genome, D, D genome.

Figure 27 shows the results of transient expression assays typical of each promoter and target tissue. The photographs (40 x magnification) of representative tissue resulting from the transient expression assays typical of each promoter and target tissue revealed under a Leica microscope with blue light illumination. Photographs were taken 48 to 72 hours after tissue bombardment. The promoter constructs are listed as follows, (with the panels showing endosperm, embryo and leaf expression listed in respective order): pact_jsgfp_nos (panels a,g and pwsssIprolgfpNOT (panels b, h and n); pwsssIpro2gfpNOT (panels c, i and psbeIIprolgfpNOT (panels d, j and psbeIIpro2gfpNOT (panels e, k and q); pZLgfpNOT (Panels f, 1 and r).

Example 1 Identification of Gene Encoding SBE I Construction of Genomic Library and Isolation of Clones The genomic library used in this study was constructed from Triticum tauschii, var. strangulata, accession number CPI 100799. Of all the accessions of T. tauschii surveyed, the genome of CPI 100799 is the most closely related to the D genome of hexaploid wheat.

WO 99/14314 PCT/AU98/00743 19 Triticum tauschii, var strangulata (CPI accession number 110799) was kindly provided by Dr E Lagudah. Leaves were isolated from plants grown in the glasshouse.

DNA was extracted from leaves of Triticum tauschii using published methods (Lagudah et al, 1991), partially digested with Sau3A, size fractionated and ligated to the arms of lambda GEM 12 (Promega). The ligated products were used to transfect the methylation-tolerant strain PMC 103 (Doherty et al. 1992). A total of 2 x 106 primary plaques were obtained with an average insert size of about 15 kb.

Thus the library contains approximately 6 genomes worth of T. tauschii DNA. The library was amplified and stored at 4°C until required.

Positive plaques in the genomic library were selected as those hybridising with the 5' end of a maize starch branching enzyme I cDNA (Baba et al, 1991) using moderately stringent conditions as described in Rahman et al, (1997).

Preparation of Total RNA from Wheat Total RNA was isolated from leaves, pre-anthesis pericarp and different developmental stages of wheat endosperm of the cultivar, Hartog and Rosella. This material was collected from both the glasshouse and the field. The method used for RNA isolation was essentially the same as that described by Higgins et al (1976). RNA was then quantified by UV absorption and by separation in 1.4% agarose-formaldehyde gels which were then visualized under UV light after staining with ethidium bromide (Sambrook et al, 1989).

DNA and RNA analysis DNA was isolated and analysed using established protocols (Sambrook et al, 1989). DNA was extracted from wheat (cv. Chinese Spring) using published methods (Lagudah et al, 1991). Southern analysis was performed essentially as described by Jolly et al (1996). Briefly, 20 gg wheat WO 99/14314 PCT/AU98/00743 20 DNA was digested, electrophoresed and transferred to a nylon membrane. Hybridisation was conducted at 420C in 25% or formamide, 2 x SSC, 6% Dextran Sulphate for 16h and the membrane was washed at 60°C in 2 x SSC for 3 x lh unless otherwise indicated. Hybridisation was detected by autoradiography using Fuji X-Omat film.

RNA analysis was performed as follows. 10 ig of total RNA was separated in a 1.4% agarose-formaldehyde gel and transferred to a nylon Hybond N membrane (Sambrook et al, 1989 and hybridized with cDNA probe at 420C in Khandjian hybridizing buffer (Khandjian, 1989). The 3' part of wheat SBE I cDNA (designated wSBE I-D43, see Table 1) was labelled with the Rapid Multiprime DNA Probe Labelling Kit (Amersham) and used as probe. After washing at 600C with 2 x SSC, 0.1% SDS three times, each time for about 1 to 2 hours, the membrane was visualized by overnight exposure at -80°C with X-ray film, Kodak MR.

Example 2 Frequency of Recovery of SEE I Type Clones from the Genomic Library An estimated 2 x 10 plaques from the amplified library were screened using an EcoRI fragment that contained 1200 bp at the 5' end of maize SBE I (Baba et al, 1991) and twelve independent isolates were recovered and purified.

This corresponds to the screening of somewhat fewer than the 2 x 10 primary plaques that exist in the original library (each of which has an average insert size of 15 kb) (Maniatis et al, 1982), because the amplification may lead to the representation of some sequences more than others.

Assuming that the amplified library contains approximately three genomes of T. tauschii, the frequency with which SBE I-positive clones were recovered suggests the existence of about 5 copies of SBE I type genes within the T. tauschii genome.

Digestion of DNA from the twelve independent isolates by the restriction endonuclease BamHI followed by hybridisation with a maize SBE I clone, suggested that the WO 99/14314 PCT/AU98/00743 21 genomic clones could be separated into two broad classes (Figure One class had 10 members and a representative from this class is the clone lE1 (Figure 1, lane XE6 (Figure 1, lane 3) is a member of this class, but is missing the 5' end of the El-SBE I gene because the SEE I gene is at the extremity of the cloned DNA. Further hybridisation studies at high stringency with the extreme 5' and 3' regions of the SEE I gene contained in XEl suggested that the other clones contained either identical or very closely related genes.

The second family had two members, and of these clone XE7 (Figure 1, lane 4) was arbitrarily selected for further study. These two members did not hybridise to probes from the extreme 5' and 3' regions of the SEE I gene that were contained in XEl, indicating that they were a distinct sub-class.

The DNA from T. tauschii and the lambda clones XEl and XE7 was digested with BamHI and hybridised with fragment El.l, as shown in Figure 2. This fragment contains sequences that are highly conserved (85% sequence identity over 0.3 kB between XEl and XE7), corresponding to exons 3, 4 and 5 of the rice gene. The bands in the genomic DNA at 0.8 kb and 1.0 kb correspond to identical sized fragments from XEl and XE7, as shown in Figure 2; these are fragments El.1 and E7.8 of XEl and XE7 genomic clones respectively. Thus the arrangement of genes in the genomic clones is unlikely to be an artefact of the cloning procedure. There are also bands in the genomic DNA of approximately 2.5 kb, 4.8 kb and 8 kb in size which are not found from the digestion of XEl or XE7; these could represent genes such as the 5' sequences of wSBE I-D1 or wSBE I-D3; see below.

Example 3 Tandem Arrangement of SBE I Type Genes in the T. tauschii Genome Basic restriction endonuclease maps for XEl and XE7 are shown in Figure 3. The map was constructed by WO 99/14314 PCT/AU98/00743 22 performing a series of hybridisations of EcoRI or BamHI digested DNA from XEl or XE7. The probes used were the fragments generated from BamHI digestion of the relevant clone. Confirmation of the maps was obtained by PCR analysis, using primers both within the insert and also from the arms of lambda itself. PCR was performed in 10 .l volume using reagents supplied by Perkin-Elmer. The primers were used at a concentration of 20 JIM. The program used was 94 0 C, 2 min, 1 cycle, then 94 0 C, 30 sec; 55°C, 30 sec; 72 0

C,

1min for 36 cycles and then 72 0 C, 5 min; 25 0 C, 1 min.

Sequencing was performed on an ABI sequencer using the manufacturer's recommended protocols for both dye primer and dye terminator technologies. Deletions were carried out using the Erase-a-base kit from Promega.

Sequence analysis was carried out using the GCG version 7 package of computer programs (Devereaux et al, 1984).

The PCR products were also used as hybridisation probes. The positioning of the genes was derived from sequencing the ends of the BamHI subclones and also from sequencing PCR products generated from primers based on the insert and the lambda arms. The results indicate that there is only a single copy of a SBE I type gene within XEl.

However, it is clear that ?E7 resulted from the cloning of a DNA fragment from within a tandem array of the SBE I type genes. Of the three genes in the clone, which are named as wSBE I-D1, wSBE I-D2 and wSBE I-D3); only the central one (wSBE I-D2) is complete.

Example 4 Construction and Screening of cDNA Library A wheat cDNA library was constructed from the cultivar Rosella using pooled RNA from endosperm at 8, 12, 18 and 20 days after anthesis.

The cDNA library was prepared from poly A+ RNA that was extracted from developing wheat grains (cv.

Rosella, a hexaploid soft wheat cultivar) at 8, 12, 15, 18, 21 and 30 days after anthesis. The RNA was pooled and used WO 99/14314 PCT/AU98/00743 23 to synthesise cDNA that was propagated in lambda ZapII (Stratagene).

The library was screened with a genomic fragment from E7 encompassing exons 3, 4 and 5 (fragment E7.8 in Figure A number of clones were isolated. Of these an apparently full-length clone appeared to encode an unusual type of cDNA for SEE I. This cDNA has been termed SEE I-D2 type cDNA. The putative protein product is compared with the maize SEE I and rice SEE I type deduced amino acid sequences in Figure 4. The main difference is that this putative protein product is shorter at the C-terminal end, with an estimated molecular size of approximately 74 kD compared with 85 kDa for rice SEE I (Kawasaki et al, 1993).

Note that amino acids corresponding to exon 9 of rice are missing in SEE I-D2 type cDNA, but those corresponding to exon 10 are present. There are no amino acid residues corresponding to exons 11-14 of rice; furthermore, 'the sequence corresponding to the last 57 amino acids of SEE I-D2 type has no significant homology to the sequence of the rice gene.

We expressed SEE I-D2 type cDNA in E. coli in order to examine its function. The cDNA was expressed as a fusion protein with 22 N-terminal residues of P-galactosidase and two threonine residues followed by the SEE I-D2 cDNA sequence either in or out of frame. Although an expected product of about 75 kDa in size was produced from only the in-frame fusion, we could not detect any enzyme activity from crude extracts of E. coli protein.

Furthermore the in-frame construct could not complement an E. coli strain with a defined deletion in glycogen branching, although other putative branching enzyme cDNAs have been shown to be functional by this assay (data not shown). It is therefore unclear whether the wSBE I-D2 gene in XE7 codes for an active enzyme in vivo.

WO 99/14314 PCT/AU98/00743 24 Example 5 Gene Structure in E7 i. Sequence of wSBE I-D2 We sequenced 9.2 kb of DNA that contained wSBE I-D2. This corresponds to fragments 7.31, 7.8 and 7.18. Fragment 7.31 was sequenced in its entirety (4.1 kb), but the sequence of about 30 bases about 2 kb upstream of the start of the gene could not be obtained because it was composed entirely of Gs. Elevation of the temperature of sequencing did not overcome this problem. Fragments 7.8 (1 kb) and 7.18 (4 kb) were completely sequenced, and corresponded to 2 kb downstream of the last exon detected for this gene. It was clear that we had isolated a gene which was closely related (approximately 95% sequence identity) to the SBE I-D2 type cDNA referred to above, except that the last 200 bp at the 3' end of the cDNA are not present. The wSBE I-D2 gene includes sequences corresponding to rice exon 11 which are not in the cDNA clone. In addition it does not have exons 9, 12, 13 or 14; these are also absent from the SBE I-D2 type cDNA. The first two exons show lower identity to the corresponding exons from rice (approximately 60%) (Kawasaki et al, 1993) than to the other exons (about A diagrammatic exonintron structure of the wSBE I-D2 gene is indicated in Figure 5. The restriction map was confirmed by sequencing the PCR products that spanned fragments 7.18 and 7.8 and 7.8 and E7.31 (see Figure 3) respectively.

ii. Sequence of wSBE I-D3 This gene was not sequenced in detail, as the genomic clone did not extend far enough to include the end of the sequence. The sequence is of a SBE-I type. The orientation of the gene is evident from sequencing of the relevant BamHI fragments, and was confirmed by sequence analysis of a PCR product generated using primers from the right arm of lambda and a primer from the middle of the gene. The sequence homology with wSBEI-D2 is about 80% over the regions examined. The 2 kb sequenced corresponded to WO 99/14314 PCT/AU98/00743 25 exons 5 and 6 of the rice gene; these sequences were obtained by sequencing the ends of fragments 7.5, 7.4 and 7.14 respectively, although the sequences from the left end of fragment 7.14 did not show any homology to the rice sequences. The gene does not appear to share the 3' end of SBE I-D2 type cDNA, as a probe from 500 bp at the 3' end of the cDNA (including sequences corresponding to exons 8 and from rice) did not hybridise to fragment 7.14, although it hybridised to fragment 7.18.

iii. Sequence of wSBE I-D1 This gene was also not sequenced in detail, as it was clear that the genomic clone did not extend far enough to include the 5' sequences. Limited sequencing suggests that it is also a SBE I type gene. The orientation relative to the left arm of lambda was confirmed by sequencing a PCR product that used a primer from the left arm of lambda and one from the middle of the gene (as above). Its sequence homology with wSBE I-D2 ,D3 and D4 (see below) is about in the region sequenced corresponding to a part of exon 4 of the rice gene.

Starch branching enzymes are members of the aamylase protein family, and in a recent survey Svensson (1994) identified eight residues in this family that are invariant, seven in the catalytic site and a glycine in a short turn. Of the seven catalytic residues, four are changed in SBE I-D2 type. However, additional variation in the 'conserved' residues may come to light when more plant cDNAs for branching enzyme I are available for analysis. In addition, although exons 9, 11, 12, 13 and 14 from rice are not present in the SBE I-D2 type cDNA, comparison of the maize and rice SBE I sequences indicate that the 3' region (from amino acid residue 730 of maize) is much more variable than the 5' and central regions. The active sites of rice and maize SBE I sequences, as indicated by Svensson (1994), are encoded by sequences that are in the central portion of the gene. When SBE II sequences from Arabidopsis were WO 99/14314 PCT/AU98/00743 26 compared by Fisher et al (1996) they also found variation at the 3' and 5' ends. SBE I-D2 type cDNA may encode a novel type of branching enzyme whose activity is not adequately detected in the current assays for detecting branching enzyme activity; alternatively the cDNA may correspond to an endosperm mRNA that does not produce a functional protein.

Example 6 Cloning of the cDNA corresponding to the wSBE I-D4 gene The first strand cDNAs were synthesized from 1 Lg of total RNA, derived from endosperm 12 days after pollination, as described by Sambrook et al (1989), and then used as templates to amplify two specific cDNA regions of wheat SBE I by PCR.

Two pairs of primers were used to obtain the cDNA clones BED1 and BED3 (Table Primers used for cloning of BED3 were the degenerate primer GGC NAC NGC NGA G/AGA C/TGG 3' (SEQ ID NO.1), based on the N-terminal sequence of the purified wheat endosperm SBE I protein, in which the 5' end of the primer is at position 168 of wSBE I-D4 cDNA, as shown in Table 1, based on the N-terminal sequence of wheat SBE I, and the primer NTS3' TAC ATT TCC TTG TCC ATCA 3' (SEQ ID NO.2) in which the 5' end is at position 1590 of wSBE I-D4 cDNA, (see Table designed to anneal to the conserved regions of the nucleotide sequences of BED5 and the maize and rice SBE I cDNAs. For clone BED1, the primers used were 5' ATC ACG AGA GCT TGC TCA (SEQ ID NO.3) WO 99/14314 PCT/AU98/00743 27 in which the 5' end is at position 1 of wSBE I-D4 cDNA (see Table the sequence was based on the wSBE I-D4 gene, and BEC3' 5' CGG TAC ACA GTT GCG TCA TTT TC 3' (SEQ ID NO.4) in which the 5' end is at position 334 of wSBE I-D4 cDNA (see Table and the sequence was based on BED 3.

Example 7 Identification of the gene from the Triticum tauschii SBE I family which is expressed in the endosperm We have isolated two classes of SBE I genomic clones from T. tauschii. One class contained two genomic clone isolates, and this class has been characterised in some detail (Rahman et al, 1997). The complete gene contained within this class of clones was termed wSBE I-D2; there were additional genes at either ends of the clone, and these were designated wSBE I-D1 and wSBE I-D3. The other class contained nine genomic clone isolates. Of these XEl was arbitrarily taken as a representative clone, and its restriction map is shown in Figure 3; the SBE I gene contained in this clone was called wSBE I-D4.

Fragments El.1 (0.8 kb) and E1.2 (2.1 kb) and fragments E1.7 (4.8 kb) and E1.5 (3 kb) respectively were completely sequenced. Fragment E1.7 was found to encode the N-terminal of the SBE I, which is found in the endosperm as described in Morell et al (1997). This is shown in Figure 6. Using antibodies raised against the N-terminal sequence, Morell et al (1997) found that the D genome isoform was the most highly expressed in the cultivars Rosella and Chinese Spring. We have thus isolated from T. tauschii a gene, wSBE I-D4, whose homologue in the hexaploid wheat genome encodes the major isoform for SBE I that is found in the wheat endosperm.

WO 99/14314 PCT/AU98/00743 28 Table 1 Location of structural features and probes within wSBE I-D4 sequence.

A. Location of exons by comparison with the cDNA sequence of Repellin et al., (1997). Accession number Y12320.

Exon number Start posn 4890 5082 5524 5819 6149 6519 7744 8015 8562 9137 9421 9580 9781 9990 End posn 4987 5149 5731 5888 6318 7424 7860 8077 8670 9237 9488 9661 9897 10480 B. Other features.

Name of feature. wSBE I-D4.

sequence Putative initiation of translation Mature N-terminal sequence of SBE I End of translated SBE I sequence End of D4 cDNA sequence wSBE I-D45 wSBE I-D43 El.1 BED 1 BED 2 BED 3 BED 4 BED 5 Endosperm box like motif TGAAAAGT CAAAT motif TATAAA motif 4900 5550 10225 10461 4870,5860 10116,10435 5680,6400 4480,590 4863 4833 D4 cDNA sequence.

11 124 2431 2687 1,354 2338,2657 380,630 1,354 169,418 151,1601 867,2372 867,2687 WO 99/14314 PCT/AU98/00743 29 All nine genomic clones of the XEl type isolated from T. tauschii appear to contain the wSBE I-D4 gene, or very similar genes, on the basis of PCR amplification and hybridisation experiments. However, the restriction patterns obtained for the clones differ with BamHI and EcoRI, among other enzymes, indicating that either the clones represent near-identical but distinct genes or they represent the same gene isolated in distinct products of the Sau3A digest used to generate the library.

Example 8 Investigation of other SBE I genomic clones isolated All ten members of the IEl-like class of SBE I genomic clones were investigated by hybridisation with probes derived from fragment E1.7 (sequence wSBE encoding the translation start signal and the first 100 amino acids from the N-terminal end and intron sequences; see Table 1) and from fragment E1.5 (sequence wSBE I-D43, corresponding largely to the 3' untranslated sequence and containing intron sequences, see Table The results obtained were consistent with one type of gene being isolated in different fragments in the different clones, as shown in Figure 7. The PCR products were obtained from the clones XEl, 2, 9, 14, 27, 31 and 52. These hybridised to wSBE I-D45 using primers that amplify near the 5' end of the gene (positions 5590-6162 of wSBE I-D4). Sequencing showed no differences in sequence of a 200 bp product.

Analysis of the promoter for wSBE I-D4 allows us to investigate the presence of motifs previously described for promoters that regulate gene expression in the endosperm. Forde et al (1985) compared prolamin promoters, and suggested that the presence of a motif approximately -300 bp upstream of the transcription start point, called the endosperm box, was responsible for endosperm-specific expression. The endosperm box was subsequently considered to consist of two different motifs: the endosperm motif (EM) (canonical sequence TGTAAAG) and the GCN 4 motif (canonical WO 99/14314 PCT/AU98/00743 30 sequence G/ATGAG/CTCAT). The GCN4 box is considered to regulate expression according to nitrogen availability (Muller and Knudsen, 1993). The wSBE I-D4 promoter contains a number of imperfect EM-like motifs at approximately -100, -300 and -400 as well as further upstream. However, no GCN4 motifs could be found, which lends support to the idea that this motif regulates response to nitrogen, as starch biosynthesis is not as directly dependent on the nitrogen status of the plant as storage protein synthesis. Comparison of the promoters for wSBE I-D4 and D2 (Rahman et al, 1997) indicates that although there are no extensive sequence homologies there is a region of about 100 bp immediately before the first encoded methionine where the homology is 61% between the two promoters. In particular there is an almost perfect match in the sequence over twenty base pairs CTCGTTGCTTCC/TACTCCACT, (positions 4723-4742 of the wSBE I sequence), but the significance of this is hard to gauge, as it does not occur in the rice promoter for SBE I. The availability of more promoters for starch biosynthetic enzymes may allow firmer conclusions to be drawn. There are putative CAAT and TATA motifs at positions 4870 and 4830 respectively of wSBE I-D4 sequence. The putative start of translation of the mRNA is at position 4900 of wSBE I-D4.

Figure 5 shows the structure of the wSBE I-D4 gene, compared with the genes from rice and wheat (Kawasaki et al, 1993; Rahman et al, 1997). The rice SBE I has 14 exons compared with 13 for wSBE I-D4 and 10 for wSBE I-D2.

There is good conservation of exon-intron structure between the three genes, except at the extreme 5' end. In particular the sizes of intron 1 and intron 2 are very different between rice SBE I and wSBE I-D4.

Example 9 Isolation of cDNA for SBE I Using the maize starch branching enzyme I cDNA as a probe (Baba et al, 1991), 10 positive plaques were recovered by screening approximately 105 plaques from a wheat endosperm cDNA library prepared from the cultivar WO 99/14314 PCT/AU98/00743 31 Rosella, as described in Example 4. On purifying and sequencing these plaques it was clear that even the longest clone (BEDS, 1822 bp) did not encode the N-terminal sequence obtained from protein analysis. Degenerate primers based on the wheat endosperm SBE I protein N-terminal sequence (Morell et al, 1997) and the sequence from BEDS were then used to amplify the 5' region: this produced a cDNA clone termed BED 3 (Table 1 and Figure This cDNA clone overlapped extensively and had 100% sequence identity with BED5 and BED4 (Figure As almost the entire protein Nterminal sequence had been included in the primer sequence design, this did not provide independent evidence of the selection of a cDNA sequence in the endosperm that encoded the protein sequence of the main form of SBE I. Using a BED3 to screen a second cDNA library produced BED2, which is shorter than BED3 but confirmed the BED3 sequence at 100% identity between positions 169 and 418 (Figure 8 and Table In addition the entire cDNA sequence for BED3 could be detected at a 100% match in the genomic clone XEl.

Primers based on the putative transcription start point combined with a primer based on the incomplete cDNAs recovered were then used to obtain a PCR product from total endosperm RNA by reverse transcription. This led to the isolation of the cDNA clone, BED1, of 300 bp, whose location is shown in Figure 8. By analysing this product, a sequence was again obtained that could be found exactly in the genomic clone XEl, and which overlapped precisely with BED3.

The N-terminal of the protein matches that of SBE I isolated from wheat endosperm by Morell et al (1997), and thus the wSBE I-D4 cDNA represents the gene for the predominant SBE I isoform expressed in the endosperm. The encoded protein is 87 kDa; this is similar to proteins encoded by maize (Baba et al, 1991) and rice (Nakamura et al, 1992) cDNAs for SBE I and is distinct from the wSBE I-D2 cDNA described previously, in which the encoded protein was 74 kDa (Rahman et al, 1997).

WO 99/14314 PCT/AU98/00743 32 Five cDNA clones were sequenced and their sequences were assembled into one contiguous sequence using a GCG program (Devereaux et al, 1984). The arrangement of these sequences is illustrated in Figure 8, the nucleotide sequence is shown in SEQ ID No:5, and the deduced amino acid sequence is shown in SEQ ID No:6. The intact cDNA sequence, wSBE I-D4 cDNA, is 2687 bp and contains one large open reading frame (ORF), which starts at nucleotides 11 to 13 and ends at nucleotides 2432 to 2434. It encodes a polypeptide of 807 amino acids with a molecular weight of 87 kDa. Comparison of the amino acid sequence encoded by wSBE I-D4 cDNA with that encoded by maize and rice SBE I cDNAs showed that there is 75-80% identity between any of two these sequences at the nucleotide level and almost at the amino acid level. Alignment of these three polypeptide sequences, as shown in Figure 4, along with the deduced sequences for pea, potato and wSBE I-D2 type cDNA, indicated that the sequences in the central region are highly conserved, and sequences at the 5' end (about 80 amino acids) and the 3' end (about 60 amino acids) are variable.

Svensson et al (1994) indicated that there were several invariant residues in sequences of the a-amylase super-family of proteins to which SBE I belongs. In the sequence of maize SBE I these are in motifs commencing at amino acid residue positions 341, 415, 472, 537 respectively; these are also encoded in the wSBE I-D4 sequence (SEQ ID No:9), further supporting the view that this gene encodes a functional enzyme. This is in contrast to the results with the wSBE I-D2 gene, where three of the conserved motifs appear not to be encoded (Rahman et al, 1997).

There is about 90% sequence identity in the deduced amino acid sequence between wSBE I-D4 cDNA and rice SBE I cDNA in the central portion of the molecule (between residues 160 and 740 for the deduced amino acid product from wSBE I-D4 cDNA). The sequence identity of the deduced amino WO 99/14314 PCT/AU98/00743 33 acid sequence of the wSBE I-D4 cDNA to the deduced amino acid sequence of wSBE I-D2 is somewhat lower (85% for the most conserved region, between residues 285 to 390 for the deduced product of wSBE I-D4 cDNA). Surprisingly, however, wSBE I-D4 cDNA is missing the sequence that encodes amino acids at positions 30 to 58 in rice SBE I (see Figure 4).

This corresponds to residues within the transit peptide of rice SBE I. A corresponding sequence also occurs in the deduced amino acid sequence from maize SBE I (Baba et al, 1991) and wSBE I-D2 type cDNA (Rahman et al, 1997).

Consequently the transit sequence encoded by wSBE I-D4 cDNA is unusally short, containing only 38 amino acids, compared with 55-60 amino acids deduced for most starch biosynthetic enzymes in cereals (see for example Ainsworth, 1993; Nair et al, 1997). .The wSBE I-D4 gene does contain this sequence, but this does not appear to be transcribed into the major species of RNA from this gene, although it 'can be detected at low relative abundance. This raises the possibility of alternative splicing of the wSBE I-D4 transcript, and also the question of the relative efficiency of translation/transport of the two isoforms. The possibility of alternative splicing in both rice and wheat has been considered for soluble starch synthase (Baba et al,1993 Rahman et al, 1995). Alternative splicing of soluble starch synthase would give a transit sequence of 40 amino acids, which is the same length proposed for the product of wSBE I-D4 cDNA.

We have previously used probes based on exons 4, and 6 (E7.8 and El.l, see Rahman et al., 1997) of wSBE-D2 to probe wheat and T. tauschii genomic DNA cleaved with PvuII and BamHI respectively. This region is highly conserved within rice SBE I, wSBE I-D2 and wSBE I-D4 and produced ten bands with wheat DNA and five with T. tauschii DNA. Neither PvuII nor BamHI cleaved within the probe sequences, suggesting that each band represented a single type of SBE I gene. We have described four SBE I genes from T. tauschii: wSBE I-D1, wSBE I-D2, wSBE I-D3 and wSBE I-D4 (Rahman et al, WO 99/14314 PCT/AU98/00743 34 1997 and this specification), and so we may have accounted for most of the genes in T. tauschii and, by extension, the genes from the D genome of wheat. In wheat, at least two hybridising bands could be assigned to each of chromosomes 7A, 7B and 7D.

Example 10 Tissue specificity and expression during endosperm development The 300 bp of 3' untranslated sequence of wSBE I-D4 cDNA does not show any homology with either the wSBE I-D2 type cDNA that we have described earlier (Rahman et al, 1997) or with BE-I from rice, as shown in Figure We have called this sequence wSBE I-D43C (see SEQ ID No:9) It seemed likely that wSBE I-D43C would be a specific probe for this class of SBE-I, and thus it was used to investigate the tissue specificity. Hybridization of RNA from endosperm of hexaploid T. tauschii cultures with SBE I, SBE II, SSS I, DBE I, wheat actin, and wheat ribosomal RNA was examined.

RNA was purified at various numbers of days after anthesis from plants grown with a 16 h photoperiod at 13 OC (night) and 18 oC (day). The age of the endosperms from which RNA was extracted in days after anthesis is given above the lanes in the blot. Equivalent amounts of RNA were electrophoresed in each lane. The probes used are identified in Tables 1 and 2.

The results are shown in Figures 9a to 9g. An RNA species of about 2700 bases in size was found to hybridise.

This is very close to the size of the wSBE I-D4 cDNA sequence. RNA hybridising to wSBE-I-D43C is most abundant at the mid-stage of endosperm development, as shown in Figure 9a, and in field grown material is relatively constant during the period 12-18 days, the time at which there is rapid starch and storage protein accummulation (Morell et al, 1995).

The sequence contained within the wSBE I-D4 gene appears to be expressed only in the endosperm (Figure 9a, Figure 9b). We could not detect any expression in the leaf.

WO 99/14314 PCT/AU98/00743 35 This could be because another isoform is expressed in the leaf, and/or because the amount of SBE I present in the leaf is much less than what is required in the endosperm.

Isolation of SBE I clones from a leaf cDNA library would enable this question to be resolved.

Example 11 Intron-Exon Structure of SBE I By comparison of the cDNA sequence of SEE I (Repellin et al, 1997) with that of wSBE I-D4 we can deduce the intron-exon structure of the gene for the major isoform of SBE I that is found in the endosperm. The structure contains 14 exons compared to 14 for rice (Kawasaki et al, 1993). These 14 exons are spread over 6 kb of sequence, a distance similar to that found in both rice SBE I and wSBE I-D2. A dotplot comparison of wSBE I-D4 sequence and that of rice SBE I sequence, depicted in Figure 10, shows good sequence identity over almost the entire gene starting from about position 5100 of wSBE I-D4; the identity is poor over the first 5 kb of sequence corresponding largely to the promoter sequences. The sequence identity over introns (about 60%) is lower than over exons (about Example 12 Repeated Sequences in SBE I Sequencing of wSBE I-D4 revealed there was a repeated sequence of at least 300 bp contained in a 2kb fragment about 600 bp after the 3' end of the gene. We have called this sequence wSBE I-D4R (SEQ ID NO: This repeated sequence is within fragment E1.5 (Figure 3 and Table 1) and is flanked by non-repetitive sequences from the genomic clone. We have previously shown that the restriction pattern obtained by digesting XEl with the restriction enzyme BamHI is also obtained when T. tauschii DNA is digested. Thus wSBE I-D4R is unlikely to be a cloning artefact. A search of the GenBank Database revealed that wSBE I-D4R shared no significant homology with any sequence in the database. Hybridisation experiments with wSBE I-D4R showed that all of the other SBE I-D4 type WO 99/14314 PCT/AU98/00743 36 genomic clones (except number 29) contained this repeated sequence (data not shown). The wSBE I-D4R sequence was not highly repeated and occurred in the wheat genome with a similar frequency as the wSBE I-D4 sequence.

When SBE I-D4R was used as the probe on wheat DNA from the nulli-tetra lines, four bands were obtained; two of these bands could be assigned to chromosome 7A and the others to chromosomes 7B and 7D (Figure 11). One of the two BamHI fragments from wheat DNA which could be assigned to chromosome 7A was distinct from the single band from chromosome 7A detected using wSBE I-D43 as the probe; the other three bands coincided in the autoradiograph with bands obtained with wSBE I-D43, and are likely to represent the same fragment. However, one of these fragments was distinct from the BamHI fragment that hybridised to the wSBE I-D43 sequence. In wSBE I-D4 (see SEQ ID No:9), the wSBE I-D43 sequence is only 300 bp upstream of wSBE I-D4R, and occurs in the same BamHI fragment. These results suggest that the wSBE I-D4R sequence can occur independently of wSBE I-D4 in the wheat genome.

Example 13 Isolation of Genomic Clones Encoding SBE II Screening of a cDNA library, prepared from the wheat endosperm as described in Example 4, with the maize BE I clone (Baba et al, 1991) at low stringency led to the isolation of two classes of positive plaques. One class was strongly hybridising, and led to the isolation of wheat SBE I-D2 type and SBE I-D4 type cDNA clones, as described in Example 5 and in Rahman et al (1997). The second class was weakly hybridising, and one member of this class was purified. This weakly hybridising clone was termed SBE-9, and on sequencing was found to contain a sequence that was distinct from that for SBE I. This sequence showed greatest homology to maize BE II sequences, and was considered to encode part of the wheat SBE II sequence.

The screening of approximately 5 x 105 plaques from a genomic library constructed from T. tauschii (see WO 99/14314 PCT/AU98/00743 37 Example 1) with the SBE-9 sequence led to the isolation of four plaques that were positive. These were designated wSBE II-D1 to wSBE II-D4 respectively, and were purified and analysed by restriction mapping. Although they all had different hybridization patterns with SBE-9, as shown in Figure 12, the results were consistent with the isolation of the same gene in different-sized fragments.

Example 14 Identification of the N-terminal sequence of SBE II Sequencing of the SBE II gene contained in clone 2, termed SBE II-DI (see SEQ ID No:10), showed that it coded for the N-terminal sequence of the major isoform of SBE II expressed in the wheat endosperm, as identified by Morell et al (1997). This is shown in Figure 13.

Example 15 Intron-Exon Structure of the SBE II Gene In addition to encoding the N-terminal sequence of sBE II,.as shown in Example 10, the cDNA sequence reported by Nair et al (1997) was also found to have 100% sequence identity with part of the sequence of wSBE II-D1. Thus the intron-exon structure can be deduced, and this is shown in Figure 14. The positions of exons and other major structural features of the SBE II gene are summarized in Table 2.

Example 16 Number of SBE II Genes in T. tauschii and Wheat Hybridisation of the SBE II conserved region with T. tauschii DNA revealed the presence of three gene classes.

However, in our screening we only recovered one class.

Hybridisation to wheat DNA indicated that the locus for SBE II was on chromosome 2, with approximately 5 loci in wheat; most of these appear to be on chromosome 2D, as shown in Figure WO 99/14314 PCT/AU98/00743 38 Table 2 Positions of structural features in wSBE II-D1.

A. Positions of exons.

Exon number 1 2 3 4 6 7 8 9 11 12 13 14 16 17 18 19 21 22 Genomic start 1058 1664 2038 2681 2949 3145 3540 3704 4110 4818 5115 6209 6427 6739 7447 8392 9556 9839 10120 10395 10928 11092 Genomic finish 1336 1761 2279 2779 2997 3204 3620 3825 4188 4939 5234 6338 6549 6867 7550 8536 9703 9943 10193 10550 11002 11475 B. Other structural features within the sequence Putative initiation of translation Mature N-terminal sequence of SBE II.

wSBE II-D13 Endosperm box like motif TGAAAAGT Endosperm box like motif TGAAAGT Endpsperm box like motif CGAAAAT Endosperm box like motif TAAATGT CAAAAT motif TCAATT motif TATAAA motif AATTAA motif wSBE II-D1 DNA 1214 1681 11116 to 11448 521 565 669 768 784 1108 799 1110 WO 99/14314 PCT/AU98/00743 39 Example 17 Expression of SBE II Investigation of the pattern of expression of SBE II revealed that the gene was only expressed in the endosperm. However the timing of expression was quite distinct from that of SBE I, as illustrated in Figures 9a, 9b and 9c.

SBE I gene expression is only clearly detectable from the mid-stage of endosperm development (10 days after anthesis in Figure 9b), whereas SBE II gene expression is clearly seen much earlier, in endosperm tissue at 5-8 days after development (Figures 9a and 9c), corresponding to an early stage of endosperm development. The hybridisation of wheat endosperm mRNA with the actin and ribosomal RNA genes is shown as controls (Figures 9fa and 9g, respectively).

Example 18 Cloning of Wheat Soluble Starch Synthase cDNA A conserved sequence region was used for the synthesis of primers for amplification of SSS I by comparison with the nucleotide sequences encoding soluble starch synthases of rice and pea. A 300 bp RT-PCR product was obtained by amplification of cDNA from wheat endosperm at 12 days post anthesis. The 300 bp RT-PCT product was then cloned, and its sequence analysed. The comparison of its sequence with rice SSS cDNA showed about 80% sequence homology. The 300 bp RT-PCR product was 100% homologous to the partial sequence of a wheat SSS I in the database produced by Block et al (1997).

The 300 bp cDNA fragment of wheat soluble starch synthase thus isolated was used as a probe for the screening of a wheat endosperm cDNA library (Rahman et al, 1997).

Eight cDNA clones were selected. One of the largest cDNA clones (sm2) was used for DNA sequencing analysis, and gave a 2662 bp nucleotide sequence, which is shown in SEQ ID NO:14. A large open reading frame of this cDNA encoded a 647 amino acid polypeptide, starting at nucleotides 247 to 250 and terminating at nucleotides 2198 to 2200. The WO 99/14314 PCT/AU98/00743 40 deduced polypeptide was shown by protein sequence analysis to contain the N-terminal sequence of a 75 kDa granule-bound protein (Rahman et al, 1995). This is illustrated in Figure 16. The location of the 75 kDa protein was determined for both the soluble fraction and starch granulebound fraction by the method of Denyer et al (1995). Thus this cDNA clone encoded a polypeptide comprising a 41 amino acid transit peptide and a 606 amino acid mature peptide (SEQ ID NO:12). The cleavage site LRRL was located at amino acids 36 to 39 of the transit peptide of this deduced polypeptide.

Comparison of wheat SSS I with rice SSS and potato SSS showed that there is 87.4% or 75.9% homology at the amino acid level and 74.7% or 58.1% homology at the nucleotide level. Some amino acids in the at N-terminal sequences of the SSS I of wheat and rice were conserved.

Major features of the SSS I gene are summarized in Table 3.

Example 19 Isolation of Genomic Clone of Wheat Soluble Starch Synthase Seven genomic clones were obtained with a 300 bp cDNA probe by screening approximately 5 x 105 plaques from a genomic DNA library of Triticum tauschii, as described above. DNA was purified from 5 of these clones and digested with BamHI and SacI. Southern hybridization analysis using the 300 bp cDNA as probe showed that these clones could be classified into two classes, as shown in Figure 17. One genomic clone, sg3, contained a long insert, and was digested with BamHI or SacI and subcloned into pBluescript KS+ vector.

WO 99/14314 PCT/AU98/00743 41 Table 3 Comparison of exons and introns of soluble starch synthases I genes of wheat and rice Identity of exons of wheat and rice soluble starch synthase I genes of Exons la lb 2 3 4 6 7 8 9 11 12 13 14 15b wSSI-D1 rSSI identity 255 316 356 78 125 82 174 82 92 63 90 125 109 53 40 159 392 113 298 356 78 125 82 174 82 92 63 90 125 109 53 41 113 539 57.52 58.92 82.87 92.31 90.40 89.02 93.10 93.90 92.39 90.48 82.22 88.80 91.74 81.13 80.00 79.65 46.46 start site stop site (wSSI-D1) (wSSI-D1) -253 0 1 316 1473 1828 2746 2823 2906 3028 4113 4194 4286 4459 4562 4643 4743 4835 4959 5021 5103 5192 8594 8718 8807 8915 8992 9044 9160 9199 9499 9657 9658 10098 e starch synthase I genes Identity of introns of solubl of wheat and rice Introns wSSI-D1 rSSI identity 1 2 3 4 6 7 8 9 11 12 13 14 1156 917 82 1084 91 102 99 123 81 3401 88 76 115 299 907 851 87 835 96 189 96 110 78 663 124 81 135 830 41.05 41.65 45.12 48.50 57.78 52.48 52.08 45.46 58.97 37.56 56.82 48.68 45.22 45.80 start site stop site (wSSI-D1) (wSSI-Dl) 317 1472 1829 2745 2824 2905 3029 4112 4195 4285 4460 4561 4644 4742 4836 4958 5022 5102 5193 8593 8719 8806 8916 8991 9045 9159 9200 9498 n 1. Exon Ib: coding Note: Exon la: non-coding region of exo region or exon 1.

Exon 15a: coding region of exon 15. Exon 15b: noncoding region of exon wSSI-Dl: wheat soluble starch synthase I gene.

rSSI: rice soluble starch synthase I gene.

WO 99/14314 PCT/AU98/00743 42 These subclones were analysed by sequencing. The intron/exon structure of the sg3 rice gene is shown in Figure 18. The SSS I gene from T. tauschii is shown in SEQ ID No:13, while the deduced amino acid sequence is shown in SEQ ID NO:14.

Example 20 Northern Hybridization Analysis of the Expression of Genes Encoding Soluble Starch Synthase Total RNAs were purified from leaves, pre-anthesis material, and various stages of developing endosperm at 5-8, 10-15 and 18-22 days post anthesis. Northern hybridization analysis showed that mRNAs encoding wheat SSS I were specifically expressed in developmental endosperm.

Expression of this mRNAs in the leaves and pre-anthesis materials could not be detected by northern hybridization analysis under this experimental condition. Wheat SSS I mRNAs started to express at high levels at an early stage of endosperm, 5-8 days post anthesis, and the expression level in endosperm at 10-15 days post anthesis, was reduced.

These results are summarized in Figure 9a and Figure 9d.

Example 21 Genomic Localisation of Wheat Soluble Starch Synthase DNA from chromosome engineered lines was digested with the restriction enzyme BamHI and blotted onto supported nitrocellulose membranes. A probe prepared from the 3' end of the cDNA sequence, from positions 2345 to 2548, was used to hybridise to this DNA. The presence of a specific band was shown to be associated with the presence of chromosomes 7A (Figure 19). These data demonstrate location of the SSS I gene on chromosome 7.

Example 22 Isolation of SSS I Promoter We have isolated the promoter that drives this pattern of expression for SSS I. The pattern of expression for SSS I is very similar to that for SBE II: the SSS I gene WO 99/14314 PCT/AU98/00743 43 transcript is detectable from an early stage of endosperm development until the endosperm matures. The sequence of this promoter is given in SEQ ID Example 23 Isolation of the Gene Encoding Debranching Enzyme from Wheat The sugary-i mutation in maize results in mature dried kernels that have a glassy and translucent appearance; immature mature kernels accumulate sucrose and other simple sugars, as well as the water-soluble polysaccharide phytoglycogen (Black et al, 1966). Most data indicates that in sugary-i mutants the.concentration of amylose is increased relative to that of amylopection. Analysis of a particular sugary-i mutation (su-iRef) by James et al, (1995) led to the isolation of a cDNA that shared significant sequence identity with bacterial enzymes that hydrolyse the a 1,6-glucosyl linkages of starch, such as an isoamylase from Pseudomonas (Amemura et al, 1988), ie.

bacterial debranching enzymes.

We have now isolated a sequence amplified from wheat endosperm cDNA using the polymerase chain reaction (PCR). This sequence is highly homologous to the sequence for the sugary gene isolated by James et al, (1995). This sequence has been used to isolate homologous cDNA sequences from a wheat endosperm library and genomic sequences from Triticum tauschii.

Comparison of the deduced amino acid sequences of DBE from maize with spinach (Accession SOPULSPO, GenBank database), Pseudomonas (Amemura et al, 1988) and rice (Nakamura et al, 1997) enabled us to deduce sequences which could be useful in wheat. When these sequences were used as PCR amplification primers with wheat genomic DNA a product of 256 bp was produced. This was sequenced and was compared to the sequence of maize sugary isolated by James et al, (1995). The results are shown in Figure 20a and Figure This sequence has been termed wheat debranching enzyme sequence I (WDBE-I).

WO 99/14314 PCT/AU98/00743 44 WDBE-1 was used to investigate a cDNA library constructed from wheat endosperm (Rahman et al, 1997) enables us to isolate two cDNA clones which hybridise strongly to the WDBE-I probe. The nucleotide sequence of the DNA insert in the longest of these clones is given in SEQ ID No:16.

Use of WDBE 1 to investigate a genomic library constructed from T. tauschii, as described above has led to the isolation of four genomic clones, designated II, 12, 13 and 14, respectively, which hybridised strongly to the WDBE-I sequence. These clones were shown to contain copies of a single debranching enzyme gene. The sequence of one of these clones, 12, is given in SEQ ID No:17. The intron/exon structure of the gene is shown in Figure 20c. Exons 1 to 4 were identified by comparison with the maize sugary-i cDNA, while Exons 5 to 18 were identified by comparison with the cDNA sequence given in SEQ ID No:16. The major features of the DBE I gene are summarized in Table 4.

Hybridization of WDBE-I to DNA from T. tauschii indicates one hybridizing fragment (Figure 21a). The chromosomal location of the gene was shown to be on chromosome 7 through hybridisation to nullisomic/tetrasomic lines of the hexaploid wheat cultivar Chinese Spring (Figure 21b).

We have clearly isolated a sequence from the wheat genome that has high identity to the debranching enzyme cDNA of maize characterised by James et al (1997). The isolation of homologous cDNA sequences and genomic sequences enables further characterisation of the debranching enzyme cDNA and promoter sequences from wheat and T. Causchii. These sequences and the WDBE I sequences shown herein are useful in the manipulation of wheat starch structure through genetic manipulation and in the screening for mutants at the equivalent sugary locus in wheat.

Figure 9e shows that the DBE I gene is expressed during endosperm development in wheat and that the timing of expression is similar to the SBEII and SSSI genes. Figure 9h WO 99/14314 PCT/AU98/00743 45 shows that the full length mRNA for the gene (3.0 kb) is found only in the wheat endosperm.

Example 24 Transient assays of Promoter-GFP Fusions DNA constructs DNA constructs for transient expression assays were prepared by fusing sequences from the BEII and SSI promoters to the gene encoding the Green Fluorescent Protein. Green Fluorescent Protein (GFP) constructs contained the GFP gene described by Sheen et al. (1995). The nos 3' element (Bevan et al., 1983) was inserted 3' of the GFP gene. The plasmid vector (pWGEM_NZfp) was constructed by inserting the NotI to HindIII fragment from the following sequence: GCGGCCGCTC CCTGGCCGAC TTGGCCGAAG CTTGCATGCC TGCAGGTCGA CTCTAGAGGA TCCCCGGGTA CCGAGCTCGA ATTCATCGAT

GATATCAGAT

CCGGGCCCTC TAGATGCGGC CGCATGCATA AGCTT 3' into the NotI and HindIII sites of pGem-13Zf(-) vector (Promega). The sequences at the junction of the wSSSIprol and wSSSIpro2 and GFP were identical, and included the junction sequence: 5'....CGCGCGCCCA CACCCTGCAG GTCGACTCTA GAGGATCCAT GGTGAGCAAG 3'.

The sequence at the junction of wsbellprol and GFP was: 5' GCGACTGGCT GACTCAATCA CTACGCGGGG ATCCATGGTG

AGCAAGGGCG

3'.

The sequence at the junction of wsbellpro2 and GFP was: GGACTCCTCT CGCGCCGTCC TGAGCCGCGG ATCCATGGTG AGCAAGGGCG 3'.

The structures of the constructs are shown in Figures 22a to 22f.

WO 99/14314 PCT/AU98/00743 46 Table 4 Structural features of wDBEI-D1

A.

Position of exons Exon Start number positi on 1890 2342 2615 3016 3360 4313 4526 4734 5058 5202 5558 6575 7507 8450 8739 8902 9114 Still being sequen ced End posit ion 2241 2524 2707 3168 3436 4454 4633 4819 5129 5328 5644 6671 7661 8527 8823 8981 9231 Comments (deduced by comparison with maize) (deduced by comparison with maize) (deduced by comparison with maize) (deduced by comparison with maize) Note that following nucleotides 3330, 6330 and 8419 there may be short regions of DNA not yet sequenced.

B.

CAAAAT motif TCAAT motif ATAAATAA motif 1833 1838 1804 Endosperm box like motif TAAAACG 1463 WO 99/14314 PCT/AU98/00743 47 Preparation of target tissue All explants used for transient assay were from the hexaploid wheat cultivar, Milliwang. Endosperm (10 12 days after anthesis), embryos (12 14 days after anthesis) and leaves (the second leaf from the top of plants containing 5 leaves) were used. Developing seed or leaves were collected, surface sterilized with 1.25% w/v sodium hypochlorite for 20 minutes and rinsed with sterile distilled water 8 times. Endosperms or embryos were carefully excised from seed in order to avoid contamination with surrounding tissues. Leaves were cut into 0.5 cm x 1 cm pieces. All tissues were aseptically transferred onto SD1SM medium, which is an MS based medium containing 1 mg/L 2,4-D, 150 mg/L L-asparagine, 0.5 mg/L thiamine, 10 g/L sucrose, 36.g/L sorbitol and 36 g/L mannitol. Each agar plate contained either 12 endosperms, 12 embros or 2 leaf segments.

Preparation of gold particles and bombardment Five gg of each plasmid was used for the preparation of gold particles, as described by Witrzens et al. (1998). Gold particle-DNA suspension in ethanol (10 gl) was used for each bombardment using a Bio-Rad helium-driven particle delivery system, PDS-1000.

GFP assay The expression of GFP was observed after 36 to 72 hours incubation using a fluorescence microscope. Two plates were bombarded for each construct. The numbers of expressing regions were recorded for each target tissue, and are summarized in Table 5. The intensity of the expression of GFP from each of the promoters was estimated by visual comparison of the light intensity emitted, and is summarized in Table 6.

The DNA construct containing GFP without a promoter region (pZLGFPNot) gave no evidence of transient expression in embryo (panel 1) or leaf (panel r) and WO 99/14314 PCT/AU98/00743 48 extremely weak and sporadic expression in endosperm (panel f) this construct gave only very weak expression in endosperm with respect to the number (Figure 5) and intensity (Figure 6) of transient expression regions. The constructs pwsssIprolgfpNOT (panels b, h and n), psbeIIprolgfpNOT(panels d, j and and psbeIIpro2gfpNOT (panels e, k and q) yielded low numbers (Table 5) of strongly (Table 6) expressing regions in leaves, and there was a very uneven distribution of expressing regions between target leaf pieces (Table pwsssIpro2gfpNOT (panels c, i and o) gave no evidence of transient expression in leaves (Table These results show that each of the promoter constructs is able to drive the transient expression of GFP in the grain tissues, endosperm and embryo. The ability of the short SSI promoter (pwsssIpro2gfpNOT containing 1042 bp of the ATG translation start site) to drive expression in leaves (panel n) contrasts with the inability of the long SSI promoter (pwsssIpro2gfpNOT containing 3914 base pair region 5' of the ATG translation start site, panel o) suggesting that regions for controlling tissue specificity are located between -3914 and -1042 of the SSI promoter region (SEQ ID Example 25 Stable transformation of rice Stable transformation of rice using Agrobacterium was carried out essentially as described by Wang et al.

1997. The plasmids containing the target DNA constructs containing the promoter-reporter gene fusions are shown in Figure 23. These plasmids were transformed into Agrobacterium tumefaciens AGL1 by electroporation.and cultured on selection plates of LB media containing rifampicillin (50 mg/L) and spectinomycin (50 mg/L) for 2 to 3 days, and then gently suspended in 10 ml NB liquid medium containing 100 gM acetosyringone and mixed well. Embryogenic rice calli (2 to 3 months old) derived from mature seeds were immersed in the A. tumefaciens AGL1 Table Transient Assay of GFP based constructs Tissue Construct Plate No.

Explant Number Ave. S.D.

1 2 3 4 5 6 7 8 9 10 11 12 Endosperm pact_jsgfg_nos 1 0 0 1 158 152 148 0 2 12 159 95 64 65.9 71.6 Endosperm pact_jsgfg_nos 2 3 13 2 83 18 9 6 188 0 102 5 3 36.0 58.6 Embryo pact_jsgfg_nos 3 97 79 77 101 121 176 89 129 139 212 131 138 124.1 40.1 Embryo pact_jsgfgnos 4 18 39 89 82 7 52 94 147 19 66 106 85 67.0 41.6 Leaf pact_jsgfg_nos 5 0 2 0 3 0 0 0.8 1.3 Leaf pact_jsgfg_nos 6 0 0 0 1 0 0 0.2 0.4 Leaf pact_jsgfg_nos 7 3 0 0 2 0 3 1.3 Endosperm pZLGFPNot Endosperm pZLGFPNot Embryo pZLGFPNot Embryo pZLGFPNot Leaf pZLGFPNot Leaf pZLGFPNot Leaf pZLGFPNot 8 13 0 4 0 14 0 0 0 0 0 0 1 9 0 0 0 0 14 0 0 5 3 4 6 0 10 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 13 0 0 0 0 0 0 14 0 0 0 0 0 0 to 2.7 5.2 2.7 4.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Table 5 (Continued) Transient Assay of GFP based constructs Tissue Construct Plate Explant Number Ave. S.D.

No.

Endosperm psbeIIprolgfpNOT 15 111 0 77 142 0 127 7 35 39 191 95 34 71.5 62.3 Endosperm psbeIIprolgfpNOT 16 21 101 0 0 34 164 102 5 39 125 147 114 71.0 60.6 Embryo psbeIIprolgfpNOT 17 23 67 63 4 12 14 9 8 29 19 24 51 26.9 21.7 Embryo psbeIIprolgfpNOT 18 92 144 64 36 31 23 106 43 11 1 9 7 47.3 45.4 Leaf psbeIIprolgfpNOT 19 0 0 0 0 0 0 0.0 0.0 Leaf psbeIIprolgfpNOT 20 6 0 0 0 0 0 1.0 2.4 Leaf psbeIIprolgfpNOT 21 0 0 0 0 3 5 1.3 2.2 Endosperm psbeIIpro2fpNOT 22 12 18 3 0 0 21 13 0 10 11 10 0 8.2 7.4 o Endosperm psbeIIpro2fpNOT 23 24 25 13 68 11 0 0 0 1 0 0 0 11.8 20.1 Embryo psbeIIpro2fpNOT 24 9 13 4 7 6 21 0 9 3 5 2 4 6.9 5.7 Embryo psbeIIpro2fpNOT 25 5 0 3 5 23 4 3 1 8 12 8 13 7.1 6.4 Leaf psbeIIpro2fpNOT 26 0 2 0 0 0 0 0.3 0.8 Leaf psbeIIpro2fpNOT 27 0 5 0 8 0 0 2.2 Leaf psbeIIpro2fpNOT 28 0 0 0 0 0 0 0.0 0.0 -4 0 Table Transient Assay of GFP based constructs Tissue Construct Plate Explant Number Ave. S. D.

Endosperm pwssslprolgfpNOT 29 121 0 0 28 0 4 81 23 0 2 0 2 21.8 39.2 Endosperm pwssslprolgfpNOT 30 3 0 0 92 12 0 0 102 4 159 41 24 36.4 52.8 Embryo pwssslprolgfpNOT 31 112 106 74 54 33 73 77 49 42 38 59 46 63.6 25.6 Embryo pwssslprolgfpNOT 32 97 48 110 22 191 112 53 6 9 145 6 10 67.4 62.4 Leaf pwssslprolgfpNOT 33 0 0 0 0 0 0 0.0 0.0 Leaf pwssslprolgfpNOT 34 0 0 0 0 0 0 0.0 0.0 Leaf pwssslprolgfpNOT 35 12 0 0 0. 0 0 2.0 4.9' Endosperm pwssslpro2fpNOT Endosperm pwssslpro2fpNOT Embryo pwssslpro2 fpNOT Embryo pwssslpro2 fpNOT Leaf pwssslpro2fpNOT Leaf pwssslpro2fpNOT Leaf pwssslpro2fpNOT

'-F

36 0 0 18 81 0 0 0 6 0 0 1 0 8.8 23 .3, 37 0 18 14 6 63 8 8 23 79 7 46 51 26.9 26.11 38 15 7 14 57 8 3 26 10 47 34 47 0 22.3 19.4 39 9 15 48 103 31 22 107 22 27 82 51 63 48.3 33.8 40 0 0 0 0 0 0 0.0 0.0 41 0 0 0 0 0 0 0.0 0.0 42 0 0 0 0 0 0 0.0 0.0 WO 99/14314 PCT/AU98/00743 52 Table 6 Comparison of the Intensities of Transient Expression Tissue pact_j pwsssI pwsssI psbell psbell pZLGFP s Not gfgno prolgf pro2gf prolgf pro2gf s pNOT pNOT pNOT pNOT Endosperm 10 4 2.5 3.5 1.5 Embryo 10 5.5 5.5 1.5 1 0 Leaf 10 20 0 10 10 0 All intensities are relative to pact_js-gfg_nos transient expression in the target tissue Relative intensities were independently scored by three researchers and averaged.

WO 99/14314 PCT/AU98/00743 53 suspension. After 3 10 minutes the A. tumefaciens AGL1 suspension medium was removed, and the rice calli were transferred to NB medium containing 100 pM acetosyringone for 48 h. The co-cultivated calli were washed with sterile Milli Q H 2 0 containing 150 mg/L timentin 7 times to remove all Agrobacterium, plated on to NB medium containing 150 mg/L timentin and 30 mg/L hygromycin, and cultured for 3 to 4 weeks. Newly-formed buds on the surface of rice calli were excised and plated onto NB Second Selection medium containing 150 mg/L timentin and 50 mg/L hygromycin. After 4 weeks of proliferation calli were plated onto NB Pre- Regeneration medium containing 150 mg/L timentin and 50 mg/L hygromycin, and cultured for 2 weeks. The calli were then transferred on to NB-Regeneration medium containing 150 mg/L timentin and 50 mg/L hygromycin for 3 to 4 weeks.-Once shooting occurs, shoots are transferred onto rooting medium MS) containing 50 mg /L hygromycin. Once adequate root formation occurs, the seedlings are transferred to soil, grown in a misting chamber for 1-2 weeks, and grown to maturity in a containment glasshouse.

Example 26 Use of probes from SSS I, SBE I, SBE II and DBE sequences to identify null or altered alleles for use in breeding programmes DNA primer sets were designed to enable amplification of the first 9 introns of the SBE II gene using PCR. The design of the primer sets is illustrated in Figure 24. Primers were based on the wSBE II-D1 sequence (deduced from Figure 13b and Nair et al, 1997; Accession No.

Y11282) and were designed such that intron sequences in the wSBE II sequence were amplified by PCR. These primer sets individually amplify the first 9 introns of SBE II. One primer (sr913F) contained a fluorescent label at the 5' end.

Following amplification, the products were digested with the restriction enzyme Ddel and analysed using an ABI 377 DNA Sequencer with GenescanTM fragment analysis software. One primer set, for intron 5, was found to amplify products from WO 99/14314 PCT/AU98/00743 54 each of chromosomes 2A, 2B and 2D of wheat. This is shown in Figure 25, which illustrates results obtained with various wheat lines, and demonstrates that products from each of the wheat genomes from diverse wheats were amplified, and that therefore lines lacking the wSBEII gene on a specific chromosome could be readily identified. Lane (iii) illustrates the identification of the absence of the A genome wSBEII gene from the hexaploid wheat cultivar Chinese Spring ditelosomic line 2AS.

Figure 26 compares results of amplification with an Intron 10 primer set for various nullisomic/tetrasomic lines of the hexaploid wheat Chinese Spring. Fluorescent dUTP deoxynucleotides were included in the amplification reaction. Following amplification, the products were digested with the restriction enzyme DdeI and analysed using an ABI 377 DNA Sequencer with GenescanTM fragment analysis software. In lane Chinese Spring ditelosomic line 2AS, a 300 base product is absent; in lane (ii) N2BT2A, a 204 base product is absent, and in lane (iii) N2DT2B a 191 base product is absent. These results demonstrate that the absence of specific wSBEII genes on each of the wheat chromosomes can be detected by this assay. Lines lacking wSBEII forms can be used as a parental line for breeding programmes for generation of new lines in which expression of SBE II is diminished or abolished, with consequent increase in amylose content of the wheat grain. Thus a high amylose wheat can be produced.

Table 7 shows examples primers pairs for SBE I, SSS I and DBE I which can identify genes from individual wheat genomes and could therefore be used to identify lines containing null or altered alleles. Such tests could be used to enable the development of wheat lines carrying null mutations in each of the genomes for a specific gene (for Table 7 PCR Primers for Starch Biosynthesis Genes Gene Foward Foward Primer sequence Reverse Reverse Primer sequence Tempj Primer Primer 0 c) Product (bp) SBE I ZLE1 5d GGC GGC GGC AAT GTG CGG CTG AG ZLBE1 !CCA GAT CGT ATA TCG GAA GGT CG 157.3 A=625, 63B 600, D -550 SSS I sssE0lF GAA.CTC GCG CCC GAC CTC CT ZLSg7 AGC CAC GAT TAT GCT GTC GAT GG 55.0 A, 450; B=450; 630 sssEl4F TTC TCA CCG CTA ACC GTG GAC ZLSml9 GTC TAC ATG ACG TAG GGT TGG TC 55 .8 lB 400, D -500 no A ___product DBE I DBEE17F TGG TCT GAG AAT AGC CGA TTC sr1536F AAGGCCACATAGATCTCG 56.8 B, 190; D, 190, A, 160.

Nonspecif i

C

product 220 bp Temp: annealing temperature, bp =length of the product in base pairs 00 WO 99/14314 PCT/AU98/00743 56 example SBEI, SSI or DBE I) or combinations of null alleles for different genes.

It will be apparent to the person skilled in the art that while the invention has been described in some detail for the purposes of clarity and understanding, various modifications and alterations to the embodiments and methods described herein may be made without departing from the scope of the inventive concept disclosed in this specification.

Reference cited herein are listed on the following pages, and are incorporated herein by this reference.

WO 99/14314 WO 9914314PCT/AU98/00743 57

REFERENCES

Ainsworth, Clark, J. and Balsdon, J.

Plant Molecular Biology, 1993 22 67-82 Amemura, Chakrabort, Fujita, Noumi, T. and Futai, M.

Biol. Chem., 1988 263 9271-9275 Baba, Kimnura, Mizuno, Etoh, Ishida,y., Shida, 0. and Arai, Y.

Biochem. Biophys. Res. Commun., 1991 181 87-94.

Baba,T.; Nishihara,M.; Mizuno,K.; Kawasaki,T.; Shimada,H.; Kobayashi,E.; Ohnishi,S.; Tanaka,K.; Arai,Y.

Plant Physiol, 1993, 103 565-573.

Ball,S.; Guan,H.'; James,M.; Myers,A.; Keeling,P.; Mouille,G.; Bul6on,A.; Colonna,P.; Preiss,J.

Cell, 1996, 86 349-352 Bevan, Barnes, and Chiltona, M.

Nucleic Acids Research, 1983, 11 369-385 Black, Loerch, McARdle, F.J. and Creech, R.G.

Genetics, 1966 53 661-668 Block, Loerz, Lutticke, S.

Genbank database Accession number U48227 Burton, Bewley, Smith, Bhattacharya,

M.K.,

Tatge, Ring,S., Bull, Hamilton, W.D.0. and Martin,

C.

The Plant Journal, 1995 7 3-15.

Cangiano, La Volpe, Paulsen, P. and Kreiberg, J.D.

Plant Physiology, 1993 102 1053-1054.

Clarke, Mukai, Y. and Appels, R.

Chromosoma, 1996 105 269-275 Devereaux, Haeberli, P. and Smithies, 0.

Nucleic Acids Res., 1984 12, 387-395.

Denyer, Hylton, Jenner, C.F. and Smith, A.M.

Planta, 1995 196 256-265 Doherty, Lindeman, Trent, Graham, M.W. and Woodcock, D.M.

Gene, 1992 124 113-120 WO 99/14314 PCT/AU98/00743 58 Dry, Smith, Edwards, Bhattacharyya, Dunn, Martin, C.

Plant J 1992, 2 193-202 Edwards, Marshall, Sidebottom, Visser, R.G.F., Smith, Martin, C.

Plant J, 1995 8 283-294 Fisher, Boyer, C.D. and Hannah, L.C.

Plant Physiology, 1993 102 1045-1046 Forde, Heyworth, Pywell, J. and Forde, M.

Nucleic Acids Research, 1985 13 7327-7339 Gill, B.S. and Appels, R.

Plant Syst. Evol., 1988.160 77-90.

Higgins, Zwar, Jacobsen, J.V. (1976) Nature, 1976, 260 166-168 Khandjian, E.W.

Bio/Technology, 1987, 5 165-167 Jahne, Lazzeri, Jager-Gussen, M. and Lorz, H.

Theor. Appl. Genet., 1991 82 47-80 James, Robertson, D.S. and Myers, A.M.

Plant Cell, 1995 7 417-429 Jolly, Glenn, G.M. and Rahman, S.

Proc. Natl Acad. Sci., 1996 93 2408-2413.

Kawasaki, Mizuno, Baba, T. and Shimada, H.

Molec. Gen. Genet., 1993 237 10-16.

Lagudah, Appels, R. and McNeill,

D.

Genome, 1991 34 387-395 Lazzeri, Brettschneider, Luhrs, R. and Lorz, H.

Theor. Appl. Genet., 1991 81 437-444 Maniatis, Fritsch, E.F. and Sambrook,

J.

Molecular cloning. A Laboratory Manual., New York. Cold Spring Harbor Laboratory, 1982 Marshall,J.; Sidebottom,C.; Debet,M.; Martin,C.; Smith,A.M.; Edwards,A.

The Plant Cell, 1996 8 1121-1135 Martin, C. and Smith, A.

The Plant Cell, 1995 7 971-985.

McElroy, Blowers, Jenes, Wu R.

WO 99/14314 WO 99/ 4314PCT/AU98/00743 59 Mol. Gen. Genet., 1991 231 150-160.

Mizuno, Kawasaki, Shimada, Satoh, Koyabashi, Okumura, Arai, Y. and Baba, T.

J.Biol. Chemn., 1993 268 19084-19091.

Muller,M.; Knudsen,S.

Plant J, 1993, 4 343-355 Morell, Blennow, Kosar-Hasheni, B. and Samuel, M.S.

Plant Physiol., 1997 113 201-208.

Morell, Rahinan, Abrahams, S.L. and Appels, R.

Aust.J.of Plant Physiol., 1995 22 647-660.

Nair, Baga, Scoles, Kartha, K. and Chibbar, R.

Plant Science, 1997 1222 153-163 Nakamura,Y.; KuboA.; Shimamune,T.; Matsuda,T.; Harada,K.; Satoh,H.

Plant J, 1997, 12 143-153 Nakamura, Yanarnori, Hirano, Hidaka, S. and Nagamine, T.

molecular and General Genetics, 1995 248 253-259 Nakamura, Takeichi, Kawaguchi, K. and Yamanouchi, H.

Physiologia Plantarun, 1992 84 329-335.

Nakamura, Umemoto, T. and Sasaki, T.

Planta, 1996 199 209-214 Rabman, Kosar-Hashemi, Samuel, Hill, Abbott, Skerritt, Preiss, Appels, R. and Morell, M.

Aust. J. Plant Physiol., 1995 22 793-803.

Rahman, Abrahams, Mukai, Abbott, Samuel, M., Morell, M. and Appels, R.

Genome, 1997 40 465-474 Repellin, Nair, Baga, M. and Chibbar, R.N.

Plant Gene Register PGR97-094 (1997) Samibrook, Fritsch, E.F. and Maniatis, T.

molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2 nd ed 1989) Sheen, Hwang, Niwa, Kobayashi, and Galbraith, D.W.

The Plant Journal, 1995 8 777-784 WO 99/14314 PCT/AU98/00743 60 Svensson,

B.

Plant Mol. Biol., 1994 25 141-157.

Tanaka, Ohnishi, Kishimoto, Kawasaki, Baba,

T.

Plant Physiol 1995, 108 677-683 Tingay, McElroy, Kalla, Fieg, Wang, M., Thornton, S. and Bretell,

R.

The Plant Journal, 1997 11 1369-1376 Wan, Y. and Lemaux,

P.G.

Plant Physiology, 1994 104 37-48 Wang, Upadhyaya, Brettell, and Waterhouse,

P.M.

Journal of Genetics and Breeding, 1997 51 325-334.

WO 99/14314 PCT/AU98/00743 61 SEQUENCE LISTING GENERAL INFORMATION:

APPLICANT:

NAME: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION STREET: Limestone Avenue CITY: Campbell STATE: ACT COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2612 NAME: THE AUSTRALIAN NATIONAL UNIVERSITY STREET: BRIAN LEWIS CRESCENT CITY: ACTON STATE: ACT COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2601 NAME: GOODMAN FIELDER LIMITED STREET: LEVEL 42, GROSVENOR PLACE CITY: SYDNEY STATE: NSW COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2000 NAME: GROUPE LIMAGRAIN PACIFIC PTY LIMITED STREET: LEVEL 31, 1 O'CONNELL STREET CITY: SYDNEY STATE: NSW COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2000 (ii) TITLE OF INVENTION: REGULATION OF GENE EXPRESSION IN PLANTS (iii) NUMBER OF SEQUENCES: 17 (iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (EPO) INFORMATION FOR SEQ ID NO: I: SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer based on the N-terminal sequence of wSBE I 5 end at position 168 of SEQ ID (iii) HYPOTHETICAL:

NO

WO 99/14314 PCT/AU98/00743 62 (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE'DESCRIPTION: SEQ ID NO: 1: GGCACGCGAG AGACTGG 17 INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer in which 5 end is at position 1590 of SEQ ID (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: TACAITTCCT TGTCCATCA 19 INFORMATION FOR SEQ ID NO: 3: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer 5 end is at position I of SEQ ID (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm WO 99/14314 WO 99/ 4314PCT/AU98/00743 63 (xi) SEQUENCE DESCRIPTION: SEQ, ID NO: 3: ATCACGAGAG CTFTGCTCA 18 INFORMATION FOR SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 00i MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer 5 'end is at position 334 of SEQ ID (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: tniticum tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: CGGTACACAG TTGCGTCATT TrC 23 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 2687 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATCGACGAAG ATGCTCTGCC TCACCGCCCC CTCCTGCTCG CCATCTCTCC CGCCGCGCCC CTCCCGTCCC GCTGCTGACC GGCCCGGACC GGGGATTTCG GCCAAGAGCA AGTTCTCTGT TCCCGTGTCT GCGCCAAGAG ACTACACCAT GGCAACAGCT GAAGATGGTG TTGGCGACCT TCCGATATAC GATCTGGATC CGAAGTTTGC CGGCTTCAAG GAACACTTCA GTTATAGGAT GAAAAAGTAC CTTGACCAGA AACATTCGAT TGAGAAGCAc GAGGGAGGCC TTGAAGAGTT CTCTAAAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA WO 99/14314 WO 9914314PCT/AU98/00743 64 ATGGGCCCCT

GCAGCAATGG

TGGGCACAGG ATGACAAAGG TGGGAAACCT GCCATCCCCC ACTATGGGTC

GATCGGGTTC

TGGAGCTCCA TATGACGGTG GCATCCTCGG CCTCGAAAGC TGGTGAGAGG CCTGAAGTAA AAACGCAAAC AACTACAACA CTTCTTTTGG TACCATGTGA GGACCTCAAA

TATCTTGTTG

TGTCCATAGC

CATGCGAGCA

AAACACACAG

GAGTCCTATT

TCGCCTGTTC

AACTATGCCA

TTGGATGGAC GAATTCATGT TAATCACCAT

GGTATCAATA

TACCGATGTA

GATGCAGTTG

GCCAGAAGCA

ACTGTTGTTG

TGATGAAGGT GGAGTAGGGT TGACTACTTG AAGAACAAAG GACCAACAGG

AGATATACGG

TGTTGGCGAC

AAGACTATGG

AGACTTGCAG

CCTGCTTCAC

CTTCATCACC

ATGGCCCTTG

CCACCCAGA\ TGGATTGACT ACGCCAGTGG AGCCTCTCAG TCAAGCAATG

AATGCGCTCG

CAGCGACATG AATGAGGAAA CTTCAATTTT CATCCCAGTA GAAGTACAAG GTAGCTCTGG CCAGTACAAC GATCACTTCA CAACAACCGC CCTAATTCAT TCGCGTCGAG

GAAAAAGCGG

TGCTCCTGGG TACATCGATG

ATGCACAACT

ATAATTATGG

ATAATTCCA

CTGCATGGAT

TTCACTGGGA

CTGACGCTCC

GCACATACAG

CAGTTCAGCT

CGAATTTCTT

ACAAGGCACA

GTAATATGAC

TCCATACAGG

ATTGGGAGGT

TTGACGGCTT

TGTCATTCGC

TTTACATGAT

CAGAAGATGT

TTGACTATCG

ATGACCTTGA

AAAAGTGCAT

CATTTCTCTT

CTACAATTGA

GAGGTGATGG

TTCCAAGAGA

ACATTGATCA

ACGACAAGTT

AGAAGATTAT

AAACTTATGA

ACTCCGATGC

CGTCACCTGA

TCAAAGTCCT

AAAAGCCTAA

TTGAAGCCAC

TATTGGTGAC

TGTTTGGTCA

GGTTAAATTT

TCGTTATGCA

TCCACCTTCT

ACGTATTTAC

AGAATTTGCA

GATGGCAATC

CGCAGTTAGC

TAGCTTAGGG

AGATGGTCTA

AGAAAGGGGT

CTTACGGTAT

CCGATTTGAT

TGGAAATTAC

GCTTGCGAAC

TTCAGGCATG

CCTTGCTATG

ATGGTCAATG

TGCATATGCT

GATGGACAAG

TCGTGGAATT

CTACTTGAAT

AGGCAACAAC

CCTACGATAC

TTCCTTCCTA

TGTATTTGAA

TGGTTACAAA

TCTGATGTTT

AGGAGTACCA

GTCTCCACCC

GGATGAAGGA

TCGTGTCAAA

TTCAACAACT

ATCAGGATTT

CGATTTCACC

ACTTTTGACG

GGTGAAAGGT

GAGGCTCATG

GACAATGTGT

ATGGAACATT

AGCAGATCAG

TTGCGTGTTC

AATGGCTATG

TATCATAAAC

CTTCTTTCTA

GGAGTA.ACAT

AAGGAATATT

CATTTAATGC

CCAGTGCTTT

GCTATTCCTG

AGTGCAATAG

GAGAGCCACG

GAAATGTATA

GCACTTCAAA

TTTATGGGTA

TGGAGTTATG

AAGTACATGA

TCGTCATCAA

CGTGGAGATC

GTCGGATGTG

GGTGGACATG

GGAGTACCTG

CGCACTTGTG

GCTGCTTCTT

GACGCAGCAG

GGAATGGCTC

CCCATGTCA.A

GTGGAGATGG

CCTCTAAATT

ATGTGTTTAA

TGGGGATGAG

TACCGCGCAT

CCATATTATG

GAACACCAGA

TGATGGATGT

ATGTTGGACA

TGTGGGATAG

ATCTGAGATA

CCATGCTATA

TTGGTTTGGA

ACAAAATCTT

GTCGGTCAGT

ATAGATGGAT

CACATACTCT

ATCAGTCTAT

CTGGCATGTC

AGATGATTCA

ATGAGTTTGG

ATAAATGCAG

ACGCATTTGA

AGCAGATTGT

TGGTCTTCGT

ATTTGCCTGG

GAAGAGTGGC

AAACAAACTT

TGGCTTACTA

GGGGCAAAGC

ATGGTGAGGC

420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 WO 99/14314 WO 99/43 14PCT/AU98/00743 65 GACTTCTGGT TCCAAAAAGG CGTCTAC TGTCTTCGGG TCACCTGACA AAGATAA CGTGTACCGA CGTCCTTGTA ATATTCC GTGCAGACTT GAGATTCTGG CTTGGAC ATAAGAGGTG ATGGTGCGGG TCGAGTC TCCTCTGTCA TAAAGGAAGT TTCGGGC INFORMATION FOR SEQ ID NO: 6: SEQUENCE CHARACTERISTICS: LENGTH: 807 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear

AGG

CAA

TGC

~TTT

:CGG

TTT

AGGTGACTCC

ATAAGCACCA

TATTGCTAGT

GCTGAGGTTA

CTATATGTGC

CAGCCCAGAA

AGCAAGAAGG

TATCA-ACGCT

AGTAGCAATA

CCTACTATAT

CAAATATGCG

TAAAAAA

GAATTAACTT 2400 TGATCAGAAC 2460 CTGTCAAACT 2520 AGAAAGATAA 2580 CCATCCCGAG 2640 2687 (ii) MOLECULE TYPE: protein (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm, (ix) FEATURE: NAME/KEY: Protein LOCATION: L .807 OTHER INFORMATION:/label= sheI /note= "deduced amino acid sequence from SEQ ID (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: Met 1 Pro Leu Cys Leu Thr Ala Pro Ser Cys Ser Gly Pro Ser Leu Pro Pro Arg Ser Arg Pro 20 Lys Phe Ser Ala Ala Asp Arg Pro 25 Al a Pro Gly Ile Ser Val Pro Val Thr Ala Giu Ser 40 Asp Pro Arg Asp Tyr Asp Ser Ala Lys Thr Met Ala Leu. Asp Pro Asp Gly Val Leu Pro Ile Lys Phe Ala Gly Phe Lys 70 Glu His Phe Ser Tyr Arg Met Lys Lys Leu Asp Gin Lys Ser Ile Glu Lys His Glu. Giy Giy Leu Giu Glu Phe Ser Lys Thr Val Tyr 115 Gly 100 Tyr Leu Lys Phe Gly 105 Ala Ile Asn Thr Giu Asn Asp Ala 110 Gin Leu Ile Arg Giu Trp Ala Pro 120 Ala Met Asp Ala 125 WO 99/14314 PCT/AU98/00743 66 Gly Asp Phe Asn Asn Trp Asn Gly Ser Gly His Arg Met Thr Lys Asp 130 135 140 Asn Tyr Gly Val Trp Ser Ile Arg Ile Ser His Val Asn Gly Lys Pro 145 150 155 160 Ala Ile Pro His Asn Ser Lys Val Lys Phe Arg Phe His Arg Gly Asp 165 170 175 Gly Leu Trp Val Asp Arg Val Pro Ala Trp Ile Arg Tyr Ala Thr Phe 180 185 190 Asp Ala Ser Lys Phe Gly Ala Pro Tyr Asp Gly Val His Trp Asp Pro 195 200 205 Pro Ser Gly Glu Arg Tyr Val Phe Lys His Pro Arg Pro Arg Lys Pro 210 215 220 Asp Ala Pro Arg Ile Tyr Glu Ala His Val Gly Met Ser Gly Glu Arg 225 230 235 240 Pro Glu Val Ser Thr Tyr Arg Glu Phe Ala Asp Asn Val Leu Pro Arg 245 250 255 Ile Lys Ala Asn Asn Tyr Asn Thr Val Gin Leu Met Ala Ile Met Glu 260 265 270 His Ser Ile Leu Cys Phe Phe Trp Tyr His Val Thr Asn Phe Phe Ala 275 280 285 Val Ser Ser Arg Ser Gly Thr Pro Glu Asp Leu Lys Tyr Leu Val Asp 290 295 300 Lys Ala His Ser Leu Gly Leu Arg Val Leu Met Asp Val Val His Ser 305 310 315 320 His Ala Ser Ser Asn Met Thr Asp Gly Leu Asn Gly Tyr Asp Val Gly 325 330 335 Gin Asn Thr Gin Glu Ser Tyr Phe His Thr Gly Glu Arg Gly Tyr His 340 345 350 Lys Leu Trp Asp Ser Arg Leu Phe Asn Tyr Ala Asn Trp Glu Val Leu 355 360 365 Arg Tyr Leu Leu Ser Asn Leu Arg Tyr Trp Met Asp Glu Phe Met Phe 370 375 380 Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Leu Tyr Asn His His 385 390 395 400 Gly Ile Asn Met Ser Phe Ala Gly Asn Tyr Lys Glu Tyr Phe Gly Leu 405 410 415 Asp Thr Asp Val Asp Ala Val Val Tyr Met Met Leu Ala Asn His Leu 420 425 430 Met His Lys Ile Leu Pro Glu Ala Thr Val Val Ala Glu Asp Val Ser 435 440 445 Gly Met Pro Val Leu Cys Arg Ser Val Asp Glu Gly Gly Val Gly Phe 450 455 460 WO 99/14314 PCT/AU98/00743 67 Asp Tyr Arg Leu Ala Met Ala Ile Pro Asp Arg Trp Ile Asp Tyr Leu 465 470 475 480 Lys Asn Lys Asp Asp Leu Glu Trp Ser Met Ser Ala Ile Ala His Thr 485 490 495 Leu Thr Asn Arg Arg Tyr Thr Glu Lys Cys Ile Ala Tyr Ala Glu Ser 500 505 510 His Asp Gin Ser Ile Val Gly Asp Lys Thr Met Ala Phe Leu Leu Met 515 520 525 Asp Lys Glu Met Tyr Thr Gly Met Ser Asp Leu Gin Pro Ala Ser Pro 530 535 540 Thr Ile Asp Arg Gly Ile Ala Leu Gin Lys Met Ile His Phe Ile Thr 545 550 555 560 Met Ala Leu Gly Gly Asp Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe 565 570 575 Gly His Pro Glu Trp Ile Asp Phe Pro Arg Glu Gly Asn Asn Trp Ser 580 585 590 Tyr Asp Lys Cys Arg Arg Gin Trp Ser Leu Ser Asp Ile Asp His Leu 595 600 605 Arg Tyr Lys Tyr Met Asn Ala Phe Asp Gin Ala Met Asn Ala Leu Asp 610 615 620 Asp Lys Phe Ser Phe Leu Ser Ser Ser Lys Gin Ile Val Ser Asp Met 625 630 635 640 Asn Glu Glu Lys Lys Ile Ile Val Phe Glu Arg Gly Asp Leu Val Phe 645 650 655 Val Phe Asn Phe His Pro Ser Lys Thr Tyr Asp Gly Tyr Lys Val Gly 660 665 670 Cys Asp Leu Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Ala Leu 675 680 685 Met Phe Gly Gly His Gly Arg Val Ala Gin Tyr Asn Asp His Phe Thr 690 695 700 Ser Pro Glu Gly Val Pro Gly Val Pro Glu Thr Asn Phe Asn Asn Arg 705 710 715 720 Pro Asn Ser Phe Lys Val Leu Ser Pro Pro Arg Thr Cys Val Ala Tyr 725 730 735 Tyr Arg Val Glu Glu Lys Ala Glu Lys Pro Lys Asp Glu Gly Ala Ala 740 745 750 Ser Trp Gly Lys Ala Ala Pro Gly Tyr Ile Asp Val Glu Ala Thr Arg 755 760 765 Val Lys Asp Ala Ala Asp Gly Glu Ala Thr Ser Gly Ser Lys Lys Ala 770 775 780 Ser Thr Gly Gly Asp Ser Ser Lys Lys Gly Ile Asn Phe Val Phe Gly 785 790 795 800 WO 99/14314 WO 9914314PCT/AU98/00743 68 Ser Pro Asp Lys Asp Asn Lys 805 INFORMATION FOR SEQ ID NO: 7: SEQUENCE CHARACTERISTICS: LENGTH: 319 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: misc~signal LOCATION :1..319 OTHER INFORMATION:/function= untranslated region of wSBE I-D34 cDNA" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: GCGACTTCTG GTTCCAAAAA GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC TTTGTCTTCG GGTCACCTGA CAAAGATA).C AAATAAGCAC CATATCAACG CTTGATCAGA ACCGTGTACC GACGTCCTTG TAATATTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA CTGTGCAGAC TTGAGATTCT GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT AAATAAGAGG TGATGGTGCG GGTCGAGTCC GGCTATATGT GCCAAATATG CGCCATCCCG AGTCCTCTGT CATAAAGGA 319 INFORMATION FOR SEQ ID NO: 8: SEQUENCE CHARACTERISTICS: LENGTH: 4890 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 00i MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:

NO

120 180 240 300 (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm Oix) FEATURE: WO 99/1 4314 PTA9/04 PCT/AU98/00743 69 NAME/KEY: promoter LOCATION: L .4890 OTHER INFORMATION:/function= "promoter containing sequence of SBE I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: GGGTGGCGGG TCGGGCGGCA

CGGGCGGCAG

GGCTGCGGCT

GGTTGACTTT

GGACTCCAAA

AACATTTATT

AAAATCAAAT

TTCATTATTT

CGGAGAGAAA

TGAAAATAAC

ATCACAAATC

ATTGAAAATT

AATGTAGAAC

TCGTGAACTT

GAAATAACTT

AGTGCCTATG

CTAGTAATGC

TTTTCAAATG

ATGGAAAGTT

TGTCAAAAGA

CCTGATCCTG

SO CGGTAGATCT

GATTTATGTG

TTTGGACGCG

AATGCCATAG

GAAGAGTGAA

TGCTTCCCTT

AGCCCTTCAT

CGGCGGCTAG

TTAAAGGCCG

AAAAAATAAT

AATCCCGAAG

TGGGCCTAAA

AAAATCCAAA

GTGAAGAAGT

AATAATTAAA

GAAATCCCCA

AAAATGCAAT

TGGGATGTTA

TCTATTTTGT

GAATCAAACC

GTGTTACATA

GTAATATCAA

AATGCAATGC

TATGAGTTGC

CTTCTCTTTT

TAATACCTCA

AAATAAAATC

TGTGGTAGTA

AAGTACATCG

CCTTGTGGGA

CTAGCAAGTA

GTGCTTCAAC

GTCGTAATCC

TGGATTCCCC

AGGCGCGGGG

GGTTTCGCGG

GCCAGGCTGA

AATTCGGACA

TAAATTTTTC

ATGCAATTTT

TAAAATCAAA

CATTTTATCC

ACAAATGATC

ACTCTCTCCG

AAAATATGAT

CATATAACTC

TTTGAAATTG

TTTAAATAAA

GATTTATTAC

TAAATATCTT

ATGCTAAAAG

CAACAAGTGG

TACATGGTTT

TAATACAATT

AGAAGATTTG

CTCATGATGT

TACACCAGAC

GTGTAAAGTA

GGCCTAGTTA

TAAAGGTTAG

TGTGCATTTG

TGGATGTCTT

CGGCGGGGCG

CGGCGGCGAC

GGTGTCCGGG

TGCAAAAAkAG

CCCATTCTTA

GAAAAATGCG

TATTTGTTTT

CATCTCATAT

CTATTTTCAA

TGGGTCCTTG

ATGCATGATG

AAATTCTATA

TATTATTTTT

ACAAAGCATA

AATAGCGTTG

GATAGATGTT

AATAGAACCT

CATACTTGGC

AGATTCCAGC

CCACTAAAGT

GTGTCATCAT

AAAATTATCA

ATAGTTGACA

CTACCATGTA

AGGAAATTCT

ACCCACTTAA

TAGGTCCCTC

TTTGTTACAT

GCCGGGGCGG

TTGGGCTGAG

TCGGACACGG

TAAGAAAAGA

AAAATAAGCC

TATTTTTCCT

TAATATTTTT

ATTTTGATAT

AATTTGAGAA

AGTTGCGTGA

ATCTAATGTA

ATTATGAACA

TAGAATTAGT

AAAATGACAA

TATGTGTGTA

TCTACAATTC

TAGTTTCATT

ACTGTTTGTT

ATGTAGCCAC

CACCTAGCCC

CATGACAACA

AGAGGGAGAG

CATCGATTTT

TTAGAAGAGG

TCCTTAGATC

AAAATGTCAC

GGATCTGAGC

TTTATTGAAG

CGCGGCGGCG

GCGGGGCACG

CCCGTA-AGGC

AATAATAA.AC

GGACAAGATG

AATTCGGAAT

CCTCCAATAT

GAAATATTTT

AACCCA-AATA

AATTTCTAGG

TAACATTCCA

CAGAAATATT

CTAGAGCATT

ATTCACATAT

TGTGTGCGTG

ACGGGTCTAA

TAACTAACAA

TGTTCATTTT

AAAATATGAT

AAGTGACCGA

AATTATTAGG

AATGTATGGA

TTAAGATACA

TGAAATGAGA

CCCTTCTCCC

TTTGAATCTT

CCTTTCTCCA

TGAGAGTGAA

120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 TTATTATATG CCCATAGGAG GTGGGATATA AAGGCTGTTG GTATTCTGCA CCATACATGC 1740 WO 99/14314 WO 9914314PCT/AU98/00743 70 TAGAGTAGGG AGGAGAGGCT GGTGCATGAT

TCCCCCACCC

TAGCCCATTC

TGACTTGAAT

CTCACAACAT

CTTTTATGCT

CCATGCGCTC

TCTATCCTCC

AGGCAAGTAA

GGCATTTTTG

ATTTAGGGAG

GTTGTTTCGC

CTTCCTCCTT

TGCACAGATA

TGCGCCCACT

CTCCGACTAT

GGCATCACCT

ACACAACTAC

TCCTCCCACG

CGAGGCTGAT

TCATACATGT

ATCCTGAGGA

AGTACGTCCT

CCCCTACGCC

GGCGAAGGTA

AAAATGTCGC

AGCAATATAA

CCATAAATTG

ATGAAAGCCG

TACTCCATAT

GCTTTTAGCG

TTCCCCTTTT

AAATTTATAT

TTTTTCTATT

ACTAACAAGT

TTCATCATGG

TTGACAATGT

TAACTTAA-AA

AAAAGTTATT

TCTCATGTTT

TTGTTTCTGG

GAGCAACTCT

GCCCAAAATG

CTCGCTCCAC

ACTCACTCCA

GCGCCAACGC

ATCACCGACG

CCCCTCAAAT

AAA.ATCTCAA

CCCCAAGACA

TGGTAAACCG

ATGGTAGGAT

ATGTCGGCTC

TAACGAGGTC

CCCGTTCGAT

CTAGGAGTTC

TTCCTCGACG

CTGCAGGTAG

TCAACGACTT

GTTTATCACA

AACATACAAA

CATATCGCGA

GACATAGGAC

CCTTTCGTGC

TTCATTTCTT

ACCATTTTTC

ATTAATTTGT

TTTTTTTATT

ACTTATTAAT

GCCTCATATA

AGGATTATTT

CAAACATAGA

ACTCTAGAAA

GAATGAGTCG

AGCAGAGTCG

GCACTTCAGA

AAACAAGCTT

TATCCAACCG

CGGGATTTTA

AGTGGGGTGA

TCATGAGGCA

CGGCATCACC

TCTCCTCCCC

CATACCCAAT

ATTCTCCTCC

CCATGATGGC

ATCCCCATTG

GTCGCAATGC

CGCCCCGCAA

ATCTCTCTTA

TAGAACATAG

TTGAAGTCGC

TTTGATAATG

TTTTTAGCAA

CTATGTGTTT

AACCATACAT

AGTGGTGCCC

TGAAATCTAT

TGTTTCTCGC

GTCTCTTATG

ACATGGTGGA

AGGTCTTCAT

CATGATTAGT

TGGCATGTGG

TTTTGGTGCA

TTTATAAACA

CCATATATCT

GGGAAGGTAA

CGATATGCCC

AGAGTCACCA

CGAGCCTCCA

CAAGCGCATG

CACAGCGCAT

CAAGAAGGAT

GCCATTTGGA

AAAACCATAG

TCTATGCCAC

CATGGTTTAC

TAGAATGGCG

GTGCATCATT

ATGTCGTTGG

GACTCTCCAA

CCATCTATAA

GGAGGACAAC

CAATGTCGAA

AAATAAAATG

ATTTGAACCG

ATGAAAAAAG

GAGCCGCAGC

ATGAAGACTC

ATAGGGAGTG

TTTATTTTTT

TATTTTTTGT

AGAAGTCCAG

CTAGCCCATA

CCTCTGATTT

TTCTTGGATT

GACTGATAGG

GTCGTAAAGA

AAGGATATCA

CTTTGTTGCA

TCTTAGGGAA

AATCGCCATA

TATCCCTTCG

AATATGGAGG

CATGAGGGAA

TACAGGTACA

AAGCACCCTC

TGGTCATCGC

CTGCCGCCTC

AATGTCATCA

CGGCAGTGCG

CGTGTGGCGC

GATTTGGCGC

TCCCCTTGCC

ACTCAAAGCT

GGAGGAGCAA

GGCTAGACGA

TGGCGACATT

TAGTGTGACT

GTGTGGTTCA

AAACAAGTAA

TGCCAAGTAC

TACTAGAGTT

AGGGTAGTTG

TTCTCTTTTG

TGTTATATTC

ACTTGCATAT

TATTTACCCC

GTTTTTCTGT

TTTGTTTACT

AAGATATATT

AAACTACTTT

CCATGCATGA

AAATATTTAA

GGTTAAAGTG

ATGCCAATAT

GATAGCCATA

CCATGGATTC

GTTTTAGCTT

TGAACCAGCA

CCATTAGTGG

GTGGCATAAG

CCCCTTCCTC

TTATGGAGAG

AACCCCACCT

TTCCTCCTCC

TTCGGGTCCA

CCCCAGTCGG

CACAATGAGG

CGATAGCTCT

CGGCGGCGGC

GCATATTTTG

ACTTTTGGCC

ACTAAATGTA

GACCACAAAT

ATATGAAGCG

CTCTAAGGCC

GACTGTTCGT

TAGGTTTCCC

TAGTTTCATA

GGAGGTGCAC

1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 WO 99/14314 WO 9914314PCT/AU98/00743 71

ACACAAACAT

ATCATCAAAA

CATTTAAAAT

CAAACCACAT

GAAACCGAAA

CGATGCTTAG

GGGGACATTG

GGCGAGGCGG

GAGGAGTAGC

TCCCCCACCC

CAAAAAGTAA

TAGAAAAAGC

TGTTCTTTTA

TTTCACAAAA

GGCCCAGCGG

CTCCCTCGCC

TTCCTGTCCA

GCCACACTCC

GATTCCCGTC

ATAAAGTATA

CCTGCCAATA

GTGAACAATT

GCTATACACT

GTAATGTTAG

ACGTGAGATG

CAGTTGACAC

ACGTCGGGCT

CTGCAAAACA

TGACAAGCAA

TTGTTCTTGC

AATCCTTTTA

GAGCAAATAT

CTGACGAAGG

CGCACGACCG

CCCGTTTCCC

AAGCGGCCAC

TCCTCCGGCC

CGCCGCCATG

AATACTAACT

TGAGATATAG

GTTTTTTTAG

TGCTCCATAT

CCGTTTTTCT

GGGATGACCA

GAGAGCGGTG

GGCAGGTAGG

TGGTACACCA

CAACCAACCA

TGGACAGCGC

TAGTTCTTTT

CTTCTTTTTT

CTGAAAGTGG

TCCACGTGCA

CCTCCCTCCC

GGACCGGAAA

GATATAAkAGC

GAGGAAGATG

TGAGAAGTAT

TTTTGAATAT

AAAAAATATA

GAAACCATGT

ATTCAAAGAA

CAACGTCCCT

AGGGGtTGCG

GGGGAGGGGG

GTTTTCTGCC

TCGCAGTCCC

AAAGAGTAAA

GTGAAAGTAA

TTTTAGGGAA

CGAGACAGTG

CCCCGGCCCT

TCTCGTTGCT

AAAATCACGC

GCGCGGGGCC

GTTTGCGTGG

ATCAATATGA

AGAAATAACT

TTGCTATTGG

GAAGGAGAGT

ACAGAGACCT

ATGCGTGTGC

AAGGACCGGG

CTACGAAAAC

ACATGTCCCT

CTTTTGTTAG

TGCTTTTATA

AAGAGCAAAT

AGGGCCCATA

CCCGGGCCCG

TCCACTCCAC

CTTTCCGTTG

TCAAAAAAAC

GCAACGCAAC

CCAACCCAGC

GCAGTTGCCT

CGAGGTGACG

CACCGGAGAT

GGCAACATGT

GGAGGAAGAA

CTCATTTCAT

CTGGTCTTTG

TTTTCATTTC

GTGATTGGGA

ATCTTCCACT

GCTTTCGTCC

CAGATCCGTT

TGTTCTCCTC

GGTCTCCGGC

3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 ACGGGCCCGG CGCAAAATGG 4860 4890 INFORMATION FOR SEQ ID NO: 9: SEQUENCE CHARACTERISTICS: LENGTH: 6228 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (Ox) FEATURE: NAME/KEY: misc-feature LOCATION: I OTHER INFORMATION:/product= "coding region of wSBE I-D4 gene" (xi) SEQUENCE DESCRIPITION: SEQ ID NO: 9: WO 99/14314 WO 99/ 4314PCT/AU98/00743 72

ACCGGCCCGG

CCGCCCCCTC

CCGGACCGGG

CCGGCTCCGT

ATGTGCGGCT

TGAGCCCTCT

ACGGAATCTG

GGGATTCGTC

GATCCGTACG

AAATGTGTAT

TCCTGTGTTG

TCATTTATCG

CATGGCAACA

TGCCGGCTTC

GATTGAGAAG

GTTTGAAACA

TTTTGTAGGC

ATGGGCCCCT

GGTTAACTTA

TCTAGCTAGT

GGTTGGCTGG

CATGTATTTA

TTCAACAACT

ATCAGGATTT

CGATTTCACC

ACTTTTGATG

GGTGAAAGGT

AACTTACATT

GCATCCTCGG

TGGTGAAAAG

AAAGGCAAAC

TTCTTTTGGG

ACCTCAATAT

CCATAGCCAT

CGCAAAATGG

CTGCTCGCCA

GATCTCGGTG

TCTGCCGGGG

GAGCGCGGTG

CCCCTGTCTA

ATCCACGGTG

CACTGAGGAA

CAGAATATCC

AATCTGTGCT

TGTCTCTACT

AAGGCCAAGA

GCTGAAGATG

AAGGAACACT

CACGAGGGAG

ATAGTTACAT

TATTTGAAGT

GCAGCAATGT

TGAAGTGCTG

AAAGAGTAGA

TATTCATTTC

CTTGTGAGTC

GGAATGGCTC

CCCATGTCAA

GTGGAGATGG

CCTCTAAATT

CTACTTTTAG

AATGTGGAGA

CCTCGAAAGC

CCTGAAGTAA

AACTACAA

TACCATGTGA

CTTGTTGACA

GCGAGCAGTA

GATTCCCGTC

TCTCTCCCGC

AGTCAGTCGG

TTTCCCTGAT

CCCGCGCCCT

CCCAGATTTG

GTTATTGGAA

CAAGTGGATG

CTCCTGCAGT

GAATGTATCA

ACTTGTTCAG

GCAAGTTCTC

GTGTTGGCGA

TCAGTTATAG

GCCTTGAAGA

CTTGTGGCGT

TTGGGATCAA

AAGTTCTAGT

ATGAAACTGT

TAAATATGAA

TTTTATGGCA

ATTACTTTAT

TGGGCACAGG

TGGGAAACCT

ACTATGGGTC

TGGAGCTCCA

TGGCTCGAGA

CATGATACTT

CTGACGCTCC

GCACATACAG

CAGTTCAGCT

CGAATTTCTT

AGGCACATAG

ATAAGACAGA

CGCCGCCATC

CGCGCCCCTC

GATCTTCATT

GCGATGCCGC

CTTCGCTCCG

CGACCGTGAT

ATAGTATATA

CGATTTCGAT

GTCTCAACCG

ACCAATAATT

TCCTGATCTG

TGTTCCCGTG

CCTTCCGATA

GATGAAAAAG

GTTCTCTAAA

CCGCAGCACA

CACAGAAAAT

GTTGTCACGC

CTTAAGAGTT

ATATGTTTTC

ATACTTGCTT

GGGTGTAGGG

ATGACAAAGG

GCCATCCCCC

GATCGGGTTC

TATGACGGTG

GCAAGAAATC

TTATTGCTCG

ACGTATTTAC

AGAATTTGCA

GATGGCAATC

CGCAGTTAGC

TTTACGGTTG

TGGTCTTAAT

GACGAAGATG

CCGTCCCCCT

TCTTTTCTTT

GCGCGCGCAG

CTGGTCGTGG

CCCCTGTTGT

CTACTAATAA

TGGATTTCTC

TATTACTGGA

GCTGCATTGT

CCGCTTATCC

TCTGCGCCAA

TACGATCTGG

TACCTTGACC

GGTTAGCTTT

AAAGACATAA

GACGCAACTG

AACTAATTGC

TATGGCTTGT

CCTTTTCTAG

CTAACTATCT

ATGCACAACT

ATAATTATGG

ATAATTCCAA

CTGCATGGAT

TTCACTGGGA

TAAGTAAAAC

TTTTGCAGGT

GAGGCTCATG

GACAATGTGT

ATGGAACATT

AGCAGATCAG

CGTGTTCTGA

GGCTATGATG

CTCTGCCTCA

GCTGACCGGC

TCTTTCGTTT

GGCGGCGGCA

CCGCGGAAGG

CGCCGGGCAA

ACTTGAGGCT

TGCTTTATGC

TGTACAACCC

GAAAACATAA

TAACTTTTGT

GAGACTACAC

ATCCGAAGTT

AGAAACATTC

TGTTTCATGT

TGCGACTCTG

TGTACCGGGA

AATGGTCGTT

CTTTTCTGAT

TTATGGTCAT

TTAGTAGATT

TATTGGTGAC

TGTTTGGTCA

GGTTAAATTT

TCGTTATGCA

TCCACCTTCT

CCACACAATT

ATGTGTTTAA

TGGGGATGAG

TACCGCGCAT

CATATTATGC

AACGCCAGAG

TGGATGTTGT

TTGGGCAAAA

120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 WO 99/14314 WO 9914314PCT/AU98/00743 73

CACACAGGAG

CCTGTTCAAC

ATGGACGAAT

CACCATGGTA

GATGTAGATG

GAAGCAACTG

GAAGGTGGAG

TACTTGAAGA

AACAGGAGAT

CCCTCCTTTG

CATACAGTTC

TTCGCTTGAT

CTAGTGATAG

GTTATATATC

AAATTTTCAG

GTATACTGGC

TCAAAAGGTT

TTGTAATGTT

CCTTTTGGTA

TGGCCCTTGG

TGTCAAAACT

AGGGCGAAAA

TGTTATCACG

TTTTAGCGTG

CACACTTATG

ATGATGCAAA

TCACGACAAG

GTTTATATCT

GAATGGATTG

TGGAGCCTCG

GTTTCTGGTC

GTAGCTTATT

TTCATATTPA

TCCTATTTCC

TATGCCAATT

TCATGTTTGA

TCAATATGTC

CAGTTGTTTA

TTGTTGCAGA

TAGGGTTTGA

ACAAAGATGA

ATACGGAAAA

TCGCTGTGCG

AAAGGTGAGA

GACTTTTAGT

TACCCACTAA

GTTGACTTTG

TCTATTGTTG

ATGTCAGACT

CGATTCGTTT

CGTTGTTACT

CATTTGGCTT

AGGTGATGGC

TATTTCTGAT

GTTTAAACAT

TATCATTTAG

GCAGTCTATT

AATATTCCCT

CATGATAGAG

CTTCTTGCAG

GTTTTCTAAC

ACTTTCCAGA

CAGACATTGA

TGGTAGCTCT

TACACTGTGT

GCCTTTCAAA

ACACAGGAGA

GGGAGTCTTA

TGGCTTCCGA

ATTCGCTGGA

CCTGATGCTT

AGATGTTTCA

CTATCGCCTG

CCTTGAATGG

GTGCATTGCA

TGAGTATGTG

CACTTTCTTT

TGCTTCACAA

CCAGCTATTA

TGTTCATCTA

GCGACAAGAC

TGCAGCCTGC

TAAGTATTCC

CAGAGTTCTG

ATTTTGTTAC

TACTTGAATT

CAATATGTTT

CTGTTTTCTA

CTGTGCCGGT

GTTGGATCCT

GTTTAAAAGA

ATGTTAGCAT

AAAATCAGCA

TCATACTGAC

AGAAGGCAAC

TCACCTACGA

CTTGGGATCT

TCCAACTTCT

CTAAACTAA).

AAGGGGCTAT

CGATTTCTTC

TTTGATGGGG

AGTTACAAGG

GCGAACCATT

GGCATGCCAG

GCTATGGCTA

TCAATGAGTG

TATGCTGAGA

TTCTTTTTTT

GCCTGGTAGA

GTTCGAATTA

CGGACCATGT

TTGAAACAAC

TATGGCATTT

TTCGCCTACA

TGAATTTGAT

CTTAGTCCTT

AAATATTTCA

TTATGGGTAA

CGGGATTCCC

TGATAGCCAA

AGTTAATCTT

CTTATTCCAA

TTTTTATTTT

GTCTTTCTTA

GTATATGGCA

GGTGCAATTT

AACTGGAGTT

TACAAGGTTA

TGACCTCACT

GTCTTGTGGA

TTGCTGATCT

CATAAACTGT

TTTCTAATCT

TAACATCCAT

AATATTTTGG

TAATGCACAA

TGCTTTGTCG

TTCCTGATAG

GAATAGCACA

GCCATGATCA

ATGGGGCACT

CAAATTTGAG

AGTTAGTTAT

AAGAATGTCC

TTAGTAGTTA

CTCTTGATGG

ATTGATCGTG

GTTCTAGTTC

GAAGATAATG

GATGATTCAC

TGAGGTAATA

TCGAAAAAAA

GTACTCCCCA

TATTCTAATT

TTACATATAT

ATACCAATGT

ACCTACTCAT

AATTGCTGCA

CCTTTTAGTT

ATGATAAATG

TGCCTATGTA

TAGTTCCTTC

TAAATTCTCC

ACTACTAGTT

GGGATAGCCG

GAGATATTGG

GCTATATAAT

TTTGGATACT

ACTCTTGCCA

GTCAGTTGAT

ATGGATCGAC

TACTCTGACC

GGTATGTTTT

GGTCTAAGAA

AAATAAACAT

ATTCTGATAA

GAAGACTGCA

ACTTTCACGC

ACAAGGAAAT

GAATTGCACT

CAGACGAGTA

TATTCCAGTC

TTCATCACCA

TCTGGTTATC

TCCTTTGGGC

GCTATTTCCA

CATTGTTGTT

GCCGACATCA

TTCTCCGTAA

GTTTTACATA

ACCTGACAAC

TGGCCACCCA

CAGACGCCAG

TATTTTTACA

ATCTCTGACT

CTTCTAACGT

GCTCAGTACG

2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 WO 99/14314 WO 9914314PCT/AU98/00743 74 ATGACCAAAT CTTGCCTGTG TGCAGTGCAT ACATTATCCA CTGTCTTCTT TTGTTAACAG TCACGATTTT TCTCATATTT CTAACATATA TAATTTGAJ\C CGACAAATTT TCCTTCCTAT GAAGTAGTTA ACTATACAAT ACTCTTAAGA ATAGCAACTC ATCTGTTTGA TATGAACCAT TTCCAGATTA TTGTATTTGA AAACTTATGA TGGGTAACTG ATGACTAATG TGCTTAATCT GACTTGCCTG GGAAGTACAA GGAAGAGTAA GCAATGTTAA AGAAGGGGCC ATCAAGGCTG CCCAGGTGGC CCATGACAAC AAACAAACTT CAACAACCGC TGGTAATGCT AATTACTAGG TGCAGTACGA TCTCACAAAA GGAAAAGCCC AAGGATGAAG TGTTGAAGCC ACTGGCGTCA GGCGTCTACA GGAGGTGACT CAAAGACAAC AAATAAGCAC TAATACTCCT GCTATTGCTA GGCTTGGACT TTGCTGAGGT GGTCGAGTCC AGCTATATGT GTTTCGGGCT TCCATCCCAG CATAGTTACA TGATAATTGA ATAACTCCAG GGCCAAGAAA GGGAAGCTTC AGTCCTTGTT TAAGCCATCA TCTTATCAAG TTCCAGGTGT TGGTTCCTCC CACTGACCAT CGAAGCCACG TCAAAATATC ACAAACTGCC

GTAACCTAGT

TATAAATTGA

GAAGTTATTT

TATCCAACTT

AGTACATGAA

CATCATCAAA

GTTTAGTCAG

TGACTTGTGC

TGTTGTCTCA

ACGTGGAATC

ATCTCTTGCA

CGTTTCCACT

GGTAGCTCTG

TGATGTTCAA

CATCAGATAA

GATCACTTTA

CCTAACTCAT

AGGATTTAGT

TGCTCTCTTG

GAGCTGCTTT

AAGACGCAGC

CCAGCAAGAA

CATATCAACG

GTAGTAGCAA

TACCTACTAT

GCCAAATATG

AATAAAAACA

TGCATATTGC

GCCTAGATTG

TCCGTTCTCG

TCCCAAAATT

ACAACCAAAA

GTGGGCATGA

ATGGCATCTT

AATTTTCTTG

CATTGCAATT

TCTCTGCATC

TTCTGCATTC

CGCATTTGAT

GCAGATTGTC

GGCAGCTGTT

GTTTTATGTT

AAATGGGCTA

TGGTCTTCGT

AGCTTTGCCT

TTTAAAACAC

GACTCTGATG

GATCTGTTTT

TCTTATTTGC

CGTCACCTGA

TCAAAATCCT

AACAATAAAT

CCAGGCTTAC

CTTGGGGGA.A

AGATGGTGAG

GGGAATTAAC

CTTGATCAGG

TACTGTCAAA

ATAGAAAGAT

CGCCATCCCG

GTTGTCTGTT

TATAAGCCTG

TATCTTTTTT

AGACAAGGCG

CTCTGGTTGA

GGCGACCATC

AATGCGCATC

CTGCCAAAGG

ATTCTTACAC

TCCCAAATAT

TGATAAATAA

AAGCATTTTT

CAAGCAATGA

AGCGACATGA

GCATCATTTG

ACCAAATAAG

TGGACTCAAT

CTTCAATTTT

TTCAATATTT

GCAGTTACAA

CTCTGATGTT

GCAACACTAT

AGTGTTGATC

AGGAGTACCA

GTCTCCATCC

AAATAACAGC

TATCGCGTCG

ACTGCTCTCG

GCGACTTCTG

TTTGTCTTTC

ACCGTGTGCC

CTGTGCAGAC

AAATA.AGCGG

AGTCCTCTGT

TGCAATTTCT

GATTGCATCT

TGCTAATAAC

TCATGTTTGG

AAGAAACCAT

GTCGTCATCA

GCCCAAGACT

CTGCACTGCA

ATTAGTGATA

TATTTGAAGG

TAATAGCCTT

TGTTTCTCGC

ATGCGCTCGA

ATGAGGAAAA

ATTCACTCCT

TTGAAACCGT

CCAACTTCCT

CATCCCAGTA

CTTCTGCTTA

AGTCGGATGT

TGGTGGACAT

GTTCTTCTAT

TGTGCTGCAT

GGAGTACCTG

CGCACTTGTG

AAAAGATATC

AGGAGAAAGC

GGTACATCGA

GTTCCGAAAA

TGTCACCCGA

GACGTCCTTG

TTGAAATTCT

TGATGGTGCG

CATAAAGAAA

TTTTGTCTTG

TCTTTTGCTA

TGCAGTGCTG

CGCACAAAGG

CACTAACTTG

TCGCTCACAG

TGGGACCGTT

CCTTTGGCAT

4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 WO 99/14314 WO 9914314PCT/AU98/00743 GAACAGAAGC A.ACAGGGGCT TGGAACTGAA CGCCGAAAAT AAAGTCAAAC CGGCTGGCC 6120 GGATTGAAAG GGGAAACGCC AAAATCCACT TAATTTGAAT GGAAGGAGGA ATGGTTCTTG 6180 CTGGTTTCAA CTCTGCAGGC TTCCCTCTGA ATTTCACACG GAGCCATT 6228 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 11463 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: misc-feature LOCATION: L.11463 OTHER INFORMATION:/product= "complete sequence of the starch branching enzyme 11 gene" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:

AGAAACACCT

TTAGCGTCTA

CCAGTCCAAA

AGGCGCATTC

ACAGCGGACG

GACGCACATG

TTTCCCCTCT

ACATGCATTT

CCATGCACCG

GTCGAGTCGA

GTACA-AATAC

TTTTTACACG

ACGACGGACA

AAACAAAATA

ACGCGCTCCC

CCATTTTAGA

GTTTTCTTAA

ACGTCTGCTA

GAACTGGACA

TGAGTGCGTG

AACACCATGA

GGAAATTCAT

TCAAACA.AGG

ACGAGTCCAT

GAAGAGGATG

ATAATGTACC

AAAATGCCAT

ATCAGACACT

AATACTTATA

AGCCGTTGGT

TTTTTTTTTT GTTCTTTTCG

AAGAACAGGC

GGATCACCAG

GACGCTCACG

ACACATGGGG

TGATGCTATC

AGCTCACACT

AAAATTAATT

GCGAGGTGGA

ACACTGAAAG

CTACAATTTG

AGCTGGCCCG

CACCAACTGC

AACGAGGGTA

TTGCGATCTC

CATTTAGGCC

CTGCAAAGTT

CAGGAGCCCA

TCATCTATGG

AGGCCTGATG

TTTTTTTAAT

CTCAAACCAC

AACGAAGAAC

TATGCGTATT

TTTTTTGGAG

CATGCGTGCA

TTTTGTCTGG

CTAGAGGCCG

GTCCTCCCGC

GACGGTGGGT

CTGCTTTACA

AAGCGCGAGA

GCACCACAGG

GCGTCGGAGC

GAGGGAGCAA

GGAAGCAAGA

CATGACATGC

TGAAAATCAA

ACGATTTCAT

CAGAGTGGTG

GATCGGATGA

GACACAATAA

CTAACGGCAT

ACGCAGCGTC

CGTGGAGAGA

AAAGGCTCAA

CCACCAAAAC

CTTGAGCCTG

AAGGAAGAGA

CCATGCACCT

GTTGGCAAAC

AATTCTCAAA

CATCCCAGTT

TTACATACAT

TGGTCTTTTT

TCGGTCGGAG

ATGTTTTTGT

GGCCAGGTAA

GCCTCCACCG

WO 99/14314 WO 9914314PCT/AU98/00743 76

TCCGTCCGTC

CACACACACT

CCTCTCCCCC

CACGTTGCTC

TGCATTTCGG

GGGATGGCGA

GTGGCGCGGG

AAGAAGGACT

CCTTCTCTCT

GAGTGAGAGA

CGGGGAAATG

TTTCATTCTG

TGTGGCGTTT

GCGGCCTCTC

CGCAACCTGA

TTCACTTACC

ATCAGCATTG

CTCTTGGGCC

TGCACCGTTT

CTGAAGATAT

TTCAATCTTC

GAGTTAAGGA

AGAAAATATA

AATGCCTACC

GGAACATCAA

AGAATATGCT

GAATAACTGT

GGGTTATAGA

TTGGGAAACT

ATTCGAATGA

TCGTGCTGCT

GCTTGGATTT

TAATTGCATA

CTGAAGGTAT

GCTGCCACCT

CACACACGGC

GCCCATCCCC

CCCCTTCTCA

CCGGCGGGTT

CGTTCGCGGT

CCGGCTCGGA

CCTCTCGTAC

CCTCTGCGCG

GATAGCTGGA

CGTTAGTGTC

ATATATATTT

TTTCACTATT

CAGGGAAGGT

AGAATTACAG

AAATGCCGGA

TGCAGTACTG

ACTGAAAAAA

GGGGTTTCGT

CGAGGAGCAA

AGAACCGACT

ACTAGTCGTG

CGAGATTGAC

CGCTGCTTTC

AGAGACAAAG

GGGAAGTAAA

CTCCGATCAT

TTTTACTTTG

TAGTTTCTTA

TTTTGGGTAT

ATTGACCAAC

ACCCGCAGGT

TCTTATAAGA

CGTCTAATTG

CTGCTGTGCG

ACACTCCCCG

ATGCACTGCA

TCGCTTCTCA

GAGTGAGATC

GTCCGGCGCG

GCGGAGGGGC

GCCTCGCTCT

CGCATGGCCT

TTAGGCGATC

ACCCAGGCCC

TCTCATTCTT

GTAGTCATCC

CCTGGTGCCT

GTACACACAC

TGAAACCAAC

CACTGCCTTG

TCAGATGGAT

CAGTCTGCTC

ACGGCGGAAG

CAGGGCATTG

GGGGAGAAAC

CCAACACTGA

GCTCATTTTG

ACTAGGGACC

TGTATA.ATTG

TACAATTAAA

CTAATTCCTC

TCTTTGTGGC

ACCTCGGTGG

ATGAAGGTGG

AAATTTAAAG

AAATTTATAA

CATATCTTAT

CGCGCACGAA

TGGGTCCCCT

CCGTACCCGC

ATTAATATCT

TGGGCGACTG

ACTCTCGGTG

GGGGCGGACT

CTCGAATCTC

GTTCGATGCT

GCGCTTCCTG

TGGTGTTACC

TTTCTTCCTG

TTGCATTTTG

GACGGCGAGA

TCGTGCCGGT

CACGGATGCG

TTCATTTTGT

GTGCATTCTA

TACAATTGCT

TGAACATGAC

TGGAAACAAT

CGCGAGTTGT

AAGATTTTCG

AATTAAGGTC

ACCATTTCAT

ATGGCTACAA

GAGTGGCAAA

TACCAAATTC

CTTTTTCTTT

ATTCAACAGA

ATTGGAAGCA

CTTTATTATT

TTCCTGTTTT

AAGAAAATTT

GGGAGGAAGA

TTCCGGCTTG

CAGCTTCCAC

CCATCACTCG

GCTGACTCAA

TGGCGCGGGC

TGCCGTCCCT

CCCCGTCTGG

GTTCCCCAAT

AACCTGTATT

ACGGCTTTGA

TTCTTGCTGT

CAGGCCCCGT

GGACGACTTG

AAATCTTCAT

TCAGGTTTCG

TAGCCTTGGC

GCAAGAACTT

ATTTTTCGTG

ACGGGGGACT

CACTGATGGT

CCCAAAACCA

GAGCCATCTT

CTTTCATCAT

ACAGATCCCT

TTTGCTCAAA

CTGATGAAAA

CTAGGGGGGA

TGGGGAAAAC

TACAGCGAAT

TTTTCTCGTG

ATGAAACGCC

CCCCTCTCTT

ATATTCCTGT

ACGAACGCCG

GCGTCTATCT

CCCCGCCGCA

GGTTCCGCGC

TCACTACGCG

CGGCGTCGGA

GCTCCTCAGG

CTTTGGCTCC

TGATCTCCAT

TTTTCCCCCG

TCATTCCTCG

AACTGCAAGT

CCTGAGCCGC

GCAAGTCCGG

ACAATCGTTA

AGCTTCTTCT

CCCGTGCTGG

CACAACATAA

CTGTAGATAC

GCAGAGAAAC

GTAACCAAAG

GGAGATGGGC

GACTACCGGT

GCAAATTTGG

TCGTGGTCTG

ATTGCAATAC

TGTGGTGGAT

AATCTACCAG

ACATTGCTAA

ACAAGAGAAT

GTTATGAAAA

TCCACTAGTC

TTTTCCAGTG

TTTCCCCTAT

960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 WO 99/14314 WO 99/43 14PCT/AU98/00743 77

TTTCCAGTGC

TTTTAAGTTC

AGCTTACTGG

CTGGCTTTTG

CAAATGCAGA

ATAACTACTG

CCTCAACTAC

ATTCTACTGA

ATACGTGCAA

CATGCTATCA

TGATTATGGT

TCATGGCTCA

AACTCTGCCT

TCCGGTGTGA

ATACCTTTCA

ATTATTAAAT

AGTCAAGACA

TAAGGGGCAA

TCTTTTATGT

AAGGATATTT

AGAGTCACTA

TATTTCACCT

TGTTAACATA

ATATTGGCA.A

GATTTCCATT

AGAGCATAGT

ACTAGCTTAA

GCCATTTCCT

TCATTATTCT

TCTCCCATTA

GAGGCATCGC

GTGATCTTGC

CAAGAATTA\

TGAAGGTATC

CTTAACGAGA

ACTTACAAAT

CACCCTGTTA

TACTATGACC

ATACATCTAT

ATCATATCAA

ATTTAGTCCA

CACTCCCATC

GTGAAGGTTT

GTTTGGGAGA

CGTGTAAAGG

ACTAAGGGTC

AGGATTCAAT

ATGGCATATA

GAAATTTCCA

ATACTTTTGA

CCAACCTTGG

GTTCTCTGTT

ACATGCAAAT

AGGATTTATG

GTTTCTGGTC

TTACATGGTG

GTGCAAAACT

GCATTTGGAG

TATATGAATT

GATTTCCCAC

ACCTTATTAA

GCGAGCGATT

TGAAGAGGAT

TAATATATAC

ACAGGAACCG

AAGGCTTGGA

ACTTACCGAG

CACCTTCCAA

TAGCTTACTG

CAGTCTGCAG

AGAGTATGTC

TTGTATTTAT

AATGGTATAA

TCTTTTTGAG

TGCATTATGT

GCTCCTATTG

TTTTCCTCCC

TAAGCTGGCC

CCTTTTCCTC

TTCTGCTTGG

TTATGATCCA

GTGTTACAGT

ATTTGGAAGT

TGATGTGTGT

GGTTAGGATA

GCAGGAGAAG

AATCACACAT

TGATGGTTTA

CATTCACTTG

TTGCTTCCTC

GCAGTGGGCA

CCATTGTTGT

TTAGGATGTA

TGAGAGAGAG

CAAAAACTTC

ATAGTTAATT

TATCATCACA

AAGATAAATT

TACAATGCAG

AATGGGCTCC

TTTATTGTTA

AATACTGACC

CATTAGTAGG

TACAGCTTGG

TTAGCTGTTT

TTTGTCAGTG

ATTGAAAATG

GTGCTTTTCC

ATGCAGATAT

TAACAACGCT

AATTATTTAG

TCTGTTTTTT

ATCAAGTTCT

CCTGAAGAGG

TTTTTAATAC

GACATATGCA

ATGCTTGTGT

TTCCATTTTG

TATGTCTTCC

TGGAATGAGC

TTCTATGGAT

ACAACCTCGA

TTTGTCTGCT

TGTGAAAGTC

TGCAATAGCT

AGAAATATTG

ACAAGGGGGG

CATTGTTCTG

CTTTGTAACC

ATACTTAGAG

CATATGCTAA

TGCAGATAAT

CTGGAGCGCA

ATGGTCACTA

AGTTACTATA

TGACTTCAAC

CAATTTTCCA

GCACATTCCT

TCTTAAGCTT

AGTATATTAA

ATCTACAATG

TTGATATGGT

GATGGATCCT

TCGAGGATGT

AGATACGGAT

CTGTGCAGGC

TAAGTATCGA

CCACTTCTTA

TTAATTCACC

GTGACATAAG

GCCTTTTGTG

AACATCTCAA

AGCCCGGTAT

TTTCTAGTTC

TTTTATTTTC

TGTTCTTTTG

ATATCTATTT

CGGTATAATG

CATTGGAGCG

GGGGGGGGGG

AGGTGTACGT

TACTTGGAAA

GATGCATCTG

TTTTAGGGAT

GGCAATCCAG

TGTTATGTTC

TTCACCAACT

AATTTATGAT

AATTGGAATC

CCTTTGCTTC

TAAAGTTGAG

CAGCCCAAAG

GGATGAATGA

AGCATATTTC

CTTTTCAGGA

CAGCTATTCC

AGCATTTTCG

GGATACTCCA

TCCAGGTGAA

TCTACATTAC

CTGACATGTG

TTCTAAGGGC

ATCTTATAGC

ACCATTTACT

CTAAACGACC

GTCAATAAGT

TGTTATGTAC

TAATGTCTTC

TCTTCTGTAA

TTTTTTTGTC

TAACCATGTT

TCTCCAGCAA

GGGGTTCCCT

ACTGCAGGGA

CTTGAGTCTT

AAATTTTAGT

GAGGTGTTGC

GAGCATTCAT

3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 WO 99/14314 WO 99/ 4314PCT/AU98/00743 78

ACTATGCAAG

TTGGAGCTAT

AGATATATAG

GTTCCAGGTA

ACTTAAAATC

TTCATAGGTA

GGGAAATTCA

AAAATCTAGA

CCATCCTAAA

GTGACTTCTT

TAGTAAACTG

GATGTGGATT

GGCACTACAT

AGGCCCCACT

TTACTTAAAG

GAAAATATAT

ATTATATTAG

AATATAGAGA

ACATCAAATA

ATATGCATCA

TAATCTACTT

ATGATTTTGT

CGATGGCACT

TTCTCGTCTA

TTGGCTAACT

TATTGAGATT

TTCGATTTGA

AAGTGGTTTC

CATGATCAGG

ACCTGATGAG

TCTTGTTCTA

TTGATGCGGT

CTGTATCCAT

AGTTTTATTT

CTTTGGGTAT

TACATCCTAA

TACAACTACA

CCATGTTACT

CTTGATCGAT

ATTAGTCCAA

GGCAATTATG

GTGGCATAAG

TGGCAGGGCC

TTTTCTCAGA

ACAGTTCCAT

GAGAAGTTCA

ATAGTTTGCA

TGCCAGCTTC

TTCTTCATTT

CAACATCTAC

CACATCTTTG

AGTTTGACTT

ATATAGATAG

GACCATCTGT

TTCCTTCTAC

GTACCCTGCA

GATACACATT

TTCAACTATG

GTTCCTGTTA

CTTACTGTCA

TGGGGTGACC

AGTAACTTTT

ACTTGTGCTA

ATCATGGAAG

GATGACATTT

AGTTTACTTG

TGGTGAAGAT

TGGGGATCAC

TCACACAATC

TGCTTCATGC

CTTAGTATTC

AATTTTTTTG

AGAGCACATG

TTTAATTTTA

ATACATTGTC

GAAAATTGGC

CTATCGCCGA

TGTATTAAAC

AGAATATCGT

GATGCTATCA

AGTTGGAAAA

ATACTAGATG

GTCCTAAGTC

AACACCAA.AT

ATGTTGTAGA

AGGACAAATC

ATGTCAACAC

TTGCTTTAGC

TTGGTTTGGT

GTCATTCGTC

ACTTCCACGG

GGAGTTGGGA

ATCTGTTCTT

AACGCGAGAT

TCCATGATGT

TTAGGGCACT

CGGAGTCTTA

ATTGGAAGTG

ACTGGGAACT

ATGCTGGTCA

GTAAGTGCTT

TCTGTTACAC

CATTTTTTTC

ACATAAA-ATA

TGAAAAAGAT

CACCAAGTAG

AGCTTGGTTT

GCTGTTTTAC

AAAAGCTAAG

AAAAACTAGA

ATATTTTTCC

CAGTTGGACA

TTTGTAATGG

ATAGAATTAA

CTGACAGCAA

TTACTTCCCT

AAACTTCTTT

TACTTTGATC

TATCAGCACA

TAGAACTTCA

TTCAACAAAA

CACTTGCTTT

TGATTCTATT

AAATAATACC

TGGTCCACGC

AGTATGTAGC

ACACATGTTG

GGTGGCTTGA

ATACTCACCA

GAAACAATTG

GATAGTTCCC

ATTATTATTT

ATGGCGAATA

ACGATCTAAT

ACAGTATTTA

TTTTTGTTAG

TGTATACACT

TTTGGATATA

CATTTTATTG

CCGTTTTGGA

GCTTGTTCTT

TGTTTATCTG

AGTGGCGAAA

GTGGCAAAAA

ATTCTATATA

TGAAATGTAT

CAACACAATT

TCAACTGGCC

TACCTCACTG

GTTGAATTCA

AAGTTTGACC

AGATTAACAA

TTTTTCTATA

ATCAATTTGG

AAATCAGACC

CATATTTATG

TCAGTTGCAT

CTTGACGGTT

GGCCATCATT

TCTGACTTCT

ATATTCTATT

AGAATATAAG

TGGATTACAA

CTATGCATCA

TAGTATGCTT

ATTTTCTTTC

TTTTGGATTT

TCATGGACTT

TGATTTTTAA

GGGTAAAATC

CTTCACCCAT

ATCCTTTATT

TTGTTGGCTT

ACTCCAGAGG

ATGGATATTG

GTATTCTAAA

GTGAAATGTC

TAAAATTTTC

ATTGTGCTAC

TTGGTACATG

TGATGCCATA

ATGTACTCGT

ATAAGTGGCC

TTTGAACATA

A.AGTCTATTG

TTTTTATTTT

GACTTGGTCA

ATCAGAGGGA

TTGTCACCAT

TGTTTGTACC

TGCTTCATCA

TGAATGGTTT

GGATGTGGGA

GTCACCATAT

CTTATGCAGG

TTTGATGGAT

GTAAGTCATC

TAACATGTAT

GTACAATTTT

TAAGTTTGTT

GCTACTGATG

TATCCTGATG

CTAGTTAAGT

TCTCTTTTCA

4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6420 6480 6540 6600 6660 6720 6780 6840 6900 6960 WO 99/14314 PTA9/04 PCT/AU98/00743 79

TAACAATGCT

GGGCATTCAA

TTTGGGTTTT

AAGAAATGGG

GGAAATGAGA

TTGTAATAGT

TGTACACATC

TCTGCCAGCA

TGGAATGCCT

GCATATGGCT

ATTACATGCG

GTGAATATCT

AAATAGCAAA

CTGTAGCAGG

TATATGAGAA

TAAGGATGGG

AAAGGAATAT

GGCTCGGTAA

CAAATACTTA

TATTTAGATT

TTTGGTGGAT

CTCTGCCGTT

TTAAGCATGT

AATATGCTGC

ATGGGCGATA

GCAGAAAGTC

AAGGTACTAG

CTTTGTAGAG

ACTCATTTTG

CGGAGGGAGT

TTGGCTGCCT

AACGTACCAT

TTCTTATTTA

AATTTATACC

GCTTACGAGC

TCAATAAGTG

CAACCTTGTC

GGCTATTCCC

GGTTTALACTT

CACAAACAAG

TTAACTGTTC

ACATTTTGCA

GTAGCAGATA

CACAATGATC

AGACATTTGC

TCTCGGAAAT

CCAGTCAACA

AGTTAGTATA

CAGTAGGTAA

ACAGGGTCAT

AAAAAACTTT

TTGCTACTAC

TAAATACTAA

ACCAGAATTT

ACAAAAGCTG

TTTTTGAAGC

AGTGTAATTT

TTGTGCACAC

ATGATCAAGC

CTGTTACTTT

ATTCCACTAT-

CTTCGTATGT

ACATAATTGA

CACCCATCAC

GTGGTACTGT

TTTGATTGCT

TTGTATGATA

ATATTTTTTG

GGAGTGTGTG

AATTGCTTCA

AAGGACATGA

AAATTCCTGC

TAATCCTGAG

ACAGTTCTAA

TCCCTGTTCC

ALATGGATTGA

TAGATTACAT

CTGTTATCAG

GTAATGGCTA

CAGTTAGCAA

TAAACTGTGG

TAAATTTAGC

GTACCATATC

ATGATGATCC

ACAGCTGCCA

CTCGATACAT

CTGCCCTCTT

TTTTCAGTTT

TGTGAGCTGT

AGCATTTCTT

CCTAACAAAT

ACTAGTTGGT

TGGACAAAAG

GGACCACATA

AGTCCATAGT

TTTGTCTCAT

CAGCTATTTC

GGCGGCTTGT

TATGTTACCG

ATGCATCACT

ATGGCTGTAA

ACTAATGTTG

GAAGGCTAAC

ATTATACTTC

ACTGCTATGC

CTTTCAACTC

TTTGTGTAAC

AGATGGTGGT

ACTCCTCAAG

TTTCTAA-ATG

CTTGAATACG

GTGTCTTTAT

TATTTTCAGA

TCATTAATTG

CAGATAAAAT

TAGTTGTAAT

AGATAGATAT

ATCTGTCATG

TGGCAATAAT

GTTAGTAATG

TTTGCATCAT

TGGTACTTAA

TAACACAGGC

AGAAGGTGGC

GACAAGACTA

AATTACTCCC

GTATATAGAT

GAAATCTCTA

CAGATTGCTA

CCAACTGTTA

GAACTTTGAC

TTCATTTGCT

TAGTAATTTG

TTTATTTGAT

TATTATTTAT

TTTGATTCCA

AGTGTGTTCT

AATCTCACTG

ATGAGAAAAT

TGTGAAATTG

GTTGGTTTTG

TAAGTGCAGG

GTAAAAAGGA

AGAAGTCAAA

GCTGGGCAGT

AACAATATTA

TGTTCACCTT

AAATCGTTAT

TAATGAAAAG

GCAGGAACGC

ATCTGTGTTC

AAACTTAACT

ATGTGCTCCC

TATTTTTGTG

TACATTCTTG

AAAGTGACGA

TTGAGAAGTG

TTGCATTCTG

TCCCGTTCCT

GCATTTTAGA

CAGAGACTTA

GTGTTTTCTT

CTTGAGCAGA

AGTTATGTTG

CATTCCTTTC

AAAAGTGCAA

AGTATGCTTG

TTAATTGCGG

TAAACGCTTT

GTACATGTAT

TATGTTGTAG

AGAGTCCGCT

TTCAGGTCAG

ACTACCGCCT

AATATTGGTG

AAATATGTAT

TACATGATTT

GTACATTGCG

TTTATATCCG

TTGTCCTGTT

TAGGTTTACA

GCTGACAAAA

GACTAAAGCT

TGCTTTGTGC

ATTCAACCAA

TGCTGCTGTT

TGTGAGTAGT

GAAGTGTCCA

ATCTTGGAAA

TGTAACTTAT

GTTGATGGAT

AAATATAAGT

GTGTAGATTC

TATTTAGGAA

GTGATAAAGA

ATTTGCTGAA

CAATTTTCTG

CGAGACCAGC

7020 7080 7140 7200 7260 7320 7380 7440 7500 7560 7620 7680 7740 7800 7860 7920 7980 8040 8100 8160 8220 8280 8340 8400 8460 8520 8580 8640 8700 8760 8820 8880 8940 WO 99/14314 WO 9914314PCT/AU98/00743 80 CAAAGTCACG TGTTAGCTGT GTGATCTGTT

GCTAAAATCC

CTTTCTTAGA

CCGCCTAAAG

CATTTTGGAG

CGCTCCTTGA

CATTCAAAAG

TGTTCCAATT

TATAACTACA

CTGTGCTGTA

CAGGATATGT

TAGCATTACA

ACTTCATGGG

ATGATTGTGA

ATCTGTTGCT

CAAGAGGCCC

GATAAATGCC

TAGATCTTTA

CATTGCTTTT

AAAAATATCA

AGTTCGATCA

TTGTTGCATA

ATTAATTCCT

AAACTATTTT

AGTTTATGAC

TCCTCAAAAG

ACTACCGTGT

CACCCTTCAC

TTTGTTGGTC

CTCGAGCTGT

TTCATATTAT

ACTAGAACTA

GGACAATTGG

CTTGGACTCT

AACGAATTAT

TGATTACCAT

GAGTGATTTT

ATATGCTTAG

GGTTTTATTA

GAAACGGTCA

TTATGAGTTT

GTTGTTTTTA

AATTATTTAT

ATGATTTCAT

TAAAATGATC

AAATGAGTTT

TTTACTGTAA

TCCAAGGAGG

ACAAACTCTT

GCCGTAGATT

TTGGCCATTT

GTAGTTTTGT

TGATTTTTTG

GGCAATGCAG

ACAAGTCACA

GTAATGAGAT

CTTAAGTGCT

ATCTGAGCAC

AGGAGATTTG

TGGGTGTTCC

CAGTAGGGTT

GTGCAGCTAT

TGTAGCCATA

CTACTTAAGT

TTTTCCGAAT

CTGGGTTTTT

GACGATGCAC

TTGCTTGAAT

AGTGCCTGAA

TATTGGATAG

TAACAGCTCT

TGGCGCCATC

CATCATTCTA

TTGGGACTCC

TACCAGTGTA

CCGACATAGA

GGCTCTGGAT

AGGCTTGTCA

GGGCATCCTG

TTTGAACCAT

AAGTTAACTT

CCAACCGGCA

TGATCTTGTA

ATTTCTTGAT

AGACGTTAAC

CAGGGAGATG

CATCTTGAGG

GTTTAACGTC

GAAAACTGTG

TGTGTATTGA

CAGTATGTTT

GTATTTGTTT

AAGCCTGGGA

AGTGGGGGCT

CAATATAAAG

GGAAGGTTGT

GTTTGTTTCA

CTACCCTAAC

GTTAGTTGTG

TCTTTGGTGG

ATCTGAATCT

TTAAATATAC

GGCTGAAATA

ATTCCTGGCC

GGGAAGTTTG

TTTGTAACTA

ATCAGGACCA

AAAGGGA.ACA

GTTTTATTCC

ACAGCATGAA

AGGCTTCA.AC

CCATGGGTTT

GTCAGTCTTT

GCTTTTCTTT

CTATTTACTT

AAGTTCTCCC

AGTTTTAGCT

GAAATCATAA

ATAAGTATGT

CAGATTTTCT

AAAAATATGG

AGTCTCTTCA

CAAAGGCGGA

TACATATACC

CACGGAAACA

TCAACTTCCA

AGTACAAGGT

TCTACAACTT

AATAGGGTALA

TCTTAACAGC

ATCTTTATGC

CATCCTAGCA

ACAGTTTCTG

ATTCAGCAGG

TGAGCAAATT

AGACGTATAG

GTTTTGGTGT

GAGTCTTCGT

GTCACAAGTC

GTGGCACCTG

CCATACTAAG

AAAGTGTCTC

AGGACAGTTG

CATATCAAGC

TCTTCGCATT

AGGTGGTGA.A

ACAACATTAT

CACATTGTAT

GGCAGAATGG

CTGGAAATAA

GTGCTATTAC

TGTTTGTTAG

GTTGAGAGTT

TAGATATCGT

GGTATGTCAC

AGTGGTAAAA

GCTGGAATTG

AGCACTGACA

TGAGGAAGAT

CTGGAGCAAT

ATGCTTGCCT

TTAATTCCAC

TTTGTAAAGA

CCCGAAGCAC

TCAGTTGGAC

GTTTTAGAGC

CTATTTCTTA

CTTGATCATG

TTATTAATAG

TCACCTGGCT

TTCTTGGATG

TACPLACATAA

TGCATCTACA

TAAGGAAACA

AGCAAGATTC

ATATTGTGCT

ATACTTGGTA

TCTCTTTGTG

GATCGTGGCA

GGCTATCTTA

TGCATTCTGC

GTATTATGTA

ATAGATTTTC

CAATAGTTAT

ATTCCCTCAC

GAAAGATCAA

GTTGATCATT

GGTATGCAAG

TGGTTTGTCT

AAAGTGTAGA

CTTTTCACCA

ATGTAACTGC

AAGGTGATCA

AGCTTTTTTG

TTTCATTGTC

ATGGATAGAG

AAAGAATTTG

ATACCATTCA

TCGGTCTAAT

AGCCCCATTT

ATCAGGTGGC

ATGTCGACTA

9000 9060 9120 9180 9240 9300 9360 9420 9480 9540 9600 9660 9720 9780 9840 9900 9960 10020 10080 10140 10200 10260 10320 10380 10440 10500 10560 10620 10680 10740 10800 10860 10920 10980 WO 99/14314 WO 99/ 4314PCT/AU98/00743 81

CTTCACAACC

AAATCTGAAT

CCGCGCTCTT

TAAGA.ACCAG

TGAGCGAAGC

GTGCCTCTTC

GAAAGAAAAT

CACATTCCCG

TCC

GTAAGTCTGG

CAACTTCCCA

TCTCGGTGTA

CAGCGGCTTG

GACGGGCA.AC

CCCAGATGCC

GGACGGGCCT

GTTGTTTTTG

GCTCAAGCGT

ATTGCTGATG

CACTCCGAGC

TTACAAGGCA

GGCGCGAGGC

AGGAGGAGCA

GGGTGTTTGT

TACATATAAC

1463

CACTTGACTC

CCCTTGCAGG

AGAACTGCGG

AAGAGAGAAC

TGCTCCAAGC

GATGGATAGG

TGTGCTGCAC

TAATAATTGC

GTCTTGACTC

AACATCCGCA

TCGTGTATGC

TCCAGAGAGC

GCCATGACTG

TAGCTTGTTG

TGAACCCTCC

CCGTGCGCTC

AACTGCTTAC

TGACAACAGG

CCTTACAGAG

TCGTGGATCG

GGAGGGGATC

GTGAGCGCTC

TCCTATCTTG

AACGTGAAAA

11040 11100 11160 11220 11280 11340 11400 11460 INFORMATION FOR SEQ ID NO: 11: SEQUENCE CHARACTERISTICS: LENGTH: 2662 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAMEIKEY: misc-feature LOCATION: L .2651I OTHER INFORMATION:/product= "nucleotide sequence of cDNA wheat SSS I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: TCTCCCACTC TTCTCTCCCC

GGCCTCGGCC

GGAGCGCGCC

GCACCGAGCG

CCGCGCGCCC

CCGCCTGCGC

GCTCCGGCGC

GCGCCCCGCG

GCCGCCCGCG

GGGGGAACTC

ACCGGCAAAC

CCGCGGCAGC

GGGCGATCCA

ACACCCATGG

GCCGATCCGG

TTGGCGCGGG

CAGCAGCAGC

CCCGCCCAGT

GCGCCCGACC

GCGCACACCG

CCCCCGATCC

AGCAGCACCG

CCGTCCGTGC

CGGCGACGGG

CGACGGCGGC

GCCGCTACGT

AACTGGCCCC

CGCCGGCCCC

TCCTGCTCGA

AGTCGGCACC

GCTTTTGCAG

CAGTGGGAGA

GTCCGCACCT

CGTCGGCGCC

CCGGGCGTCC

TGCCGAGCTC

GCCGCTCGTG

GACGCAGCCG

AGGGATTGCT

GGCTCATCAC

GCAGCGCACT

GAGAGGCTTC

CCTCCGCCTC

GGGTGCCTCG

GCCTGCGTCG

AGCAGGGAGG

CCAGGCTTCC

CCCCTGCCGG

GAGGATTCCA

CCATCACCTC

AAAACCCCGG

GCCCCGGCCC

CTCCCCTGTC

CCCCCAGCGT

TCCGCGCGCG

GCCCCGCGGC

TCGCGCCGCC

ACGCCGGCGT

TCGACAGCAT

120 180 240 300 360 420 480 540 600 WO 99/14314 WO 99/ 4314PCT/AU98/00743 82

AATTGTGGCT

TAAAGTTACA

GGGGCTGGGA

GATGGTTGTA

ATACACTGGG

TCATGAGTAT

AGGAAGTTTA

1s

CCTTTGCTAT

ACAGAATTGC

TGCAAAATAT

TTTAGCACAT

ATGGTATGGA

GGGTGAGGCA

CAGTCAGGGT

CTTAAGCTCC

GAACCCCACC

GGCCAAATGT

TCTGATTGGC

CATTCCAGAG

TTTTGAAGGC

TGGATTTAGT

ATCCAGGTTT

TGTAGTTCAT

AAAAGGAGAG

GGCATTGCGA

GAAGCGAGGC

CTTCGAATGG

AGCGCGGGTC

ACCCCTGTAC

CGCCGGTTCG

ACAGTTACAG

CCACTCAGAT

TCTTTAGCCT

GCAAGTGAGC

CGTAGCATCG

GATGTTTGTG

ATGCCALAGAT

AAGCACATTA

AGAGACAACG

TATGGAGATA

GCTGCATGCG

ATGTTTGTTG

AGACCATACG

CAGGGTCTGG

GCTTTAGAAT

GTTAACTTTT

TATTCATGGG

CGAAAAAGTG

ACAGACAAGT

AAAGCTGAAT

TTTATTGGAA

CTCATGAGGG

TGGATGAGAT

GTTCCAGTTT

GAACCTTGTG

GGAACTGGGG

GAGCGTACAG

ACCGCGATGT

ATGACGAAAG

GCCTTCGTGG

TCCTTGAGCT

ATTGCGTTGT

AGAGTAGATG

TTTTGGGGAA

GGCAGCCTCT

TAGCGATTGT

AGGATTCTGA

TGTTTGTGAC

GTTCGTTACC

ACTTGAATGG

AGATTCCATG

TCGATTGGGT

ATTTTGGTGC

AGGCCCCACT

TGAACGATTG

GTGTTTACAG

AGCCTGCAAG

GGGTATTTCC

TGAAAGGAGC

AGGTCACAAC

TATTGAATGG

GTCTCCCTCA

TGCAGAAGGA

GACTGGATTA

AGGACGTGCA

CTACCGAGTC

CCCACAGAAT

GTCTTAATCA

GCCTCCGAGA

GGTGGGCGTT

CGACATTCAG

ACCATACGTG

ACCAACCCTA

CTGAAGACAT

CCTGCTACAG

ACGGCTGTGC

TAAGGAAGGG

CTGTCCGTGT

GAAGTTTGTT

GATCATGGAT

TGGTGAAGCT

AATTGCTCTT

GTCCTCTGAT

CTTTGGGGGA

GTTTGTCGAT

TTTTGGTGAT

AATCCTTGAA

GCATGCCAGC

AGATTCCCGC

TACATATCCT

AGAATGGGCA

AGTCGTGACA

TGCTGAAGGT

AATTGTAAAT

TCATTATTCT

GCTGGGTTTA

CCAGAAkAGGC

GTTTGTCATG

GAGTTACAAG

AACTGCAGGT

GCTATATGCT

CACAGTCGAG

CTCACCGCTA

GGAGCACAAG

GGACCATGCC

CGTCATGTAG

GTTCCTCATC

TAGAGTCGCA

TGCTGCGGCG

ATGTGCTGCA

TACAGCTGAA

GCATTCTGTG

GCGAATGAGC

GCTCCTTATG

GCTGCTCGTG

kAAPACTATG

TCACATGAAG

CATCCGTCAT

AATCAGTTCA

TTGGGAGGAT

CTTGTGCCAG

AGCACCCTTG

GATCTGGGAT

AGGAGGCATG

GCAGATCGAA

GGACAGGGCC

GGAATTGACA

GTCGATGACC

CCTGTAAGGG

ATTGATCTCA

CTTGGATCTG

GATAAATTCC

TGCGATATAT

ATGCAATATG

ACCTTCAACC

ACCGTGGACA

CCGTCCTGGG

GCCGAGCAGT

ACGGGGACTG

CTTCCGCGGC

ATGCGCCTGC

GTGACAGCTT

GGATGGT'rAA

ATCAGAAACC

TATGTTGTCT

AACCTCAAGC

CAAAGTCAGG

GTCACCGTGT

CAAAGGCATT

TGACCTTTTT

ATCATAGACC

GATACACACT

ATATTTATGG

TCCTTCTTGC

TTATACATAA

TGCCACCTGA

CCCTTGACAA

TTGTGACCGT

TCAATGAGCT

TTALATGATTG

TCTCTGGAAA

AGGATGTTCC

TTAAAATGGC

GGGATCCAAT

GTGGATGGGT

TGTTAATGCC

GTACAGTTCC

CTTTTGGTGC

AGATGTTGTG

AGGGGCTCAT

ACGAGCAGAT

GGGAGGTCGA

CCGGAAGGAT

TTGCTTGGTC

CGGGTGGATG

CAGCAAAGCA

AACTGGTGAC

TGTCCTTAGC

660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 WO 99/14314 PCT/AU98/00743 83 TGACAAATAT TAGACCTGTT GGAGAATTTT ATTTATCTTT GCTGCTGTTG TTTTTGTTTT 2640 GTTAAAAAAA AAAAAAAAAA AA 2662 INFORMATION FOR SEQ ID NO: 12: SEQUENCE CHARACTERISTICS: LENGTH: 768 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii (ix) FEATURE: NAME/KEY: Protein LOCATION: 1..768 (ix) FEATURE: NAME/KEY: Protein LOCATION:1..768 OTHER INFORMATION:/product= "deduced amino acid sequence SBE II" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: Met Ala Thr Phe Ala Val Ser Gly Ala Thr Leu Gly Val Ala Arg Pro 1 5 10 Pro Ala Ala Ala Gln Pro Glu Glu Leu Gln Ile Pro Glu Asp Ile Glu 20 25 Glu Gln Thr Ala Glu Val Asn Met Thr Gly Gly Thr Ala Glu Lys Leu 40 Glu Ser Ser Glu Pro Thr Gin Gly Ile Val Glu Thr Ile Thr Asp Gly 55 Val Thr Lys Gly Val Lys Glu Leu Val Val Gly Glu Lys Pro Arg Val 70 75 Val Pro Lys Pro Gly Asp Gly Gin Lys Ile Tyr Glu Ile Asp Pro Thr 90 Leu Lys Asp Phe Arg Ser His Leu Asp Tyr Arg Tyr Ser Glu Tyr Arg 100 105 110 Arg Ile Arg Ala Ala Ile Asp Gin His Glu Gly Gly Leu Glu Ala Phe 115 120 125 Ser Arg Gly Tyr Glu Lys Leu Gly Phe Thr Arg Ser Ala Glu Gly Ile 130 135 140 Thr Tyr Arg Glu Trp Ala Pro Gly Ala His Ser Ala Ala Leu Val Gly 145 150 155 160 WO 99/14314 WO 9914314PCT/AU98/00743 84 Asp Tyr Ala Gly Pro 225 Glu Arg Asn Leu Tyr 305 Arg Giu Asn Tyr Leu 385 Ala Gly Thr Val1 Asp 465 Ile Phe Gly Ile Val1 210 Gly Lys Ile Ser Gly 290 Ala Phe Leu Asn Phe 370 Phe Arg Val Gly Val1 450 Ala Pro Asn Val1 Pro 195 Lys Giu Tyr Tyr Tyr 275 Tyr Ser Gly Gly Thr 355 His Asn Trp Thr Asn 435 Tyr Val1 Val1 Asn Trp 180 His Asp Ile Val Giu 260 Ala Asn Phe Thr Leu 340 Leu Gly Tyr Trp Ser 420 Tyr Leu Ser Pro Trp 165 Giu Gly Ser Pro Phe 245 Ser Asn Ala Gly Pro 325 Leu Asp Gly Gly Leu 405 Met Gly Met Ile Asp 485 Asn Ile Ser Ile Phe 230 Gin His Phe Val1 Tyr 310 Giu Val1 Gly Pro Ser 390 Giu Met Giu Leu Gly 470 Gly Pro Phe Arg Ser 215 Asn His Ile Arg Gin 295 His Asp Leu Leu Arg 375 Trp, Glu Tyr Tyr Val1 455 Giu G ly Asn Leu Val1 200 Ala Gly Pro Gly Asp 280 Ile Val1 Leu Met Asn 360 Gly Giu Tyr Thr Phe 440 Asn Asp Val Ala Asp 170 Pro Asn 185 Lys Ile Trp Ile Ile Tyr Gin Pro 250 Met Ser 265 Giu Val Met Ala Thr Asn Lys Ser 330 Asp Ile 345 Gly Phe His His Val Leu Lys Phe 410 His His 425 Gly Phe Asp Leu Val Ser Gly Phe 490 Thr Met Asn Ala Arg Met Lys Phe 220 Tyr Asp 235 Lys Arg Ser Pro Leu Pro Ile Gin 300 Phe Phe 315 Leu Ile Val His Asp Gly Trp Met 380 Arg Phe 395 Asp Gly Gly Leu Ala Thr Ile His 460 Gly Met 475 Asp Tyr Thr Asp Asp 205 Ser Pro Pro Giu Arg 285 Giu Ala Asp Ser Thr 365 Trp Leu Phe Gin Asp 445 Gly Pro Arg Arg Asp 175 Gly Ser 190 Thr Pro Val Gin Pro Giu Giu Ser 255 Pro Lys 270 Ile Lys His Ser Pro Ser Arg Ala 335 His Ser 350 Asp Thr Asp Ser Leu Ser Arg Phe 415 Met Thr 430 Val Asp Leu His Thr Phe Leu His 495 Asp Pro Ser Ala Giu 240 Leu Ile Arg Tyr Ser 320 His Ser His Arg Asn 400 Asp Phe Ala Pro Cys 480 Met WO 99/14314 PCT/AU98/00743 85 Ala Val Ala Asp Lys Trp 500 Trp Glu Asp 545 Met Leu Tyr Phe Asn 625 Ala Gin Val Asp Tyr 705 Ser Asp Lys Lys 530 Lys Ala His Leu Pro 610 Asn Asp His Ser Leu 690 Arg Asp Tyr Met 515 Cys Thr Leu Lys Asn 595 Arg Asn Phe Leu Arg 675 Val Val Asp Phe Gly Val Ile Asp Met 580 Phe Gly Ser Leu Glu 660 Lys Phe Gly Ala Thr 740 Ile Tyr Phe 550 Pro Arg Gly Gin Asp 630 Tyr Lys Glu Phe Ser 710 Phe Glu Ile Val Ala 535 Trp Ser Leu Asn Thr 615 Lys His Tyr Glu Asn 695 Arg Gly His Glu His 520 Glu Leu Thr Val Glu 600 Leu Cys Gly Gly Asp 680 Phe Pro Gly Pro Thr 760 Leu 505 Thr Ser Met Pro Thr 585 Phe Pro Arg Met Phe 665 Lys His Gly Phe His 745 Ala Leu Leu His Asp Arg 570 Met Gly Thr Arg Gin 650 Met Val Trp Lys Ser 730 Asp Val Lys Thr Asp Lys 555 Ile Gly His Gly Arg 635 Glu Thr Ile Ser Tyr 715 Arg Asn Val Gin Asn Gin 540 Asp Asp Leu Pro Lys 620 Phe Phe Ser Ile Asn 700 Lys Leu Arg Tyr Ser Arg 525 Ala Met Arg Gly Glu 605 Val Asp Asp Glu Phe 685 Ser Val Asp Pro Ala 765 Asp 510 Arg Leu Tyr Gly Gly 590 Trp Leu Leu Gin His 670 Glu Phe Ala His Arg 750 Leu Glu Trp Val Asp Ile 575 Glu Ile Pro Gly Ala 655 Gin Arg Phe Leu Asp 735 Ser Thr Ser Leu Gly Phe 560 Ala Gly Asp Gly Asp 640 Met Tyr Gly Asp Asp 720 Val Phe Glu Ser Val Tyr Thr Pro Ser Arg INFORMATION FOR SEQ ID NO: 13: SEQUENCE CHARACTERISTICS: LENGTH: 10550 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO WO 99/14314 WO 9914314PCT/AU98/00743 86- (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii (ix) FEATURE: NAME/KEY: exon LOCATION: 1.316 OTHER INFORMATION:/product= 'exon I" (ix) FEATURE: NAME/KEY: exon LOCATION: 1472..1828 OTHER INFORMATION:/product= "exon 2' (ix) FEATURE: NAME/KEY: exon LOCATION: 2766..2823 OTHER INFORMATION:/product= 'exon 3" (ix) FEATURE: NAME/KEY: exon LOCATION: 2906..3028 OTHER INFORMATION:/product= "exon 4" (ix) FEATURE: NAME/KEY: exon LOCATION:41 13..4 194 OTHER INFORMATION:/product= "exon (ix) FEAT URE: NAME/KEY: exon LOCATION:4286..4459 OTHER INFORMATION:/product= "exon 6" (ix) FEATURE: NAME/KEY: exon LOCATION: 4562..464 3 OTHER INFORMATION:/product= "exon 7' (ix) FEATURE: NAME/KEY: exon LOCATION:4744..4855 OTHER INFORMATION:/product= "exon 8" (ix) FEATURE: NAME/KEY: exon LOCATION:4999..5021 OTHER INFORMATION:/product= "exon 9" (ix) FEATURE: NAME/KEY: exon LOCATION:5102..5192 OTHER INFORM ATION:/product= "exon (ix) FEATURE: NAME/KEY: exon LOCATION: 8593..8718 WO 99/14314 WO 9914314PCT/AU98/00743 87- OTHER INFORMATION:/product= "exon I1I" (ix) FEATURE: NAME/KEY: exon LOCATION: 8807..8915 OTHER INFORMATION:/product= "exon 12" (ix) FEATURE: NAME/KEY: exon LOCATION: 8992..9 104 OTHER INFORMATION:/product= "exon 13" (ix) FEATURE: NAME/KEY: exon LOCATION: 9161..9199 OTHER INFORMATION:/product= "exon 14' (ix) FEATURE: NAME/KEY: exon LOCATION: 9498..97 13 OTHER INFORMATION:/product= "exon (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: ATGGCGGCGA CGGGCGTCGG CGCCGGGTGC CTCGCCCCCA GCGTCCGCCT GCGCGCCGAT CCGGCGACGG CGGCCCGGGC GTCCGCTI'GC GTCGTCCGCG 100 CGCGGCTCCG GCGCTFGGCG CGGGGCCGCT ACGTCGCCGA GCTCAGCAGG 150 GAGGGCCCCG CGGCGCGCCC CGCGCAGCAG CAGCAACTGG CCCCGCCGCT 200 CGTGCCAGGC TFTCCTCGCGC CGCCGCCGCC CGCGCCCGCC CAGTCGCCGG 250 CCCCGACGCA GCCGCCCCTG CCGGACGCCG GCGTGGGGGA ACTCGCGCCC 300 GACCTCCTGC TCGAAGGTAA AAAACAAGGC TGAATCCTCA GATCACTCCG 350 CGTCTJTCGTT 1TACCAAATA CGGTACTGCG AAGTGGTGCT GTATATGTGA 400 AG1TCTGTC GATF7CY17CC TGACGGATGT TCAGTCGAYI' CAG17GTATA 450 TATGTGATAC G'ITCGTTGTT CATCGATCGT ACAGA'T'AC CAGCACACTA 500 GATAGAAATC GAGACCGAC.G CGGGCAGATC AATAGATITI TCTAGACG'IT 550 'ITATTGGATC GTGAGATGAT TGATTrGGGGT GGCGTGTCGA TACGATAGCG 600 GTGCACCGCC GATGTATCGG GGCATGTGCA CGTGGTTGGG TCTCAGCAGA 650 CATATCACTA GACTGGTATC GTAATITACT AGTACTACTG GAAAGAGGAC 700 TAAAAAGGCT AGGCCAAGTG CACGCATGT T GGGAACGTI7G TI'AAATTGAT 750 GAGTGTCC TITGCTTGGG CTGGTAY17AT TACCAAAAAA TGGTG'TAGT 800 WO 99/14314 WO 9914314PCT/AU98/00743 88 CCCTGTACr-r ATFAATGGGA AAATCTFAAC ATGACACTGG GGTFATGAG 850 TCTCCAA1TG TATAFTCTCA GCACTCAACT GATIIACTG ATACTGTAGT 900 GGAAATGACA CGTGAGCACC CCCC'T7CAAG GAATGCAATG CTTICTITCTG 950 TMATAYFA CAGGAACTAG AAGGAGCTT'C CACCTITTGAG TACAGAAGTA 1000 CTCCCTCCGT TCCAAAATAG ATGACTCAAC 'ITGTACTAA TIIGTACTA 1050 TAGTITAGTAC AAAGTTrGAGT CATCTA=II AGAACGGAGG GAGTAGTATC 1100 GAAATrGAAG ACCCTrGTAT TACTGTCTrG FTITCAATG AAAATGGGAG 1150 GCCCATGCAG TAAGTCACAT GGGCACCTGG GAGGCTGGGA TCATGTGTGC 1200 MrIGCAGAGT ACTAGACCCA GCTCACCCTC TGTTAGA1TA CT1TGTrGGGC 1250 TGCTACMTG TGMIIGCTGT GCAGTATATC AGACATCCTG AAMTGGCAT 1300 CTAGCTGAGA ACAGAATGCA GGTITGCACCA TrCT7ATTAT TGCTAAACTG 1350 'ITGTCACGCA A'IMATAAAG AATGTGATCT TCTGAGTATr AAITrAATCAT 1400 G'17CTGCTAA TATCTGTCCT CGCTCTGGTG YrGACAAATA TACCATATGA 1450 ATA=IICCA TITGCAACC AGGGATI'GCT GAGGATTCCA TCGACAGCAT 1500 AATCGTGGCT GCAAGTGAGC AGGATTrCTGA GATCATGGAT GCGAATGAGC 1550 AACCTCAAGC TAAAGTrACA CGTAGCATCG TGTITGTGAC TGGTGAAGCT 1600 GCTCCTrATG CAAAGTCAGG GGGGCTGGGA GATGT'TTGTG GT-rCGT-TACC 1650 AA1TTGCTcT-r GCTGCTCGTG GTCACCGTGT GATGGTTGTA ATGCCAAGAT 1700 ACT-rGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATFT ATACACTGCG 1750 AAGCACATTA AGATTCCATG CMIIGGGGGA TCACATGAAG TGACCT1T 1800 TCATGAGTAT AGAGACAACG TCGA17GGGT GGGTACACAA TCACCTTrC~r 1850 ATITCTCTGT-r GAAYrGTAGC AACTGMIIAT CCTFTG'FIAC ACTFTCYIIA 1900 GCCCTGCAAA GACATATGTG AMIICCATAC TYT=GT-rA MICCCTrGT 1950 ACTCT-rCCTC ATGAAGGTCA AAATATCATA TATCCATGGA AGTCATGCAT 2000 GTGCCTAGTA ITrTTGGTGT CGGTGCC1TI7 AACTFICAGG GAT-TAATACG 2050 TGGAATrTGA TAACTAAAGT T-rA=IIATT GAAAAAAAT-r GTAGGTrGG 2100 TGAGCCCACA.GCCACGCAGT GGCACCACTG C'rrGCACATG ATM~GCATT 2150 TCTGITIGCA CCGAGCAcTTr CATGTGAATA AGGTGTAAAA TCATAAAGTA 2200 WO 99/14314 WO 9914314PCTIAU98/00743 89 CCAATI=AT TCTGCCAATr GCACITAAGA GTATATACAT TFATCVFGGC 2250 CTCAATCATG GGAGTACTGT GCATTCAGTG CACCATCArr7 GFTCTAAGGA 2300 GAAAATGTGG GTGCAAGGAA GACACTFTIG TCCC1TAATA AAAGGCAGGC 2350 ACTCTGflGT CATATAGATA GAAAGCAACA AACTTfATIC AAAGAGCTAA 2400 CAATGGCAAA AGAACCAAAA AAAGCATGCT AAGGCGGTGA CACCAAAAGG 2450 TGAGGGGGGC C'ITGTGACTG ACAGCACCCC AAACTATrGC CATFFGTFFIA 2500 CTAAATGAAG ATCATITAG AAGCTCTCAG GAACTTCGAA AACAGTGGCT 2550 'TTCCGTCCAC AGATCGTCTG TrAATA1Tr TGTCCAGTGA TACTIFFTI 2600 GCTCCTI'ACA AGAGTGCCTA TGTrTGACATA TACATfGTFA AGI7GTrCAT 2650 AAG1TT'ACT7 CTTATrCTAA ACAGCAAGTG CCTAATGCTF7 GCAMIIAM 2700 TGGCTAnTTA T1TFATrCT CATITCAATC AACACTITG TTrCAGGTGTr 2750 TGTCGATCAT'CCGTCATATC ATAGACCAGG AAGTFTrATAT GGAGATAAT-r 2800 TFIGGTGCTIT TGGTGATAAT CAGGTACACT ACACTATACT AAGCTCCTAG 2850 TTGACTAAGT CGTAAGTFrGT ACCTCCTCGC TGACCGGCTG CTCTATGTCG 2900 TGCAGTITCAG ATACACACTC CTI17GCTATG CTGCATGCGA GGCCCCACTA 2950 ATCCTTrGAAT TGGGAGGATA TArTATGGA CAGAATrGCA TGTITTGTrrGT 3000 GAACGATTGG CATGCCAGCC TTGTGCCAGT GTACGTFTGT-r TGTGGATCTG 3050 AAAGTCCAAT CCTITrAT7CA TTCTCTGCTr TGCAGTGTGC CCATGTCTAC 3100 AT1C=f~A TGCT-TITIC ATGTCTGTTC TITATATrGCA TATATGCTrA 3150 TGGAGTCTAA AAGTTACCGG AGGGAATAAC TCTTAAGGAT TfCCTCAATC 3200 AATTATCTIT AGC'FIAGTT AACATIACT GTGGCAAACA TAATGTGTI 3250 TGAGA1TIAC AAGT7CAGAG ATrGCACTrC ACTAGTCGT AGCTAATCTG 3300 ATG'TTICCC CGAGAAAATG CCTAAAGcTr TGTGTCT-rGA TGCATTrGATA 3350 GAAAAAGAGT TfATGTACAC TCCCAAA GAG GGGACCCAAA AT7ACAACAC 3400 CACACCCCTG AGAACTAGGC GCTGCCGGAA GAAGCGATGC AAGCCCCACT 3450 GCCCCTGCCT TAGCTCAAAG CCGGGCGTCA GCTGATTGT GTCAAGTAAG 3500 CTAGCAGTGC TAGATTGCGC AAGGTCGATT CGTCGAAGAT GACAGTGTTG 3550 CGCTGCTFcC AAATCCACCA AACTATGAGC ATGATCACTG GAGAAGTACC 3600 WO 99/14314 WO 99/ 4314PCT/AU98/00743 YTTCTCGCG GCTGAGGGGG TGGACTGGTG GTCTGCTGCT GCCAG1TIC 3650 AGATAATCTG AAAAATGCAT G I IIrIGATGA IT=AGTATC TTGCGGACCC 3700 TGGGTACCAC CTAAGCMIC ACACAGTAAT T-rGCAGTFAC ACCTATAAAA 3750 GTAACGGTCA TGATATGCAT GTG'TITGGG TAGATCATGG TGCATGCATT 3800 TFIAGGAATIA GGACATGCCA GAACCACGTG AGGCTFATGG GGCAATITCAT 3850 TFFGTrCCA-IT ATACGAGTCA TGAATATGGT TCAGCATGTT1 TGGACGCTAC 3900 TI'GT-rTGGGG CAATITICAGA TGGTGAATITG TAGCTGCTT'G ATGTFFGGCTA 3950 GCTGGC-FrAT T1IGTACAAG TATCGATGTF AGATGCATAT TTCCT17GT 4000 TCTTGTGCTG TI=GCCATGT TGTATrCCCC FTTCTGTCG CCAGTGTTGC 4050 ATGTrFAAAT-T GG=TICA'rr ACATAATCAA CT17GTI'GCT GACATCAGTC 4100 ATFIATI'C AGCCTTFrG CTGCAAAATA TAGACCATAC GGTGTITACA 4150 GAGATFCCCG CAGCACCC~F GTTATACATA AI7rAGCACA TCAGGT17GG 4200 GTCTATCACC T17CATTATC CGTACATGGC TIGTAAGTC GGTrCACACG 4250 TATCGTCATA CTGTATGITA T-17CAATGTC ATrAGGGTGT GGAGCCTGCA 4300 AGTACATATC CTGATCTGGG ATT'GCCACCT GAATGGTATG GAGCT1TAGA 4350 ATGGGTAMII CCAGAATGGG CAAGGAGGCA TGCCCTrGAC AAGGGTGAGG 4400 CAG'17AACT-r 1TIGAAAGGA GCAGFTGTGA CAGCAGATCG.AAT-rGTGACC 4450 GTCAGTCAGG TGAAATACTC AATACTTCTC T-IT1TT7F GCGGGATGTT 4500 CFTCAGTTCA ATTGCCCTGT CTFTCACCCA ATrAAGAAAT GATI1AATCT 4550 1IG'ICTA GGGTTCA TGGGAGGTCA CAACTGCTGA AGGTGGACAG 4600 GGCCTCAATG AGCTCTTAAG CTCCCGAAAA AGTGTATI'GA ATGGTAACTA 4650 TATITGAATC CACTTATCT-r CTrCTGAAAC ATAFFIACAG AAATAGATGG 4700 ATGGGTI'GCA AGAATAAAYI' CAG=~GCTC TITCGGTATG AAGGAATTGT 4750 AAATGGAATT GACATTAATG ATI'GGAACCC CACCACAGAC AAGTGTCTCC 4800 CTCATCATrA TTCTGTCGAT GACCTCTCTG GAAAGGTGTG TGGATAGTAC 4850 CCTATATAAT AACATGTATA TCTGATCTAG TACT7CTI TTCTIITGCTA 4900 G1TI-rGCTTCC CATGATGTrC TCACTAACTA ATCCTATGTG GMFIGGCATA 4950 cTTrGTCAGGC CAAATGTAAA GCTGAATFIGC AGAAGGAGCT GGGTTIACCT 5000 WO 99/14314 WO 99/ 4314PCT/AU98/00743 91- GTAAGGGAGG ATGT-rCCTCT GG1TAGATAC AAACCCCTAA GATATATAIT 5050 TF=AAATCC CTAAAAAAAA CYFGCCGATC ATCTCATTAG CTTGATrTCAC 5100 AGATTGGCT-T TATFGGAAGA CTGGATrfACC AGAAAGGCAT TGATCTCATJT 5150 AAAATGGCCA TFICCAGAGCT CATGAGGGAG GACGTGCAGT T-rGTAAGTTC 5200 ATATTTCITI TCTrGAGACT AGAGTATAAA TCAAACATGT AGGTGTGGGG 5250 TGGTATAATA CAGACATAAG TTrCCAGCTAT TGCTTfCCATG AGAA=IIAA 5300 TGCTAFFCAG TAATATGCTA CTGCAAG'lM TGAAACAAAG TTGGAAGCAA 5350 TAAATATATG TGTAGCACTG ACCATGCAGT GCCACTATAG CTGGAATGTC 5400 CTGTAGTCTA TGTGATCTAA CACACTCAAC AACATGIT1 CGCATACAAA 5450 CACATGCGTG CGCGCAACAA ACATACTCTA CAATAAAATT GGCT-rGGTGA 5500 ACTGCAGACA TGCTCTFATC TCCAYFCCAA CAFIII T7GT TTCAACATrG 5550 GCTGAAGACT AAGAGAAGGG GGACCCAGGG TGATGTAGCC AACTAGATCC 5600 AGTAAGGAAG CTAGCCGAGC CTAGGAGGAT TCGCTTAGGT AGCTGGAACG 5650 TAGGGTCTCT GACAGGGAAG CTITCGGGAGC TAGTCGATGC AGTGGTGAGG 5700 AGAGGTGTTG ATATCCTTIG CGTCCAAGAA ACCAAATGTA GGGGACAGAA 5750 GGCGAAGGAG GTGGAGGATA CCGGCITrCAA GCTGTGGTAC ATGGGACGGC 5800 TGCAAACAGA AATGGCGTAG GCATCTITGAT CAACAAGAGC C'rrAAGTATG 5850 GAGTGGTAGA CGTCAAGAGA CGTGGGGACC GGATTATCCT CGTCAAGCTG 5900 GTAG1TGGGG ACTTAGTTCT CAATG'ITATC AGCGTGTATG CCCCGCAAGT 5950 AGGCCACAAT GAGAACGCCA AGAGGGAGTT CTGGGAAGGC CTGGAAGACA 6000 TGG~rAGGAG TGTACCGA'Tr GGCGAGAAGC TCTTCATAGG AGGAGACCTC 6050 AATGGCCACG TGGGTACATC TAACATAGGT T17GAAGGGG CACATGGGGG 6100 CTITGGCTAT GGCATCAAGA ATCAAGAAGA AGATGTCTrA CGCTFGCTC 6150 TAGCCTACGA CATGATTGTA GCTAACACCC TCTIMAGAAA GAGAGAATCA 6200 CATCTGGTGA C=flAGTAG TGGCCAACAC TAGCCAGATC GAMIICATCC 6250 TCTCGAGAAG AGAAGATAGG TGTGCGCGCC TAGACTGCAA GGTGATACCT 6300 TCGGATFrCGT GTCCAGCGGG ATAAGCGTGC CAAAGTCGCT AGAATGAAGT 6350 GGTGGAAGCT CAAGGGGGAG GTAGCTCAGG CGTT'CAAGGA GAGGGTCATFT 6400 WO 99/14314 WO 99/ 4314PCT/AU98/00743 92 AGGGAGGGCC C17GGGAGGA AGGAGGGGAT GCGGACAATG TGTGGATGAA 6450 GATGGCGACT TGCATTCGTA AGGTGGCCTC GGAGGAGTGT GGAGTGTCCA 6500 GGGGATGGAG AAGCGAAGAT AAGGATACCT GGTGGTGGAA TGATGATGTC 7000 CAGAAGGCAA TrAAAGAGAA GAAAGATTGC TTTAGACGCC TATACTFGGA 7050 TAGGAGTGCA GTCAACATAG AAAAGTACAA GATGGCGAAG AAGGCCGCAA 7100 AGCGAGCTGT CAGTGAAGCA AGGGGTCGGG CATATGAGGA TCTCTACCAA 7150 CGG1TAGGCA CGAAGGAAGG CGAAAGGGAC ATCTATAAGA TGGCCAAGAT 7200 CCGAGAGAGA GGAAGACGAG GGATATTGGC CAAGTCAAAT GCATCAAGGA 7250 TGGAGCAGAC CAACTCTFGG TGAAGGACGA GGAGATITAAG CATAGATGGC 7300 GGGAGTACTFF CGACAAGCTG TTCAATGGGG AGGATGAGAG TCCTACCA1T 7350 GAACTTGACG ACTCC1TrGA TGAGACCATC ATGCG=IJA TGCGGCGAAT 7400 CCAGGAGTCC GAGGTCAAGG AGGCT17AAA AAGGAGGCAA GGCGATGGGC 7450 CCTGATrGTA TCCCCATITGA GGTGTGGAAA GGCCTCGGGG ACATAGCGAT 7500 AGTATGGCTA ACCAAGCTAT TCAACCTCAT '1ICGGGCA AACAAGATGC 7550 CAGAAGAATG GAGACGAAGT ATATI7AGTAC CAATCATCAA ACAGGGGGGA 7600 TGTTCAGAGT TGTACTAATT ACCATGGAAT TAAGCTGATG AGCCATACAA 7650 TGAAGCTATG GGAGAGAATC ATFrGAGCACC GCITAAGAAG AATGACAAGC 7700 GTGACCAAAA ATCAGMIGG TICATGCCT GGGAGGTCGA CCATGGAAAC 7750 CATITITYG GTACGACAAC TrATGGAGAG ATACAGGGAG CAAAAGAAGG 7800 ACYI'GCATAT GGTGIrCATr GAC1TGAAGA AGGCCTATAA TAAGATACCG 7850 CGGAATGTCA TGTGGTGGGC CTITGGAGAAA CACAAAGTCC CAGCAAAGTA 7900 CATFrACCCTC ATCAAGGACA TGTACGATAA TGTrGTGACA AGTGTrCGAA 7950 CAAGTGATGT CGACACTAAT GACTrCCCGA 'ITAAGATAGG ACTGCATCAG 8000 GGGTCAGC'Tr TGAGCCCITA TCITITGCC TFGGTGATGG ATGAGGTCAC 8050 AAGGGATATA CAAGGAGATA TCCCATGGTG TATGCTCYI GTGGATGATT 8100 TGGTGCTAGT TGACGATAGT CGGGCGGGGG TAAATAACAA GTTAGAGT7IA 8150 TGGAGACAAA CCITGGAATC GAAAGGGMr AGGCTTAGTA GAACTAAAAC 8200 CGAGTACATG ATGTGCGGTT TCAGTACTAC TAGGTGTGAG GAGGAGGAGG 8250 WO 99/14314 WO 99/ 4314PCT/AU98/00743 93- 7TAGCCTFGA TGGGCAGGTG GTACCCCAGA AGGACACCT-r TCGATAMfG 8300 GGGTCAATGC TGCAGGAGGA TGGGGGTATr GATGAAGATG TGAACCATCG 8350 AATCAAAGCT GGATGGATGA AGTGGCGCCA AGCTTCTGGC ATTCThFGTG 8400 ACAAGAGAGT GCCACAAAAG CTAAGGCAAG TFCTACAGGA CGGCGGTTCG 8450 ACCCGCAATG TrGTATGGCG CTGAGTGTTG GCCGACTAAA AGGCGACATG 8500 TrCAACAGTr AGGTGTGGCG GAGATGCGTA TG17GAGATG GATGTGTGGC 8550 CACACGAGGA AGRATCGAGT CCGGAATGAT GATATACGAG ATAGAGTITGG 8600 GGTAGCACCA ATI'GAAGAGA AGCTFFGTCCA ACATCGTCTG AGATGGTTTG 8650 GGCATATFCA GCGCACGCCT CCGAAAACTC CAGTGCATAA CGGACGGCTA 8700 AAGCGTGCGG AGAATGTCAA GAGAGGGCGG GGTAGACCGA ATTGACATG 8750 GGAGGAGTCC G1TAAGAGAG ACCTGAAGGT TrGGAGTA'17 ACGAAAGAAC 8800 TAGCTATGGA CARGGGTGCG TGGAAGC'rTG TTATCCATGT GCCAGAGCCA 8850 TGAGTTGATC ACGAGATCT-r ATGGGTFF7CA CCTCTAGCCT ACCCCAACTF 8900 GTI1rGGGACT AAAGGC'TTrG T7GTTGTrGT TGTTGTTGTT GTTrGTAGCCA 8950 ACTAAATCCA GTITGATCAGT GGTFTTIACT CTrATITTA CAGGTCATGC 9000 TI7GGATCTGG GGATCCAAT7r TIGAAGGCT GGATGAGATC TACCGAGTCG 9050 AGTrACAAGG ATAAA-FrCCG TGGATGGG'Fr GGAYIAGTG 'TrCCAGTTITC 9100 CCACAGAATA ACTGCAGGGT ATGCCGAGAA CTFTCITAACA AGACCTFI'CGT 9150 TATCAGCTTG GATATAT7AT AATGT7CAAA ACATITATGT CTCTC I II 1 9200 GTGCAGTTGC GATATATTGT TAATGCCATC CAGG'FIGAA CCTTGTGGTC 9250 TITAATCAGCT ATATGCTATG CAATATGGTA CAGITCCTGT AGT-rCATGGA 9300 ACTGGGGGCC TCCGAGTAAG ACAACTGCCT TGAAAATrAT CGT-FATC1rrG 9350 GCTCCAACGC AAATGTflCTA AT-rGGCTCGT GTAITCAACA GGACACAGTC 9400 GAGACCTICA ACCC1TITGG TGCAAAAGGA GAGGAGGGTA CAGGGTACGC 9450 ACTGCTCAAT T1AGCTAAC MFCAGT-ITA TC1TFIGCA ATGTCTTTGGG 9500 GGTrCAT7GC GCCATAAATC AACTI'GTGAT AATrAACTGT TACTGTI'CTG 9550 TACTrGCAGG TGGGCGTITCT CACCGCTAAC CGTGGACAAG ATG1TGTGGG 9600 TAAG1TI=G CTGAGCTCTT GTCCGG'17AT AGGATCGACC TITGGCTGTAG 9650 WO 99/1 4314 PTA9/04 PCT/AU98/00743 94- CATGGTACCT TAGTGCCCCT TGTATATAGA CCTAACCTGA TGGACTCACT 9700 YIGTCTACAC TAATCATAGT AGTCGATTGC CCGGAGGCGT TITGCTTGGA 9750 17TCTGCTAAT TI'AATITFCA TGACGATAAC TCATACCATG UITGGTTCT 9800 CCGATGGGGG CCAGAATGGC GTCTAGTGTC TGCGATCTGT GTAACTAGCC 9850 AATGCCGGGT TG'TTCCAAGT GAAAATITFAC CTTGACCA TTGTGCAGGC 9900 ATTGCGAACC GCGATGTCGA CATFTCAGGGA GCACAAGCCG TCCTGGGAGG 9950 GGCTCATGAA GCGAGGCATG ACGAAAGACC ATACGTGGGA CCATGCCGCC 10000 GAGCAGTACG AGCAGATC1T CGAATGGGCC TTCGTGGACC AACCCTACGT 10050 CATGTAGACG GGGACTGGGG AGGTCGAAGC GCGGGTCTCC TTGAGCTCTG 10100 AAGACATGrI' CCTCATCCT-r CCGCGGCCCG GAAGGATACC CCTGTACATT 10150 GCGTFIGTCCT GCTACAGTAG AGTCGCAATG CGCCTGCTTG CTFTGGTCCGC 10200 CGGTrFCGAGA GTAGATGACG GCTGTGCTGC TGCGGCGGTG ACAGCTTCGG 10250 GTGGATGACA GTFACAGT'Ir TGGGGAATAA GGAAGGGATG TGCTGCAGGA 10300 TGGTITAACAG CAAAGCACCA CTCAGATGGC AGCCTCTCTG TCCGTGTIAC 10350 AGCTGAAATC AGAAACCAAC TGGTGACTCT TrAGccTITAG CGATTGTGAA 10400 GTITGTTGCA 'TTCTGTGTAT GTTGTCTrGT CCTITAGCTGA CAAATATrITG 10450 ACCTGTrFGGA TAATTCTATC TTTGCTGCTG, T7nTrCTri GGTCAAAAGA 10500 GGGGT-rCCCT CCGA'1CAT TAACGAAACC ACCAAAATAA CAGCACCCAG 10550 TGCAGGTCTC AGGTITCAGAT ATACTTAAGA CTACTAAATC TAACAGCAGC 10600 TAAAAAGCT-' AAAGATFCAG GCGACATAAC CGAACAAAAT CCACAACCGA 10650 AGGGACCAAA GCAGGACAAG TAAAAAGGCA GNCGACACAA AGCGCAGGTC 10700 GCTGAAAAGG CAAGCAGACA GAGGTCTGCA 'ITCTGTCAAC ACCACTITGTG 10750 AAAAATGAAG AGAAGATCGA GAA7TCCCGG GAATCCG 10787 INFORMATION FOR SEQ ID NO: 14: SEQUENCE CHARACTERISTICS: LENGTH: 647 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (iii) HYPOTHETICAL:

NO

WO 99/14314 PCT/AU98/00743 95 (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: Protein LOCATION:1..647 OTHER INFORMATION:/product= "deduced amino acid sequence for SSS I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: Met Ala Ala Thr Gly Val Gly Ala Gly Cys Leu Ala Pro Ser Val Arg 1 5 10 Leu Arg Ala Asp Pro Ala Thr Ala Ala Arg Ala Ser Ala Cys Val Val 25 Arg Ala Arg Leu Arg Arg Leu Ala Arg Gly Arg Tyr Val Ala Glu Leu 35 40 Ser Arg Glu Gly Pro Ala Ala Arg Pro Ala Gln Gln Gln Gln Leu Ala 55 Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro Pro Pro Ala, Pro Ala 70 75 Gln Ser Pro Ala Pro Thr Gln Pro Pro Leu Pro Asp Ala Gly Val Gly 90 Glu Leu Ala Pro Asp Leu Leu Leu Glu Gly Ile Ala Glu Asp Ser Ile 100 105 110 Asp Ser Ile Ile Val Ala Ala Ser Glu Gln Asp Ser Glu Ile Met Asp 115 120 125 Ala Asn Glu Gln Pro Gln Ala Lys Val Thr Arg Ser Ile Val Phe Val 130 135 140 Thr Gly Glu Ala Ala Pro Tyr Ala Lys Ser Gly Gly Leu Gly Asp Val 145 150 155 160 Cys Gly Ser Leu Pro Ile Ala Leu Ala Ala Arg Gly His Arg Val Met 165 170 175 Val Val Met Pro Arg Tyr Leu Asn Gly Ser Ser Asp Lys Asn Tyr Ala 180 185 190 Lys Ala Leu Tyr Thr Gly Lys His Ile Lys Ile Pro Cys Phe Gly Gly 195 200 205 Ser His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Asn Val Asp Trp 210 215 220 Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Ser Leu Tyr Gly 225 230 235 240 Asp Asn Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr Leu Leu 0 245 250 255 Cys Tyr Ala Ala Cys Glu Ala Pro Leu Ile Leu Glu Leu Gly Gly Tyr 260 265 270 WO 99/14314 WO 9914314PCT/AU98/00743 96 Ile Tyr Leu Val 290 Arg Asp 305 Leu Giu Tyr Gly Leu Asp Ala Asp 370 Thr Ala 385 Ser Val Pro Thr Ser Gly Pro Val 450 Tyr Gin 465 Arg Giu Giu Gly Gly Trp Cys Asp 530 Gin Leu 545 Gly Gly Gly Giu Met Leu Gly Gin 275 Pro Val Ser Arg Pro Ala Ala Leu 340 Lys Gly 355 Arg Ile Glu Gly Leu Asn Thr Asp 420 Lys Ala 435 Arg Giu Lys Gly Asp Vai Trp Met 500 Vai1 Gly 515 Ile Leu Tyr Aia Leu Arg Giu Gly 580 Trp Ala 595 Asn Leu Ser Ser 325 Glu Glu Val1 Gly Gly 405 Lys Lys Asp Ile Gin 485 Arg Phe Leu Met Asp 565 Thr Leu Cys Leu Thr 310 Thr Trp Ala Thr Gin 390 Ile Cys Cys Val1 Asp 470 Phe Ser Ser Met Gin 550 Thr Gly Arg Met Ala 295 Leu Tyr Val Val1 Vai 375 Gly Val Leu Lys Pro 455 Leu Val1 Thr Vai Pro 535 Tyr Val Trp Thr Phe 280 Ala Val Pro Phe Asn 360 Ser Leu Asn Pro Ala 440 Leu Ile Met Giu Pro 520 Ser Gly Giu Ala Ala 600 Val1 Lys Ile Asp Pro 345 Phe Gin Asn Giy His 425 Giu Ile Lys Leu Ser 505 Val1 Arg Thr Thr Phe 585 Met Val1 Tyr His Leu 330 Giu Leu G ly Giu Ile 410 His Leu Gly Met Gly 490 Ser Ser Phe Val1 Phe 570 Ser Ser Asn Arg Asn 315 Gly Trp Lys Tyr Leu 395 Asp Tyr Gin Phe Al a 475 Ser Tyr His Giu Pro 555 As n Pro Thr Asp Pro 300 Leu Leu Ala Gly Ser 380 Leu Ile Ser Lys Ile 460 Ile G ly Lys Arg Pro 540 Val1 Pro Leu Phe Trp 285 Tyr Ala Pro Arg Ala 365 Trp Ser Asn Val Giu 445 Giy Pro Asp Asp Ile 525 Cys Val1 Phe Thr Arg 605 His Gly His Pro Arg 350 Val1 Giu Ser Asp Asp 430 Leu Arg Giu Pro Lys 510 Thr Gly His Gly Val1 590 Giu Ala Val1 Gin Giu 335 His Val1 Vai Arg Trp 415 Asp Gly Leu Leu Ile 495 Phe Ala Leu Giy Ala 575 Asp His Ser Tyr Gly 320 Trp Aia Thr Thr Lys 400 Asn Leu Leu Asp Met 480 Phe Arg Giy Asn Thr 560 Lys Lys Lys WO 99/14314 PTA9/04 PCT/AU98/00743 97 Pro Ser Trp Giu Gly Leu Met Lys Arg Gly Met Thr Lys Asp His Thr 610 615 620 Trp Asp His Ala Ala Giu Gin Tyr Giu Gin Ile Phe Glu Trp Ala Phe 625 630 635 640 Val Asp Gin Pro Tyr Val Met 645 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 5072 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:

NO

(vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: promoter LOCATION: L .4993 OTHER INFORMATION:/function= "region containing promoter of SSS I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:

TCTAGATGCA

TTCGGTCTAC

AACTATGCTC

GATCTTGAGA

ATCTTCCAAT

TTATTATAAA

GTGACCGTGA

GTGTACGTGC

TGAGAAAAAT

ACGTAGTATT

TCCGGCAGGT

GCCGCCCCAG

GAGAAGGCTT

GGGTCATGCC

TGCTGGATAG

TTGTCGCGGA

AATTCTATCA

GAAGTCACTA

ACTTAGTTAT

AAATACCAAA

AGGGATTGAC

GTGTGACTCG

ACTTACGCTA

CAAGAGGTAG

TGATGCGGGC

CGGCGTCTTG

GTCGATGAAC

CTTCGAGAAA

CGGTCGATGT

CGTGATGCCT

ATTGCTCGAC

GTGAAACCTA

TTCCATTGCC

AATATTATCT

AACCCCTTTA

CACGTCTCCT

CTTTACTGCA

CACGCTACCA

CGGGGAAGAA

AACCTGTCCA

TCCAGCTGTT

CCCACCTTGG

GTGGAGTAAT

ATATACATGA

AGTAATTCGT

TGCCCCCCAG

GTTTATTTTA

TATCATATCT

TCGTGTTGGT

ACTGGATTGA

TAACCCTTTC

TCCTCTCCAA

CTCCAGCTGC

CGTAGCGCTC

GTGCCAGCCT

CCACCCTTGT

AGTAGTAGAT

TCATACCTAG

TTACCCACCG

GTCTATTTTG

CTTTGTATCT

ATCAGATCTC

TGCGAGGTTC

TACCTTGGGT

CTCTTTAAAA

CAGGAGCGCG

CTTGGCCAGC

CCTGACACGC

AGCTTGCGCC

GCTTGAGCGG

GCAGAATCGT

ATATTCTCAT

TAATACTTAT

CATCATATTA

TTATTTCTTT

ATTCTCGTAA

TTGTTTGTTT

TTTCAAAAAC

AAAAAAACCA

GAGATCTTTG

TTGGTCGTGA

GGCGTGAACT

TTCTTCTGCT

CGCGCCACCT

CAGCAGGCGG CGGCGTGGGG ATGAAGAGGG TGTCTGCTTC CGGAGCAGGC GGGTCGGCGT WO 99/14314 WO 99/ 4314PCT/AU98/00743 98

TGAACTTGAA

GGAAAGTGGT

CGATCTCTGG

GGAGCGGCAG

GCAGGCAA

CTCAACCTCG

AACCGCCTCG

GTCGAGCTCG

CAGCAGGAAG

ATCAGCAGCG

GTCGACGGCA

CGAGCGGCTG

ATGGCGACGC

CCCGTCCTCG

GCCAGGGCGG

TTCGTCATCG

CGGCTCAGCC

TTGGGGGTCG

TGTGGTGTCC

GCAGCCCTCC

CCCCCTTTTG

CGTGTTGGAG

CGCATCCTGC

GCAGTAGTAC

GAGCGCGACG

GTGGACGAGC

GGCGACGCGG

GAGTCATCCG

TCCTGCTGA).

GGAAAAAGAC

AAACAGTGAC

TGTCGTTTCT

CATACAAAAT

AGGCGGTGGC

GTTGGCGTCC

CTCCGGCTGG

CTCTGGCTGA

GCCACCCGGA

GTGAGGTTCG

TCCGCCCCGA

AGCAGCAGAG

GGGGACTGGT

TTTGCACCAG

GCAAAACGTG

TAGGAGCGCT

CGGCCACCAC

CGGGCAGCTC

CTGCGGCGAC

TGGCGCCTCG

CGCCCTTCCT

TTGGCGCGCG

AGGTGGATGA

GGCAGCGTCT

GTGGGGATGT

AGGGAGGTCG

CCCGCCTCCT

CGCCAGACAC

CCCCAGCAGG

CCAGAGATGG

CCGGAGTGAA

AGAGAGGTGT

TAACCACACA

GGCGAAAAAT

ACGTCGTTTT

TTCTTTTCTC

CAAATGAATG

CCCATGATGG

ACCTCCAGTG

AAGGAGGCTC

GCAGACCCCG

TGGGGGCGCG

CCCCAGACCA

AACTGTCCAG

GGTCCGTGCG

CCATCGCCCC

GGGGAGCAGC

GCTGGAGCAA

CGGTGCCCTC

TGGACGTGCC

CACCTGAGCG

GGCGACGGCC

GACAAGGATG

CGACGTGGCG

GCATCTCGGG

GCAGAGAGAA

GGCGGCCCCT

CGTCCGGACT

TCTGCCGCTC

TGTTCCAGGA

GGCGGTGGCC

AGACGACCCC

CCAGGCGCAT

CCGCGGCGTG

ATCAGTGGCT

TGTGTACTGT

TCACGGACAC

GCGTTGTCGG

CAAATCGACA

CATTCAAGGG

ATGGGGGGAG

CCTGCAGTTT

GACGCTCCGG

CGCCCATGTA

AGGTGGACTG

GGGCGGCAGG

GACAGACGGC

GGTGATGTCT

TGGCCAAGCC

CACACCTTGG

GTTGCCGTCG

AGACTCGGAC

ATGGCGCTGG

GCACCCGAGG

GCGGTCGCGG

CTCGCTGTCA

AGCCCTGCGG

GTCGCGGTCA

ATCCGGCCCC

GGGGTCCAGG

CCATGCCCAC

CAACCAGTCG

CTGCACCGGC

GTGTGCCGAT

AGCGTCGAAA~

TGACGCGGGG

GTGGCCGACG

CTGCACAATA

CGTTAAATAA

ACGACTAGTA

CCGGTGTTGT

AACCGTTTCT

CCGGTAATCC

CATGCCAAAG

GGAAGCCAGA

TGTGCCAGAA

CTCTGCATTG

CGCACCGGAG

CTCGGGTCCA

GGACGACGGA

TGCCAAATGG

ACTGGTACGC

AGGACAGGGA

CGTGCCGGCC

AGTGCGCCAG

TCCTGACGGC

AGCACACCCC

TCTGCACCAT

CCGACGCGAG

ATATGCTCCT

GCTATCGGGG

TCTAGCCCCT

GGTCGATCGA

ACCCAGGCAA

ACGTGGCATG

ATGTTCTCGA

GGTGACCAGG

GCGATGTCCC

AAGGGGAAGG

GGGCTGGAGA

CCCAGTGTCG

ATCATTGGTC

GTACCCAATA

CGAGTCATTG

CTTTGGTTA

A-ATTCTGAGC

ACTTGGTTGA

CGATTGGCGT

CGCAAAGGGA

GGCCAAGGCT

GAAGGCCAAG

CAAAGGGCCA

AGGCCGTGTC

ACTCCACCTC

CAAAGATGGC

GGGTGCGGAC

TCGGCGAGCG

TGGGAGAGCC

GCCTGGATGG

GCCAAGCTGG

CATCTTCATC

GGACGTGAGC

CGAGCGGCCA

TGTAGTCCTT

CGTCCCGGGG

TGATGGAGAA

AGAGGCAGGC

TCTTCCCGAG

CGGCGATGCG

CCGACAGGGA

GGTGCCTGAA

AGTTAGGATG

GGCAGAGGCG

CCACATCATA

ACGCGAACCC

TACTCGGCAA

TACTATGTTT

AAAACAGAAA

CCAGGCTCAG

960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 CTACACCCGC CCTTACAAAA~ AAATCAAAAT AAATACTAGA AAAATTCAAA AAATTCCAAT 2940 WO 99/14314 WO 9914314PCT/AU98/00743 99

TTGTTTGTGC

GACATCTGAG

TCATGCACTG

AGCTAAAATT

ATTTTTTGAA

GGCACCGAAA

TTTCCCTCTG

ATTAGAGGAT

GGTCACGATT

GACAACCCAC

CCGCGGACAC

TGAGGGGTGA

AAGTCTGATT

TCTCCCCTCC

GGCGGCTGAT

GTACTTTTTT

TTTGTCTTCC

CCGGCATAGA

CAGATTTGGT

CAGAGCGATG

AAGCCGGCTC

TAGTGGTCGT

GTACTTCCGA

TTCAACTGAA

CTTTTTTTTG

AAAAAACCCA

CATTTTCGTG

ACGCTGCTAC

CCTTTCCAGG

CAGCCAGCCA

GCTCTTCCGT

CCACTCTTCT

TCGGCCACCG

GTGGTAGATA

CAGCTCTCAG

TTCTGACCCG

TTGAGCGGAG

TTTTTTTTAT

CGCCGCACTC

TTTTTTGAGC

CTACAGCCGG

TTTTGACCCA

ATATCCAGCC

CACACATCTT

CCGGTCCCTG

AGAGATGCTC

AAACTTTGTC

GGCGGGGAGG

AGTCCTCGCA

GGGCTCTGAT

TTCCTATCAT

GCCAGATGCT

GTCCTTAAGG

CGATAGGGGA

TCGGTGCTCT

TCGATGAACT

AACAAAAGAG

CGAGAAGTCG

CGCAACTGTC

TGCGAGGCAT

ATGTCGGCAC

CTCACCACGG

GATATGGCAA

CCGTCCGTCC

CTCCCCGCGC

GCAAACCCCC

ATTTGATGCG

CAAAAAAGAC

ATTTGTCTTT

CTTACGTGAT

TTTTTGTGAT

AGGCTCATCC

AAGGGGCACC

GCGTCTCAAA

GACGGGCCCC

CAAATATGGG

CAGTTTCTAA

TCCGTGGATG

TTAGGTGTTC

GGCGAGCCTG

AGAATCCCGG

GGTGCGGCGT

CCTCCTCGAG

CGTCTTGGTG

TCATATCTAT

GCACGTGCAC

GCAGCGACAG

CGGAACCTCG

CTGATAATAG

TTTCACTAGT

AGTTTCACTA

TGGATCCATC

CCTCTCATTT

TCCACGCAAA

AAAAAAATAC

CGGAGGCACG

CTCCGCCCGT

ACACCGAGTC

CGATCCGCTT

TGAGGTACGC

AAATTCGGGG

TTTGCTGAGA

AAAATGTCTA

TTGTTTCCTG

TTTTCTATAA

ACCCACCAAA

CCAGCCTCAT

TCAAACGGTC

GTGGATATGG

TTTGAGATAT

CGCCCGGACG

CACCCCCATC

TGGATTCTTC

TGTCTTCGCT

TCGGACGTAT

TTCGTCCATC

AGGTGAGGTT

TCAAGGGTTC

GAAGACTTCA

CGGCGCGTCA

ATGTAATTTT

ATATCTCTTC

TCTTCTTTTA

AGTACTAAAC

TTCGTTTTTT

TGCACGGCCC

CAAAAAGAAG

CACGCGCCGC

GGCCGCACAC

CCGCGCCACT

GGCACCGGCT

TTGCAGGCAG

TTCAATTTTC

TCTGTAAAAA

GCTTCTCAGA

TCATGCAAAA

GACGGGTGCA

AAGAAAAGAA

GAGTTTTCAA

ACGCTTGAGC

CTTAAACGCC

GGGCGCCCGG

CCGGATGTGG

TTTGAGGGGT

CCTTGATGGC

TCTCCTCTGC

TGGTTAGTTG

GGTCGTGCTT

TGGACGTACT

ATGGTTTCTT

AGCGGCAACA

CGGCTGTTAT

ACCGCTCGTT

TATGATTTTA

TCTCGCAAAA

GAAACAGAGT

CCACGCAATT

CCCCGAGAAT

AGCTCTCTTC

CCCAACCGAA

TCACGAGCAA

AGCCACTGAA

CCACTCGCCT

CATCACCCAT

CGCACTAAAA

AAATTATTTG

TGTTTACTGT

AGTCCAAATG

AAGGATTGGA

GATAAGCCTG

ATACATACA-A

CTCACATGGT

GGGTCGCCTT

CAGGCTGACC

GCACGCCAGC

AATGCGTTTT

TGGATTTGCC

TAGGGCAAAC

CCGCTGCTCC

TTTAAGTTAC

CTTTTTTGAG

CGACGGAGCT

GTCATGTGGG

ACTGCGGCTC

CGACAAGGTC

CTGGCGGCAG

GAGATGCTTT

AAAGAGAGTT

TTCACTAGCA

ATTCTCAAAA

CGTCTGGATC

TCGCCGGCGT

AACGCACGCG

ACCGTGACAA

AACCGCAGCT

TGCCCCACTC

CACCTCGGCC

CCCCGGGGAG

3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 WO 99/14314 PCT/AU98/00743 100 CGCGCCCCGC GGCAGCAGCA GCACCGCAGT GGGAGAGAGA GGCTTCGCCC CGGCCCGCAC 4980 CGAGCGGGGC GATCCACCGT CCGTGCGTCC GCACCTCCTC CGCCTCCTCC CCTGTCCCGC 5040 GCGCCCACAC CCATGGCGGC GACGGGCGTC GG INFORMATION FOR SEQ ID NO: 16: SEQUENCE CHARACTERISTICS: LENGTH: 1706 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: CDS LOCATION: 1706 OTHER INFORMATION:/product= "partial cDNA for hexaploid wheat DBE" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 5072

GCT

Ala 1 GTG-TCG AAG CTT GAC TAT TTG AAG Val Ser Lys Leu Asp Tyr Leu Lys 5

GAG

Glu 10 CTT GGA GTT AAT Leu Gly Val Asn TGT ATT Cys Ile GAA TTA ATG Glu Leu Met TCT TCC AAG Ser Ser Lys 35 TGC CAT GAG TTC Cys His Glu Phe GAG CTG GAG TAC Glu Leu Glu Tyr TCA ACC TCT Ser Thr Ser TTC TTT TCA Phe Phe Ser ATG AAC TTT TGG Met Asn Phe Trp

GGA

Gly 40 TAT TCT ACC ATA Tyr Ser Thr Ile

AAC

Asn CCA ATG Pro Met ACG AGA TAC ACA TCA GGC GGG ATA AAA Thr Arg Tyr Thr Ser Gly Gly Ile Lys TGT GGG CGT GAT Cys Gly Arg Asp ATA AAT GAG TTC Ile Asn Glu Phe ACT TTT GTA AGA Thr Phe Val Arg GCT CAC AAA CGG Ala His Lys Arg

GGA

Gly ATT GAG GTG ATC CTG GAT GTT GTC TTC Ile Glu Val Ile Leu Asp Val Val Phe

AAC

Asn 90 CAT ACA GCT GAG His Thr Ala Glu GGT AAT Gly Asn GAG AAT GGT Glu Asn Gly TAT ATG CTT Tyr Met Leu 115 ATA TTA TCA TTT Ile Leu Ser Phe

AGG

Arg 105 GGG GTC GAT AAT Gly Val Asp Asn ACT ACA TAC Thr Thr Tyr 110 GGC TGT GGG Gly Cys Gly GCA CCC AAG GGA Ala Pro Lys Gly

GAG

Glu 120 TTT TAT AAC TAT Phe Tyr Asn Tyr

TCT

Ser 125 WO 99/14314 WO 9914314PCT/AU98/00743 101

AAT

Asn

TGT

Cys 145

GAT

Asp

AAC

Asn

CCT

Pro

CTT

Leu

TAT

Tyr 225

GGG

Gly

TTT

Phe

CAG

Gin

CAT

His

AAT

Asn 305

AGC

Ser

AGA

Arg

TCT

Ser

AAA

Lys

ACC

Thr 130

TTA

Leu

CTT

Leu

GTG

Val

CTT

Leu

GGA

Gly 210

CAA

Gin

AAG

Lys

GCT

Ala

GCA

Ala

GAT

Asp 290

TTA

Leu

TGG

Trp

TTG

Leu

CAA

Gin

GGG

Gly 370 TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC Phe

AGA

Arg

GCA

Ala

TAT

Tyr

GTT

Val1 195

GGC

Gly

GTA

Val1

TAC

Tyr

GGT

Gly

GGA

Gly 275

GGA

Gly

CCA

Pro

AAT

Asn

AGG

Arg

GGA

Gly 355

GGC

Gly Asn

TAC

Tyr

TCC

Ser

GGA

Gly 180

ACT

Thr

GTC

Val1

GGT

Gly

CGG

Arg

GGT

Gly 260

GGA

Gly

TTT

Phe

AAT

Asn

TGT

Cys

AAG

Lys 340

GTT

Val1

AAC

Asn Cys

TGG

Trp

ATA

Ile 165

GCT

Ala

CCA

Pro

AAG

Lys

CAA

Gin

GAC

Asp 245

TTT

Phe

AGG

Arg

ACA

Thr

GGG

Gly

GGG

Gly 325

AGG

Arg

CCA

Pro

AAC

Asn Asn

GTG

Val1 150

ATG

Met

CCA

Pro

CCA

Pro

CTC

Leu

TTC

Phe 230

ATT

Ile

GCC

Ala

AAA

Lys

CTG

Leu

GAG

Giu 310

GAG

Giu

CAG

Gin

ATG

Met

AAT

Asn His 135

ATG

Met

ACC

Thr

ATA

Ile

CTT

Leu

ATT

Ile 215

CCT

Pro

GTG

Val

GAA

Giu

CCT

Pro

GGT

G ly 295

AAC

Asn

GAA

Glu

ATG

Met

TTT

Phe

ACA

Thr 375 Pro

GAA

Giu

AGA

Arg

GAA

Giu

ATT

Ile 200

GCT

Ala

CAC

His

CC

Arg

TGT

Cys

TGG

Trp 280

GAT

Asp

A.AT

Asn

GGA

Giy

CGC

Arg

TAC

Tyr 360

TAC

Tyr Val1

ATG

Met

GGT

Gly

GGT

Gly 185

GAC

Asp

GAA

Giu

TGG

Trp

CAA

Gin

CTT

Leu 265

CAC

His

TTG

Leu

AGA

Arg

GAA

Giu

AAT

Asn 345

ATG

Met

TGC

Cys Val1

CAT

His

TCC

Ser 170

GAC

Asp

ATG

Met

GCA

Ala

AAT

Asn

TTC

Phe 250

TGT

Cys

AGT

Ser

GTA

Val1

GAT

Asp

TTC

Phe 330

TTC

Phe

GGC

Gly

CAT

His Arg

GTT

Val1 155

ACT

Ser

ATG

Met

ATC

Ile

TGG

Trp

GTT

Val1 235

ATT

Ile

GGA

Gly

ATC

Ile

ACA

Thr

GGA

Gly 315

GCA

Ala

TTT

Phe

GAT

Asp

GAT

Asp Phe

GGT

Gly

TGG

Trp

ACA

Thr

AAT

Asn 205

GCA

Ala

TCT

Ser

GGC

Gly

CCA

Pro

TTT

Phe 285

AAT

Asn

AAT

Asn

TTG

Leu

TGT

Cys

TAT

Tyr 365

TAT

Tyr

ATT

Ile

TTT

Phe

GAT

Asp

ACA

Thr 190

GAC

Asp

GGA

Gly

GAG

Glu

ACT

Thr

CAC

His 270

GTA

Val1

AAC

Asn

CAC

His

TCT

Ser

CTC

Leu 350

GGC

Gly

GTC

Val1

GTA

Val1

CGT

Arg

CCA

Pro 175

GGG

Gly

CCA

Pro

GGC

Gly

TGG

Trp

GAT

Asp 255

CTA

Leu

TGT

Cys

AAG

Lys

AAT

Asn

GTC

Val1 335

ATG

Met

CAC

His

A.AT

Asn

GAT

Asp

TTT

Phe 160

GTT

Vai

ACA

Thr

ATT

Ile

CTC

Leu

A.AT

Asn 240

GGA

Gly

TAC

Tyr

GCA

Ala

TAC

Tyr

CTT

Leu 320

AAA

Lys

GTT

Val

ACA

Thr

TAT

Tyr 432 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 1152 WO 99/14314 PCT/AU98/00743 102

TTT

Phe 385 CGC TGG GAT AAA Arg Trp Asp Lys

AAA

Lys 390 GAA CAA TAC TCT Glu Gin Tyr Ser

GAC

Asp 395 TTG CAC AGA TTC Leu His Arg Phe

TGC

Cys 400 TGC CTC ATG ACC Cys Leu Met Thr TTC CGC AAG GAG Phe Arg Lys Glu GAG GGT CTT GGC Glu Gly Leu Gly CTT GAG Leu Glu 415 GAC TTT CCA Asp Phe Pro AAG CCT GAT Lys Pro Asp 435

ACG

Thr 420 GCC GAA CGG CTG Ala Glu Arg Leu

CAG

Gin 425 TGG CAT GGT CAT Trp His Gly His CAG CCT GGG Gin Pro Gly 430 TCC ATG AAA Ser Met Lys TGG TCT GAG AAT Trp Ser Glu Asn CGA TTC GTT GCC Arg Phe Val Ala

TTT

Phe 445 GAT GAA AGA CAG GGC GAG Asp Glu Arg Gin Gly Glu

ATC

Ile 455 TAT GTG GCC TTC Tyr Val Ala Phe ACC AGC CAC TTA Thr Ser His Leu

CCG

Pro 465 GCC GTT GTT GAG Ala Val Val Glu CCA GAG CGC GCA Pro Glu Arg Ala

GGG

Gly 475 CGC CGG TGG GAA Arg Arg Trp Glu

CCG

Pro 480 1200 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1706 GTG GTG GAC ACA Val Val Asp Thr

GGC

Gly 485 AAG CCA GCA CCA Lys Pro Ala Pro GAC TTC CTC Asp Phe Leu TTA CCT GAT Leu Pro Asp TCC AAC CTC Ser Asn Leu 515 CGC CCT GAT Arg Pro Asp 530 GCT CTC ACC ATA Ala Leu Thr Ile

CAC

His 505 CAG TTC TCT CAT Gin Phe Ser His ACC GAC GAC Thr Asp Asp 495 TTC CTC AAC Phe Leu Asn 510 CTA GTA TTG Leu Val Leu TAC CCC ATG CTC Tyr Pro Met Leu

AGC

Ser 520 TAC TCA TCG GTC Tyr Ser Ser Val GTT TGA GAG Val Glu

ACA

Thr 535 AAT ATA TAC AGT Asn Ile Tyr Ser TAA TAT GTC TAT Tyr Val Tyr

ATG

Met 545 TAG TCC TTT GGC Ser Phe Gly TTA TCA GTG TGC Leu Ser Val Cys

ACA

Thr 555 ATT GCT CTA TTG Ile Ala Leu Leu

CCA

Pro 560 GTG ATC TAT TCG Val Ile Tyr Ser

ATA

Ile 565 GCG GCC GCG AA Ala Ala Ala INFORMATION FOR SEQ ID NO: 17: SEQUENCE CHARACTERISTICS: LENGTH: 9289 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: WO 99/14314 PCT/AU98/00743 103 ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: CDS LOCATION: L.9289 OTHER INFORMATION:fproduct= "genomic sequence of DBE" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: CGG GAC Arg Asp 570 CGT CCC TTG GCA Arg Pro Leu Ala TGG GTT ACG TTG Trp Val Thr Leu CCT GAC GCT TCG Pro Asp Ala Ser

CTT

Leu 585 ATC CGG TGT GCC Ile Arg Cys Ala

CTG

Leu 590 AGA CGA GAT ATG Arg Arg Asp Met AGC TCC TAT CGG Ser Ser Tyr Arg 48 96 144 TGT CGG CAC ATT Cys Arg His Ile

CGG

Arg 605 CGG CTT TGC TGG Arg Leu Cys Trp TGT TTT ACC ATT Cys Phe Thr Ile GTC GAA Val Glu 615 ATG TCT TAT AAA Met Ser Tyr.Lys 620 AGG TTT ATC CTT Arg Phe Ile Leu 635 CCG GGA TTC CGA Pro Gly Phe Arg TGA TCG GGT CTT Ser Gly Leu CCC GGG AGA Pro Giy Arg 630 TAA GTT GGG Val Gly CGT TGA CCG Arg Pro

TGA

640 GAG CTT ATA ATG Glu Leu Ile Met

GGC

Gly 645 ACA CCC Thr Pro 650 CTG CAG GGT ATT Leu Gin Gly Ile

ATC

Ile 655 TTT CGA AAG CCG Phe Arg Lys Pro CCG CGG TTA TGA Pro Arg Leu

GGC

Gly 665 AGA TGG GAA TTT Arg Trp Giu Phe AAT GTC CGA TTG Asn Val Arg Leu AGA ACC TGT CAC Arg Thr Cys His

TTG

Leu 680 ACT TAA TTT AAA Thr Phe Lys

ATT

Ile 685 CAT CAA CCG TGT His Gin Pro Cys

GTG

Val 690 TAG CCG TGA TGG Pro Trp TCT CTT Ser Leu 695 TTC GGC GGA Phe Giy Gly AAG TAG TTT Lys Phe 715 CGG GAA GTG AAC Arg Giu Val Asn

ACG

Thr 705 GTT TGA GTT ATG Vai Vai Met CAT GAA CGT His Glu Arg 710 GCG ACC GTT Ala Thr Val CAG GAT CAC TCC Gin Asp His Ser ATC ACT TCT AGC Ile Thr Ser Ser

TCC

Ser 725 GCG TTG Ala Leu 730 TTT CTC TTC TCG Phe Leu Phe Ser

CTC

Leu 735 TCA TTT GCG TAT GTT AGC CAC CAT ATA Ser Phe Aia Tyr Vai Ser His His Ile 740

TGC

Cys 745 TTA GTG TCT GCT Leu Vai Ser Ala GCT CCA CCT CAT Ala Pro Pro His

TAC

Tyr 755 CCC TTC CTT TCC Pro Phe Leu Ser

TAT

Tyr 760 528 576 624 AAG CTT AAA TAG Lys Leu Lys

TCT

Ser 765 TGA TCT CGC GGG Ser Arg Gly GAG ATT GCT GAG Glu Ile Ala Glu TCC TCG Ser Ser 775 WO 99/14314 PCT/AU98/00743 104 TGA CTT ACA Leu Thr

GAT

Asp 780 TCT ACC AAA ACA Ser Thr Lys Thr

GTT

Val 785 GCA GGT GTC GAC Ala Gly Val Asp GAT GCC AGT Asp Ala Ser 790 GCA GGT GAC Ala Gly Asp 795 CGT TAC TAT Arg Tyr Tyr 810 GCA ACC GAG CTC Ala Thr Glu Leu

AAG

Lys 800 TGG GAG TTC GAC GAG GAA CGT GGT Trp Glu Phe Asp Glu Glu Arg Gly 805 GTT TCT TTT Val Ser Phe

CCT

Pro 815 GAT GAT CAG TAG Asp Asp Gln

TGG

Trp 820 AGC CCA GTT GGG Ser Pro Val Gly

ACG

Thr 825 ATC GGG GAT CTA Ile Gly Asp Leu TTT GGG GTT ATC Phe Gly Val Ile

TTA

Leu 835 ATT TCT TTT AGA Ile Ser Phe Arg

TTT

Phe 840 GAC CGT AAT CGG Asp Arg Asn Arg

TCT

Ser 845 ATG TGT GGA TTT Met Cys Gly Phe ATG ATG TAT GAA Met Met Tyr Glu TTA TTT Leu Phe 855 ATG TAT TGT Met Tyr Cys AAG TGG CGA TTG Lys Trp Arg Leu

TAA

865 GCC AAC TCT CGT Ala Asn Ser Arg TAT CCC ATT Tyr Pro Ile 870 TGC GAC AAA Cys Asp Lys CTT GTT CAT Leu Val His 875 ACC ACA ATG Thr Thr Met 890 TAC ATG GGA TTG Tyr Met Gly Leu

TGT

Cys 880 GAA GAT GAC CCT Glu Asp Asp Pro CGG TTA TGC Arg Leu Cys TAA GTC GTG CCT Val Val Pro

CGA

Arg 900 CAC GTG GGA GAT His Val Gly Asp

ATA

Ile 905 GCC GCA TCG TGG Ala Ala Ser Trp

GCG

Ala 910 TTA CAC GCA AGT Leu His Ala Ser

CTT

Leu 915 CAT AGC AAC CAA His Ser Asn Gin

AAC

Asn 920 TCC TCT CCG CAT Ser Ser Pro His AAG CCA CCA ATC Lys Pro Pro Ile

GCA

Ala 930 GCC ACC ATG ACT Ala Thr Met Thr TTC TTC Phe Phe 935 ACC ACT GTC Thr Thr Val TCG GCA AGA Ser Ala Arg 955

AAT

Asn 940 GCC ATG AAA ATC Ala Met Lys Ile

TAT

Tyr 945 ATG TAG ACA TGT Met Thr Cys CCC ATT GCA Pro Ile Ala 950 GCC TCT CTG Ala Ser Leu 960 1008 1056 1104 1152 1200 1248 1296 1344 1392 AAG CGA AGC TTC Lys Arg Ser Phe

ACG

Thr 960 GCA CAC CTT CAT Ala His Leu His GCC GAA Ala Glu 970 GAC AAG GAT GCG Asp Lys Asp Ala

CCC

Pro 975 GAC CGG ATC AAT Asp Arg Ile Asn

TCC

Ser 980 TAT CTA GAT ACC Tyr Leu Asp Thr

TAG

985 985 TGG AGC CAT GCG Trp Ser His Ala ATA GCG GAG ATC Ile Ala Glu Ile

TCC

Ser 995 GAG AGG AAG ACC Glu Arg Lys Thr

GGA

Gly 1000 ACT CGT CGG ACG Thr Arg Arg Thr TCG GCG TCC AAA TCG Ser Ala Ser Lys Ser 1005 AGG AGG CCG GCA TGA Arg Arg Pro Ala 1010 AGC ACA Ser Thr 1015 TCG AGG ATG Ser Arg Met GTG ATC CCC ATA CGG Val Ile Pro Ile Arg 1020 GTA GAT CGG Val Asp Arg 1025 GTC GGC CGC CAT CTC Val Gly Arg His Leu 1030 WO 99/14314 PCT/AU98/00743 105 ACA CCG AGA TTA GGA TGC TTA AAA CGG TTT TTT TGG CAC TAG CAT TAT 1440 Thr Pro Arg Leu Gly Cys Leu Lys Arg Phe Phe Trp His His Tyr 1035 1040 1045 TTT GCA TCA TCC GTT GGA GAG AAC ATG AGA GAG CCC CAT TTC TTC CAC 1488 Phe Ala Ser Ser Val Gly Glu Asn Met Arg Glu Pro His Phe Phe His 1050 1055 1060 GGT TCT ACC TAT GGG ATC TTG TTC TGC TTG CAA CCG GGC CTC ACG GAA 1536 Gly Ser Thr Tyr Gly Ile Leu Phe Cys Leu Gin Pro Gly Leu Thr Glu 1065 1070 1075 1080 AAC CCG CGC CAG CGG ACC CAC CCC ATG CTA GCA GGG CAC GGC ACC CGC 1584 Asn Pro Arg Gin Arg Thr His Pro Met Leu Ala Gly His Gly Thr Arg 1085 1090 1095 AGC GGC CGG TCC AAA TGG ACG GTG AGA ACC GCA ACG CGA CAC GCC CGG 1632 Ser Gly Arg Ser Lys Trp Thr Val Arg Thr Ala Thr Arg His Ala Arg 1100 1105 1110 CAC TGT CAG CAA AGC GAG AGC GCG CGC ACG GCA CAC GCA CGC TCG GAC 1680 His Cys Gin Gin Ser Glu Ser Ala Arg Thr Ala His Ala Arg Ser Asp 1115 1120 1125 GAA CGG ACG GTG CGA TCG ATC CCT CCC CCC TCG CTC AAC CAC AGT AGT 1728 Glu Arg Thr Val Arg Ser Ile Pro Pro Pro Ser Leu Asn His Ser Ser 1130 1135 1140 ACC CTG CCA CAC TAT CAC GCA CGC ACT CGA GTC ACA CCT CCC ACG AAG 1776 Thr Leu Pro His Tyr His Ala Arg Thr Arg Val Thr Pro Pro Thr Lys 1145 1150 1155 1160 AAC CAA CAG GAG GCG CGG ATC CCA CCG ATA AAT AAC CCC GCC TCG CCG 1824 Asn Gin Gin Glu Ala Arg Ile Pro Pro Ile Asn Asn Pro Ala Ser Pro 1165 1170 1175 CTC CTC CCC AAA ATC AAT CAC CGA TCG CTC GGG GTT CCC GGC ATG ACG 1872 Leu Leu Pro Lys Ile Asn His Arg Ser Leu Gly Val Pro Gly Met Thr 1180 1185 1190 ATG ATG GCC ATG GCC AAG GCG CCC TGC CTC TGC GCG CGC CCG TCC CTC 1920 Met Met Ala Met Ala Lys Ala Pro Cys Leu Cys Ala Arg Pro Ser Leu 1195 1200 1205 GCC GCG CGC GCG AGG CGG CCG GGG CCG GGG CCG GCG CCG CGC CTG CGA 1968 Ala Ala Arg Ala Arg Arg Pro Gly Pro Gly Pro Ala Pro Arg Leu Arg 1210 1215 1220 CGG TGG CGA CCC AAT GCG ACG GCG GGG AAG GGG GTC GGC GAG GTG TGC 2016 Arg Trp Arg Pro Asn Ala Thr Ala Gly Lys Gly Val Gly Glu Val Cys 1225 1230 1235 1240 GCC GCG GTT GTC GAG GCG GCG ACG AAG GCC GAG GAT GAG GAC GAC GAC 2064 Ala Ala Val Val Glu Ala Ala Thr Lys Ala Glu Asp Glu Asp Asp Asp 1245 1250 1255 GAG GAG GAG GCG GTG GCG GAG GAC AGG TAC GCG CTC GGC GGC GCG TGC 2112 Glu Glu Glu Ala Val Ala Glu Asp Arg Tyr Ala Leu Gly Gly Ala Cys 1260 1265 1270 AGG GTG CTC GCC GGA ATG CCC GCG CCG CTG GGC GCC ACC GCG CTC GCC 2160 Arg Val Leu Ala Gly Met Pro Ala Pro Leu Gly Ala Thr Ala Leu Ala 1275 1280 1285 WO 99/14314 PCT/AU98/00743 106 GGC GGG GTC AAT TTC GCC GTC TAC TCC GGT GGA GCC ACC GCC GCG GCG 2208 Gly Gly Val Asn Phe Ala Val Tyr Ser Gly Gly Ala Thr Ala Ala Ala 1290 1295 1300 CTC TGC CTC TTC ACG CCA GAA GAT CTC AAG GCG GTG GGG TTG CCT CCC 2256 Leu Cys Leu Phe Thr Pro Glu Asp Leu Lys Ala Val Gly Leu Pro Pro 1305 1310 1315 1320 GAG TAG AGT TCA TCA GCT TTG CGT GCG CCG CGC GCC CCC TTT TCT GGC 2304 Glu Ser Ser Ser Ala Leu Arg Ala Pro Arg Ala Pro Phe Ser Gly 1325 1330 1335 CTG CGA TTT AAG TTT TGT ACT GGG GGA AAT GCT GCA GGA TAG GGT GAC 2352 Leu Arg Phe Lys Phe Cys Thr Gly Gly Asn Ala Ala Gly Gly Asp 1340 1345 1350 GGA GGA GGT TTC CCT TGA CCC CCT GAT GAA TCG GAC TGG GAA CGT GTG 2400 Gly Gly Gly Phe Pro Pro Pro Asp Glu Ser Asp Trp Glu Arg Val 1355 1360 1365 GCA TGT CTT CAT TGA AGG CGA GCT GCA CGA CAT GCT TTA CGG GTA CAG 2448 Ala Cys Leu His Arg Arg Ala Ala Arg His Ala Leu Arg Val Gin 1370 1375 1380 GTT CGA CGG CAC CTT TGC TCC TCA CTG CGG GCA CTA CCT TGA TAT TTC 2496 Val Arg Arg His Leu Cys Ser Ser Leu Arg Ala Leu Pro Tyr Phe 1385 1390 1395 1400 CAA TGT CGT GGT GGA TCC TTA TGC TAA GGT GAT CAT ACT TTA GCT TTA 2544 Gin Cys Arg Gly Gly Ser Leu Cys Gly Asp His Thr Leu Ala Leu 1405 1410 1415 CCT GCA TCT TGG TAT TTA CAG TAG AAA TTG TTA CGT GGA CCC TTA TTT 2592 Pro Ala Ser Trp Tyr Leu Gln Lys Leu Leu Arg Gly Pro Leu Phe 1420 1425 1430 GTT GCC TTT TGT GTT GCT CTA GGC AGT GAT AAG CCG AGG GGA GTA TGG 2640 Val Ala Phe Cys Val Ala Leu Gly Ser Asp Lys Pro Arg Gly Val Trp 1435 1440 1445 CGT TCC GGC GCG TGG TAA CAA TTG CTG GCC TCA GAT GGC TGG CAT GAT 2688 Arg Ser Gly Ala Trp Gin Leu Leu Ala Ser Asp Gly Trp His Asp 1450 1455 1460 CCC TCT TCC ATA TAG CAC GGT ATG CCT GAT TGC TGA AAA TAT TGG CTG 2736 Pro Ser Ser Ile His Gly Met Pro Asp Cys Lys Tyr Trp Leu 1465 1470 1475 1480 CAT TTG TTT CTC TCT TTT TCT CAT ATT TTT CTC CTG TCT TTC ACT TGT 2784 His Leu Phe Leu Ser Phe Ser His Ile Phe Leu Leu Ser Phe Thr Cys 1485 1490 1495 ACT ACA TTG CCT CAG ACA GTC ATG ATC AAA GAG AGC AGT GTC ATT AGA 2832 Thr Thr Leu Pro Gin Thr Val Met Ile Lys Glu Ser Ser Val Ile Arg 1500 1505 1510 CAT TTG TAG TTG TCT GCT GAC TTT GAC CAA AAC TTG TAA TTT ACT GTT 2880 His Leu Leu Ser Ala Asp Phe Asp Gin Asn Leu Phe Thr Val 1515 1520 1525 GTT AAA GGT CCT TGA ATC ATA TTT TTT TAT AAT ATT ATG TTT GCA AGT 2928 Val Lys Gly Pro Ile Ile Phe Phe Tyr Asn Ile Met Phe Ala Ser 1530 1535 1540 WO 99/14314 PCT/AU98/00743 107 GGA AGT AAA GTG AAA TTG CAT CTA GTA TTT GTT GTT GCT GTC TTA GTC 2976 Gly Ser Lys Val Lys Leu His Leu Val Phe Val Val Ala Val Leu Val 1545 1550 1555 1560 GTT TAA TTG GAC ATG CAG TAA AAA GGT TTG CAT CTG CAG TTT GAT TGG 3024 Val Leu Asp Met Gin Lys Gly Leu His Leu Gin Phe Asp Trp 1565 1570 1575 GAA GGC GAC CTA CCT CTA AGA TAT CCT CAA AAG GAC CTG GTA ATA TAT 3072 Glu Gly Asp Leu Pro Leu Arg Tyr Pro Gin Lys Asp Leu Val Ile Tyr 1580 1585 1590 GAG ATG CAC TTG CGT GGA TTC ACG AAG CAT GAT TCA AGC AAT GTA GAA 3120 Glu Met His Leu Arg Gly Phe Thr Lys His Asp Ser Ser Asn Val Glu 1595 1600 1605 CAT CCG GGT ACT TTC ATT GGA GCT GTG TCG AAG CTT GAC TAT TTG AAG 3168 His Pro Gly Thr Phe Ile Gly Ala Val Ser Lys Leu Asp Tyr Leu Lys 1610 1615 1620 GTA CAG CTG TAC TTG CTG ACT ACA TAG GAT AAT TTT TAA AGA AAG CTA 3216 Val Gin Leu Tyr Leu Leu Thr Thr Asp Asn Phe Arg Lys Leu 1625 1630 1635 1640 CAT ATT AGC CAG AAT TTG GGT TAT TAC AAA AAC TAC TGC ATA CTA TAG 3264 His Ile Ser Gin Asn Leu Gly Tyr Tyr Lys Asn Tyr Cys Ile Leu 1645 1650 1655 CAG TTA CAT GCT CAT TAT CGA GGA GAT GCT CAC ACG CAT CTT ATT TGG 3312 Gin Leu His Ala His Tyr Arg Gly Asp Ala His Thr His Leu Ile Trp 1660 1665 1670 ATT TAA TAC CCA ATT CTG TTT TGA TAT TGG ACT GTT CCC TCT ACA GGA 3360 Ile Tyr Pro Ile Leu Phe Tyr Trp Thr Val Pro Ser Thr Gly 1675 1680 1685 GCT TGG AGT TAA TTG TAT TGA ATT AAT GCC CTG CCA TGA GTT CAA CGA 3408 Ala Trp Ser Leu Tyr Ile Asn Ala Leu Pro Val Gin Arg 1690 1695 1700 GCT GGA GTA CTC AAC CTC TTC TTC CAA GTA AGG ACA TGA ATT TAG TAT 3456 Ala Gly Val Leu Asn Leu Phe Phe Gin Val Arg Thr Ile Tyr 1705 1710 1715 1720 TAG CCT GCC AGC ACT GTT TGA GTG AGA GTT CAT ACA CAT TTT GTG CCT 3504 Pro Ala Ser Thr Val Val Arg Val His Thr His Phe Val Pro 1725 1730 1735 GCA TAA CTG ATA TTT GTT CAA ACT ATT TTT TTT AGC AGT CAC TCA ACA 3552 Ala Leu Ile Phe Val Gin Thr Ile Phe Phe Ser Ser His Ser Thr 1740 1745 1750 GTT TTA CAT ATA TAT ATA ATA TAG ACT ATT CGT CAC CCT GGG TGA GGA 3600 Val Leu His Ile Tyr Ile Ile Thr Ile Arg His Pro Gly Gly 1755 1760 1765 ATA GTT ATT CTT CAC CCA CCT CTA TTT TAA CAT CTA TGC ACC GTA ATT 3648 Ile Val Ile Leu His Pro Pro Leu Phe His Leu Cys Thr Val Ile 1770 1775 1780 TTA CGT TTC GTA AAT TTG TCT TAT TTT AGA GAT AAA AAG AGA ACG TAA 3696 Leu Arg Phe Val Asn Leu Ser Tyr Phe Arg Asp Lys Lys Arg Thr 1785 1790 1795 1800 WO 99/14314 PCT/AU98/00743 108 GAA AAC CTA TAA TCG Glu Asn Leu Ser 1805 ATG TAA AAA CAT AGT Met Lys His Ser 1820 TAT TTT TTT TGT TAA Tyr Phe Phe Cys 1835 TCG TAA AAA AAA ATA TGT Ser Lys Lys Ile Cys 1810 GTA AAA TGT ACA TAA AAT Val Lys Cys Thr Asn 1825 TGC CAA ATT TTA TAC AGT Cys Gin Ile Leu Tyr Ser 1840 TAC GTA AAA Tyr Val Lys TTA CAA Leu Gin 1815 ACA TTT TTT GAC CTA Thr Phe Phe Asp Leu 1830 AAA TCA ATA TGA ATG Lys Ser Ile Met 1845 TAA CTA TTT Leu Phe 1850 GTA TTT CAA Val Phe Gin ATG TAA Met 1855 TTT ATT TAT Phe Ile Tyr GAA ATG Glu Met 1860 GTC GTA AGA Val Val Arg TTA CCT CGG GTG AAG AAT AAC TTA TTC TGC ACC CTG GGT GAT GAA Leu Pro Arg Val Lys Asn Asn Leu Phe Cys Thr Leu Gly Asp Glu

TAG

1880 1865 1870 1875 TAA CAC TAT ATA His Tyr Ile TAT ATA TAT ATA TAT Tyr Ile Tyr Ile Tyr 1885 ATA TAT Ile Tyr 1890 ATA TAT ATA Ile Tyr Ile CCG GCT Pro Ala 1895 GCT GCT AAT Ala Ala Asn GAT GTT Asp Val 1900 AAT ATT TCG Asn Ile Ser CAA GTA CCT Gin Val Pro 1905 AAG CTG GAT TTT TCT Lys Leu Asp Phe Ser 1910 CCA TGA GAC ATC AAT CCA TAA TTG AAA TTG GTC ACG ACA GTT GAA TAG Pro Asp Ile Asn Pro Leu Lys Leu Val Thr Thr Val Glu 1915 1920 1925 3744 3792 3840 3888 3936 3984 4032 4080 4128 4176 4224 4272 4320 4368 4416 4464 TTG ATA GCT Leu Ile Ala 1930 GAA AAT GAA ATC CAG Glu Asn Glu Ile Gin 1935 CAT GCT ACT GTC His Ala Thr Val 194C TTG CCA TCT CCA Leu Pro Ser Pro GAC TTG CTA ACA TGA Asp Leu Leu Thr 1945 ATT TTG Ile Leu 1950 TCT GCC TAC Ser Ala Tyr CTG TCA Leu Ser 1955 TTT GTA CCA Phe Val Pro

ACG

Thr 1960 TTC CCA ATT GCC CTC TCA TTA TTC GTG TGT ACC ATG CAT ATG TGT TTT Phe Pro Ile Ala' Leu Ser Leu Phe Val Cys Thr Met His Met Cys Phe 1965 1970 1975 AAC ATG ATT Asn Met Ile ATC ACC CGT Ile Thr Arg 1995 ATT GTT Ile Val 1980 GGC TAT ATT Gly Tyr Ile TCT CTT Ser Leu 1985 TGG AAA CAT Trp Lys His GAC TAA TTT Asp Phe 1990 TTT GTA TAA ACT Phe Val Thr GCT TGT Ala Cys 2000 TTT CAT ATC Phe His Ile AGG ATG AAC TTT Arg Met Asn Phe 2005 ACG AGA TAC ACA Thr Arg Tyr Thr 0 TGG GGA Trp Gly 2010 TAT TCT ACC ATA Tyr Ser Thr Ile AAC TTC Asn Phe 2015 TTT TCA CCA Phe Ser Pro

ATG

Met 202( TCA GGC GGG ATA AAA Ser Gly Gly Ile Lys 2025 AAC TGT GGG CGT GAT Asn Cys Gly Arg Asp 2030 GCC ATA Ala Ile 2035 AAT GAG TTC Asn Glu Phe

AAA

Lys 2040 ACT TTT GTA AGA Thr Phe Val Arg GAG GCT CAC AAA CGG Glu Ala His Lys Arg 2045 GGA ATT GAG GTA AGC Gly Ile Glu Val Ser 2050 AAG TCG Lys Ser 2055 WO 99/14314 WO 99/ 4314PCT/AU98/00743 109 TAC GAG TTA GTT GCT Tyr Giu Leu Val Ala 2060 CCT TTT GAA CTT ATC Pro Phe Glu Leu Ile 2065 AAT TTG ATG Asn Leu Met CGA AGA CAT Arg Arg H-is 2070 CAG CTG AGG Gin Leu Arg GTT ACT Vai Thr GCT AGG TGA TCC TGG Aia Arg Ser Trp 2075 ATG TTG Met Leu 2080 TCT TCA ACC Ser Ser Thr

ATA

Ile 2085 GTA ATG AGA Vai Met Arg 2090 ATG GTC CAA Met Val Gin TAT TAT Tyr Tyr 2095 CAT TTA GGG His Leu Giy GGG TCG ATA ATA CTA Gly Ser Ile Ile Leu 2100 CAT ACT ATA TGC TTG His Thr Ile Cys Leu 2105 CAC CCA His Pro 2110 AGG TGA CAG Arg Gin ATC TTT Ile Phe 2115 CTT GCT GCG Leu Ala Ala

TAA

2120 TTG TTC TTT CAT AGA TOT ATA GAG CAT AGA TGT OTT ATG TAG TAG TTC Leu Phe Phe His Arg Cys Ile Giu His Arg Cys Vai Met Phe 2125 2130 2135 TTT TTC AAG Phe Phe Lys GGG ATT Oly Ile 2140 ATG TTC ATG Met Phe Met CAG GGA GAG TTT TAT Gin Giy Giu Phe Tyr 2145 AAC TAT TCT Asn Tyr Ser 2150 GCC TGT GGG AAT Giy Cys Gly Asn 2155 ACC TTC AAC Thr Phe Asn TGT AAT CAT CCT GTG Cys Asn His Pro Vai 2160 GTT CGT Val Arg 2165 CAA TTC Gin Phe ATT GTA GAT Ile Val Asp 2170 TGT TTA AGG Cys Leu Arg TAC AGA Tyr Arg 2175 TAT ACA TTT Tyr Thr Phe TAC TTC TAG AAC TAC Tyr Phe Asn Tyr 2180 4512 4560 4608 4656 4704 4752 4800 4848 4896 4944 4992 5040 5088 5136 5184 5232 TTT TTC Phe Phe 2185 ATT TCT TTT Ile Ser Phe GCT GCT TGT CAT TTT Ala Ala Cys His Phe 2190 GAT ATG ATT AAT TTG Asp Met Ile Asn Leu 2195

CAA

Gin 2200 GCT TGT GGG GGT Ala Cys Gly Oly AAA TCT Lys Ser 2205 TTT GGT CAG Phe Gly Gin CAT ATT His Ile 2210 GTA TCT TTA Vai Ser Leu AAT GTC Asn Val 2215 ACA AAT ACT Thr Asn Thr AAT GTC Asn Val 2220 CTO GTG CTT Leu Val Leu ATT GAT Ile Asp 2225 TTG GCA TCT Leu Ala Ser TCA AAT TCT Ser Asn Ser 2230 AAC TAA TTT Asn Phe TCT CCA ATG AAA Ser Pro Met Lys 2235 AGG GAA AAA Arg Glu Lys TCT ACT Ser Thr 2240 GTA TGT CTC Val Cys Leu

GTC

Val1 2245 ACT TTT Thr Phe 2250 GTT TTG CAG ATA VJal Leu Gin Ile CTG GGT Leu Gly 2255 GAT GGA AAT Asp Gly Asn GCA TGT TGA TGG TTT Ala Cys Trp Phe 2260 TCG TTT TGA TCT TGC Ser Phe Ser Cys 2265

ATC

Ile 2270 CAT AAT GAC CAG His Asn Asp Gin TGT TGC CTT TTC Cys Cys Leu Phe 229( AGG TTC Arg Phe 2275 CAG GTA ATT Gin Val Ile

TOT

Cys 2280 ATT TAT TGT TTG Ile Tyr Cys Leu TGT TTC TTT TAC Cys Phe Phe Tyr 230( TTT GCG Phe Ala 2285 AAG TCT Lys Ser 0 AGA AGA TTC TTA Arg Arg Phe Leu AAA GAA Lys Glu 2295

GTG

Val1 GGA TCC AGT TAA CGT GTA Gly Ser Ser Arg Val 2305 TGG AGC TCC Trp Ser Ser 2310 WO 99/14314 PCT/AU98/00743 110 AAT AGA AGG TGA CAT GAT CAC AAC AGG GAC ACC TCT TGT TAC TCC ACC 5280 Asn Arg Arg His Asp His Asn Arg Asp Thr Ser Cys Tyr Ser Thr 2315 2320 2325 ACT TAT TGA CAT GAT CAG CAA TGA CCC AAT TCT TGG AGG CGT CAA GGT 5328 Thr Tyr His Asp Gin Gin Pro Asn Ser Trp Arg Arg Gin Gly 2330 2335 2340 ACT TGT TTC ATC CAA CAC CTG TTG TCT GTG TGC ATT CAA TTG TTT TAA 5376 Thr Cys Phe Ile Gin His Leu Leu Ser Val Cys Ile Gin Leu Phe 2345 2350 2355 2360 TAT GGT AAT GAT CAA TTT CCC AAT GTT GAT AAG GAA AAA AAA TGC AAG 5424 Tyr Gly Asn Asp Gin Phe Pro Asn Val Asp Lys Glu Lys Lys Cys Lys 2365 2370 2375 TAG CTC TCT TTA TCT GCT TCT TGT GAG TTA TGC TAA ACA TGT AGA TAC 5472 Leu Ser Leu Ser Ala Ser Cys Glu Leu Cys Thr Cys Arg Tyr 2380 2385 2390 TAC TAT ATT TCA ACT GTA TAT ACT TGA CAT ATT ATT GCT TCC TTG GGA 5520 Tyr Tyr Ile Ser Thr Val Tyr Thr His Ile Ile Ala Ser Leu Gly 2395 2400 2405 GGC TCT CTT ATT CCT TTC CCC CGT TGC AAT TAT AGC TCA TTG CTG AAG 5568 Gly Ser Leu Ile Pro Phe Pro Arg Cys Asn Tyr Ser Ser Leu Leu Lys 2410 2415 2420 CAT GGG ATG CAG GAG GCC TCT ATC AAG TAG GTC AAT TCC CTC ACT GGA 5616 His Gly Met Gin Glu Ala Ser Ile Lys Val Asn Ser Leu Thr Gly 2425 2430 2435 2440 ATG TTT GGT CTG AGT GGA ATG GGA AGG TAA GGT ACC TGT TAA AAG TTT 5664 Met Phe Gly Leu Ser Gly Met Gly Arg Gly Thr Cys Lys Phe 2445 2450 2455 GAA TGG CAA ATA CTG ATA GAA ATA TAA CTT ATA TTT GCG ACA TAT ATA 5712 Glu Trp Gin Ile Leu Ile Glu Ile Leu Ile Phe Ala Thr Tyr Ile 2460 2465 2470 GAT AAA GCA AAA TAA TAC GCA TTC CAC CTG AAC TTT AAA GGG GCA CGC 5760 Asp Lys Ala Lys Tyr Ala Phe His Leu Asn Phe Lys Gly Ala Arg 2475 2480 2485 AGA ATT ATC CCG CAT CTG TCT ACA AGA ATG ATA ACA CAT GTG CTG AAT 5808 Arg Ile Ile Pro His Leu Ser Thr Arg Met Ile Thr His Val Leu Asn 2490 2495 2500 AGT GAA GTA CTA CTT CTC AAA TGT CTG AAT GAA CGC ACT AAC TCT TGT 5856 Ser Glu Val Leu Leu Leu Lys Cys Leu Asn Glu Arg Thr Asn Ser Cys 2505 2510 2515 2520 GAG TGT CAA CCG AGC AAG AAA TAT TTG AGT TTT CTG CAA GAA ATT GTT 5904 Glu Cys Gin Pro Ser Lys Lys Tyr Leu Ser Phe Leu Gin Glu Ile Val 2525 2530 2535 CAT GTT GTG CTG TAT TAT ACT CCC TCC GTC CGA AAT TAT TTG TCG GAG 5952 His Val Val Leu Tyr Tyr Thr Pro Ser Val Arg Asn Tyr LeA Ser Glu 2540 2545 2550 AAA TGG ATG TAT CTA GAC GTA TTT TAG TTC TAG ATA CAT CCA TTT TTA 6000 Lys Trp Met Tyr Leu Asp Val Phe Phe Ile His Pro Phe Leu 2555 2560 2565 WO 99/14314 PCT/AU98/00743 111 TCC ATT TCT Ser Ile Ser 2570 GCA ACA AGT Ala Thr Ser AGT TCC Ser Ser 2575 GGA CGG AGG Gly Arg Arg GAG TAT Glu Tyr 2580 CAT TTA ACA His Leu Thr AAT ATA Asn Ile 2585 TGC ATG TTC Cys Met Phe GAA GTA AAT CCC CAC Glu Val Asn Pro His 2590 GAA TAA Glu 2595 GCA TAT AAG Ala Tyr Lys

ACG

Thr 2600 ATA TTG CTT TTT Ile Leu Leu Phe GAC TTG Asp Leu 2605 CAA CAC CTA Gin His Leu AAC CTC Asn Leu 2610 ATT GTT TTC Ile Val Phe TCC TAG Ser 2615 GAT TTT GGG Asp Phe Gly TGT TCG AAG CAA GCA Cys Ser Lys Gin Ala 2620 GCT GGT Ala Gly 2625 GAT ATT TAA Asp Ile TTT ACC TTT Phe Thr Phe 2630 GCC TTT ATT TGT AGC TTG ATT TGA GGG TGC GGC AAA GGT TTT AGC TTA Ala Phe Ile Cys Ser Leu Ile Gly Cys Gly Lys Gly Phe Ser Leu 2635 2640 2645 GTA GTG TTT TGT AAA TTA TTA TAG TTT ATG TAT ATA CTC CTC ATT TGG Val Val Phe Cys Lys Leu Leu Phe Met Tyr Ile Leu Leu Ile Trp 2650 2655 2660

GCA

Ala 2665 CTT CCG TAC Leu Pro Tyr TGG TCC CAT Trp Ser His 2670 AGA AGA TAA Arg Arg AAA TGG Lys Trp 2675 AAT GAT GTC Asn Asp Val

TGG

Trp 2680 CCA ATA ATT GTT Pro Ile Ile Val GAC AAC Asp Asn 2685 ACT GTT GCG Thr Val Ala CAT TTG ATT TTT ATC His Leu Ile Phe Ile 2690 AGG GAA Arg Glu 2695 6048 6096 6144 6192 6240 6288 6336 6384 6432 6480 6528 6576 6624 6672 6720 6768 TGG AAA ATT Trp Lys Ile GAA ATC GGT AAG AAA Glu Ile Gly Lys Lys 2700 CAT TGC GAT ATT AAG His Cys Asp Ile Lys 2705 CTT GTA TAT Leu Val Tyr 2710 CGT GTG CAT Arg Val His i GCT AAT GCT GGT Ala Asn Ala Gly 2715 GGA TCT TTA Gly Ser Leu AGA GGG AAC ATA TGA Arg Gly Asn Ile 2720

TCT

Ser 2725 CCA TCT TCA Pro Ser Ser 2730 ACT AAA AAA Thr Lys Lys ATA TGT Ile Cys 2735 TGC ACA TCT Cys Thr Ser CCC ACG TCA CTT ACT Pro Thr Ser Leu Thr 2740 AGC TAT Ser Tyr 2745 TTC ATC CAA Phe Ile Gin GTA CTA Val Leu 2750 ACT TGT GTG Thr Cys Val GTT GTC Val Val 2755 TCC TCA GTA Ser Ser Val

CCG

Pro 2760 GGA CAT TGT GCG Gly His Cys Ala CCA ATT Pro Ile 2765 CAT TAA AGG His Arg CAC TGA TGG ATT TGC His Trp Ile Cys 2770 TGG TGG Trp Trp 2775 TTT TGC CGA 2 Phe Cys Arg TGG CAA TAC Trp Gin Tyr 2795 ATG TCT Met Ser 2780 TTG TGG AAG Leu Trp Lys TCC ACA Ser Thr 2785 CCT ATA CCA Pro Ile Pro GGT AAG TTG Gly Lys Leu 2790 ATT TTT TAT Ile Phe Tyr TTG GAA ATG GGT Leu Glu Met Gly TGA GTG AAT GTC ACA Val Asn Val Thr 2800

TGG

Trp 2805 ATA TAC CAC ATG ATG Ile Tyr His Met Met 2810 ATA CAC ATG TAA ATA TAT Ile His Met Ile Tyr 2815 AAC GAT TAT AGT GTA Asn Asp Tyr Ser Val 2820 WO 99/14314 PCT/AU98/00743 112 TGC ATA Cys Ile 2825 TGC ATT TGG CTA AGA AGT ACT Cys Ile Trp Leu Arg Ser Thr 2830 CCC TCC CTT Pro Ser Leu 2835 AGT AAA AGT Ser Lys Ser

TAG

2840 TAC AAA GTT GAG Tyr Lys Val Glu TCA TCT Ser Ser 2845 ATT TTG GAA Ile Leu Glu CGG AGG Arg Arg 2850 GAG TAT AAG Glu Tyr Lys TGT ATA Cys Ile 2855 CAC TAG TGC His Cys CAT AGG GCT His Arg Ala 2875 AAT ATA Asn Ile 2860 TAG GTT TTA Val Leu ACA CCC Thr Pro 2865 AAC TTG CCA Asn Leu Pro ATG AAG GAA Met Lys Glu 2870 ATA ATC CAC Ile Ile His TTC TAG TTA TCT Phe Leu Ser TAT TTA TTT GTC TGG Tyr Leu Phe Val Trp 2880

TGA

2885 TGA AAA ATT CCA GCC ATG TCA TTT TTT AGG GGG Lys Ile Pro Ala Met Ser Phe Phe Arg Gly GGA GAA GAA ACT ACA Gly Glu Glu Thr Thr 2900 2890 2895 TTG ATT TTT CCC CCT Leu Ile Phe Pro Pro 2905 AAA AAA Lys Lys 2910 AGC CAT CTC Ser His Leu AGA TTT CAT AGG TAA Arg Phe His Arg 2915

CTT

Leu 2920 GCT TTT CTG TAA Ala Phe Leu AGA AAT GAA AAC GAC Arg Asn Glu Asn Asp 2925 TTC ATA CTT TCT GTC Phe Ile Leu Ser Val 2930 GAT TAT Asp Tyr 2935 AAG TGT ATA Lys Cys Ile CAC TAG His 2940 TGC AAT ATA Cys Asn Ile

TAG

2945 GTT TTA ACA Val Leu Thr CCC AAC TTG CCA Pro Asn Leu Pro 2950 TTT GCT GGT GAA Phe Ala Gly Glu 2965 6816 6864 6912 6960 7008 7056 7104 7152 7200 7248 7296 7344 7392 7440 7488 7536 ATG AAG GAA CAT Met Lys Glu His 2955 AGG GCT TTC Arg Ala Phe TAG TTA TCT TAT TTA Leu Ser Tyr Leu 2960 TAA TCC Ser 2970 AAC TAT Asn Tyr 2985 ACT GAA AAA TTC Thr Glu Lys Phe CAG CCA Gin Pro 2975 TGT CAT TTT Cys His Phe TTA GGG GGG AGA AGA Leu Gly Gly Arg Arg 2980 ATT GAT TTT Ile Asp Phe TCC CCC TAA AAA AAG Ser Pro Lys Lys 2990 CCA TCT CAG ATT CAT Pro Ser Gin Ile His 2995

AGG

Arg 3000 AAC TTG CTT TTC Asn Leu Leu Phe TGT AAA Cys Lys 3005 GAA ATG AAA Glu Met Lys ACG ACT Thr Thr 3010 TCA TAC TTT Ser Tyr Phe CTG CGG Leu Arg 3015 TTA TTT Leu Phe CGC TTA CTT Arg Leu Leu AGC TCG Ser Ser 3020 ATG GAT ATT Met Asp Ile TGT AAG ATG AAT GCC AAA Cys Lys Met Asn Ala Lys 3025 3030 GGC GGG ATT Gly Gly Ile 3035 CAA CCC AGT Gin Pro Ser 3050 TGA TCG TTA TTC Ser Leu Phe ACC TTG TTA TTG Thr Leu Leu Leu 3055

CAA

Gin 3040 ATT TCA TTT GGT Ile Ser Phe Gly TTC TCT AGC AAT Phe Ser Ser Asn 3045 GCA CTG CAA TTT Ala Leu Gin Phe CTT ATT GAT TAA TCA Leu Ile Asp Ser 3060 GGC AGG AGG AAG GAA ACC TTG GCA CAG TAT Gly Arg Arg Lys Glu Thr Leu Ala Gin Tyr 3065 3070 CAA CTT GGT ATG TGC Gin Leu Gly Met Cys 3075

ACA

Thr 3080 WO 99/14314 PCT/AU98/00743 113 TGA TGG ATT Trp Ile TAC ACT GGG Tyr Thr Gly 3085 TGA TTT GGT ACA TAT Phe Gly Thr Tyr 3090 AAT ACC AAG TCA ATT Asn Thr Lys Ser Ile 3095 S TAC CAA ATG Tyr Gin Met GGA ATT GTG Gly Ile Val 3115 GGG AGA Gly Arg 3100 CCA ATA GAG ATG GAG Pro Ile Glu Met Glu 3105 AAA ATC ACA ATC TTA GCT Lys Ile Thr Ile Leu Ala 3110 CTT TTT TTT TGA AAT TTT Leu Phe Phe Asn Phe 3125 GGG AGG TAA TTC Gly Arg Phe TGA ACT Thr 3120 CAT GCT TTA CAT AAT AGT His Ala Leu His Asn Ser 3130 CAA ATG Gin Met 3135 GCT GAC AAA Ala Asp Lys TGT CGT TGT ATG GTT Cys Arg Cys Met Val 3140 CTC TCT ACC TAA ACC GTT AAG GCA GTA AGA GTT TCC CTA CAA GAT CTC Leu Ser Thr Thr Val Lys Ala Val Arg Val Ser Leu Gin Asp Leu 34 3150U 3155 3160 TTT GTT CGT ATA ATT GTA TTT TCT AGA Phe Val Arg Ile Ile Val Phe Ser Arg 3165 GAA AAG TTG CCT TCA Glu Lys Leu Pro Ser 3170 ATT TTG Ile Leu 3175 TGC ACG CGG Cys Thr Arg CAG TAC AGG AAT TGT Gin Tyr Arg Asn Cys 3180 GGT TAT AAA TAT TGA Gly Tyr Lys Tyr 3185 ACC ATC GTT ACT Thr Ile Val Thr 3195 AAT AGG GGG Asn Arg Gly AAC AAT AAG CAC ATT Asn Asn Lys His Ile 3200

TTT

Phe 3205 TAC AGG CTG Tyr Arg Leu 3190 TTA ATA GCA Leu Ile Ala ATC CGA ACC Ile Arg Thr 7584 7632 7680 7728 7776 7824 7872 7920 7968 8016 8064 8112 8160 8208 8256 8304 AAG GCA Lys Ala 3210 ATA AGT Ile Ser 3225 TCA CCC TTG TTC Ser Pro Leu Phe CGT TTC CAA TGA AAT Arg Phe Gin Asn 3215 CAC AGT His Ser 3220 TTT ACA AGT Phe Thr Ser ATG CGT Met Arg 3230 AGA GAG AAA Arg Glu Lys TAA AGT ATC AAC CCG Ser Ile Asn Pro 3235

GCA

Ala 3240 GAA ACA GTT GTT Glu Thr Val Val TCA GGC GCA AAG AGA Ser Gly Ala Lys Arg 3245 AAA GGA AAC GAT ATG Lys Gly Asn Asp Met 3250 CTC TAT Leu Tyr 3255 TAC ATC AAC Tyr Ile Asn CTT TTA Leu Leu 3260 GCA TTT AGG Ala Phe Arg GAC GAC CAG CAT CAT Asp Asp Gin His His 3265 AAT CAA CTG GAG Asn Gin Leu Glu 3275 CGA GGT CAC Arg Gly His CTC CAA Leu Gin 3280 TCT TCT CAG Ser Ser Gin

CAG

Gin 3285 CCC ATC TTC Pro Ile Phe 3270 CCT CAG AGT Pro Gin Ser GGG GTT GGG Gly Val Gly GGT GAC Gly Asp 3290 CTC CCA AGC AAG Leu Pro Ser Lys TGC ATC Cys Ile 3295 AGC ATC CAT Ser Ile His CAT CTG His Leu 3300 CAC ATA His Ile 3305 CCA TGA GCA Pro Ala CAA TCA CCT GAA TTT Gin Ser Pro Glu Phe 3310 GAT GAA TTT TCC TCT Asp Glu Phe Ser Ser 3315

GTT

Val 3320 TAC CTT GCA GCA Tyr Leu Ala Ala GAC CCC TGC CGT ATA Asp Pro Cys Arg Ile 3325 AAT GGT TTT AAA TGA Asn Gly Phe Lys 3330 CAG CAT Gin His 3335 WO 99/14314 PCT/AU98/00743 114 GTT CTT TCA GTT TGA GCA AAA TTT GTG CAA TTG CAA AGA AGC TTT AGA 8352 Val Leu Ser Val Ala Lys Phe Val Gin Leu Gin Arg Ser Phe Arg 3340 3345 3350 ATC ATG TGG AAC ATG CAC TTA CAT TTC ATC TGA CAA TAT AGG AAG GAG 8400 Ile Met Trp Asn Met His Leu His Phe Ile Gin Tyr Arg Lys Glu 3355 3360 3365 AGC CCG ACG TCG CAT GCT CCT CTA GAC TCG AGG AAT TCG CAA GAT TGT 8448 Ser Pro Thr Ser His Ala Pro Leu Asp Ser Arg Asn Ser Gin Asp Cys 3370 3375 3380 CTG TCA AAA GAT TGA GGA AGA GGC AGA TGC GCA ATT TCT TTG TTT GTC 8496 Leu Ser Lys Asp Gly Arg Gly Arg Cys Ala Ile Ser Leu Phe Val 3385 3390 3395 3400 TCA TGG TTT CTC AAG TAA GAC TTA TAT CTG ATC TCT TCA ATT TTT GAG 8544 Ser Trp Phe Leu Lys Asp Leu Tyr Leu Ile Ser Ser Ile Phe Glu 3405 3410 3415 ATT GCC TGT TTT TCA CAA TGG CAT ATG TTG TCA GGT GAA ACA TCC AAT 8592 Ile Ala Cys Phe Ser Gin Trp His Met Leu Ser Gly Glu Thr Ser Asn 3420 3425 3430 CCC AGT ATT AAT AGA GCC AAC ATG AAG GGA TTG CTT ATC TGA GAT ATC 8640 Pro Ser Ile Asn Arg Ala Asn Met Lys Gly Leu Leu Ile Asp Ile 3435 3440 3445 TGC CAA AGT TGA ATT CTT AGA TTC ACC TTC TTC AGT ATT TCA GAC CTT 8688 Cys Gin Ser Ile Leu Arg Phe Thr Phe Phe Ser Ile Ser Asp Leu 3450 3455 3460 CTA AGC ATT TTC ATT TTT TTT TTC AAT TGT TAG GGA GTT CCA ATG TTT 8736 Leu Ser Ile Phe Ile Phe Phe Phe Asn Cys Gly Val Pro Met Phe 3465 3470 3475 3480 TAC ATG GGC GAT GAA TAT GGC CAC ACA AAA GGG GGC AAC AAC AAT ACA 8784 Tyr Met Gly Asp Glu Tyr Gly His Thr Lys Gly Gly Asn Asn Asn Thr 3485 3490 3495 TAC TGC CAT GAT TCT TAT GTC AGT ACA ATT TGG TCA CAT ATT GTT GTT 8832 Tyr Cys His Asp Ser Tyr Val Ser Thr Ile Trp Ser His Ile Val Val 3500 3505 3510 CTA AGT AAC TAT CTT CAA ATC TTT GCA TTC ATC CGT CAT GGC TCT TCT 8880 Leu Ser Asn Tyr Leu Gin Ile Phe Ala Phe Ile Arg His Gly Ser Ser 3515 3520 3525 GTA GGT CAA TTA TTT TCG CTG GGA TAA AAA AGA ACA ATA CTC TGA CTT 8928 Val Gly Gin Leu Phe Ser Leu Gly Lys Arg Thr Ile Leu Leu 3530 3535 3540 GCA AAG ATT CTG CTG CCT CAT GAC CAA ATT CCG CAA GTA AGT ATT CCG 8976 Ala Lys Ile Leu Leu Pro His Asp Gin Ile Pro Gin Val Ser Ile Pro 3545 3550 3555 3560 TTG AAT AAT TTC TGT GTA GAA CCA CTG AAG GTG CCT CCA AAC GCT AAG 9024 Leu Asn Asn Phe Cys Val Glu Pro Leu Lys Val Pro Pro Asn Ala Lys 3565 3570 3575 CGA GCA AGG TCA ATT TCA CAC CCT AAT CAA GTT GGT GTT GTC-TAT TTG 9072 Arg Ala Arg Ser Ile Ser His Pro Asn Gin Val Gly Val Val Tyr Leu 3580 3585 3590 WO 99/14314 WO 99/ 4314PCT/AU98/00743 115 TGT ATT TGA TCT Cys Ile Ser 3595 GCT GCA CTG TAG GGA Ala Ala Leu Gly 3600 GTG CGA GGG TCT TGG CCT TGA Val Arg Gly Ser Trp Pro* 3605 GGA CTT TCC AAC GCC CGA Giy Leu Ser Asn Gly Arg 3610 ACG GCT Thr Ala 3615 GCA GTG GCA Ala Val Ala TGG TCA Trp Ser 3620 TCA GCC TGG Ser Ala Trp GAA GCC Glu Ala 3625 TGA TTG GTC Leu Val TGA GAA TAG CCG ATT Glu Pro Ile 3630 CGT TGC CTT TTC CAT Arg Cys Leu Phe His 3635

GGT

G ly 3640 9120 9168 9216 9264 9289 ACA CAT ATA OTT Thr His Ile Val CTG ACA CTT CAC TAT Leu Thr Leu His Tyr 3645 AGT TGT TTT AAA AAA Ser Cys Phe Lys Lys 3650 GAA AAT Giu Asn 3655 TTA ACT CAA AAG TAA ATT ATG GAG A Leu Thr Gin Lys Ile Met Giu 3660

Claims

1. A nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize.

2. A sequence according to claim 1, wherein the sequence is a genomic DNA or cDNA sequence.

3. A sequence according to claim 1 or claim 2, wherein the sequence is functional in wheat.

4. A sequence according to any one of claims 1 to 3, wherein the sequence is derived from a Triticum species.

5. A sequence according to claim 4, wherein the Triticum species is Triticum tauschii.

6. A sequence according to any one of claims 1 to wherein the sequence encodes starch branching enzyme I or a biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID NO:5 or SEQ ID NO:9.

7. A sequence according to claim 6, wherein the homology is at least

8. A sequence according to any one of claims 1 to wherein the sequence encodes starch branching enzyme II a or biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID WO 99/14314 PCT/AU98/00743 117

9. A sequence according to claim 8, wherein the homology is at least A sequence according to any one of claims 1 to wherein the sequence encodes soluble starch synthase or a biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID NO:ll or SEQ ID NO:13.

11. A sequence according to claim 10, wherein the homology is at least

12. A sequence according to claim 11, wherein the sequence encodes a 75 kD soluble starch synthase of wheat.

13. A sequence according to claim 12, which encodes an amino acid sequence at least 70% homologous to that shown in SEQ ID NO:14.

14. A sequence according to any one of claims 1 to wherein the sequence encodes debranching enzyme or a biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID No:17. A sequence according to claim 14, wherein the homology is at least

16. A promoter of an enzyme selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize.

17. A promoter according to claim 16, wherein the promoter is a starch branching enzyme I promoter or PCT/AU98/00743 Received 21 June 1999 118 biologically-active fragment thereof, and wherein the promoter sequence has at least 70% sequence homology with the sequence shown in SEQ ID No:8.

18. A sequence according to claim 17, wherein the homology is at least

19. A promoter according to claim 16, wherein the promoter is a starch soluble synthase I promoter or biologically-active fragment thereof, and wherein the promoter sequence has at least 70% sequence homology with the sequence shown in SEQ ID A sequence according to claim 19, wherein the homology is at least

21. A nucleic acid construct comprising a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, operably linked to one or more nucleic acid sequences facilitating expression of the nucleic acid sequence in a plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, a biologically- active fragment thereof, and that starch branching enzyme II does- not have the N-terminal amino acid sequence: AASPGKVLVPDGEDDLASPA.

22. A nucleic acid construct for targeting a gene to the endosperm of a cereal plant, comprising one or more promoter sequences selected from the group consisting of SBE I promoter, SEE II promoter, SSS I promoter, and DBE promoter, operatively linked to a nucleic acid sequence encoding a protein, wherein the expression of the targetted gene in the endosperm of a cereal plant is modified. AMENDED SHEET (Article 34) (IPEA/AU) WO 99/14314 PCT/AU98/00743 119

23. A construct according to either claim 21 or claim 22, wherein the promoter or nucleic acid sequence is also operatively linked to one or more additional targeting sequences and/or one or more 3' untranslated sequences.

24. A construct according to claim 23, wherein the nucleic acid encoding the protein is either in the sense or antisense orientation.

25. A construct according to claims 24, wherein the protein is an enzyme of the starch biosynthetic pathway.

26. A construct according to claim 25, wherein the nucleic acid encoding the protein is in the antisense orientation, and the enzyme is selected from the group consisting of GBSS, starch debranching enzyme, SBE II, low molecular weight glutenin, and grain softness protein I.

27. A construct according to claim 25, wherein the nucleic acid encoding the protein is in the sense orientation, and the enzyme is selected from the group consisting of bacterial isoamylase, bacterial glycogen synthase, and wheat high molecular weight glutenin Bxl7.

28. A construct according to any one of claims 21 to 27, wherein the plant is a cereal plant.

29. A construct according to claim 28, wherein the cereal plant is either wheat or barley.

30. A construct according to claim 29, wherein the cereal plant is wheat.

31. A construct according to any one of claims 21 to wherein the construct is either a plasmid or a vector. WO 99/14314 PCT/AU98/00743 120

32. A construct according to claim 31, wherein the plasmid or vector is suitable for use in the transformation of a plant.

33. A construct according to claim 32, wherein the plasmid is selected from the group consisting of those depicted in Figures 22a to 22f.

34. A construct according to claim 32, wherein the vector is a bacterium of the genus Agrobacterium. A construct according to claim 34, wherein the vector is Agrobacterium tumefaciens.

36. A method of modifying the characteristics of starch produced by a plant, comprising the steps of: introducing a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway into a host plant, and/or introducing an anti-sense nucleic acid sequence directed to a gene encoding an enzyme of the starch biosynthetic pathway into a host plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, and wherein if both steps and are used, the enzymes in the two steps are different.

37. A method according to claim 36, wherein the plant is a cereal plant.

38. A method according to claim 37, wherein the cereal plant is wheat or barley. WO 99/14314 PCT/AU98/00743 121

39. A method of targeting expression of a gene to the endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to any one of claims 21 to A method of modulating the time of expression of a gene in endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to any one of claims 21 to

41. A method according to claim 40, wherein when expression at an early stage following anthesis is desired, the construct comprises either the SEE II, SSS I, or DBE promoter.

42. A method according to claim 40, wherein when expression at a later stage following anthesis is desired, the construct comprises the SBE I promoter.

43. A plant transformed with a construct according to any one of claims 21 to

44. A plant according to claim 43, wherein the plant is a cereal plant. A plant according to claim 44, wherein the cereal plant is wheat or barley.

46. A method of identifying variations in the starch synthesis characteristics of a cereal plant, comprising the step of identifying a variation in nucleic acid sequence in the intron regions of the SEE I, SBE II, SSS I or DBE genes.

47. A method of identifying variations in the starch synthesis characteristics of a cereal plant, comprising the step of identifying a variation in nucleic acid sequence compared to the sequence shown in one or more SEQ ID WO 99/14314 PCT/AU98/00743 122 SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:16, or SEQ ID NO:17.

48. A method according to claim 47, in which a mutation or absence of a SBE I, SBE II, SSS I or DBE gene is detected.

49. A method according to either claim 47 or claim 48, in which the cereal plant is wheat or barley.

50. A product comprising plant material propogated from a plant transformed with a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, operably linked to one or more nucleic acid sequences facilitating expression of the nucleic acid sequence in a plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, a biologically-active fragment thereof.

51. A product comprising plant material propogated from a plant in which a gene was targeted to the endosperm of a cereal plant, by a nucleic acid construct comprising one or more promoter sequences selected from the group consisting of SBE I promoter, SBE II promoter, SSS I promoter, and DBE promoter, operatively linked to a nucleic acid sequence encoding a protein, wherein the expression of the targetted gene in the endosperm of a cereal plant is modified.

52. A product according to claim 50 or claim 51 wherein the product is a food product.