HIGH-INDUCTION ALPHA-AMYLASE GENE PROMOTERS
Field of the Invention
The present invention relates to monocot α-amylase promoters which are inducible by gibberellins (GA) in germinating seeds, and in particular, to mutated promoters whose GA-induction levels are severalfold those of the corresponding wildtype promoters.
References Chandler, P.M. (1991) Primer extension studies on α-amylase MRNAS in barley aleurone. II. Hormonal regulation of expression. Plant Mol.Biol. 16:637-645.
Chandler, P.M., and Huiet, L. (1991). Primer extension studies on α-amylase MRNAS in barley aleurone. I. Characterization and quantitation of the transcripts. Plant Mol. Biol. 16:627-635. Chrispeels, M.J., and Varner, J.E. (1967). Gibberellic acid-enhanced synthesis and release of α-amylase and ribonuclease by isolated barley aleurone layers. Plant Physiol. 42:398-406.
Gubler, F. , and Jacobsen, J.V. (1992) Gibberellin-responsive elements in the promoter of a barley high-Pi α-amylase gene. Plant Cell 4:1435-1441. Higgins, T.J.V., Jacobsen, J.V., and Zwar, J.A. (1982) Gibberellic acid and abscisic acid modulate protein synthesis and MRNA levels in barley aleurone layers. Plant Mol. Biol. 1 : 191-215.
Jacobsen, J.V., and Beach, L.R. (1985) Control of transcription of α-amylase and rRNA genes in barley aleurone layer protoplasts by gibberellic acid and abscisic acid. Nature 316:275-277.
Jones, R.L., and Jacobsen, J.V. (1991). Regulation of synthesis and transport of secreted proteins in cereal aleurone. In International Review of Cytology, Jeon KW, Friedlander M, eds (San Diego: Academic Press), pp. 49-88.
Khursheed, B., and Rogers, J.C. (1988). Barley α-amylase genes. Quantitative comparison of steady-state MRNA levels from individual members of the two different families expressed in aleurone cells. J. Biol. Chem. 263:18953-18960.
Lanahan, M.B., Ho, T.-H.D., Rogers, S.W.. and Rogers, J.C. (1992). A gibberellin response complex in cereal α-amylase gene promoters. Plant Cell 4:203-211.
Rogers (1994). The cis- Acting Gibberellin Response Complex in High-Pi α- amylase Gene Promoters. Plant Physiol. 105:51-158.
Rogers, J.C. (1985). Two barley α-amylase gene families are regulated differently in aleurone cells. J. Biol. Chem. 260:3731-3738. Rogers, J.C, and Milliman, C. (1983). Isolation and sequence analysis of a barley α-amylase CDNA clone. J. Biol. Chem. 258:8169-8174.
Rogers, J.C, and Milliman, C. (1984). Coordinate increase in major transcripts from the high pi α-amylase multigene family in barley aleurone cells stimulated with gibberellic acid. J. Biol Chem. 259: 12234-12240. Rogers, J.C, and Milliman, C. (1985). Coordinate increase in major transcripts from the high pi α-amylase multigene family in barley aleurone cells stimulated with gibberellic acid (Correction). J. Biol. Chem. 260:3233.
Whittier, R.F., Dean, D.A., and Rogers, J.C. (1987). Nucleotide sequence analysis of α-amylase and thiol protease genes that are hormonally regulated in barley aleurone cells. Nucl Acids Res. 15:2515-2535.
Background of the Invention
A variety of plant cell systems for producing selected polypeptides, e.g., bacterial and mammalian peptides and proteins, have been proposed. In general, protein production is carried out in cell culture using transgenic plant cells, or in a selected tissue of transgenic plants. The proteins are preferably produced in a form, e.g., mature protein secreted from the plant cells, to facilitate recovery of the foreign protein in a commercially useful form.
It is generally desirable, in a transgenic plant system for producing a selected protein, to place the protein-encoding gene under the control of a promoter that is inducible under selected protein-production conditions. For example, in producing a foreign and native protein in plant cell culture, the promoter is preferably inducible by an easily produced change in cell culture condition, e.g., sucrose depletion, as described in US Patent No. 5,460,952. As another example, where a foreign gene is to be expressed in germinating seeds, as described for example, in PCT WO 95/14099, published May 25 1995, the gene promoter should be one that is inducible to high levels during the germination process, either by natural promoter induction that occurs during seed germination, or
under the control of gibberellic acid (GA3), which induces seed germination.
To date, several monocot promoters that are inducible during seed germination have been described (e.g. , Chrispeels and Varner, 1967; Higgins et al., 1982; Rogers and Milliman. 1983; Rogers and Milliman, 1984; Rogers and Milliman, 1985; Rogers, 1985; Jacobsen and Beach, 1985; Whittier et al., 1987; Khursheed and Rogers, 1988; Chandler and Huiet, 1991; Chandler, 1991;
Jones and Jacobsen, 1991). Typically, these promoters show a two- to tenfold or greater increase in the level of mRNA and/or protein expression in the induced state relative to an uninduced state, allowing protein expression under the control of the promoter to be highly stimulated during seed germination.
Malt is cereal grain that has been allowed to germinate under controlled conditions. The function of malting is to produce enzymes, particularly amylases, which hydrolyse the starch reserve of the cereal endosperm to make sugars available for fermentation.
It would be desirable, in the production of proteins, native or heterologous, in germinating seeds, for example in malting, to employ promoters with very high levels of inducibility, e.g., in the range of 100-500 fold induction. Such promoters would lead to a significantly greater percentage of protein production, with consequent advantages in terms of protein yield, cost, and ease of purification.
Summary of the Invention
The invention includes, in one aspect, an improvement in a monocot α-amylase gene promoter of the type that (i) contains a pyrimidine box sequence, a GARE sequence, and an amylase box sequence, and (ii) shows, in wildtype form, at least a 2 fold induction in expression levels over non-induced levels when exposed to GA3 in germinating seeds.
The improvement is effective to enhance the level of GA3 inducibility of the promoter at least another 5-fold over the level of GA3 inducibility of the wildtype gene promoter, and includes one or both of the following modifications (a) and (b) to the wildtype promoter:
(a) a mutation in the downstream region 5-15 bases 3' of the amylase box sequence of said promoter, and
(b) a tandem duplicate of the promoter region which includes the amylase box,
where the promoter contains or has been modified to contain an O2S sequence 5 ' to the pyrimidine box.
In various embodiments: (i) the wildtype promoter shows at least a tenfold level of inducibility; (ii) the tandem duplicate includes the pyrimidine box, the GARE element, and the amylase box; and (iii) the embodiment with the tandem duplicate may also include the modification (a) in at least one of the duplicates, 5-15 bases 3' of the amylase box. An exemplary modified promoter of this embodiment has the sequence identified by SEQ ID NO:31.
Where the improvement includes at least modification (a), the mutation may encompass, for example, 1-3 bases in the downstream region 7-10 bases 3' of the amylase box sequence. The promoter in this embodiment may be, for example, from one of the following low-pi α-amylase genes: BLYAMY2A, HVAMY32B, AFAMY254, TAAAM253, TAAAM254, and TAAAM28.
In another aspect, the invention includes a method of enhancing the inducibility by gibberellins in a germinating or malted seed in a monocot α-amylase gene promoter of the type described above, by at least a 2 fold induction in expression levels over non- induced levels when exposed to GA3 in germinating seeds. The method includes modifying the promoter by one or both of the above modifications (a) and (b). Exemplary modifications are as above. Also disclosed is a vector for use in transforming a monocotyledonous plant. The vector includes a chimeric gene having, operatively linked in sequence in a 5' to 3' direction, (i) the improved monocot α-amylase gene promoter described above, (ii) a gene encoding a protein, under the control of said promoter, and (iii) a 3' untranslated terminator region. Exemplary modified promoters are as given above. In still another embodiment, the invention includes a monocotyledonous plant whose cells are transformed with a chimeric gene having, operatively linked in sequence in a 5 ' to 3 ' direction, (i) the improved monocot α-amylase gene promoter described above, (ii) a gene encoding a protein, operatively linked to the promoter, and (iii) a 3' untranslated terminator region. Exemplary modified promoters are as given above. These and other objects and features of the invention will be more fully understood when the following detailed description of the invention is read in conjunction with the accompanying drawings.
Brief Description of the Figures
Fig. 1 shows a schematic representation of a GA-inducible monocot α-amylase promoter, with the promoter sequence elements pyrimidine box 12, GA response element (GARE) 14, amylase box 16, and TATA box 18 shown as shaded boxes. The numbers in parentheses between the boxes indicate the range of nucleotides which separate each sequence element in various promoters of this type.
Figs. 2A-2D are schematic representations of portions of modified barley Amy32b promoters which include the pyrimidine box 12, GARE 14, amylase box 16, and downstream sequence 3' of the amylase box, flanked by Xbal sites. Nucleotides unchanged from the wild-type barley Amy 32b promoter are shown as dots, except as noted below:
Fig. 2 A shows a schematic of the wild-type barley Amy 32b promoter construct TS010 (SEQ ID NO:28), which contains Xbal restriction sites flanking the promoter region from the pyrimidine box through the downstream region 1-18 bases 3' of the amylase box, wherein the CCT nucleotides in the downstream region are nucleotides present in the wildtype Amy32b promoter;
Fig. 2B shows a schematic of a TS010 mutant promoter construct, designated TS022 (SEQ ID NO:29), which contains a substitution of the wildtype nucleotides CCT to TGG in the downstream region, 7-9 bases 3' of the amylase box; Fig. 2C shows a schematic of a tandem duplication of the TS010 promoter construct, designated TS010D (SEQ ID NO:30);
Fig. 2D shows a schematic of a tandem duplication TS010D containing the TS022 mutation in the first occurrence of the downstream region, the resulting construct designated TS022D (SEQ ID NO:31). Figs. 3 A. 3B and 3C show portions of GA-inducible monocot α-amylase promoter sequences (SEQ ID NOs:l-21) extending from the pyrimidine box sequence downstream to the GARE sequence (Fig. 3A); from the GARE sequence through the amylase box sequence to the TATA box sequence (Fig. 3B); and from the TATA box sequence downstream to the ATG translation start codon (Fig. 3C). Fig. 4 shows a scheme for introducing 12 different nucleotide substitutions into
TS010.
Fig. 5 shows a series of nucleotide substitutions introduced into the Amy32b promoter/GUS fusion construct TS010, and the levels of GA3 induction observed in transgenic seeds containing the corresponding mutant promoters.
Fig. 6 shows the construction of a vector designed to express lysozyme in germinated seeds.
Brief Description of the Sequences SEQ ID NO:l is the sequence of the BLYAMY1A promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 2 is the sequence of the BLAMY2A promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 3 is the sequence of the BLYAMY46 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO:4 is the sequence of the BLYAMYABD promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 5 is the sequence of the BLYAMYG41 promoter between the pyrimidine box and the translation start codon. SEQ ID NO: 6 is the sequence of the HV18 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 7 is the sequence of the HVAMY152 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 8 is the sequence of the HVAMY1G promoter between the pyrimidine box and the translation start codon.
SEQ ID NO:9 is the sequence of the HVAMY56 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 10 is the sequence of the HVAMY32B promoter between the pyrimidine box and the translation start codon. SEQ ID NO: 11 is the sequence of the AFAMY254 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 12 is the sequence of the OSALAM promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 13 is the sequence of the OSAMYC promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 14 is the sequence of the OSOSAMYBG promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 15 is the sequence of the OSRAMY3B promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 16 is the sequence of the OSRAMY3C promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 17 is the sequence of the RICAMYIB promoter between the pyrimidine box and the translation start codon. SEQ ID NO: 18 is the sequence of the TAAAM234 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 19 is the sequence of the TAAAM253 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO: 20 is the sequence of the TAAAM254 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO:21 is the sequence of the TAAAM28 promoter between the pyrimidine box and the translation start codon.
SEQ ID NO:22 is the sequence of the pyrimidine box promoter element.
SEQ ID NO:23 is the sequence of the GARE promoter element. SEQ ID NO:24 is the sequence of the amylase box promoter element.
SEQ ID NO:25 is the sequence of the TATA box promoter element.
SEQ ID NO:26 is the sequence of the O2S promoter element.
SEQ ID NO:27 is the sequence of ML022, a truncated form of the Amy32b promoter, between the Hindlll and BamHI restriction sites. SEQ ID NO:28 s the sequence of TS010.
SEQ ID NO:29 s the sequence of TS022.
SEQ ID NO: 30 s the sequence of TS010D.
SEQ ID NO:31 s the sequence of TS022D.
SEQ ID NO: 32 s primer TS022XbaF. SEQ ID NO:33 s primer TS022XbaR.
SEQ ID NO:34 s primer OL1.
SEQ ID NO:35 s primer OL2.
SEQ ID NO:36 s primer OL3.
SEQ ID NO: 37 s primer 32bClaF. SEQ ID NO:38 s primer 32bClaR.
SEQ ID NO:39 s the sequence of HV18 which includes the region upstream of the pyrimidine box.
SEQ ID NO:40 is primer HVlδClaF.
SEQ ID NO:41 is primer HV18ClaR.
SEQ ID NO:42 is primer HV18XbaF. SEQ ID NO:43 is primer HVlδXbaR. SEQ ID NO:44 is primer OL-Ax2.
Detailed Description of the Invention
I. Definitions
The terms below have the following meaning, unless indicated otherwise in the specification.
"Cell culmre" refers to cells and cell clusters, typically callus cells, growing on or suspended in a suitable growth medium.
"Codon optimization" refers to changes in the coding sequence of a gene to replace native codons with those corresponding to optimal codons in the host plant.
A DNA sequence is "derived from" a gene, such as a barley α-amylase gene, if it corresponds in sequence to a segment or region of that gene. Segments of genes which may be derived from a gene include the promoter region, the 5' untranslated region, and the 3 ' untranslated region of the gene.
"Germination" refers to the breaking of dormancy in the seed and resumption of metabolic activity in the seed, including the production of enzymes effective to break down starches in the seed endosperm. "Malting" is the process by which monocot grains are germinated under controlled conditions, typically in contained facilities, to produce a product that can be used for human consumption, animal feed and the brewing of alcoholic beverages. The process typically begins by steeping seeds in heated water followed by a several-day germination of the grain in malting bins or drums. "Gibberellins" (GA) refer to a family of major plant hormones which are produced during germination and malting. Gibberellins are terpenoids consisting of 19 or 20 carbons derived from four isoprenoid units. More than 90 gibberellins have been isolated from higher plants, fungi, and related sources. Gibberellin A, (GA,) and gibberellic acid (gibberellin A3; GA3) are the most active in inducing α-amylase genes in cereal grains. Although both GA, and GA3 are produced in germinating seeds, GA3 is the most commonly used gibberellin in experimental conditions.
"GA-Inducible" , with reference to a monocot α-amylase gene in a seed, means that the promoter for that gene is up-regulated in germinating or malted seeds. An upregulation of N fold over non-induced level means that the level of protein expressed
by the gene in germinating or malted seeds is N times, on a per seed weight basis, that measured in the non-induced state. Examples of promoters that are inducible during germination are presented below.
"Heterologous DNA" refers to DNA which has been introduced into plant cells from another source, and/or which is under the control of a plant promoter which does not control that gene in a non-transgenic plant.
A "promoter" refers to nucleic acid sequences that influence and/or promote initiation of transcription. Promoters are typically considered to include regulatory regions, such as enhancer or inducer elements. A promoter from a monocot α-amylase gene includes, as separate regulatory regions, a pyrimidine box, a GARE element, and amylase box, and a TATA box. The α-amylase promoter may also include an O2S region upstream of the pyrimidine box. Included are α-amylase promoters which have a promoter region extending from the pyrimidine box to a point 15 bases downstream of the amylase box whose sequence is at least 80% homologous to the corresponding region in the following monocot α-amylase promoters: BLYAMY1A, BLAMY2A,
BLYAMY1A, HVAMY32B, BLYAMY2A, BLYAMY46, BLYAMYABD, HV18, HVAMY152, HVAMY1G, HVAMY56, AFAMY254, OSALAM, OSAMYC, OSOSAMYBG, OSRAMY3B, OSRAMY3C, RICAMY1B, TAAAM234, TAAAM253, TAAAM254, and TAAAM28. "Operably linked" refers to components of a chimeric gene or an expression cassette that function as a unit to express a heterologous protein. For example, a promoter operably linked to a DNA which encodes a protein promotes the production of functional mRNA corresponding to the DNA. The promoter may be operatively linked to the coding region, for example, through a 5'UTR sequence between the transcription start site and the translation start codon of the coding region.
A "product" encoded by a DNA molecule includes, for example, RNA molecules and polypeptides.
"Stably transformed" as used herein refers to a cereal cell or plant that has foreign nucleic acid stably integrated into its genome which is transmitted through multiple generations.
"Tandem duplication" refers to two or more tandemly arranged promoter elements or groups of promoter elements. "Tandem duplicates" may include, for example, two or more tandemly arranged amylase box elements; two or more GARE- amylase box units; two or more pyrimidine box-GARE-amylase box units.
II. GA-inducible Promoters
Fig. 1 shows a composite monocot α-amylase gene promoter 10 assembled from the promoter sequences of a variety of known monocot α-amylase promoters, identified in Figs. 3A-3C by SEQ ID NOS: 1-21. The sequences of representative GA-inducible monocot α-amylase promoters include those derived from the following genes:
BLYAMY1A, SEQ ID NO:l; BLAMY2A, SEQ ID NO:2; BLYAMY46, SEQ ID NO:3;
BLYAMYABD, SEQ ID NO:4; BLYAMYG, SEQ ID NO:5; HV18, SEQ ID NO:6;
HVAMY152, SEQ ID NO:7; HVAMY1G, SEQ ID NO:8; HVAMY56, SEQ ID NO:9;
HVAMY32B, SEQ ID NO: 10; AFAMY254, SEQ ID NO:ll; OSALAM, SEQ ID NO:12; OSAMYC, SEQ ID NO:13; OSOSAMYBG, SEQ ID NO:14; OSRAMY3B,
SEQ ID NO: 15; OSRAMY3C, SEQ ID NO: 16; RICAMY1B, SEQ ID NO: 17;
TAAAM234, SEQ ID NO: 18; TAAAM253, SEQ ID NO: 19; TAAAM254, SEQ ID
NO:20; and TAAAM28, SEQ ID NO:21.
The composite promoter includes a pyrimidine box 12 having a consensus sequence CCTTTT (SEQ ID NO:22), a GA-response element (GARE) 14, having a consensus sequence TAAC(A/G)(A/G)A (SEQ ID NO:23), an amylase box 16 having a consensus sequence TATCCA(T/C) (SEQ ID NO:24), and a TATA box 18 having a consensus sequence TATA(T/A)A (SEQ ID NO:25). The promoter contains a region 19 upstream (5') of the pyrimidine box, a 4-60 base region 20 between the pyrimidine box and GARE element, a 6-17 base region 22 between the GARE element and amylase box, a 40-80 base region 24 between the amylase box and TATA box, and an 83-124 base region 26 between the TATA box and the ATG start codon of the gene.
Figures 3A, 3B, and 3C show portions of the representative α-amylase promoter sequences extending from the pyrimidine box to the GARE (Figure 3A); from the GARE through the amylase box to the TATA box (Figure 3B); and from the TATA box to the
ATG start codon (Figure 3C). In each figure the indicated regulatory elements are underlined.
Each of these genes is representative of monocot α-amylase genes that are characterized by at least a 2 fold induction in expression levels over non-induced levels when exposed to GA in germinating seeds. That is, the level of the controlled α-amylase enzyme is induced more than 2-fold during germination by exposure of the germinating seeds to GA3, measured as described in Example 3. Some of the promoters, e.g., the promoters from the HVAMY32B, BLYAMY46, BLYAMYABD, HVAMY152, and
TAAM254 α-amylase genes, have levels of inducibility in the wildtype of ten-fold or higher.
The GA-inducible monocot α-amylases are categorized as low pi and high pi (Jones, R.L. et al. ; 1991). In general, the promoters of low pi α-amylase genes, such as from barley HVAMY32B, have fewer nucleotides separating the regulatory elements than promoters from high pi α-amylase genes such as barley HV18. The low-pi promoters also contain an 02S sequence CTTGACCATCATC (SEQ ID NO:26) within region 19 upstream of the pyrimidine box element (Fig. 1). Representative promoters of the low pi class include those from the BLYAMY2A, HVAMY32B, AFAMY254, TAAAM253, TAAAM254, and TAAAM28 monocot α-amylase genes.
Lanahan et al. (1992) demonstrated the importance of sequences within the GARE and a 6 base pair region located 3 ' of and immediately adjacent to the amylase box for high level GA-regulated expression from a low pi α-amylase gene promoter. These authors also demonstrated the effect of the O2S sequence on amylase expression in low pi promoters (Lanahan et al.; 1992) and high pi promoters (Rogers et al.; 1994). Similar results were obtained by Gubler and Jacobsen (1992) using a high pi α-amylase promoter.
A. High-Induction α-Amylase Promoters The present invention is based on the discovery that GA-inducible α-amylase promoters from monocots, as defined herein, can be enhanced to show at least another 5- fold level of inducibility by GA3, over wildtype promoters, as measured by the method of Example 3. Thus, for example, if a wildtype GA-inducible promoter shows a 10 fold induction by GA3, a five-fold enhancement over the wildtype inducibility would lead to a 50 fold increase in level of expressed protein over the non-induced states. At the same time, the low basal level of expression characteristic of the inducible promoters is retained.
According to the invention, this enhancement in GA-inducibility is achieved by making one of both of the following modifications to the promoter, as diagrammed generally in Fig. 2:
Modification (a): a mutation in the downstream region 5-15 bases 3' of the amylase box sequence of said promoter, and
Modification (b): a tandem duplicate of the promoter region which includes the
amylase box, and preferably the pyrimidine box sequence, the GARE sequence, the amylase box sequence, and a region about 15 bases downstream of the amylase box, where the promoter contains or has been modified to contain an O2S sequence 5 ' to the pyrimidine box.
B. Constructing modification (a) promoters
In modification (a), a base-substitution, addition, or deletion mutation is made in the region 5-15 bases 3' of the amylase box. The mutation is preferably one or more base-substitution mutations, and the mutations are preferably localized in the region 7-10 bases 3' of the amylase box. Fig. 2B illustrates one exemplary modification, in accordance with this embodiment. Here the bases CCT at positions 7-9 downstream of amylase box 16 in promoter 29 are replaced by bases TGG at the same positions, to yield a mutated promoter 30 having a 3-base mutation at region 33.
Mutations of this type may be readily constructed using standard recombinant methods. Construction of the exemplary modification of Fig. 2B is described in Example 1A.
To demonstrate the ability of mutations in this region to enhance GA-inducibility, a series of mutations were introduced into the Amy32b promoter as diagramed in Figure 4. The promoter was operatively linked to the β-glucoronidase (GUS) gene such that the promoter controlled the expression of the gene. The constructs were delivered by particle bombardment into intact aleurone layers which were incubated in the presence and absence of GA3 (Example 3 A). An oat ubiquitin promoter/firefly luciferase reporter construct, the expression from which is unaffected by GA3, was used as an internal standard. The transcriptional activity of the promoter constructs was assayed as a function of GUS activity (Example 3B). Figure 5 shows relative GUS activity for a series of promoter mutations as diagramed on the left hand side of the figure. Results are presented as relative GUS activity; the columns represent the mean (with SE) of samples incubated in the absence (light bars) or the presence (dark bars) of GA3. The numbers to the right of the bars represent the amount of induction in presence of GA3 over the non- induced level for each construct (e.g. 20x = twenty-fold induction in expression over the non-induced level).
The wild-type Amy32b promoter constructs ML022 and TS010 (the latter containing Xb l sites flanking a portion of the promoter region as shown in Fig. 2A)
showed approximately 20-fold induction in the presence of GA3. Promoter constructs containing mutations of 1 to 3 contiguous bases located in the region about 3 bases 5 ' of the GARE to about 6 bases 3' of the amylase box showed induction levels equivalent to or lower than those of the wild-type promoter constructs (Fig. 5) consistent with deleterious sequence changes within and surrounding the conserved sequence elements GARE, amylase box, and pyrimidine. The most likely explanation for these results is the disruption of binding sites and/or contexts necessary for correct assembly of transcription factors/complexes mediating the hormone and tissue specific response of these promoters. In contrast, mutant promoter construct TS022, in which bases 7-9 in the downstream region 3 ' of the amylase box are mutated from CCT to TGG in combination with a specific sequence duplication described in the next section resulted in nearly a 200-fold induction in the presence of GA3 (Fig. 5). The induction level of this improved promoter is enhanced 10-fold over that of the wild-type promoter while at the same time maintaining low basal level (non-induced) expression in the absence of gibberellic acid, in accordance with the invention. It has been determined that the sequence alteration alone in the absence of a duplication designated TS022 (SEQ ID NO: 29) is responsible for less than 20 % of the 200-fold enhanced expression originally observed with the combination of both the duplication and sequence change as shown in the embodiment represented in Fig. 2D designated TSO22D (SEQ ID NO:31).
Mutant promoter construct TS022, in which bases 7-9 in the downstream region 3' of the amylase box are mutated from CCT to TGG, showed nearly 200-fold induction of expression in the presence of GA3 (Figure 5). The induction level of this improved promoter is enhanced 10-fold over that of the wildtype promoter, in accordance with the invention. It is noted that this construction also contains the duplication modification described in the next section.
C. Constructing modification-(b) promoters
In modification (b), the promoter is modified to include a tandem duplication of the promoter region which includes the amylase box, and preferably the region including the pyrimidine box, GARE, and amylase box, and preferably including the region 15 bases 3' of the amylase box. The promoter also includes, or is modified to include, the O2S region defined by SEQ ID NO: 26 positioned upstream of the pyrimidine box.
A typical duplication of this type is illustrated in Fig. 2C The modified promoter 32, designated TS010D (SEQ ID NO:30) has a tandem repeat which includes the pyrimidine box sequence, the GARE sequence, the amylase box sequence, and the region 1-18 bases downstream of the amylase box. In the embodiment shown in Fig. 2D, the promoter 34 designated TS022D (SEQ ID NO:31) includes both modification types (a) and (b) — namely a tandem duplication which includes the pyrimidine box sequence, the GARE, the amylase box sequence, and the mutation 33 in at least one of the downstream regions 5-15 bases 3' of the associated amylase box.
As above, these promoter modifications may be made by standard recombinant manipulations as outlined in Examples 1 and 2.
III. Plant Transformation
For transformation of plants, a chimeric gene comprising the improved α-amylase promoter of the present invention, a gene encoding a protein to be expressed, and a 3' untranslated terminator region, is placed in a suitable expression vector designed for operation in plants. The vector includes suitable elements of plasmid or viral origin that provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT WO 95/14099, published May 25, 1995, which is incorporated by reference herein.
A. Transformation Vector
Vectors containing the promoter of the present invention may also include selectable markers for use in plant cells (such as the nptϊl kanamycin resistance gene, for selection in kanamycin-containing or the phosphinothricin acetyltransferase gene, for selection in medium containing phosphinothricin (PPT).
The vectors may also include sequences that allow their selection and propagation in a secondary host, such as sequences containing an origin of replication and a selectable marker such as antibiotic or herbicide resistance genes, e.g. , HPH (Hagio et al, Plant Cell Reports 14:329 (1995) and van der Elzer, Plant Mol. Biol. 5:299-302 (1985).
Typical secondary hosts include bacteria and yeast. In one embodiment, the secondary host is Eschenchia coli, the origin of replication is a colEl-type, and the selectable marker is a gene encoding ampicillin resistance. Such sequences are well known in the art and are commercially available as well (e.g. , Clontech, Palo Alto, CA; Stratagene,
La Jolla, CA).
The vectors of the present invention may also be modified to intermediate plant transformation plasmids that contain a region of homology to an Agrobacterium tumefaciens vector, a T-DNA border region from Agrobacterium tumefaciens, and chimeric genes or expression cassettes (described above). Further, the vectors of the invention may comprise a disarmed plant tumor inducing plasmid of Agrobacterium tumefaciens.
One exemplary expression vector, designated pAPI237, is illustrated in Fig. 6. This plasmid is constructed as follows: Plasmid pAPI226 containing the TS022D promoter, HV18 (barley α-amylae gene signal peptide and the Nos terminator was digested with Ehel/Xhol. Both of these restriction sites are located between the HV18 signal peptide and the Nos terminator. The resulting 3.5 kb band was gel-pourified. Plasmid pLys-ger, containing codon optimized human lysozyme gene, was digested with Dral/Xhol in order to isolate the lysoztyme gene. The resulting fragment was ligated to the 3.5 kb fragment from pAPI226 to form plasmid pAPI237.
B. Expressed Protein
The gene encoding the protein to be expressed may include a) the monocot α- amylase gene which is under the control of the promoter from which the improved promoter was derived; b) a gene encoding a monocot α-amylase normally controlled by a different promoter, or a gene encoding a monocot protein other than α-amylase; or c) a gene encoding a protein from a source other than a monocot plant. The proteins of categories b) and c) are collectively referred to as heterologous proteins. The expressed protein, if other than a monocot α-amylase, may include a fusion of an N-terminal region corresponding to a portion of a monocot α-amylase signal sequence peptide and, immediately adjacent to the C-terminal amino acid of said portion, the protein. The expressed protein, with or without an adjacent signal sequence, may include therapeutic proteins and peptides such as factor VIII or αl-antitrypsin; vaccines; industrial enzymes such as subtilisin; and proteins and peptides of nutritional importance. The coding sequence for these proteins are available from a variety of reference and sequence database sources. The coding sequence in a fusion protein gene is constructed such that the final codon in the signal sequence is immediately followed by the codon for
the N-terminal amino acid in the mature form of the expressed protein. The coding sequence for a heterologous protein may be codon-optimized for optimal expression in plant cells.
C. 3' Untranslated Region The expression vector also includes, downstream of the protein coding sequence, the 3' untranslated region (3' UTR) from an inducible monocot gene, such as one of the monocot α-amylase genes described above. The transcriptional termination region may be selected, particularly for stability of the mRNA to enhance expression. Polyadenylation tails (Alber and Kawasaki, 1982, Mol. and Appl. Genet. 1:419-434) are also commonly added to the expression cassette to optimize high levels of transcription and proper transcription termination, respectively. Polyadenylation sequences include but are not limited to the Agrobacterium octopine synthetase signal (Gielen, et al., EMBO J. 3:835-846, 1984) or the nopaline synthase of the same species (Depicker, et al., Mol. Appl. Genet. 1:561-573, 1982).
D. Transformation of plant cells
The plants used in the process of the present invention are derived from monocots, particularly the members of the taxonomic family known as the Gramineae. This family includes all members of the grass family of which the edible varieties are known as cereals. The cereals include a wide variety of species such as wheat (Triticum sps.), rice (Oryza sps.) barley (Hordeum sps.) oats, (Avena sps.) rye (Secale sps.), corn (Zea sps.) and millet (Pennisettum sps.). In the present invention, preferred family members are rice and barley.
Plant cells or tissues derived from the members of the family are transformed with expression constructs (i.e. , plasmid DNA into which the chimeric gene of the invention has been inserted) using a variety of standard techniques (e.g. , electroporation, protoplast fusion or microparticle bombardment). In the present invention, particle bombardment is the preferred transformation procedure.
Various methods for direct or vectored transformation of plant cells, e.g. , plant protoplast cells, have been described, e.g. , in above-cited PCT application WO
95/14099. As noted in that reference, promoters directing expression of selectable markers used for plant transformation (e.g. , nptll) should operate effectively in plant hosts. One such promoter is the nos promoter from native Ti plasmids, Herrera-Estrella, et al, Nature 303:209-213, 1983. Others include the 35S and 19S promoters of
cauliflower mosaic virus, Odell, et al, Nature 313:810-812, 1985, and the 2' promoter, Velten, et al, EMBO J. 3:2723-2730 (1984).
In one preferred embodiment, the embryo and endosperm of mature seeds are removed to exposed scutulum tissue cells. The cells may be transformed by DNA bombardment or injection, or by vectored transformation, e.g. , by Agrobacterium infection after bombarding the scuteller cells with microparticles to make them susceptible to Agrobacterium infection (Bidney et al, Plant Mol. Biol. 18:301-313, 1992). One preferred transformation follows the methods detailed generally in Sivamani,
E. et al, Plant Cell Reports 15:465 (1996); Zhang, S., et al, Plant Cell Reports 15:465 (1996); and Li, L. , et al., Plant Cell Reports 12:250 (1993).
The following barley transformation method described is representative. Immature about 1.5-2.5 mm in size were harvested from barley plants, sterilized in 50% commercial bleach, isolated and placed on callus induction media (DC) in the dark for three weeks to obtain callus. The callus was then transferred to DBC2 media in dim light to obtain green callus. Numerous shoots were also generated as well. The shoots were removed and the green callus was transferred into fresh DBC2 medium every three weeks. Non-green callus callus derived from green callus was also observed after each subculmre. However, greater than 90% of the callus remains green in each subculmre. The green callus was then placed onto regeneration medium (FHG)) to obtain plants.
The green callus were selected and transformed with pAPI166 by particle bombardment. The plasmid pAPI237 contains the TS022D promoter, HV18 signal- peptide coding sequence, and codon-optimized juman lysozyme gene and Nos terminator. One day after bombardment, the transformed callus was placed on DBC2 media containing bialaphos at a concentration of 5 mg/1. The callus was sub-cultured onto fresh DBC2 with selectable reagent every three weeks. After three months, transgenic calli were obtained. Some of the transgenic calli were able to produce transgenic plants. Twelve plants were analyzed with PCR, and four of the twelve were shown to carry the lysozyme expression plasmid pAPI237.
IV. Cell culmre production of mature protein
Transgenic cells, typically callus cells are cultured under conditions that favor plant cell growth, until the cells reach a desired cell density, then under conditions that
favor expression of the mature protein under the control of the improved promoter. Purification of the mature protein secreted into the medium is by standard techniques known by those of skill in the art. V. Production of mature protein in germinating seeds Monocot cells transformed as above are used to regenerate plants, seeds from the plants are harvested and then germinated, and the mature protein is isolated from the germinated seeds.
Plant regeneration from cultured protoplasts or callus tissue is carried by standard methods, e.g. , as described in Evans et al , HANDBOOK OF RANT CELL CULTURES Vol. 1 : (MacMillan Publishing Co. New York, 1983); and Vasil I.R. (ed.), CELL CULTURE AND SOMATIC CELL GENETICS OF PLANTS, Acad. Press, Orlando, Vol. I, 1984, and Vol. Ill, 1986, and as described in the above-cited PCT application.
The transgenic seeds obtained from the regenerated plants are harvested, and prepared for germination by an initial stepping step, followed by malting in the presence of gibberellic acid, as detailed, for example, in above-identified PCT application WO 95/14099.
The mature protein secreted from aleurone cells into the endosperm tissue of the seed can be isolated by standard methods. Typically, the seeds are mashed to disrupt tissues, the seed mash is suspended in a protein extraction buffer, and the protein is isolated from the buffer by conventional means.
The following examples are provided by way of illustration only and not by way of limitation. Those of skill will readily recognize a variety of noncritical parameters which could be changed or modified to yield essentially similar results.
General Methods
Generally, the nomenclature and laboratory procedures with respect to standard recombinant DNA technology can be found in Sambrook, et al. , MOLECULAR CLONING - A LABORATORY MANUAL. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 1989 and in S.B. Gelvin and R.A. Schilperoot, PLANT MOLECULAR BIOLOGY. 1988. Other general references are provided throughout this document. The procedures therein are known in the art and are provided for the convenience of the reader.
Example 1
Construction Wild-type and Mutant
Amv32b Promoters
The Amy32b promoter construct ML022 contains a truncated form of the Amy 32b promoter (Lanahan et al. (1992). ML022 is digested with Hindlll-BamHI and the 536 bp fragment (SEQ ID NO:27), which contains the promoter region of interest, is inserted between the Hindlll-BamHI sites of the pBI221 GUS fusion vector (Clonetech). Xbal restriction sites are introduced 3 bases upstream of the 5' side of the pyrimidine box and 19 bases downstream of the 3' side of the amylase box (as indicated in Fig. 2A) using standard oligonucleotide-directed mutagenesis techniques (Sambrook) to yield the TS010/GUS fusion construct. Additional modified promoter constructs as diagrammed in Fig. 4 and assayed in Fig. 5 were prepared in a similar manner. This construct is introduced into intact aleurone layers as described below.
A. Construction of Modification (a)-Promoters
Mutagenic primers and standard techniques are used to create the promoter mutations shown in Figures 4 and 5. To generate the TS022 mutant (Fig. 2B), forward and reverse primers designated TS022 XbaF (SEQ ID NO:32) and TS022 XbaR (SEQ ID NO:33), are obtained from commercial sources (e.g., Operon Technologies, Alameda CA). The forward and reverse primers contain the first and second XBal sites, respectively, and the reverse primer also introduces the TGG substitution. These primers are PCR-amplified with ML022 as template, using Pfu polymerase (Stratagene) under the conditions recommended by the manufacturer. The resulting PCR-amplified fragment is purified, digested with Xbal and ligated into the corresponding site in Xbαl-digested TS010/GUS. The modified promoter/GUS fusion constructs are introduced into intact aleurone layers as described below.
Example 2 Construction of Modification (b) Promoters A. Construction of TS010D
The modified promoter as diagramed in Fig. 2C, which contains a tandem repeat of the pyrimidine box sequence, the GARE sequence, the amylase box sequence, and the region 1-18 bases downstream of the amylase box, is prepared using oligos OL-1 (SEQ ID NO:34), OL-2 (SEQ ID NO:35), and OL-3 (SEQ ID NO:36). The oligos are kinased and annealed as described in Ausubel et al. (1997). Single-stranded gaps between the
annealed oligos are filled in using Pfu polymerase (Stratagene) and ligated with T4 DNA ligase, using procedures detailed in Ausubel et al. (1997). The product is digested with Xbal and the duplication is ligated in place of the Xbal fragment in TS010/GUS. An alternative method of constructing of TS010D would be to use TS022D as a template in sit-directed mutagenesis using an 28 oligomer such as 5' AGA GTT GCT TGG AGG CAC TGC ATG GAT A3'. The modified promoter/GUS fusion construct is introduced into intact aleurone layers as described below.
B. Construction of Other Modification (b) Promoters A variation of the technique used to construct the pyrimidine box - GARE - amylase box tandem duplication described above may be used to construct modified promoters which contain tandem duplications of one or more promoter elements. In this example, a tandem duplication of the amylase box is prepared using oligos OL-1 (SEQ ID NO:34) and OL-Ax2 (SEQ ID NO:44). The oligos are kinased and annealed, and gaps filled in using Pfu polymerase (Stratagene), as described in Ausubel et al. (1997). The product is digested with Xbal and the tandem duplication is ligated in place of the Xbal fragment in TS010/GUS. The modified promoter/GUS fusion construct is introduced into intact aleurone layers as described below.
C. Addition of the O2S Sequence to High pi Promoters
In one embodiment of the invention, the promoter is modified to include an O2S box 5' of the pyrimidine box. The present example outlines the addition of an 02S box to the high-pl HV18 promoter. a) A Cla I restriction site is introduced into the low-pi promoter construct TSOlO, 49 base pairs 5' of the Xbal site in SEQ ID NO:28, using TSOlO as template and the following mutagenic primers:
5 ' CTGGGTGATCCCAGCTTGGATCGATCTATCTTTTCCCATGGAATTTG (32bClaF, SEQ ID NO:37) and its complement 5 ' C AAATTCC ATGGGAAAAGATAGATCGATCC AAGCTGGG ATC ACCC AG
(32bClaR, SEQ ID NO:38) by the QuickChange mutagenesis technique (Stratagene; La Jolla, CA) to yield XlO+Cla. b) A Cla I restriction site is introduced into the high-pi HV18 promoter (positions
536-541 of SEQ ID NO:39), which is fused to the GUS reporter gene, using the
following mutagenic primers:
5 ' AACTGACGGTCGTATTGAATCGATCCTTCTTATGGAAGGCGA
(HV18ClaF, SEQ ID NO: 40) and its complement
5' TCGCCTTCCATAAGAAGGATCGATTCAATACGACCGTCAGTT (HV18ClaR, SEQ ID NO:41) to yield HV18+Cla.
Additionally, a Xba I site is introduced at positions 589-594 of SEQ ID NO:39 with the mutagenic primers
5" CATCTACATCACTTGGGCATCTAGACGCCTTTTGAGCTCACCG (HVlδXbaF,
SEQ ID NO:42) and its complement 5' CGGTGAGCTCAAAAGGCGTCTAGATGCCCAAGTGATGTAGATG
(HV18XbaR, SEQ ID NO:43) to form HV18+Cla/Xba. c) The 61 bp Clal-Xbal fragment of XlO+Cla, which contains the 02S site, replaces the existing 58 bp Clal-Xbal fragment of HV18+Cla/Xba to yield a high pi promoter with an O2S sequence. Type (a) promoter modifications (Example 1) and type (b) promoter modifications
(Example 2), are then introduced individually and in combination into this O2S- containing high pi promoter using minor variations of the procedures described above.
The modified promoter/GUS fusion constructs are introduced into intact aleurone layers as described below. Example 3
Transformation of Aleurone Layers and Assay of Transcriptional Activity
A. Particle Bombardment
For transient expression experiments, each test construct was mixed with plasmid pAHC18 (ubiquitin promoter/luciferase internal standard; Bruce W.B., et al. (1989) Proc. Natl. Acad. Sci USA 86:9692-9696) at a mass ratio of 2:1 with a final plasmid concentration of 0.6 μg/μl. Ten microliters of this DNA mixture was precipitated onto tungsten particles using CaCl2 and spermidine (free base) and introduced into intact aleurone layers by particle bombardment (Bruce, W.B., and Quail, P.H. (1990) Plant Cell 2:1081-1089: Lanahan et al., 1992). For individual bombardments, sterile deembryonated half grains of Himalaya barley, which had been imbibed for 2 days and had their pericarp/testa layers removed, were arranged in a small circle ("2.5 cm diameter) so that all would be in the path of the mngsten particles. After each shot, eight grains were incubated in 5 ml of 20 mM CaCl2,
20 mM Na succinate pH 5.0 buffer containing 10-15% glycerol, 10 μg/ml chloramphenicol, plus no GA3 or 2 x 10"°M GA3 for 36 to 40 hours in a Petri plate. The eight half grains were then homogenized using a mortar and pestle in 1 ml of 100 mM NaPO4 pH 7.4, 5 mM DTT. and 20 μg/ml leupeptin. The homogenates were centrifuged for 10 min at 10,000g and the supernatants were retained for enzyme assays. B. Transcription Activity Assays
GUS and luciferase activities of the supernatants were measured as described in Lanahan et al. (1992). GUS activity was expressed as fluorescence units (minus background control) generated in 4 hr at 37°C. Relative GUS activity was normalized for luciferase activity in the same transformed sample. The normalized GUS activity indicated how efficiently the test promoter construct drove transcription of the GUS reporter gene as compared to the transcription efficiency of the ubiquitin/luciferase construct. The average GUS activity and SE were calculated using four to six replicate bombardments per construct. While the invention has been described with reference to specific methods and embodiments, it is appreciated that various modifications and changes may be made without departing from the invention.
Sequences not shown in Figs. 3A-3C or in the specification:
SEQ ID NO:27 MLO22
AAGCTTGATACAGATACATTTTGCATCGAGTTCGTTACCACGGAAACATGAAA 5 AATCTTGCAAAATCTTGCAAAATCTTGCACTATGTTCAATTGGAAAATCAGTT ACTCAAAAAAGTAAAGGGCAACGCTGGGTGATCCCAGCTTGGATAGTGCTAT CTTTTCCCATGGAATTTGTGCCGGCCCGGATTGACTTGCCATCATCTGTTGCAC CTTTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCCAAGCAACACTCCA CGGGGACGTAGCTCGTGTTAAATACCGCTGTGGCATCGACTTCCTATAAATAC o C AAGCACGTAGAACTCTTGTAACCATCAATCACCAGTCTTGTGAATCATTCAT
CCACAGAACAAGAGTGCAGCGAACAGTGTGAGATCGACAGTAGCGCGCCTTT CAGGTACACATGCTCGTGGTTTTTGATTTGTCCGGGTTGGCTCAGCTTGTTTCT GTGATCTCAGGAGCTTAATTAACTGTGGGAGCTGGAATTGATGTTGCAGGGGA TCCACC 5
SEQ ID NO:28 TSOlO
AAGCTTGATACAGATACATTTTGCATCGAGTTCGTTACCACGGAAACATGAAA
AATCTTGCAAAATCTTGCAAAATCTTGCACTATGTTCAATTGGAAAATCAGTT o ACTC AAAAAAGTAAAGGGCAACGCTGGGTGATCCCAGCTTGGATAGTGCTAT
CTTTTCCCATGGAATTTGTGCCGGCCCGGATTGACTTGCCATCATtctagaCACCT TTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCCAAGCAACtCTagACGG GGACGTAGCTCGTGTTAAATACCGCTGTGGCATCGACTTCCTATAAATACCAA GCACGTAGAACTCTTGTAACCATCAATCACCAGTCTTGTGAATCATTCATCCA 5 C AGAACAAG AGTGC AGCGAACAGTGTGAGATCGAC AGT AGCGCGCCTTTC AG
GTACACATGCTCGTGGTTTTTGATTTGTCCGGGTTGGCTCAGCTTGTTTCTGTG ATCTCAGGAGCTTAATTAACTGTGGGAGCTGGAATTGATGTTGCAGGGGATCC ACC 0
SEQ ID NO:29 TSO22
AAGCTTGATACAGATACATTTTGCATCGAGTTCGTTACCACGGAAACATGAAA AATCTTGCAAAATCTTGCAAAATCTTGCACTATGTTCAATTGGAAAATCAGTT ACTCAAAAAAGTAAAGGGCAACGCTGGGTGATCCCAGCTTGGATAGTGCTAT 5 CTTTTCCC ATGGAATTTGTGCCGGCCCGGATTGACTTGCC ATC ATtctagaC ACCT
TTTCTCGTAACAGAGTCTGGTATCCATGCAGTGTGGCCAAGCAACtCTagACGG GGACGTAGCTCGTGTTAAATACCGCTGTGGCATCGACTTCCTATAAATACCAA GCACGTAGAACTCTTGTAACCATCAATCACCAGTCTTGTGAATCATTCATCCA CAGAACAAGAGTGCAGCGAACAGTGTGAGATCGACAGTAGCGCGCCTTTCAG o GTACACATGCTCGTGGTTTTTGATTTGTCCGGGTTGGCTCAGCTTGTTTCTGTG
ATCTCAGGAGCTTAATTAACTGTGGGAGCTGGAATTGATGTTGCAGGGGATCC ACC
5 SEQ ID NO:30 TSO10D
AAGCTTGATACAGATACATTTTGCATCGAGTTCGTTACCACGGAAACATGAAA AATCTTGCAAAATCTTGCAAAATCTTGCACTATGTTCAATTGGAAAATCAGTT ACTCAAAAAAGTAAAGGGCAACGCTGGGTGATCCCAGCTTGGATAGTGCTAT CTTTTCCCATGGAATTTGTGCCGGCCCGGATTGACTTGCCATCATtctagaCACCT
TTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCCAAGCAACTCTGCATA GACACCTTTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCCAAGCAACtC TagACGGGGACGTAGCTCGTGTTAAATACCGCTGTGGCATCGACTTCCTATAAA TACCAAGCACGTAGAACTCTTGTAACCATCAATCACCAGTCTTGTGAATCATT 5 CATCCACAGAACAAGAGTGCAGCGAACAGTGTGAGATCGACAGTAGCGCGCC TTTCAGGTACACATGCTCGTGGTTTTTGATTTGTCCGGGTTGGCTCAGCTTGTT TCTGTGATCTCAGGAGCTTAATTAACTGTGGGAGCTGGAATTGATGTTGCAGG GGATCCACC 0
SEQ ID NO:31 TSO22D
AAGCTTGATACAGATACATTTTGCATCGAGTTCGTTACCACGGAAACATGAAA AATCTTGCAAAATCTTGCAAAATCTTGCACTATGTTCAATTGGAAAATCAGTT ACTCAAAAAAGTAAAGGGCAACGCTGGGTGATCCCAGCTTGGATAGTGCTAT 5 CTTTTCCCATGGAATTTGTGCCGGCCCGGATTGACTTGCCATCATtctagaCACCT TTTCTCGTAACAGAGTCTGGTATCCATGCAGTGTGGCCAAGCAACTCTGCATA GACACCTTTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCCAAGCAACtC TagACGGGGACGTAGCTCGTGTTAAATACCGCTGTGGCATCGACTTCCTATAAA TACCAAGCACGTAGAACTCTTGTAACCATCAATCACCAGTCTTGTGAATCATT o CATCCACAGAACAAGAGTGCAGCGAACAGTGTGAGATCGACAGTAGCGCGCC TTTCAGGTACACATGCTCGTGGTTTTTGATTTGTCCGGGTTGGCTCAGCTTGTT TCTGTGATCTCAGGAGCTTAATTAACTGTGGGAGCTGGAATTGATGTTGCAGG GGATCCACC
5 SEQ ID NO:32
ACGTTCTAGACACCTTTTCTCGTAACAGAGTC
SEQ ID NO:33 CAGTTCTAGAGTTGCTTGGCCACACTGCATGGATACCAGAC 0
SEQ ID NO:34
ACGTTCTAGACACCTTTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCC
AA 5 SEQ ID NO:35
ACCTTTTCTCGTAACAGAGTCTGGTATCCATGCAGTGCCTCCAAGCAACTCTA GAACGT
SEQ ID NO:36 o CTCTGTTACGAGAAAAGGTGTCTATGCAGAGTTGCTTGGAGGCACTGCATGGA TA
SEQ ID NO:39 HV18 including region upstream of pyrimidine box GGATCCTAGCTACGGACAGCGCCCCGGTTATGGAGGCCGACAGCCGCGGCGC 5 GCGGCTGCGTAGCAGTGCAGCGTGAAGTCATAGATAGACTGTAGAGGGCATG
GCGGCAAGTGAAAACACACTTCCGTTTGTTCTGTTGAGTCAGTTGGATCTGCT TTGGCCTGGCGATAACGTCTCCGGCCATTGTTTATCACGGCGCCTGCTTATCC CTCCGAAAGTTTGAGCAAAAGGTGCAGCTTCTTTCTAGTACAGAAATGACGTC CAGAGTTGCAGCAACCCATTCGGAACTCCTGGTGGATGCCAACGAAATTAAAT 0 GGGATAAAACTTAGTGAAGAATCTATATTTTCTTGCAAC AACATACTCCTACC
CTCACGAATTGAATGCTCATCGAACGAATGAATATTTGGATATATGTTGATCT CTTCGGACTGAAAAAGTTTGAACTCGCTAGCCACAGCACACTATTCCATGAAA AATGCTCGAATGTTCTGTCCTAGAAAAACAGAGGTTGAGGATAACTGACGGTC GTATTGACCGGTGCCTTCTTATGGAAGGCGAAGGCTGCCTCCATCTACATCAC TTGGGCATTGAATCGCCTTTTGAGCTCACCGTACCGGCCGATAACAAACTCCG GCCGACATATCCACTGGCCCAAAGGAGCATTCAAGCCGAGCACACGAGAAAG TGATTTGCAAGTTGCACACCGGCAGCAATTCCGGCATGCTGCAGCACACTATA AATACTGGCCAGACACACAAGCTGAATGCATCAGTTCTCCATCGTACTCTTCG AGAGCACAGCAAGAGAGAGCTGAAGAAC
SEQ ID NO:44
ACGTTCTAGAGTTGCTTGGCCACACTGCATGGATACCAGATGCTTGGAGGCAC
TGCATGGATA