AU725002B2

AU725002B2 - Gene expression in plants

Info

Publication number: AU725002B2
Application number: AU31704/97A
Authority: AU
Inventors: Marcus Cornelissen; Veronique Gossele; Frank Meulewaeter; Piet Soetaert; Roel Van Aarssen
Original assignee: Plant Genetic Systems NV
Current assignee: Bayer CropScience NV
Priority date: 1996-06-21
Filing date: 1997-05-30
Publication date: 2000-10-05
Anticipated expiration: 2017-05-30
Also published as: AU3170497A; EP0922104A1; WO1997049814A1; JP2000513217A; CA2255057A1

Description

f( WO 97/49814 PCT/EP97/02832 GENE EXPRESSION IN PLANTS.

FIELD OF THE INVENTION The invention relates to the efficient expression in plants of AT-rich genes, especially Bacillus thuringiensis (Bt) genes encoding insecticidal crystal proteins (ICP). The invention thus relates to a process that comprises the RNA polymerase II independent production of predominantly uncapped, non-polyadenylated RNA transcripts of the native coding sequences of AT-rich genes, preferably Bt ICP genes, said transcripts comprising translation enhancing sequences, particularly those derived from the 5' region and 3' region of positive-stranded RNA plant viruses, preferably of necroviruses, that enable efficient cap- and poly(A)independent translation of the RNA transcripts in plant cells to yield high levels of proteins specified by the AT-rich genes, more particularly insecticidal levels of Bt ICPs.

BACKGROUND OF THE INVENTION The recent developments in plant genetic engineering allow routine introduction of recombinant DNA in a wide range of plants. Transcription and translation was observed for most of the chimeric genes, however suboptimal expression is often encountered when expression of AT-rich genes is attempted. One of the prime examples of such difficulties was the expression of Bt ICPs.

Numerous publications teach the expression of different Bt ICPs in a wide range of plant species. Truncating the Bt ICP genes so as to encode a smaller and more soluble protein that retained full toxicity was found to be critical to obtain insect controlling amounts of Bt ICP in the plants [Vaeck et al., Nature, 328: 33-37 (1987); Fischhof et al., Bio/Technology 807-813 (1987); Carozzi et al., Plant Molecular Biology 20: 539-548 (1992)].

Subsequent publications described the enhancement of the expression levels of Bt ICP genes in plant species, in order to be able to target also less susceptible insect species. Different approaches were CONFIRMATION

COPY

WO 97/49814 PCT/EP97/02832 2 followed to modify the introduced bacterial DNA sequences encoding Bt ICPs to avoid the presence of sequences that could negatively affect expression in the plant cells. To this end, nucleic acid sequences were provided that encode a Bt ICP with essentially the same amino acid 5 sequence as an existing Bt ICP but wherein one or more of the following modifications were included: the nucleic acid sequence surrounding the translation initiation codon was changed to resemble more the translation initiation sequences preferably used by plants.

the overall codon usage was modified to better reflect the preferred codon usage of a particular plant species.

cryptic promoter signals were removed.

nucleic acid sequences that target the hnRNA into an abortive splicing pathway were eliminated.

potential termination signals for DNA-dependent RNA polymerase II within the coding sequence were removed.

putative mRNA destabilizing sequences were replaced.

presumptive alternative polyadenylation sites were avoided.

[Perlak et al., Proc. Natl. Acad Sci. USA 88: 3324-3328 (1991); Adang et al., Plant Mol. Biol. 21: 1131-1145 (1993), Murray et al. Plant Mol. Biol.

116:1035-1050 (1991) WO 91/16432, WO 93/09218].

Recently, Mc Bride et al. described the introduction of a native Bt ICP coding sequence under control of a T7 promoter or a plastid expression signal in the chloroplasts of tobacco plants in an attempt to circumvent the problem of poor expression of full-length protoxin genes from the nucleus of plants, particularly those with a high AT-content. The regenerated plants from these transplastomic lines were reported to express Bt ICP at a high level in mature leaves using the prokaryotic-like transcriptional and translational machinery of the plastid (Mc Bride et al., Bio/Technology 13: 362-365 (1995); WO 95/24492, WO 95/24493).

However, the transformation process set forth in these references is complicated because it requires the use of plastid transformation vectors and/or the transport of appropriate polymerases from the cytoplasm to the chloroplasts. Furthermore, the references remain silent on the level of ICPs in tissues other than mature leaves, such as root or stem tissue which WO 97/49814 PCT/EP97/02832 3 constitute important targets for pests such as corn root worm (Diabrotica spp), European corn borer (Ostrinia nubilalis) or cutworms Agrotis spp.).

Unique features of eukaryotic mRNA are the presence of the m 7

G

s cap at its 5' end and a 3' poly(A) tract. Several functions at different stages of gene expression have been attributed to the cap at the 5' end, which is added shortly after transcription elongation has started, including a role in RNA stabilization, splicing, transport and translation. The cap structure supposedly binds to the translation initiation factor elF-4F, allowing the ribosomal subunits and proper factors to bind and initiate at the first AUG codon in a favourable sequence context. Absence of this 5' cap structure in naturally capped plant viral RNA or cellular mRNA decreases the translational efficiency substantially [Fletcher et al, J. Biol. Chem. 265: 19582-19587 (1990)].

A role for the poly(A) tail found at the 3' end of most eukaryotic mRNAs has been implied in mRNA stability, its transport into the cytoplasm, and its efficient translation [Jackson and Standart, Cell, 62: 24,1990]. The poly(A) tail, complexed with poly(A)-binding protein is believed to enhance the formation of 40S translational initiation complexes, presumably through promoting some sort of interaction between 5' and 3'proximal elements of the mRNA [Tarum and Sachs, Genes and Dev. 9: 2997-3007 (1995)].

Whereas the majority of eukaryotic mRNAs have capped 5' ends and poly(A) tails at the 3' ends, the genomic or subgenomic RNAs of plant viruses often lack one or both. For positive-strand RNA viruses, the RNAs are translated early upon infection, even though cellular templates are prevalent. It is often due to the presence of alternative terminal structures that viral RNA templates exhibit high translational efficiency.

US Patent (US) 4,820,639 describes a process and means for increasing production of protein translated from eukaryotic messenger ribonucleic acid comprising transferring a regulatory nucleotide (nt) sequence from a viral coat protein mRNA to the 5' terminus of a gene or complementary deoxyribonucleic acid (cDNA) encoding the protein to be produced to form a chimeric DNA sequence.

t WO 97/49814 PCT/EP97/02832 4 US 5,489,527 and the European patent publication (EP) 0270611 both describe the use of 5' regions of RNA viruses as enhancers of translation of mRNA, especially 5' regions derived from plant RNA viruses.

Publication of the PCT patent application (WO) 91/00905 and US i 5 5,135,855 describe the use of untranslated regions from an encephalomyocarditis virus to confer cap-independent translation to RNAs in mammalian cells, particularly when a prokaryotic transcription system is used in these eukaryotic cells.

EP 0589841 provides a dual method for producing male-sterile plants, as well as compositions and methods for high level expression of a coding region of interest in a plant by expression of a T7 RNA polymerase in a plant cell that contains a second expression cassette comprising a T7 regulatory region linked to the coding region of interest.

SUMMARY

In accordance with the invention chimeric genes are provided that comprise: a first promoter recognized by a DNA-dependent

RNA

polymerase different from a eukaryotic RNA polymerase

II,

particularly a T3 or T7 RNA polymerase specific promoter; a DNA region encoding a chimeric RNA which comprises a 5' UTR, a heterologous coding sequence, preferably an AUrich coding sequence, and a 3' UTR; and optionally a terminator sequence recognized by said RNA polymerase wherein the chimeric RNA, produced by the RNA polymerase, is uncapped and comprises: i) a first translation enhancing sequence derived from the 5' region of genomic or subgenomic RNA of a positive stranded RNA plant virus, preferably a necrovirus, especially STNV-2 or TNV-A, located in the 5' region of the chimeric

RNA;

ii) a second translation enhancing sequence derived from the 3' region of genomic or subgenomic RNA of a WO 97/49814 PCT/EP97/02832 positive-stranded RNA plant virus, preferably a necrovirus, especially STNV-2 or TNV-A, located in the 3' region of the chimeric RNA; and which is capable of being translated in the cytoplasm of a plant cell, to s produce the protein or polypeptide. The transcribed uncapped RNA coding sequence may be polycistronic.

Also provided in the invention are plant cells and plants, particularly corn plant cells and plants, comprising these chimeric genes, integrated in their nuclear DNA, whereby the plant cell produces the RNA polymerases corresponding to the used promoters and terminators.

More particularly, it is a further objective of the invention to provide plant cells and plants, comprising these chimeric genes, integrated in their nuclear DNA, wherein the first promoter is a single subunit bacteriophage RNA polymerase specific promoter, such as a T3 or T7 RNA polymerase specific promoter, and wherein such plant cells or plants further comprise a chimeric polymerase.gene including: a second plant-expressible promoter; a DNA sequence encoding a single subunit bacteriophage

RNA

polymerase such as a T3 or T7 RNA polymerase functionally linked to a nuclear localization signal; operably linked so that upon expression of the chimeric polymerase gene a functional and properly located RNA polymerase is produced.

The invention further provides a process for producing a plant expressing a protein or polypeptide encoded by a heterologous gene, preferably an AT-rich gene, especially a Bt ICP encoding gene, which comprises the steps of: transforming the nuclear genome of a plant cell with the abovementioned chimeric genes; and regenerating a transformed plant from the transformed cell.

BRIEF DESCRIPTION OF THE FIGURES Figure 1A schematically represents the relative protein accumulation profiles in plant protoplasts obtained by translation of a capped chimeric WO 97/49814 PCT/EP97/02832 6 RNA comprising the translation enhancing sequences of the invention, in reference to an efficiently translated capped and polyadenylated

RNA.

Figure 1B schematically represents the relative protein accumulation profiles in plant protoplasts obtained by translation of a uncapped chimeric RNA comprising the translation enhancing sequences of the invention, in reference to the capped version of the same chimeric RNA comprising the translation enhancing sequences of the invention.

Figure 2A depicts schematically different possible locations of first and second translation enhancing sequences with regard to the homologous coding sequence and untranslated regions of a viral genomic or subgenomic RNA.

Figure 2B is a schematic representation of different possible locations of first and second translation enhancing sequences with regard to the heterologous coding sequence and untranslated regions of the chimeric RNAs encoded by the cap-independently expressed chimearic genes of the invention.

DETAILED DESCRIPTION OF INVENTION The difficulties associated with the expression of Bt ICP genes in plant cells are also often encountered when expressing other heterologous genes with high AT-content. AT-rich genes have an enhanced probability of harbouring cryptic signals interfering with efficient transcription and translation in plant cells, especially in monocotyledonous cells, such as corn cells. Expression problems are magnified when the AT content of the coding region of the heterologous gene surpasses significantly the mean AT content of the coding regions of the host plant in which expression is attempted. These expression problems might already arise when the coding sequence of the gene of interest, although not particularly AT-rich when taken as a whole, contains an AT-rich nucleotide-stretch of about 400 residues.

Accordingly, it was a main object of the present invention to provide a reliable method for efficient expression in plant cells of AT-rich genes, particularly Bt ICP genes without having to rely on expensive, labourious WO 97/49814 PCT/EP97/02832 7 and time-consuming methods to implement the various approaches that have been described.

The present invention provides a new method to promote expression to a high level, of coding sequences, preferably coding sequences of ATrich genes such as Bt ICP genes, particularly native coding sequences of Bt ICP genes which are integrated in the plant's nuclear genome. It was realized that problems associated with the expression of coding sequences of heterologous AT-rich genes at the transcriptional and/or posttranscriptional level can be overcome by using an RNA polymerase different from the eukaryotic DNA-dependent RNA polymerase II, to produce uncapped RNAs encoding the protein or polypeptide of interest.

These uncapped RNAs are then efficiently translated into the desired protein or polypeptide, by using the translation enhancing sequences provided in this invention.

The invention is based on the realization that transciption by an RNA polymerase different from the eukaryotic DNA dependent RNA polymerase II, of AT-rich genes such as Bt ICP genes, particularly native coding sequences of Bt ICP genes, integrated in the nuclear genome of a plant, generates sufficiently large amounts of RNA, without suffering from the mentioned transcriptional and post-transcriptional problems. The resulting RNA is however uncapped and non-polyadenylated.

The invention is further based on the finding by the applicants, that when uncapped RNAs comprising native coding sequences of heterologous genes and suitable translation enhancing sequences derived from 5' and 3' regions of the genomic RNA coding for the coat protein of a necrovirus, such as STNV-2, are introduced in plant cells, these RNAs are translated efficiently.

The invention thus provides the means and methods to transcribe AT rich genes by an RNA polymerase different from the eukaryotic

DNA

3o dependent RNA polymerase II, to produce uncapped RNAs encoding the protein or polypeptide of interest, which are efficiently translated by the inclusion of translation enhancing sequences from 5' and 3' regions of RNA viruses which allow efficient translation of uncapped RNAs in a capindependent manner. To this end, cap-independently expressed chimeric genes are provided comprising an AT-rich coding sequence and DNA WO 97/49814 PCT/EP97/02832 8 encoding translation enhancing sequences of a necrovirus, under control of a promoter recognized by an RNA polymerase different from eukaryotic RNA polymerase II. Integration of such chimeric genes in a plant cell expressing the alternative RNA polymerase results in the production of predominantly uncapped and non-polyadenylated RNA transcripts which are translated efficiently due to the presence of the translation enhancing sequences.

As used herein, both "leader" and "5'UTR" refer to the part of a protein-encoding RNA molecule, preceding the initiation codon of the coding sequence. These terms are employed interchangeably and may also be used to refer to a DNA, encoding such a leader. Similarly, "trailer" and "3'UTR" refer to the part of a protein-encoding RNA molecule, downstream of the stop codon of the coding sequences. Again, these terms are employed interchangeably and may also be used to refer to a DNA encoding such a trailer. Generally, but not exclusively, the 5'UTR and 3'UTR of an RNA plant virus mentioned in this specification flank the coding sequence of the coat protein of that virus.

As defined herein, the region" of a protein-encoding

RNA

molecule, refers to the extreme 5' end of that RNA and comprises at least the 5'UTR of that RNA but may include several nucleotides extending immediately downstream of the initiation codon of the homologous coding region. Similarly, the region" of a protein-encoding RNA molecule, refers to the extreme 3' end of that RNA and comprises at least the 3'UTR of that RNA but may include several nucleotides extending immediately upstream of the stop codon of the homologous coding region.

As used herein "coding region" or "coding sequence" refers to an RNA molecule or sequence which can be translated into a continuous sequence of amino acids of a biologically active protein or peptide an enzyme or a protein toxic to insects) or to the DNA molecule or sequence encoding such an RNA. Whether the "coding region" refers to a RNA or DNA molecule will be readily understood by the context. A coding sequence to be utilized in a cap-independently expressed chimeric gene will be generally derived from the coding region of a heterologous gene, and an appropriate initiation codon has to be provided, if necessary.

WO 97/49814 PCT/EP97/02832 9 A "DNA region encoding an RNA region" may refer to any part of a DNA molecule that is transcribed and thus can relate to the entire transcribed region of a gene, but also to parts thereof, part of a coding sequence, a DNA-region corresponding to a first or second translation enhancing sequence, a 5' or 3' UTR, or a 5' or 3' region.

Whenever cited in this application, "expression" of a gene refers at least to the combination of phenomena (transcriptional, post-transcriptional and translational events) which result in the production of the primary translation product a protein or a polypeptide. However, in some instances it will be clear that the term also relates to the effect the translation product or its derivative may have on the phenotype of the cell or of the plant.

A cap-independently-expressed chimeric gene (CIG) of this invention generally comprises 15 a) a first promoter recognized by a DNA-dependent

RNA

polymerase, different from eukaryotic DNA-dependent

RNA

polymerase II, b) a DNA encoding an RNA molecule which comprises 1) an untranslated leader sequence; 2) a coding region encoding a heterologous protein or polypeptide, preferably an AU-rich coding region; and 3) an untranslated trailer sequence, and, optionally, c) a terminator sequence recognized by the same RNA polymerase which recognizes the first promoter.

These elements are provided as operably linked components in the 5' to 3' direction.

The CIGs of this invention are further characterized in that they comprise DNAs encoding first and second translation enhancing sequences.

In the uncapped RNA that is encoded by the CIG, the first translation enhancing sequence is generally located in the untranslated leader sequence, but it may overlap with the coding region, it may extend downstream of the initiation codon of the coding region. Preferably, the WO 97/49814 PCTIEP97/02832 first translation enhancing sequence is located around that translation initiation codon.

In the RNA that is encoded by the CIG, the second translation enhancing sequence is generally located in the untranslated trailer s sequence, but it may also overlap with the coding region, it may extend upstream of the stop codon of the coding region. Preferably, the second translation enhancing sequence is located around that stop codon.

Preferred cap-independently expressed chimeric genes of the invention are CIGs as described above, wherein the DNA encoding a heterologous protein or polypeptide is AT-rich."AT-rich" DNA coding sequences as referred to herein, are those coding DNA sequences, comprising a continuous nucleotide sequence of at least 400 nucleotides, preferably of a least 600 nucleotides in length, with an AT content of at least 55%, preferably of at least 57.5%, particularly of at least 60%, more particularly of at least 62 It goes without saying that "AT rich" coding sequences also include those coding sequences, where the entire coding sequence has an AT content of at least 55%, preferably of at least 57.5%, particularly of at least 60%, especially of at least 62 Evidently, coding sequences smaller than 400 nucleotides are considered AT-rich when the entire coding sequence has an AT content of at least 55%, preferably of at least 57.5%, particularly of at least 60%, especially of at least 62 AT rich coding sequences thus include but are not limited to coding sequences of Bt ICP genes, but also sequences encoding fusion proteins between an Bt ICP and a protein encoded by a GC-rich coding sequence.

It is clear, that a coding RNA sequence referred to as "AU rich" is defined by the same criteria as an "AT rich DNA", except that thymine is replaced by uracil Another class of preferred CIGs are those CIGs wherein the first and second translation enhancing sequences are derived from a TNV strain, particularly from TNV-A, especially from TNV sg RNA 2.

In accordance with the invention, the CIGs are integrated in the nuclear genome of cells of a host plant. In order to transcribe the CIGs independently from the host-encoded RNA polymerase II, so as to produce predominantly uncapped, non-polyadenylated RNA transcripts, these genes contain promoters recognized by the endogenous RNA polymerase

I

i i t F: WO 97/49814 PCTIEP97/02832 11 or Ill of the host, or recognized by a bacteriophage single subunit RNA polymerase. In the latter case, the gene encoding the single subunit RNA polymerase is also introduced and expressed in a functional and properly located form in the same plant cell. It goes without saying that the choice of the RNA polymerase will depend on the particular promoter of the CIG and vice versa.

As used herein, the term "heterologous" with regard to a coding sequence refers to any coding sequence which is different from the coding sequence naturally associated with a 5' UTR or 3' UTR from a viral RNA from which the first or second translation enhancing sequences are derived. Preferably a heterologous coding region does not contain a region of more than 20, preferably not more than 15 codons of the viral RNA ccoding region. "Homologous" on the contrary means that such a coding sequence is naturally associated with a 5' UTR or 3' UTR from a viral RNA from which the first or second translation enhancing sequences are derived A heterologous, respectively homologous protein is thus a protein encoded by a heterologous, respectively homologous coding sequence.

As used herein, the term "necrovirus" refers to any plant virus isolate normally included in this taxonomic group, as well as their satellite viruses, exemplified by, but not limited to, tobacco necrosis virus strains, satellite tobacco necrosis virus strains, chenopodium necrosis virus, carnation yellow stripe virus, and lisianthus necrosis virus.

As used herein, the term "native DNA" or "native DNA sequence" refers to a DNA as found in its natural state, as well as a DNA containing small modifications whereby the overall AT content of that DNA is essentially retained, and the amount of modified bases, preferably of modified adenine or thymine, is limited to maximally particularly less than A native DNA with small modifications should have at least preferably 99% sequence identity with respect to that native DNA without such modifications. Examples of such modifications include, but are not limited to, the modification of the nucleotide sequence to introduce or remove a restriction enzyme recognition site or to change one or more amino acids in order to make a protein protease-resistant. For the purpose of the invention, the term native DNA will be used predominantly with regard to all or part of the heterologous coding sequence encoding a WO 97/49814 PCT/EP97/02832 12 biologically functional protein or polypeptide, such as a BT ICP coding region. In this regard, the native Bt ICP encoding sequence may thus be a truncated version comprising the minimal toxic fragment.

"Viral RNA" as used herein designates any genomic or subgenomic RNA of, or produced by a positive stranded RNA plant virus in nature.

This invention makes use of an RNA polymerase that generates uncapped, non-polyadenylated RNA transcripts of a CIG. The nature of the RNA polymerase evidently determines the first promoter to be included in the CIG and vice versa.

A useful RNA polymerase is a bacteriophage single subunit RNA polymerase such as the RNA polymerases derived from the E. coli phages T7, T3, 41, 411, W31, H, Y, A1, 122, cro, C21, C22, and C2; Pseudomonas putida phage gh-1; Salmonella typhimurium phage SP6; Serratia marcescens phage IV; Citrobacter phage VillI; and Klebsiella phage No.11 [Hausmann, Current Topics in Microbiology and Immunology, 75: 77-109 (1976); Korsten et al., J. Gen Virol. 43: 57-73 (1975); Dunn et al., Nature New Biology, 230: 94-96 (1971); Towle et al., J. Biol. Chem. 250: 1723- 1733 (1975); Butler and Chamberlin, J. Biol. Chem., 257: 5772-5778 (1982)]. Especially preferred are the T3 RNA polymerase and the T7 RNA polymerase. Obviously, when these RNA polymerases are used the first promoter should be a T3 RNA polymerase specific promoter and a T7 RNA polymerase specific promoter, respectively. For the sake of convenience, a T3 RNA polymerase specific promoter and a T7 RNA polymerase specific promoter are referred to as a T3 promoter and a T7 promoter, respectively.

A T3 promoter to be used as a first promoter in the CIG can be any promoter of the T3 genes as described by McGraw et al, Nucl. Acid Res.

13: 6753-6766 (1985). Alternatively, a T3 promoter may be a T7 promoter which is modified at nucleotide positions -10, -11 and -12 in order to be recognized by T3 RNA polymerase [(Klement et al., J. Mol. Biol. 215, 21- 29(1990)]. A preferred T3 promoter is the promoter having the "consensus" sequence for a T3 promoter, as described in US Patent 5,037,745.

A T7 promoter which may be used according to the invention, in combination with T7 RNA polymerase, comprises a promoter of one of the T7 genes as described by Dunn and Studier, J. Mol. Biol. 166: 477-535 WO 97/49814 PCT/EP97/02832 13 (1983). A preferred T7 promoter is the promoter having the "consensus" sequence for a T7 promoter, as described by Dunn and Studier (supra).

It should be noted that T3 or T7 promoters as described above include nucleotides immediately downstream of the transcription initiation site. At the 3' end of the described T3 or T7 promoter for use in this invention, up to six nucleotides can be removed to prevent the incorporation bf additional nucleotides in the 5' UTR of the transcripts from the CIGs. Particularly preferred are the T3 promoter of SEQ ID No.18 between the nucleotide positions 14 and 32 and the T7 promoter of SEQ ID No.30 between nucleotide positions 22 and 39. Another particularly preferred promoter is the T7 promoter of SEQ ID No. 30 between nucleotide positions 22 and 39 followed by 4 nucleotides of the consensus sequence GGAG) as described by Dunn and Studier (supra).

Another useful RNA polymerase for application in this invention is RNA polymerase I. Accordingly, the CIG of this invention may comprise a RNA polymerase I promoter. RNA polymerase I normally transcribes the tandemly repeated rRNA genes in eukaryotic cells such as plant cells, and the promoter signals are located in the intergenic spacer sequences between the rRNA gene repeats. It is preferred that the RNA polymerase

I

promoter used in the CIG of this invention originates or is derived from the plant species to be transformed with the CIG, although this is not required.

In a preferred embodiment, a functional RNA polymerase I specific rRNA promoter region from corn derived from the 3 kb intergenic spacer as described for Black Mexican Sweet Maize [McMullen et Nucl. Acids Res. 14: 4953-4968 (1986)] is used. A preferred promoter region comprises the nucleotide sequence of the EMBL nucleotide sequence database under accession number X03990 (EMBL X03990, which is herein incorporated by reference) between nucleotide positions 2160 and 2296, particularly a promoter region including all subrepeats of the intergenic spacer, such as a promoter region comprising the nucleotide sequence of EMBL X03990 between nucleotide positions 154 and 3118. Especially preferred is a promoter region wherein some of the subrepeats have been deleted, such as a promoter region comprising the nucleotide sequence of EMBL X03990 between nucleotide positions 939 and 3118. More particularly preferred are promoter regions wherein some or all of the WO 97/49814 PCT/EP97/02832 14 nucleotides downstream of the transcription initiation point have been deleted such as a promoter region comprising the nucleotide sequence of EMBL X03990 between nucleotide positions 154 and 2590 or a promoter region comprising the nucleotide sequence of EMBL X03990 between s nucleotide positions 2160 and 2296. It is clear that for the purpose of the invention corresponding promoter regions from another isolated rRNA intergenic repeat from the same maize variety can be used, or from an isolated rRNA intergenic repeat from another maize variety A619 [Toloczyki and Feix, Nucl. Acids Res 14:4969-4986 (1986); EMBL Accession No X03989, incorporated herein by reference] is used.

Particularly preferred are the corresponding RNA polymerase I promoter regions derived from the 3 kb intergenic region of the maize line B73.

Other rRNA intergenic spacers, comprising RNA polymerase

I

promoters which may be used according to the invention, are known in the is art for rye [Appels et al, Can J Genet Cytol 28:673-685 (1986)], wheat [Barker et al, J. Mol. Biol. 201: 1-17 (1988)], radish [Delcasso- Tremousaygue et al., Eur. J. Biochem 172: 767-776 (1988)], rice [Takaiwa et al., Plant Mol. Biol. 15: 933-935(1990)], mung bean [Gerstner et al, Genome 30: 723-733 (1988), Schiebel et al., Mol Gen Genet 218: 302-307 (1989)], potato [Borisjuk and Hemleben,Plant Mol Biol. 21; 381-384 (1993)], tomato [Schmidt-Puchta et al., Plant Mol Biol 13: 251-253 (1989)], Vicia faba [Kato et al, Plant Mol. Biol. 14: 983-993 (1990)], Pisum sativum [Kato et al., supra (1990)] and Hordeum bulbosum [Procunier et al., Plant Mol Biol. 15: 661-663 (1990)].

Yet another useful RNA polymerase for application in this invention is RNA polymerase III. Accordingly, the cap-independently expressed chimeric gene of this invention may comprise a RNA polymerase

III

promoter. RNA polymerase III normally transcribes the majority of small RNAs, such as tRNAs, 5S RNAs and small nuclear RNAs (snRNAs) involved in mRNA processing, in eukaryotic cells such as plant cells.

Suitable promoters for this invention recognized by RNA polymerase III are the promoters transcribing snRNAs of plants such as U3 or U6 snRNA from Arabidopsis thaliana [Waibel and Filipowicz, Nucl. Acids Res. 18: 3451-3458 (1990), Marshallsay et al., Nucl. Acids Res. 18: 3459-3466 WO 97/49814 PCT/EP97/02832 (1990)] or the promoter transcribing tRNAs of plants such as tRNAmet from soybean [Bourque and Folk, Plant Mol. Biol. 19: 641-647(1992)].

According to the invention, the transcribed region of a CIG, comprises a heteroiogous AT-rich coding sequence, as defined above. In a preferred embodiment of the invention the transcribed region comprises a sequence encoding a Bt ICP having insecticidal activity to at least one insect species. Especially preferred is a transcribed region comprising a sequence encoding a truncated Bt ICP, which lacks nucleotides either at the 5' end or the 3' end of the coding sequence, or both, but still comprises the sequence coding for the minimal toxic fragment. Particularly preferred Bt ICP encoding sequences for use in this invention are crylAb5, cry9C crylBa, cry3C, cry3A, crylDa and crylEa. As used herein, represents the crylAb gene described by Hofte et al, Eur. J. Biochem. 161: 273-280 (1986); cry9C represents the crylH gene described by Lambert et al., Appl. and Env. Microbiol. 62: 80-86 (1996); crylBa represents the crylB described by Brizzard and Whitely, Nucl. Acid Resarch 16: 4168-4169 (1988); cry3C represents the cryllID gene described by Lambert et al., Gene 110: 131-132 (1992); cry3A represents the crylllA gene described by Hofte et al., Nuc. Acids Res. 15: 7183; crylDa and crylEa represent the bt4 and bt18 genes, respectively, described in WO 90/02801, according to the classification proposed by Crickmore et al, Abstract presented at the 28th annual meeting of the Society for Invertebrate Pathology, 16-21 July 1995. CIGs of the invention may further include the use of genes encoding a Bt ICP fused to a protein allowing selection, gentamycin acetyl transferase (GAT) encoded by aac(6') or phosphinotricin acetyl transferase (PAT) encoded by bar. CIGs encoding chimeric toxins, wherein a domain of the toxic BT ICP fragment has been exchanged for a similar domain of another BT ICP, as described by Bosch et al. [BIO/TECHNOLOGY 12, 915-918(1994)] are also encompassed by the invention.

The CIGs according to the invention may be polycistronic, comprising between the first and second translation enhancing sequence at least 2 and up to 5 cistrons, although more cistrons may be possible.

Transcription of such a polycistronic CIG yields polycistronic RNA that should preferably comprise an internal ribosome entry site [Jackson and Kaminski, RNA 1: 985-1000 (1995); Levis and Astier-Monifacier, Virus WO 97/49814 PCT/EP97/02832 16 Genes 7: 367-379 (1993); Basso et al. J. Gen. Virology 75: 3157-3165 (1994)] between the cistrons. For the purpose of this invention it is preferred that at least one cistron is AT-rich.

The CIGs used in the invention may further include a terminator s recognized by the RNA polymerase which is used to enable transcription of the CIG. Suitable terminators are known in the art and should preferably be chosen according to the specific promoter that is used. For instance, when a T3 promoter is used, a T3 specific terminator such as described by Sengupta et al., J. Biol. Chem. 264: 14246-14255 (1989), preferably in a duplicated form, can be used. Since a T7 RNA polymerase terminates as efficiently on a T3 terminator (T3-T4) as on a T7 terminator (T7-TO) [Macdonald et al., J. Mol. Biol. 232: 1030-1047 (1993)], a terminator region comprising T3-TO may be used as well for CIGs containing a T3 promoter as for those containing a T7 promoter.

Alternatively when promoters specifically recognized by RNA polymerase I are used, the terminator regions used should comprise the corresponding species-specific RNA polymerase I terminators which are present in the intergenic regions between the rRNA repeats [Reeder and Lang, Molecular Microbiology 12: 11-15 (1994)].

When promoters specifically recognized by RNA polymerase III are used, the terminator regions used may comprise the corresponding trailer sequences associated with genes normally transcribed by RNA polymerase III, such as the genes encoding U3 or U6 snRNA from Arabidopsis thaliana [Waibel and Filipowicz, supra, Marshallsay et al. supra] or the gene encoding tRNAmet from soybean [Bourque and Folk, supra].

According to the invention, the CIG integrated in the nuclear genome of a plant cell, is transcribed in an RNA polymerase II independent manner.

This can be achieved in accordance with the invention by incorporating in the CIG a promoter and terminator as described above. Whenever the transgenic plant cells do not naturally contain the RNA polymerase required for the recognition of the promoter and transcription of the CIG, these cells need to comprise a second chimeric gene encoding that RNA polymerase, further referred to as the chimeric polymerase gene. When promoters recognized by single subunit RNA polymerases of bacteriophages T7 or T3 promoters) are used, a chimeric polymerase gene encoding a T7 or WO 97/49814 PCT/EP97/02832 17 T3 RNA polymerase [US 5,102,802] should also be incorporated in the nuclear DNA of the host plant cell. Further, mutant bacteriophage

RNA

polymerases as exemplified for T7 RNA polymerase by McDonalds et al., J.

Mol. Biol. 238: 145-148 (1994), may be used in this invention. Such mutant bacteriophage T7 RNA polymerases no longer recognize the rare termination signals encountered in heterologous genes under control of a T7 promoter, while still terminating at bona fide T7 RNA polymerase termination signals. Also, hybrid bacteriophage RNA polymerases as described by Joho et al., J. Mol. Biol. 215: 31-39 (1990), with altered specificity and promoter preference, may be used according to the invention.

Methods to express such bacteriophage RNA polymerases in plant cells, in a functional and properly located form have been described [Lassner et al, Plant Mol Biol, 17: 229-234 (1991), EP 0589841]. The is chimeric polymerase gene comprises a 5' regulatory region, i.e. the promoter region, necessary for expression in plant cells. This plantexpressible promoter may be a constitutive promoter, such as a promoter [Odell et al. Nature 313, 810-812] or may be regulated in a tissuespecific way, such as the promoters disclosed in WO 92/13957, WO 92/13956 or EP 0344029. Another suitable regulated promoter is a lightinducible promoter such as the promoter of the small subunit of Rubisco.

The expression of the single subunit bacteriophage RNA polymerase may also be temporarily regulated using promoters which are only expressed at a certain developmental state, or are induced by external stimuli such as nematode-feeding (WO 92/215757), or fungus-infection (WO 93/19188).

Further suitable promoters are plant-expressible promoters regulated by the presence of plant-growth regulators such as abscisic acid, steroidinducible promoters or copper-inducible promoters.

The spatial or temporal regulation of the promoter used in the chimeric polymerase gene will of course be reflected in the expression pattern of the single subunit bacteriophage RNA polymerase in the transformed plants of this invention, and ultimately in the expression pattern of the CIG comprising the corresponding promoter.

In order to be expressed in a properly located form according to the invention, the single subunit bacteriophage RNA polymerase should be t. 1 WO 97/49814 PCT/EP97/02832 18 operably linked to a nuclear localization signal (NLS) [Raikhel, Plant Physiol. 100: 1627-1632 (1992) and references therein], such as the NLS of SV40 large T-antigen [Kalderon et al. Cell 39: 499-509 (1984)]. It is known that the NLS can be operably linked to the polymerase in different ways. Preferably, the NLS is joined to the amino-terminus of the polymerase, or located within the N-terminal region of the polymerase, particularly within the first 20 amino acids of the polymerase, more particularly between amino acid 10 and 11 of the T7 polymerase.

The chimeric polymerase gene may further include any other necessary regulatory sequences such as terminators [Guerineau et al, Mol.

Gen. Genet. 226:141-144 (1991), Proudfoot Cell, 64:671-674 (1991), Safacon et al., Genes Dev 5: 141-149 (1991); Mogen et al., Plant Cell, 2: 1261-1272 (1990); Munroe et al., Gene, 91: 151-158 (1990); Ballas et al., Nucleic Acids Research 17: 7891-7903 (1989); Joshi et al., Nucleic Acid Research 15: 9627-9639 (1987)], plant translation initiation consensus sequences [Joshi, Nucleic Acids Research 15: 6643-6653 (1987)], introns (Luehrsen and Walbot, Mol. Gen. Genet. 225: 81-93 (1991)] and the like, operably linked to the nucleotide sequence of the chimeric polymerase gene.

According to the invention the first and second translation enhancing sequences which may be used are preferably derived from positivestranded RNA viruses. Preferred translation enhancing sequences are derived from necroviruses, preferably from STNV or TNV strains, especially from STNV-2 or TNV-A sgRNA2.

A first translation enhancing sequence, derived from a 5' region of a viral RNA, predominantly contains sequences of the 5' UTR of that viral RNA and is comprised within the 5' region of the CIG; similarly, a second translation enhancing sequence, derived from a 3' region of a viral RNA, predominantly contains sequences of the 3' UTR of that RNA and is comprised within the 3' region of the CIG. For the purpose of the invention suitable first and second translation enhancing sequences for use in an uncapped RNA of this invention are those combinations which, operably contained within such an uncapped RNA encoding a protein, allow the uncapped, non-polyadenylated RNA of this invention to be translated in plant protoplasts, to a peak level tY2/n2; see end of this section WO 97/49814 PCT/EP97/02832 19 for the mathematical formula allowing estimation of functional half-life of the RNA and translation efficiency of the mentioned protein of at least preferably at least 25%, of the peak level resulting from in vivo translation of similar capped, non-polyadenylated first reference RNA a first reference RNA identical to the uncapped RNA but with a capstructure). The peak level resulting from in vivo translation of the capped non-polyadenylated first reference RNA should be at least 10% of the peak level resulting from in vivo translation of a second reference RNA which is capped and polyadenylated and comprises the 0 leader of TMV [Gallie et al. Nucl. Acids Res. 15: 8693-8711(1987)], a coding sequence encoding essentially the same protein as the first reference RNA, preferably the same protein as used in the first reference RNA, and a poly(A) tail comprising around 100 A-residues, such a second reference RNA being extremely efficiently translated. Schematic relative protein-protein profile are represented in Figure 1A and 1B the percentages indicated are those obtained for RNAs comprising TNV sgRNA2 derived translation enhancing sequences. For practical purposes, determination of peak levels can be substituted by determination of protein steady-state levels, the latter being determined after a sufficient long time 5 hours for a cat-RNA) after RNA introduction in the protoplasts.

Methods to generate capped and uncapped RNAs in vitro, for the introduction of such RNAs in plant protoplasts and to compare the translation efficiencies and functional half-lives of RNAs are described at the end of this section, as well as in Examples 2, 3 and 4.

The translation enhancing sequences are largely derived from sequences comprised in the leaders and trailers of genomic or subgenomic viral RNAs Fig 2A and However, for optimal enhancing of cap-independent translation in vivo, it may be necessary to use a first translation enhancing sequence comprising nucleotide sequences extending immediately downstream of the initiation codon of the homologous protein comprising nucleotides of the 5' end of the viral homologous coding sequence; Fig 2A and or to use a second translation enhancing sequence comprising nucleotide sequences extending immediately upstream of the stop codon of the homologous WO 97/49814 PCT/EP97/02832 protein comprising nucleotides of the 3' end of the viral homologous coding sequence; e.g. Fig 2A and On the other hand, in several instances, parts only of the natural or 3'UTR or derivatives thereof (see below) are suitable to provide s translational enhancement Fig 2B and Figure 2A schematically summarizes the different possible positions of nucleotide sequences comprising translation enhancing sequences (indicated by the thin lines with reference to the homologous coding sequence (CDS; indicated as a solid black bar) and 5' and 3' untranslated region (5'UTR and 3'UTR; indicated as open bars) of a viral genomic or subgenomic RNA First translation enhancing sequences include those indicated by 1-4, second translation enhancing sequences include those indicated by 5-8.

Satellite tobacco necrosis virus (STNV) and tobacco necrosis virus (TNV) are plant viruses belonging to the necrovirus group. STNV is a satellite virus, that relies upon the viral RNA replicase of the helper virus (TNV) for its replication, but codes for its own coat protein The genome consists of one single-stranded RNA strand with positive polarity, and the nucleotide sequence is known for several strains. Generally, the nucleotide sequence consists of a leader sequence or 5' untranslated region of 29-32 nucleotides a CP encoding region of 588-597 nt, and a trailer sequence or 3' UTR of 616-622 nt [Ysenbaert et al. J. Mol.

Biol. 143: 273-287 (1980), Danthinne et al, Virology 185, 605-614 (1991)].

The 5' UTRs of the STNV strains are nearly identical and can fold into a hairpin structure with a stem of 6 or 7 bp enclosing a loop of seven residues. The trailer sequences, which exhibit 64 sequence identity between the nucleotide sequence of STNV-1 and STNV-2, can fold into a secondary structure consisting of three (or four) pseudo knots flanked by two hairpins, ending with an extended double helix that spans the last 350 residues of the sequence and includes several internal loops, bulged out nucleotides, and bifurcations. [Danthinne et al, (1991) supra].

The STNV RNA does not contain a m 7 G cap structure, nor a covalently linked virus-encoded protein at the 5' end Neither does it contain a poly(A) tail at the 3' end [Horst et al. Biochemistry 10: 4748-4752 (1971); Smith and Clark, Biochemistry 18: 1366-1371(1976)]. Yet, STNV WO 97/49814 PCT/EP97/02832 21 RNA is translated efficiently in vitro. Mutations and deletions in the STNV RNA, followed by in vitro translation of the mutant RNAs, identified a translation enhancing sequence (designated the translational enhancer domain or TED), comprising a conserved hairpin structure immediately downstream from the CP cistron (nucleotide 632 to nucleotide 749 for STNV-2) [Danthinne et al., Mol. Cell. Biol. 13: 3340-3349 (1993); Timmer et al., J. Biol. Chem. 13: 9504-9510 (1993)]. TED enhances in vitro translation when fused to a heterologous coding sequence (encoding -glucoronidase), but the level of enhancement depends on the nature of the 5' UTR and is larger in combination with the STNV 5' terminally located 173 nucleotides [Danthinne et al.,supra (1993)]. It has been found that including an additional 11 bp of the STNV-2 sequence located immediately downstream of the conserved hairpin (nucleotide 632 to nucleotide 760 for STNV-2) into a second translation enhancing sequence enhances two-fold cap- 1i independent translation in vitro of a heterologous coding sequence as compared to cap-independent translation conferred by a second translation enhancing sequence comprising the hairpin plus additional 4 nt of the STNV-2 sequence.

Preferred first translation enhancing sequences comprise the leader of STNV-2, especially preferred is a first translation enhancing sequence comprising the nucleotide sequence between nucleotide positions 1 and 32 of SEQ ID No.2 particularly preferred is a first translation enhancing sequence comprising the nucleotide sequence between nucleotide positions 1 and 38 of SEQ ID No.2 comprising an initiation codon and the second codon of the coat protein coding sequence.

Preferred second translation enhancing sequences comprise portions effective in enhancing translation of uncapped RNAs, derived from the trailer sequence of STNV-2, particularly the nucleotide sequence between nucleotide positions 632 and 753 of SEQ ID No.2, quite particularly the nucleotide sequence of SEQ ID No. 2 between nucleotide positions 632 and 760.

TNV is a small icosahedral plant virus, with a single genomic RNA of about 3.7 kb. The nucleotide sequence of different isolates has been published (except for some terminal nucleotides) [Meulewaeter et al.

Virology 177:699-709 (1990); Coutts et al., J. Gen. Virol. 72: 1521-1529 WO 97/49814 PCT/EP97/02832 22 (1991)]. Upon infection of plant cells, six TNV specific RNAs are produced: the genomic RNA, two subgenomic (sg) RNAs of 1.5 kb (sgRNA1; starts at nt 2184 of TNV-A) and 1.2 kb (sgRNA2 starts at nt 2461) which are 3' coterminal, and the corresponding minus-strand RNAs. The RNA of TNV strain A (TNV-A) contains six major open reading frames (ORFs) and most likely serves as mRNA for the synthesis of a 23-kDa protein and a 82-kDa read-through protein, which are encoded by ORFs 1 and 2. In plants, the internal cistrons are most probably expressed from the two 3'-co-terminal subgenomic RNAs. The 5' ends of the largest and smallest subgenomic RNAs are located upstream of ORFs 3 and 5, respectively [Meulewaeter et al., J. Virology 66: 6419-6428 (1992)]. A very similar genome organization was proposed for TNV-D and for the carmovirus melon necrotic spot virus [Riviere and Rochon, J. Gen. Virol. 71: 1887-1896 (1990)]. The smallest subgenomic RNA probably directs the synthesis of the viral coat protein [Meulewaeter et al., J. Virology 66: 6419-6428 (1992)]. It comprises a UTR of 152 nt, with a G content of only 11.8%, that precedes the start codon of the coat protein gene. The coat protein gene is followed by a trailer sequence of 241 nucleotides.

In the context of the invention, the inventors have identified translation enhancing sequences derived from the TNV-A virus. Preferred first translation enhancing sequences comprise portions derived from the regions of TNV-A sgRNA2, such as the nucleotide sequence of SEQ ID No.1 between nucleotide positions 2461 and 2619, which still comprises 7 nucleotides of the coat protein coding sequence. Especially preferred is a first translation enhancing sequence comprising the nucleotide sequence between nucleotide positions 2461 and 2612 of SEQ ID No.1, particularly the nucleotide sequence between nucleotide positions 2461 and 2603 of SEQ ID No. 1, more particularly the nucleotide sequence between nucleotide positions 2461 and 2598 of SEQ ID No.1.

Preferred second translation enhancing sequences comprise portions effective in enhancing translation of uncapped RNAs, derived from the 3' region sequence of the TNV sgRNA2, particularly the nucleotide sequence between positions 3399 and 3684 of SEQ ID No.1, which still comprises 41 nucleotides upstream of the stop codon of the coat protein coding sequence, preferably the nucleotide sequence between nucleotide WO 97/49814 PCT/EP97/02832 23 positions 3429 and 3611 of SEQ ID No.l, especially the nucleotide sequence between nucleotide positions 3472 and 3611 of SEQ ID No.1.

The translation enhancing sequences as derived from the 5' regions or 3' regions of an RNA plant virus can be modified by small insertions, deletions or substitutions, so that their capacity to enhance capindependent translation or their synergistical interaction is not negatively affected. Such variants are referred to herein as "derivatives" and their use as enhancers for cap-independent translation form part of the invention.

Generally, it is preferred that such a derivative has at least 90 sequence identity to the natural translation enhancing sequence.

For the purpose of this invention the sequence identity of two related nucleotide or amino acid sequences refers to the number of positions in the two optimally aligned sequences which have identical residues (xlOO) divided by the number of positions compared. A gap, a position in an alignment where a residue is present in one sequence but not in the other is regarded as a position with non-identical residues.

It is however preferred, for optimal translation enhancing effect, that the nucleotide stretches which allow interactions between a pair of first and second translation enhancing sequences or between one or both of the translation enhancing sequences and the 3' end of the 18S rRNA, are left unchanged. For example, when using as first translation enhancing sequence the nucleotide sequence of SEQ ID No. 1 between nucleotide positions 2461 and 2619 and as second translation enhancing sequence the nucleotide sequence of SEQ ID No. 1 between nucleotide positions 3399 and 3684, the sequences of SEQ ID No. 1 between nucleotide positions 2464 and 2479, between nucleotide positions 2563 and 2567, between nucleotide positions 2571 and 2574, between nucleotide positions 2576 and 2586, between nucleotide positions 3449 and 3463, between nucleotide positions 3465 and 3472, and between nucleotide positions 3475 and 3482 are left unchanged.

For the same reason, when using as first translation enhancing sequence the nucleotide sequence of SEQ ID No. 2 between nucleotide positions 1 and 38, and as second translation enhancing sequence the nucleotide sequence of SEQ ID No. 2 between nucleotide positions 632 and 753, it is preferred that sequences of SEQ ID No. 2 between WO 97/49814 PCT/EP97/02832 24 nucleotide positions 9 and 19, between nucleotide positions 24 and between nucleotide positions 33 and 37, between nucleotide positions 636 and 640, between nucleotide positions 646 and 652, and between nucleotide positions 692 and 698 are left unchanged. Nevertheless, if one s of these regions are changed, it is important to make the corresponding mutations in the appropriate complementary region.

To the extent that these sequences are included in the indicated alternative translation enhancing sequences, it is preferred that they are left unchanged to obtain optimal cap-independent translation with these sequences.

It is clear that first and second translation enhancing sequences may be derived from a different RNA virus, or from different genomic or subgenomic RNAs from the same virus. However, due to the fact that the first and second translation enhancing sequences often interact in enhancing cap-independent translation when derived from STNV or TNV strains), it is preferred that first and second translation enhancing sequences are derived from the same genomic or subgenomic viral RNA.

Different possible positions of the first and second translation enhancing sequences in the chimeric RNAs encoded by the capindependently expressed chimearic genes, with respect to the heterologous coding sequence and untranslated regions(indicated i to iv), are schematically represented in Figure 2B. In this figure the heterologous coding sequence is indicated by a dotted bar. Translation enhancing sequences are indicated by the same bracketted arabic numbers as in Figure 2A, and the portions of 5'UTR and 3' UTR and/or homologous coding sequence are indicated using the same color code as in Figure 2B.

Thick black lines refer to unrelated sequences, such as the intervening sequences between a first or a second translation enhancing sequence and the heterologous coding sequence.

It is preferred that a first translation enhancing sequence is located in the 5' region of the chimeric RNA transcribed from the CIG, particularly in the 5' UTR of the chimeric RNA(e.g. Fig 2B i, ii and iii) or in a region surrounding the translation initiation codon of the heterologous sequence; in other words, the translation initiation codon may be comprised within the first translation enhancing sequence Fig 2B iv) Likewise it is WO 97/49814 PCT/EP97/02832 preferred that a second translation enhancing sequence is located in the 3' region of the chimeric RNA transcribed from the CIG, particularly in the 3' UTR of the chimeric RNA(e.g., Fig 2B i,ii and iii) or in a region surrounding the translation stop codon of the heterologous sequence; in other words the translation stop codon of the heterologous sequence may be comprised within the second translation enhancing sequence Fig 2B iv).

The first translation enhancing sequence may be located immediately upstream of the initiation codon of the coding sequence or it may be spaced therefrom by an intervening sequence of up to 100 nt, lo preferably up to 50 nt (see Fig 2b ii and iii). Similarly the second translation enhancing sequence may be located immediately downstream of the stop codon of the coding sequence or it may be spaced therefrom by an intervening sequence of up to 100 nt, preferably up to 50 nt (see e.g., Fig 2B ii and iii).

Moreover, for maximal translation enhancing effect, it may be necessary to make a translational fusion between a first translation enhancing sequence comprising nucleotide sequences extending immediately downstream of the initiation codon of the homologous coding sequences, and the coding sequence of interest Fig 2B iv). Likewise, it may be necessary to make a translational fusion between a second translation enhancing sequence, including nucleotide sequences extending immediately upstream of the initiation codon of the homologous coding sequences, and the coding sequence of interest Fig 2B iv).

For the purpose of the invention the term "translational enhancing sequence" refers to a part of an RNA molecule or RNA sequence, but may also be used to refer to a DNA molecule encoding such part.

The DNA regions encoding the translational enhancers used in this invention may be directly derived from a cDNA copy of the RNA from positive-stranded RNA viruses, but may also be partly or completely synthesized chemically.

It should be noted for unambiguousness that whenever a sequence is referred to as being the sequence between the nucleotide at position x and the nucleotide at position y, the resulting sequence includes both the nucleotide at position x and the nucleotide at position y. Moreover, as leaders and trailers evidently are parts of RNA molecules, while the WO 97/49814 PCT/EP97/02832 26 sequences in the sequence listing refer to DNA molecules, it is clear that when it is stated in the description or the claims that a leader or trailer or translation enhancing sequence in an RNA comprises a nucleotide sequence as in the sequence listing, the nucleotide sequence referred to is actually the non-transcribed strand of the double-stranded DNA molecule presented in the sequence listing, which can be transcribed into the mentioned leader or trailer RNA. In other words, the actual base-sequence of the leader or trailer RNA molecule is identical to the base-sequence of the DNA molecule represented in the SEQ ID No referred to, except that thymine is replaced by uracil.

Further combinations of 5' regions and 3' regions derived from plant viruses, known in the art to stimulate translation of uncapped RNA in vitro include a leader and trailer from barley yellow dwarf virus serotype PAV Wang and Miller J. BioL. Chem. 22: 13446-13452 (1995)]. Translation is enhancing sequences derived from these 5' UTR and 3' UTR may also be used according to the invention.

The secondary structure prediction of the sequence of sgRNA2 from TNV-AC36 revealed that the conserved secondary structures between the trailer of TNV-A and TNV-AC36 correspond to the region comprising the second translation enhancing sequence of TNV-A. It is therefore expected that the 5' regions and 3' regions of the sgRNA2 from TNV-AC36 can be used according to the invention. Preferred first translation enhancing sequences of TNV-AC36 comprise the nucleotide sequence of SEQ ID No.

particularly the nucleotide sequence of SEQ ID N 0 40 between nucleotide positions 1 and 90. Preferred second translation enhancing sequences comprise the nucleotide sequence of SEQ ID N* 41, particularly the nucleotide sequence of SEQ ID N 0 41 between nucleotide positions 102 and 227.

CIGs of the invention encode an RNA comprising first and second translational enhancing sequences in their 5' and 3' regions, but these regions may include additional sequence elements. Whereas the presence of an intron in the 5'UTR, or a polyadenylation signal in the 3'UTR is less suitable for the present invention, the region surrounding the initiation codon of the CIG may be adapted to include plant translation initiation WO 97/49814 PCT/EP97/02832 27 consensus sequences [Joshi, Nucleic Acids Research 15: 6643-6653 (1987)].

It is clear that the CIGs of the invention can further comprise one or more functional elements that can increase expression of the CIG, s particularly increase the transcription of the CIG. Such functional elements include DNA sequences which enhance the accessibility of the promoter of the CIG for the cognate polymerase, such as DNA sequences influencing the local chromatin structure (scaffold attachment regions, matrix attachment regions as described by Breyne et al. [The Plant Cell 4: 463-471 (1992)], Allen et al. The Plant Cell 5: 603-613 (1993)] or in WO 94/07902).

The invention is especially useful for the efficient expression of ATrich coding sequences, especially those encoding Bt ICPs, particularly native coding regions encoding Bt ICPs, integrated in the nuclear DNA of 1i plants. Use of the methods and means of this invention, avoids many problems associated with the RNA polymerase II dependent expression of such genes. However, this invention can be used for the efficient expression of any gene. In this regard, the use of first and second translation enhancing sequences derived from TNV sgRNA2 to increase the production of heterologous gene products in plant cells, when combined with the efficient production of predominantly uncapped, nonpolyadenylated transcripts by a bacteriophage single subunit RNA polymerase, such as T3 or T7 RNA polymerase, is particularly important.

The present invention can therefore be used for the efficient production of any protein or polypeptide of interest by the use of a CIG comprising a suitable promoter such as T3 or T7 promoter, a DNA encoding a first translation enhancing sequence derived from STNV-2 or TNV sgRNA2, a DNA region encoding a heterologous protein or polypeptide of interest, a DNA encoding a second translation 'enhancing sequence derived from 3o STNV-2 or TNV sgRNA2, and a terminator recognized by the used bacteriophage RNA polymerase. Transcription of the CIG by a single subunit RNA-polymerase such as T3 or T7 RNA polymerase, yields predominantly uncapped RNA without poly(A) tail that is efficiently translated due to the presence of the first and second translation enhancing sequences. Thus, a wide variety of peptides or proteins can be WO 97/49814 PCT/EP97/02832 28 produced in plants using genes such as those coding for peptides or proteins with pharmaceutical interest, for seed proteins modified so as to enhance nutritional value or to include peptides of interest, for chaperonins, for bactericidal or bacteriostatic peptides. Also contemplated are genes which upon expression lead to plants having an increased resistance to herbicides phosphinotricin, glyphosate, triazines), plants that can better withstand adverse environmental factors high salt concentrations in the soil, extreme temperatures etc.), or plants that have enhanced phytopathogen resistance. The invention may also be used to express to a high level inhibitors to proteases, amylases or RNases barnase-inhibiting barstar).

It goes without saying that to achieve the goal of this embodiment of the invention any viral single subunit polymerase and corresponding promoter can be used.

Preferably, the recombinant DNA comprising the CIGs also comprises a conventional chimeric marker gene. The chimeric marker gene can comprise a marker DNA that is under the control of, and operatively linked at its 5' end to, a promoter, preferably a constitutive plantexpressible promoter, such as a CaMV 35S promoter, or a light inducible promoter such as the promoter of the gene encoding the small subunit of Rubisco; and operatively linked at its 3' end to suitable plant transcription termination and polyadenylation signals. The marker DNA preferably encodes an RNA, protein or polypeptide which, when expressed in the cells of a plant, allows such cells to be readily separated from those cells in which the marker DNA is not expressed. The choice of the marker DNA is not critical, and any suitable marker DNA can be selected in a well known manner. For example, a marker DNA can encode a protein that provides a distinguishable color to the transformed plant cell, such as the Al gene (Meyer et al. (1987), Nature 330: 677), can encode a fluorescent protein [Chalfie et al, Science 263: 802-805 (1994); Crameri et al, Nature Biotechnology 14: 315-319 (1996)], can encode a protein that provides herbicide resistance to the transformed plant cell, such as the bar gene, encoding PAT which provides resistance to phosphinothricin (EP 0242246), or can encode a protein that provides antibiotic resistance to the WO 97149814 PCTEP97/02832 29 transformed cells, such as the aac(6') gene, encoding GAT which provides resistance to gentamycin (WO 94/01560).

In an alternative embodiment, the marker gene could be operably linked to similar expression controls, promoter, first and second translation enhancing sequences and terminator as used for the GIG, thereby allowing direct selection for transgenic cell lines wherein capindependent translation occurs very efficiently.

In transgenic plants the chimeric polymerase gene is preferably in the same genetic locus as the CIG so as to ensure their joint segregation.

This can be obtained by combining both chimeric genes on a single transforming DNA, such as a vector or as part of the same T-DNA.

However, a joint segregation is not always desirable. Therefore both constructs can be present on separate transforming DNAs, so that transformation might result in the integration of the two constructs at is different locations in the plant genome, or even in seperate lines, which subsequently have to be crossed to yield a hybrid plant whereby the CIG and chimeric polymerase are joined in a single cell.

In accordance with the present invention, a plant expressing a chimeric gene in a cap-independent manner, can be obtained from a single plant cell by transforming the cell in a known manner, resulting in the stable incorporation of a cap-independently expressed chimeric gene of the invention into the nuclear genome.

A recombinant DNA of the invention, a recombinant

DNA

comprising a GIG, a chimeric polymerase gene and/or a chimeric marker gene can be incorporated in the nuclear DNA of a cell of a plant, particularly a plant that is susceptible to Agrobacterium-mediated transformation. Gene transfer can be carried out with a vector that is a disarmed Ti-plasmid, comprising the recombinant DNA of the invention, and carried by Agrobacterium. This transformation can be carried out using the procedures described, for example, in EP 0116718. Ti-plasmid vector systems comprise the recombinant DNA of the invention between the T- DNA border sequences, or at least to the left of the right T-DNA border.

Alternatively, any other type of vector can be used to transform the plant cell, applying methods such as direct gene transfer (as described, for example, in EP 0233247), pollen-mediated transformation (as described, WO 97/49814 PCT/EP97/02832 for example, in EP 0270356, W085/01856 and US 4,684,611), plant RNA virus-mediated transformation (as described, for example, in EP 0067553 and US 4,407,956), liposome-rmediated transformation (as described, for example, in US 4,536,475), and the like.

s Other methods, such as microprojectile bombardment as described, for example, by Fromm et al. [(1990), Bio/Technology 8: 833] and Gordon- Kamm et al. [(1990), The Plant Cell 2: 603], are suitable as well. Cells of monocotyledonous plants, such as the major cereals, can also be transformed using wounded or enzyme-degraded intact tissue (such as immature seedlings in corn) or the embryogenic callus obtained therefrom (such as type I callus of corn), as described in WO 92/09696. Corn protoplasts can be transformed using the methods of EP 0469273. The resulting transformed plant cell can then be used to regenerate a transformed plant in a conventional manner.

is The obtained transformed plant can be used in a conventional breeding scheme to produce more transformed plants with the same characteristics or to introduce the cap-independently expressed chimeric gene or the chimeric polymerase gene of the invention, or both in other varieties of the same or related plant species. Seeds obtained from the transformed plants contain the CIG of the invention as a stable genomic insert.

The transgenic plant according to the invention may be a dicotyledonous or a monocotyledonous plant. Preferred dicotyledonous plants are potato, tomato, cotton, selected Brassica species such as oilseed rape, tobacco, soybean. Preferred monocotyledonous plants are corn, wheat, rice and barley.

The following examples provide additional description of the identification of translation enhancing sequences derived from TNV sgRNA2, the use of such translation enhancing sequences derived from necroviruses to stimulate expression in vitro and in vivo of heterologous genes (comprising genes with native coding sequences coding for Bt ICPs), construction of plant transformation vectors comprising CIGs including DNA copies of said translation enhancing elements of necroviruses, further operably linked to a promoter region recognized by a RNA polymerase capable of producing predominantly uncapped, non- WO 97/49814 PCTIEP97/02832 31 polyadenylated RNA, and the use of such vectors to obtain plant cells and plants comprising CIGs, further comprising an RNA polymerase capable of producing uncapped, non-polyadenylated RNA. These examples are not intended to unduly restrict the invention to the uses described therein.

Throughout these examples the following materials and methods were employed, unless stated otherwise: In vitro transcription of uncapped and capped RNAs. Uncapped RNAs were produced by in vitro transcription of linear DNA templates (either plasmids treated with restriction enzymes, or polymerase chain reaction (PCR) fragments) containing the appropriate promoter region, using T7 RNA polymerase (Pharmacia, Upsala Sweden) or T3 RNA polymerase (Pharmacia), essentially as described by Krieg and Melton, Nucl. Acid Res 12:7057-7070 is (1984), modified in that after 90 min of incubation at 37°C extra NTPs (0.5mM) and RNA polymerase (0.3U/pl) were added, and the reaction was further incubated for 60 min at 37°C. After reaction the DNA template was removed by adding 1.5U/pl DNasel (Pharmacia, Upsalla, Sweden) and incubating further for 10 min at 37"C.

Subsequently, the mixture was purified by phenol extraction, and passed through a Sephadex G-50 column (Pharmacia, Upsalla, Sweden). RNA was precipitated in 0.09 M K-acetate and 66% ethanol, and resuspended in RNase-free H 2 0. RNA concentration was determined by measuring OD 26 0 The integrity of the transcripts was verified by formaldehyde-agarose gelelectrophoresis. Capped RNAs were obtained by modifying the reaction conditions to include 0.5 mM m 7 GpppG and 0.05mM GTP, during the first 30 minutes of incubation.

In vitro translation of RNAs and computer aided data analysis.

Cell-free translation of in vitro synthesized RNA transcripts was performed in a wheat germ extract prepared according to Morch et al., Methods. Enzymol 118:154-164 (1986), using final concentrations of 1 mM Mg and 110 mM K Reactions were performed with 3 pmol of transcript, in a total volume of 75 pl in the WO 97/49814 PCTIEP97/02832 32 presence of [35S] methionine. To determine protein accumulation profiles, aliquots were taken at 6 to 8 different time points, and reaction products were separated on 0.1%SDS-12.5% polyacrylamide gells as described by Laemmli, Nature 227: 680-685, s (1970). After electrophoresis, gels were fixed overnight at 4 0 C in a methanol-7% acetic acid mixture, dried and autoradiographed.

Quantification of in vitro synthesized proteins was performed by slicing the appropriate band from the gel, and measuring the incorporated radioactivity by liquid scintillation counting. The obtained values were normalized to the number of methionine residues present in the synthesized protein, excluding the initiatior methionine. RNA degradation (chemical half-life of RNA) was analyzed and quantified as described by Danthinne et al., Mol. Cell.

Biol. 13: 3340-3349 (1993).

Protein accumulation in function of time was analyzed using the mathematical description tY 2 /In2(1-e -In2(t-T)/t 2 described by Danthinne et al (1993; supra) in which T corresponds to the time point at which the first translation product is completed, A is the translation efficiency of the mRNA and t% is the functional half-life of the mRNA. From this formula, it can be deduced that P(oo)=A. t/ln2, showing that the protein peak level is proportional to both the translation efficiency and the functional half-life of the mRNA. The parameters A, t/ 2 and T were estimated by nonlinear regression using the GraphPad Prism softwareTM version 1.02.

Introduction of RNA in tobacco protoplasts by electroporation.

Isolation of mesophyll protoplasts from leaves of Nicotiana tabacum cv Petit Havanna SR1 was carried out as described by Denecke et al., Methods Mol. Cell. Biol. 1:19-27 (1989) except that before electroporation, the protoplasts were washed once with TEX-buffer and three times with electroporation buffer. Introduction of RNA into the protoplasts was carried out by electroporation in the presence of 10-15 pmol of RNA per 106 protoplasts in 300 pl. Electroporation was performed immediately after the addition of the protoplasts to the RNA. For RNAs including STNV translation enhancing WO 97/49814 PCT/EP97/02832 33 sequences and replication sequences 1 pmol of RNA was used and 0.2 pmol of TNV RNA was added. Electroporation was done, using the following electrical parameters: Capacitance 200 pF, initial field strength (E 0 630 V/cm. After electroporation, protoplasts were diluted 10-fold in TEX-buffer, floated by centrifugation, isolated and diluted with TEX-medium until a concentration of 0.5 x 106 protoplasts per ml was reached. Aliquots of an appropriate amount of protoplasts 5 x 106) were incubated at 25 0 C in the dark for different times before processing.

Analysis of the fate of the RNA after introduction in tobacco protoplasts, detection of the different in vivo translation products and computer-aided data analysis of the accumulation profiles.

RNA from protoplasts was prepared as described by Denecke et al (1993) supra. Quantitative Northern analysis was performed as described by Meulewaeter et al., supra (1992). Alternatively, RNA quantification was performed by densitometric scanning of the autoradiograph resulting from the Northern hybridization using a DT120 laser scanner and analysing the data with the Molecular Dynamics ImageQuant version 4.2 software.

Proteins were isolated from tobacco protoplasts by 10 seconds sonication (using a Soniprep 150, MSE Scientific Instruments, Crawley, England) in an extraction buffer consisting either of 50 mM Tris/HCI, 2mM EDTA, 0.15 pg/pl DTT, 0.15 pg/pl BSA and 30 pg/pl PMSF (for protoplasts wherein PAT and chloramphenicol acetyltransferase (CAT) encoding transcripts were introduced) or of mM Tris/HCI, 5% glycerol, 100 mM KCI, 1 mM benzamidine HCI, mM E -amino-n-caproic acid, 10 mM EDTA, 10 mM EGTA, lpg/ml antipain, 1 pg/ml leupeptin, 14 mM B -mercapto-ethanol and 1 mM PMSF (for protoplasts wherein Bt ICP encoding transcripts were introduced). The lysate was centrifuged 5 min at 10000 g and the supernatants were recovered. Protein concentrations were determined according to Bradford (1976). PAT activities were determined with 10 pg of soluble protein, using the chromatography WO 97/49814 PCT/EP97/02832 34 method of De Block et al., EMBO J. 6:2513-2518 (1987).

Quantification was performed by densitometric scanning of the autoradiograph using a DT120 laser scanner and analysing the data with the Molecular Dynamics ImageQuant version 4.2 software.

s CAT activity was determined by thin-layer chromatography

CAT

assays as described by Gorman et al., Mol. Cell. Biol. 2:1044-1051 (1982) and quantified either by liquid-scintillation counting of excised spots or by densitometric scanning of the autoradiograph using a DT120 laser scanner and analysing the data with the Molecular Dynamics ImageQuant version 4.2 software. Absolute levels of CAT protein were calculated using a standard curve of purified CAT protein. Bt ICPs were detected by ELISA, as described by Clark et al., Meth Enzymol. 118: 742-766 (1986).

The translational efficiency of a replicating RNA can be described 1i by the mathematical function: z (dP/dt)(In2/t1/ 2 in which R represents total RNA pool, P corresponds to protein concentration and t 1 /2 is the functional halflife of the RNA. (dP/dt)/(dR/dt) can be estimated by non-linear regression using GraphPad PrismTM software version 1.02.

Unless stated otherwise in the Examples, all recombinant DNA techniques are carried out according to standard protocols as described in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, NY and in Volumes 1 and 2 of Ausubel et al. (1994) Current Protocols in Molecular Biology, Current Protocols, USA. Standard materials and methods for plant molecular work are described in Plant Molecular Biology Labfax (1993) by R.D.D. Croy, jointly published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific Publications, UK. These publications also include lists explaining the current abbreviations.

In the examples and in the description of the invention, reference is made to the following sequences of the Sequence Listing: WO 97/49814 PCTIEP97/02832 SEQ ID No.1: SEQ ID No.2: SEQ ID No.3: SEQ ID No.4: SEQ ID No.5: SEQ ID No.6: SEQ ID No.7: SEQ ID No.8: SEQ ID No.9: SEQ ID No.10: SEQ ID No. 11: SEQ ID No.12: SEQ ID No.13: SEQ ID No.14: SEQ ID No.15: SEQ ID No.16: SEQ ID No.17: SEQ ID No.18: SEQ ID No.19: SEQ ID No.20: SEQ ID No.21: SEQ ID No.22: SEQ ID No.23: SEQ ID No.24: SEQ ID No.25: SEQ ID No.26: SEQ ID No.27: SEQ ID No.28: SEQ ID No.29: SEQ ID No.30: SEQ ID No.31: SEQ ID No.32: SEQ ID No.33: SEQ ID No.34: SEQ ID No.35: cDNA of TNV-A cDNA of STNV-2 cat-gene inserted DNA fragment in pXD324 native coding sequence of cry9C (truncated) native coding sequence of crylA(b)(truncated) oligonucleotide FM oligonucleotide FMI 1 oligonucleotide FM8 oligonucleotide FM9 oligonucleotide FM 12 oligonucleotide FM 16 oligonucleotide FM 17 oligonucleotide FM 18 oligonucleotide FM 19 oligonucleotide oligonucleotide FM21 oligonucleotide FM23 oligonucleotide FM24 oligonucleotide FM1 oligonucleotide FM 13 oligonucleotide FM14 oligonucleotide FM T3 RNA polymerase terminator oligonucleotide FM3 oligonucleotide FM4 oligonucleotide oligonucleotide FM7 oligonucleotide FM6 oligonucleotide FM22 oligonucleotide oligonucleotide FM26 oligonucleotide FM2 synthetic DNA fragment encoding cry9C (truncated) inserted DNA fragment of pFM409 WO 97/49814 PCT/EP97/02832 36 SEQ ID No.36: nucleotide sequence preceding the T7 RNA polymerase in pFM410 SEQ ID No.37: nucleotide sequence of pTFM600 T-DNA SEQ ID No.38: nptll coding region translationally fused to coat protein s coding sequence and preceded by STNV-2 leader SEQ ID No.39: nptll coding region flanked by suitable restriction sites SEQ ID No.40: 5' UTR of TNV-AC36 SEQ ID No.41: 3' UTR of TNV-AC36 Example 1. Plasmid constructions used for in vitro transciption to generate the test RNAs used for the in vitro and in vivo translation experiments.

pFM21, pFM23 and pFM24 are in vitro transcription plasmids containing original TNV-A cDNA fragments cloned in the Smal site of pGEM®-3Z (Promega Biotec.,Madison, Wisc.) as described by Meulewaeter et al., supra (1990). pFM20 contains the nucleotide sequence between nucleotide 1763 and 3660 of SEQ ID No.1; pFM21 contains a cDNA corresponding to the nucleotide sequence between nucleotide and 2619 of SEQ ID No.1; pFM23 contains a cDNA corresponding to the nucleotide sequence between nucleotide 2593 and 3510 of SEQ ID No.1; and pFM24 contains a cDNA corresponding to the nucleotide sequence between nucleotide 19 and 1632 of SEQ ID No.1.

pFM33 is a 3'-terminal TNV-A cDNA clone in the Scal site of pAT153. The cDNA was synthesized on TNV dsRNA as described by Danthinne et al., supra (1991). The cDNA clone contains the nucleotide sequence between 3334 and 3684 of SEQ ID No.1, followed by three Aresidues. pAT153 is a derivative of pBR322 lacking the 0.62 kb Haell Bfragment [Twiggs and Sheratt, Nature 283:216-218, (1980)].

pFM136 [(Meulewaeter et a/.,supra (1992)] contains the cat coding sequence of Tn9, flanked by additional nucleotides on a fragment having the sequence of SEQ ID No.3, cloned as an Xbal, filled-in Clal fragment between the Xbal and trimmed Kpnl sites of pGEM®-3Z.

pFM133 and pFM134 were made by insertion of the bar coding region as a filled-in BamHI fragment from pGEMBAR into the trimmed Sad WO 97/49814 PCT/EP97/02832 37 site of pFM23 and pFM20, respectively, in such a way that upon transcription with T7 RNA polymerase an RNA encoding PAT is produced.

pGEMBAR is a clone of a modified BamHI fragment of pGSR1 (EP 242236), comprising the coding sequence of the bar gene, wherein the s sequence around the initiation codon (CCATGA) has been changed into a Ncol restriction recognition sequence (CCATGG). This BamHI fragment has been cloned into the BamHI site of pGEM®-1.

Insertion of the 1426-bp blunt-ended EcoRI-Pvul fragment of pFM134 into the blunt-ended Sad fragment of pFM136 resulted in plasmid pFM140.

pFM139 was obtained by the insertion of the cat gene, as a Pstl, blunt-ended Sad fragment from pFM136, between the Pstl and bluntended Mlul sites of pFM134.

A translational fusion between the TNV coat protein and the cat 1i open reading frames was made by transfer of the 830-bp filled-in BamHI fragment from pFM21 into the trimmed Sad site of pFM136. A 1371 bp Pstl-Nsil fragment from the resulting plasmid was inserted between the Pstl and Nsil sites of pFM134 in such a way that both sites are restored, resulting in plasmid pFM138.

pXD324 contains downstream of the T7 promoter: the -fragment of tobacco mosaic virus, the bar coding region, a poly(dA/dT) track of about 100 residues, and the SP6 promoter. This plasmid is composed of the following nucleotide sequence: from nucleotide 1 to 790 it contains the nucleotide sequence of SEQ ID No.4; from nucleotide 791 to 1221 it contains the sequence complementary to the sequence between nucleotides 2865 and 2435 of pGEM®-1 (Promega Biotec.,Madison, Wisc.); from nucleotide 1222 to 3696 it contains the nucleotide sequence between the nucleotide at position 269 and the nucleotide at position 2743 of pGEM®-3Z.

pFM108 is pGEM®-3Z derivative that, by deletion of the sequence between the nucleotide at position 2 and the nucleotide at position 17, contains a Kpnl site at the start of transcription of the T7 promoter [Danthinne et al.,supra (1993)].

pXD535 is an in vitro transcription plasmid that contains a full-length STNV-2 cDNA clone except for the first nucleotide (sequence as in SEQ ID WO 97/49814 PCT/EP97/02832 38 No.2 between the nucleotide at position 2 and the nucleotide at position 1245, downstream of the T7 promoter [Danthinne et al.,supra (1993)]. The STNV-2 cDNA was cloned between the Smal and trimmed Kpnl sites of a plasmid obtained by cloning of the 515-bp long Aatll-Pstl fragment of s pFM108 between the Aatll and Pstl sites of pAT153.

pGEM4N is a derivative of pGEM®-4 (Promega Biotec.,Madison, Wisc.) obtained by digestion with Hindlll, filling-in, and religating. In this way, an Nhel site is created.

A Kpnl-Nhel fragment containing codons 44 to 666 of the cry9C coding region flanked by translation initiation and termination sites (nucleotide sequence between nucleotide 6 and 1892 of SEQ ID was cloned between the Kpnl and Nhel sites of pGEM4N, resulting in plasmid pGEM9C1.

pGEM9C2 is a similar plasmid containing a synthetic coding region for the codons 44 to 666 of cry9C flanked by translation initiation and termination sites. The cry9C encoding Ncol-Nhel fragment of pGEM9C1 has been exchanged for the Ncol-Nhel fragment comprising the synthetic coding region, which has the nucleotide sequence between nucleotide 8 and 1888 of SEQ ID No. 34).

A Ncol-Nhel fragment containing codons 29 to 616 of the coding region flanked by transation initiation and termination sites (nucleotide sequence between nucleotide 8 and 1783 of SEQ ID No.6), was cloned between the Ncol and Nhel sites of pGEM9C1, resulting in plasmid pGEM1Abl.

Plasmid pAB02 was constructed as follows: a PCR fragment, obtained with primers FM10 and FM11 having the nucleotide sequences of SEQ ID No.7 and SEQ ID No.8, using plasmid pFM20 as template, was digested with BamHI (in first primer) and Bsml and cloned between the Bsml and BamHI sites of pFM20, resulting in plasmid pFM187. This plasmid now contains a Bsal site at the 5' end of the TNV sgRNA2 sequence. The 5' end of the subgenomic RNA2 was fused to the T7 promoter by cloning the 1224-bp Bsal(filled-in)-Pstl fragment of pFM187 between the Kpnl (blunted) and Pstl site of pFM108, resulting in plasmid pFM187B. The 3' end of TNV sgRNA2 was reconstructed by PCR using primers FM8 and FM9 having the nucleotide sequences of SEQ ID No.9 WO 97/49814 PCTIEP97/02832 39 and SEQ ID No.10 with pFM33 as template. The amplified fragment was digested with Pstl and Bsu361 and cloned between the Pstl and Bsu361 sites of pFM20 and pFM187B, resulting in plasmids pFM20C and pAB02, respectively.

pRDO1 was created by restricting pAB02 with EcoRI, followed by filling-in the protruding termini with Klenow polymerase and religation. This creates a new stop codon at nucleotide 735 of the TNV-A CP mRNA (nucleotide 3195 of SEQ ID No. The RNAspecified by this plasmid encodes a C-terminally truncated CP protein of 21-kDa.

Plasmids pRD02, pRD06, pRD03, pRD04, and pRD05 were created as follows. pRD01 contains a unique BstBI site immediately downstream of the newly introduced stop codon. pRDO1 was restricted by BstBI and respectively one of the following enzymes: Asp718, Nhel, BsaAl, Bsu361, and BamHl. The linearized DNA fragments were treated with Klenow polymerase and religated.

Plasmid pAB01 was constructed by cloning the 592-bp Ndel-Bsml fragment of pFM23 between the Ndel and Bsml sites of pAB02.

Plasmid pMA300 [Andriessen et al., Virology 212: 22-224 (1995)] was constructed in two steps starting with plasmid pFM24. The intact of the TNV-A sequence was reconstructed using complementary oligomers encoding the first 35 nucleotides of TNV-A (nucleotide sequence between nucleotide 1 and 35 of SEQ ID No.1) to create plasmid pFM39. A fragment from plasmid pFM21 containing TNV-A residues 311 to 2619 (nucleotide sequence of SEQ ID No.1 between the nucleotides at position 311 and 2619) was inserted in pFM39.

pTNV was constructed as follows: the 1636-bp Nsil-Hindlll fragment of pFM20C was cloned between the Nsil and Hindlll sites of pMA300, resulting in plasmid pTNV. pTNV contains the full-length TNV-A sequence under control of a T7 promoter. Upon digestion with Bsal, T7 RNA polymerase directs the synthesis of a transcript that differs from the natural RNA only by the addition at the 5'-end of an extra G residue.

Plasmids to obtain chimeric TNV-cat RNAs were constructed as follows. A PCR fragment obtained with primers FM10 and FM12 having the nucleotide sequences of SEQ ID No.7 and SEQ ID No.11, using plasmid pFM140 as template, was digested with BamHI (present in the first primer) WO 97/49814 PCT/EP97/02832 and BspEl (present in the cat gene) and cloned between the BspEl and BamHI sites of pFM140, resulting in plasmid pFM188. This plasmid contains a Bsal site at the 5' end of the TNVsgRNA2 leader sequence.

The 5'end of the TNVsgRNA2 was fused to the T7 promoter by cloning the 929-bp Bsal(filled-in)-Pstl fragment of pFM188 between the Kpnl (blunted) and Pstl site of pFM108. This resulted in plasmid pFM188B.

The 1006-bp Nari-NlalV fragment of pFM188B was cloned between the BsaAl and Narl site of pAB02, resulting in plasmid pFM188C.

The 1335-bp Nsil-Xbal fragment of pFM138 was ligated to the 5097bp Nsil-Nhel fragment of pTNV, resulting in plasmid pFM216.

The 1155-bp Pvul-Pstl fragment of pFM216 was ligated to the 2830bp Pvul (partially digested)-Pstl fragment of pAB02, resulting in plasmid pFM188G.

The 891-bp Ncol-Ndel fragment of pFM188B was ligated to the 3072-bp Ncol-Ndel fragment of pFM216, resulting in plasmid pFM188H.

Similarly, the 768-bp Ncol-Ndel fragment of pFM136 was ligated to the 3072-bp Ncol-Ndel fragment of pFM216, resulting in plasmid pFM1881.

A PCR fragment was obtained with primers FM23 and FM24 having the nucleotide sequences of SEQ ID No.18 and SEQ ID No.19, using plasmid pFM188C as a template, digested with EcoRI and Ndel and cloned between the EcoRI and Ndel sites of pFM188C, resulting in plasmid pVE190. In this way the T7 promoter of pFM188C was exchanged for a T3 promoter.

Using pFM188C as template, DNA fragments were PCR-amplified with primers FM16 and FM17 having the nucleotide sequences of SEQ ID No.12 and SEQ ID No.13, and with primers FM18 and FM19 having the nucleotide sequences of SEQ ID No.14 and SEQ ID No.15. Both fragments were then used in an overlap extension PCR with primers FM16 and FM19, having the nucleotide sequences of SEQ ID No.12 and SEQ ID No.15 to amplify a DNA fragment containing an Nhel site just downstream of the cat stop codon. The amplified fragment was digested with Ncol and BamHI and cloned between the Ncol and BamHI site of pFM188C, resulting in plasmid pVE192.

Using pFM188C as template, DNA fragments were amplified with primers FM16 and FM21, having the nucleotide sequences of SEQ ID WO 97/49814 PCT/EP97/02832 41 No.12 and of SEQ ID No.17, and with primers FM20 and FM19, having the nucleotide sequences of SEQ ID No.16 and SEQ ID No.15. Both fragments were then used in an overlap extension PCR with primers FM16 and FM19, having the nucleotide sequences of SEQ ID No.12 and SEQ ID No.15 to amplify a DNA fragment containing an Nhel site at nucleotide963-968 of TNV sgRNA2 (nucleotides 3423-3428 of SEQ ID No.1). The amplified fragment was digested with Ncol and BamHI and cloned between the Ncol and BamHI sites of pFM188C, resulting in plasmid pVE193.

The 1037-bp Ndel-Nhel fragment of pVE192 was cloned between the Ndel and Nhel sites of pVE193, resulting in plasmid pVE195. pVE192 was digested with Nhel and Bsu361, blunted, and religated, resulting in plasmid pVE196.

Plasmids to obtain chimeric STNV-cat RNAs were constructed in the following way. pFM175, which contains the first 889 nucleotides of the STNV-2 cDNA downstream of the T7 promoter, was made by insertion of the 1123-bp Ndel-Nsil fragment of pXD535 between the Pstl and Ndel sites of a pGEM®-3Z derivative that lacks the sequence between the nucleotide at position 62 and the nucleotide at position 91, including the SP6 promoter.

A mutant STNV leader (designated STNV*) was cloned downstream of the T7 promoter by insertion of the annealed oligodeoxyribonucleotides FM14 and FM15, having the nucleotide sequences of SEQ ID No.22 and SEQ ID No.23 between the Smal and trimmed Kpnl sites of pFM108, resulting in plasmid pFM184A. The STNV* leader was subsequently fused to the cat coding region by insertion of the 520-bp Ncol(filled-in)-Ndel fragment of pFM184A between the Ndel and blunted BssHII sites of pFM139, resulting in plasmid pFM189.

In pFM191, the cat coding region was placed upstream of the TED of STNV-2 (TED 2 by insertion of the 900-bp Nari-NlalV fragment of pFM189 between the Nari and blunted Ncol sites of pFM175.

pFM169 was made by inserting the cat coding region, as a Psti-Nrul fragment of pFM136 between the Pstl and filled-in Xbal sites of pXD324.

Insertion of the 430-nt-long Ncol-Sphl fragment of pFM191 between the Ncol and Sphl sites of pFM169 yielded plasmid pFM191A. A derivative of pXD324, named pFM179, was made by religating blunt-ended Hindlll- WO 97/49814 PCTIEP97/02832 42 digested plasmid. Upon linearization of the resulting plasmid with Nhel, RNA is synthesized which has GCUAG downstream of the poly(A) tail. The poly(dA:dT)-track of pFM179 was placed downstream of TED by inserting the 1100-nt-long Spel-Ndel fragment of pFM191A between the Xbal and Ndel sites of pFM179. The resulting plasmid was named pFM209. The length of the poly(dA:dT) track of pFM191A and pFM209 was estimated by polyacrylamide gel electrophoresis to be about 100 bp.

pFM191B was made by inserting the 430-nt long Ncol-SDhl fragment of pFM191 between the Ncol and Sphl sites of pFM136, To fuse the STNV-2 leader to the cat coding region, a fragment containing the T7 promoter fused to the first 38 nucleotides of the STNV-2 cDNA was amplified by PCR on pFM175 using primers FM1 and FM13, having the nucleotide sequences of SEQ ID No.20 and SEQ ID No.21.

After digestion with Mlul and Ndel, this fragment was cloned between the BssHII and Ndel sites of pFM189 and pFM191, resulting in plasmids pFM189A and pFM191E, respectively.

Plasmid pFM207E was constructed by ligating the 726 bp Pvull-Afllll fragment from pFM191E and the 615 bp long Pvull-EcoRI fragment of pFM191 in the 2556 bp EcoRI-Afllll vector fragment from pFM191E.

Plasmids to obtain chimeric STNV-cry RNAs, were obtained in several steps as outlined. The 1496-bp long Ndel-Hindll fragment of pXD535 was cloned between the Ndel and Eco47ll sites of pXD324, resulting in plasmid pFM214. A PCR fragment obtained with primers FM1 and FM3 having the nucleotide sequences of SEQ ID No.20 and SEQ ID using plasmid pFM175 as a template, was digested with Ncol and Ndel and the resulting fragment was cloned between the Ncol and Ndel sites of pFM214, yielding plasmid pFM214C. A synthetic DNA fragment, consisting of the annealed oligodeoxyribonucleotides FM4 and FM5, having the nucleotide sequences of SEQ ID No.26 and SEQ ID No.27, was cloned between the BsaAl and Ncol sites of pFM214C, resulting in plasmid pFM214A.

pFM214A was used as template in a PCR reaction with the primers FM1 and FM7, having the nucleotide sequence of SEQ ID No.20 and SEQ WO 97/49814 PCT/EP97/02832 43 ID No.28 and the resulting fragment was digested with Ndel and Ncol. This fragment was cloned, together with the 1880-bp Ncol-Nhel fragment of pGEM9C1, between the Nhel and Ndel sites of pFM214A. The resulting plasmid was designated as pRVL11. pRVL12 was obtained by the same strategy except that the Ncol-Nhel fragment of pGEM9C2, comprising a synthetic coding region of cry9C was used.

Example 2. STNV-2 5'UTR and TED 2 cooperate in stimulating capindependent translation of heterologous mRNAs in vivo.

The first set of experiments demonstrate that 5' information affecting translation is contained within the 5'-terminal 38 nt of STNV-2, comprising the full sequence complementarity with TED 2 Translation of an RNA which has the STNV-2 leader plus the first two codons of the CP coding region 1i (further named STNV-2 leader) translationally fused to the cat coding region was compared to that of an analogous RNA with a mutated leader (STNV* leader) which has a reduced complementarity with TED 2 Translation of the RNA with the STNV-2 leader was not affected by the presence of a cap structure, whereas the RNA with the STNV* leader required the cap to maintain its functional stability (Table These data show that the functional stability of the STNV-2 RNA in vitro depends on the combined presence of the 5'-terminal 38 nucleotides (nt) and TED 2 Furthermore, it establishes that the complementarity between leader and TED is important for the functional stability of the mRNA.

Table 1. The 5'-terminal 38 nt of STNV-2 cooperate with TED to maintain the functional stability of the mRNA in vitro.

Template DNA Leader cap T.E. ty Peak level (cpm/met.min) (min) (cpm/met) pFM191(Spel) STNV* 48.2 3.8 17.4 1.9 1210 pFM191(Spel) STNV* 59.6 ±1.1 31.7 1.1 2726 pFM191E(Spel) STNV-2 46.2 6.4 55.1 21.2 3673 pFM191E(Spel) STNV-2 46.7 6.7 47.7 18.9 3214 WO 97/49814 PCT/EP97/02832 44 It was demonstrated that inclusion of a second translation enhancing sequence comprising TED 2 followed by the sequence between nt 753 and 760 of the STNV-2 trailer in the RNA further increased translation of uncapped RNAs in vitro. Template DNAs for in vitro transcription by T7 RNA polymerase were made by PCR using appropriate primers with plasmid pFM191B as template. The resulting RNAs contain a 19 nt leader derived from a polylinker sequence, the cat coding region, and varying parts of the STNV-2 trailer (see Table 1b). The RNAs were translated in a wheat germ extract. CAT protein accumulation was quantified after 18, 32, 40, 50, 65, 80, and 100 min of incubation. Estimation of the translation efficiency and functional half-life of the mRNAs from these data (see Table Ibis) showed that translation of the RNA which has 7 additional STNV-2 nucleotides downstream of TED 2 was about two-fold higher than translation of an RNA which has only TED 2 as trailer.

Tablelbis. STNV-2 sequences downstream of TED increase capindependent translation of cat RNAs in vitro.

RNA Relevant features T.E. (k.mol/min) tY2 (min) Peak level (k.mol) 3'UTR 1 19nt nt 632-753 of STNV-2 90.1 8.1 42.5 8.1 5524

(=TED

2 2 19 nt nt 632-760 of STNV-2 166.1 18.1 37.8 ±8.1 9058 The effect of TED 2 (second translation enhancing sequence from STNV-2), as defined in vitro, on translation of a series of chimeric cat RNAs was determined in tobacco protoplasts.

In vitro transcription by T7 RNA polymerase on the different templates (summarized in Table 2) was used to generate the RNAs introduced in tobacco protoplasts (45 pmol cat-comprising RNA per 3x10 6 tobacco protoplasts). The levels of generated CAT protein were determined hrs after RNA introduction. They are summarized in Table 2.

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCT/EP97/02832 Table 2. TED 2 stimulation of uncapped and capped heterologous mRNAs in tobacco protoplasts Template DNA Relevant features CAT level Normalized (pg/100 pg total translation stimulation protein) by TED 2 3'UTR uncapped capped uncapped capped pFM169Sall QTMV control 13 283 pFM191ASpel QTMV TED 2 90 1006 7 3.6 pFM1 6 9 HindllI fTMV control-A 1 0 0 26 3450 pFM 2 0 9 Nhel 2TMV TED 2

-A

1 0 0 102 3418 3.9 Control 3'UTR is a 120 nt plasmid derived sequence; translation stimulation has been normalized to the corresponding RNA construct without TED 2 for each case separately.

In the absence of both the cap and poly(A)-tail, TED 2 stimulates translation in vivo about 7-fold. When the RNA contained either a cap or a poly(A) tail, the stimulatory effect was about 4-fold. TED 2 did not increase translation of capped and polyadenylated cat RNA.

In vitro the STNV-2 leader and TED 2 cooperate to stimulate capindependent translation. The different T7 RNA polymerase generated RNA transcripts comprising cat (summarized in Table were introduced by electroporation in tobacco protoplasts. Samples for protein extraction were taken 6 hrs after RNA introduction, and the levels of CAT protein accumulated was determined. RNA level determination revealed that min after electroporation the cat mRNA levels varied less than two-fold, indicating an RNA delivery with similar efficiency between the separate introduced RNAs. After 256 min, the cat mRNA levels were 3-5 fold lower in all experiments, indicating similar chemical half-lives for the different mRNAs.

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCT/EP97/02832 46 Table 3. Cooperation between TED 2 and STNV-2 in vivo Template DNA Relevant features CAT level (pg/100pg total protein) 3'UTR uncapped capped pFM191Spel STNV*

TED

2 10 185 pFM191ESpel STNV-2 TED 2 57 145 pFM189Sall STNV* control BB

ND

pFM189ASall STNV-2 control BB

ND

ND not determined; BB below background level (which is 2pg); control refers to a 120 nt unrelated plasmid derived sequence s CAT accumulation from uncapped RNAs was about five-fold higher in tobacco protoplasts expressing the STNV-2 5'UTR, than when a mutant of the similar length was used (STNV*). (A similar enhancement was observed in other independent experiments). Additionally, CAT protein accumulation profiles in tobacco protoplasts electroporated in the presence of uncapped TED 2 containing cat RNAs with the STNV* and the STNV-2 were determined (Table The STNV-2 leader fusion RNA encoded a higher peak level than the STNV* fusion RNA. The main difference between the profiles was that the initial rate of CAT accumulation was much greater for the STNV-2 leader fusion RNA than for the STNV* fusion 1i RNA. This implies that the STNV-2 leader confers a higher translation efficiency to the RNA than the STNV* leader. To understand to what extent the observed difference in translation efficiency is related to intrinsic differences in the performance of the leaders, the profiles of both RNAs were compared to those of the capped RNAs (Table The addition of a cap had no effect on the functional half-lives of the RNAs but improved translation efficiency. Importantly, the addition of a 5' cap stimulated translation efficiency of the STNV-2 comprising RNA only 2.5 fold as opposed to 23-fold for the STNV* leader fusion RNA (see Table This implies that the combined presence of the STNV-2 leader and TED 2 elements allows cap-independent translation to a level that is practically useful.

WO 97/49814 PCTIEP97/02832 Table 4. Cooperation between STNV-2 cap-independent translation in leader and TED 2 in supporting tobacco protoplasts Template DNA Relevant features 3'UTR 5'cap T.E.

(pg CAT/100pg protein.min) ty (min) Peak level (pg CAT/100 pg prot) pFM191Spel STNV* TED 2 0.26 0.05 52.1 19.54 10.2 pFM1 9 1Spel STNV* TED 2 6.13 0.78 26.0 229.94 pFM191ESpel STNV-2 TED 2 1.76 0.88 24.6 62.46 12.9 pFM191ESpel STNV-2 TED 2 4.52 0.85 27.0 176.07 Example 3: Determination of the nucleotide sequences from TNV sgRNA2 leader and trailer that synergistically stimulate translation in vitro and in vivo.

As can be deduced from Table 5, TNV sgRNA2 contains translation enhancing sequences which allow uncapped TNV sgRNA2 to be translated in vitro to a coat protein peak level of 83 of the level obtained after in vitro translation of capped TNV sgRNA2.

WO 97/49814 PCTIEP97/02832 48 Table 5 Effect of cap on translation of TNV sgRNA2 in vitro Template DNAa cap T.E. t1/ 2 (min) peak level _(cpm/min) _(cpm) pABO2(Bsal) 318 65 41 17 18,800 pABO2(Bsal) 285 45 55 21 22,600 a RNAs were synthesized on the indicated plasmid DNA using T7 RNA polymerase. Samples were taken after 20, 30, 45, 60, 80, and 100 min of incubation at The elements of the TNV sgRNA2 that are required for an efficient translation were determined by comparison of translation of full-length TNV sgRNA2 with translation of deletion mutants in a wheat germ translation system.

RNAs were synthesized in vitro from the DNA templates summarized in Table 6, using T7 RNA polymerase. Translation of these RNAs, which differ in the presence or absence of the sgRNA2 5' UTR or 3' UTR sequences, was compared in a wheat germ translation system (Table The indicated nucleotides remaining are the 3' nucleotides for the UTR and the 5' nucleotides for the 3' UTR.

In the absence of the 5' UTR sequence, the 3' UTR increased the protein peak level only 1.5-fold, exclusively due to a longer functional halflife. The 5' UTR stimulated translation in the absence of the trailer about 3fold. In the full-length sgRNA2, translation stimulation by the 5' UTR and 3' UTR (21- and 11-fold, respectively) is much higher than stimulation by the individual elements, indicating that the TNV sgRNA2 5' UTR and 3' UTR stimulate translation synergistically in vitro. The TNV sgRNA2 thus contains both a 5' and 3' translational enhancing sequence.

WO 97/49814 PCT/EP97/02832 49 Table 6. Effect of leader and trailer on translation of TNV sgRNA2 in vitro Template DNA 5' UTR 3' T.E. t 1 /2 peak level UTR (cpm/min) (min) (cpm) pAB01(PCR1, pl- 19 nt 14 nt 1.9 ±0.4 14 ±3 38 AfllIll) pAB01(Bsal) pl-19 nt 241 nt 1.6 0.3 26 58 11 pAB02(PCR1, 152 nt 14 nt 4.0 0.5 20 4 115 Afl/Ill) pABO2(Bsal) 152 nt 241 nt 23.1 2.4 37 7 1218 pl refers to a 23 nucleotide long polylinker sequence.

The 3' border of the translation stimulating region in the trailer was determined by translation in a wheat germ extract of 3' deletion mutants of TNV sgRNA2 (Table These mutant RNAs were synthesized in vitro using T7 RNA polymerase and pAB02 plasmid DNA that was linearized with different restriction enzymes. Translation of the RNA that lacks the 3'terminal 73 nucleotides was comparable to that of the full-length sgRNA.

Deletion of the next 49 nucleotides resulted in a two-fold decrease of 1i translation. Further deletion of the trailer resulted in a further, gradual decrease in translation. These data allow to conclude that the 3' border of the second translation enhancing sequence lies between nucleotide 1102 and 1151 of sgRNA2.

WO 97/49814 PCT/EP97/02832 Table 7.

Determination of the 3' border of the 3' translation stimulating region of TNV sgRNA2.

T 7 Template DNA 5' UTR 3' UTR T.E.

(cpm/met/min) 11/2 (min) peak level Relative (cpm/met) peak level pABO2(Bsal) 152 nt 241 nt 98 8 24 3 3375 100 pAB02(ApaLI) 152 nt 168 nt 124 8 21 2 3795 112

I

pABO2(BspEl) 152 nt 119 nt 56 7 23 4 1824 54 pABO2(BamHI) 152 nt 65 nt 19.4 1.4 27 3 747 22 pABO2(BsmAl) 152 nt 48 nt 12.7 1.2 32 5 577 17 pABO2(Bsu361) 152 nt 31 nt 4.0 0.2 44 4 253 pAB02(PCR1, AfllIll) 152 nt 14 nt 8.4 1.0 16± 2 5.6 I To demonstrate that translation stimulation by the 3' stimulatory region is independent on its position relative to the translation stop codon, a new stop codon was created at nucleotide 735 of the TNV CP mRNA by filling-in and religating the EcoRI site of pAB02. The RNA specified by the resulting plasmid (pRD01) encodes a C-terminally truncated CP protein of 21-kDa. Translation of this RNA in the wheat germ extract was comparable to translation of the wild-type sgRNA2 (Table This shows that the location of the translation termination site is not crucial for translation stimulation by the second translation enhancing sequence.

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCTIEP97/02832 51 Table 8. Effect of the location of the translation termination codon on translation of TNV sgRNA2.

Template translation T.E. t11/2 peak level Relative DNA termination site (cpm/met/min) (min) (cpm/met) peak level pABO2(Bsal) nt 981 226 57 22 9 7107 100 pRDO1(Bsal) nt 734 210 54 21 8 6210 87 The 5' border of the second translation enhancing sequences from TNV-A was determined by comparison of the translation in vitro of the RNA comprising the newly introduced stop codon with translation of internal deletion mutants. RNAs were synthesized from the plasmids linearized with Bsal listed in Table 9, using T7 RNA polymerase, and translated in a wheat germ cell free extract. The data, summarized in Table 9, demonstrated that nucleotides 738 to 1011 of sgRNA2 could be deleted without affecting translation of the mutant RNA in vitro. Extension of this deletion to nucleotide 1044 caused a drop in translation of more than fold, resulting in the same level of translation as for an RNA lacking the 3' UTR. Conclusively, the 5' border of the second translation enhancing sequence is located between nucleotides 1011 and 1044 of sgRNA2.

Moreover, the data also prove that the 5' and 3' translation stimulating regions are distinct domains, with the second translation enhancing sequence located between nucleotides 1011 and 1151 of sgRNA2.

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCTIEP97/02832 Table 9.

52 Mapping of the 5' border of the 3' translation enhancing sequence of TNV sgRNA2 7 Template DNA deletion (nt of sgRNA2)

T.E.

(cpm/met/min) t1/2 (min) peak level (cpm/met) pRDO1(Bsal) 194 15 15 2 4198 pRDO2(Bsal) 738-799 64 5 46 7 i pRDO6(Bsal) 738-882 118 14 32 6 4247 5448 4813 Relative peak level 100 102 130 115 pRDO3(Bsal) pRDO4(Bsal) 738-938 139 8 24 2 481 115 738-1011 183 17 19± 2 5016 738-1044 14.3 2.6 20 5 413 9.8 pRD01(PCR1, Afl/1ll) 1030-1224 14.4 1.8 18 3 8.9 In vitro generated chimeric TNV-cat RNAs containing various parts of TNV 5' and 3' UTR flanking the cat coding region (Table 10) were introduced in tobacco protoplasts by electroporation to determine if and 3'-UTR of TNV sgRNA2 specify efficient translation of heterologous mRNAs in vivo.

The cat RNA levels in the transfected protoplasts were determined by quantitative Northern blot analysis to estimate the efficiency of RNA introduction. The results, summarized in Table 10, revealed that the efficiency of introduction of the TNV-cat RNAs varied less than two-fold.

Determination of the CAT protein levels (Table 10) revealed that the RNA which comprised only TNV 3' UTR specified low levels of CAT. The RNAs with both 5' and 3' UTR sequences from TNV directed the synthesis of levels of CAT which were 25- to 35-fold higher as compared to the RNA lacking TNV 5' UTR sequences. Similar levels of CAT protein resulted from the translation of the TNV-cat RNAs differing in the length of the 5' and 3' TUTE SHEET (RULE 26) WO 97/49814 PCTIEP97/02832 53 UTR sequence. Efficiency of uncapped RNA translation is only four fold lower than translation efficiency of capped RNA and only two-fold lower than for a very efficiently translated mRNA (pFM16 9 Hindlll).

These data demonstrate that first and second translation enhancing sequences from TNV sg RNA2 allow efficient cap-independent translation in vivo.

Table 10. Translation of chimeric TNV-cat RNAs in tobacco protoplastsa Template DNA leader trailer cat RNA level CAT protein level pFM1881Bsal us(19) us(112)/883-1224 35 182 4 pFM188HBsal 1-138 us(112)/883-1224 28 4730 540 pFM188CBsal 1-138 us(22)/939-1224 20 6310 pFM1 8 8 GBsal 1-159 us(112)/883-1224 29 5300 220 pFM188GBsal CAP-1-159 us(112)/883-1224 29 21200 2500 pFM1 69 Hindll CAP-Q us(140)/A 1 0 0 8 47800 a RNA was synthesized on the indicated plasmid DNAs using T7 RNA polymerase and introduced in tobacco protoplasts by electroporation. The composition of the leader and trailer sequences is given, using the nucleotide numbering of the TNVsgRNA2; us unrelated sequence with the length indicated in nucleotides;. Total RNA was isolated from the protoplasts 140 min after electroporation. The cat RNA levels are in amol/pg of total RNA. The CAT protein level (pg/mg of soluble protein) was determined 340 min after RNA introduction, in duplo.

RNA was synthesized, using T3 RNA polymerase from Bsal-, and ApaLI-digested pVE190, pVE195 and pVE196 and from Bsu361-digested pVE190 and pVE195. These RNAs were introduced into tobacco protoplasts. CAT accumulation was monitored, at least 5 hours after RNA introduction. This revealed that the minimal 3' TNV sequences required for an efficient translation of an uncapped cat mRNA are located between nt 1012 and 1151 of TNV-A sgRNA2 (see Table 10 bis).

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCT/EP97/02832 54 Table 10bis Translation of chimeric TNV-cat RNAs in tobacco protoplastsa leader trailer cat RNA CAT protein level level pVE190 Bsal 1-138 us(22)/939-1224 2.23 39.5 7.6 pVE190 Bsu361 1-138 us(22)/939-1014 2.03 0 pVE195 Bsal 1-143/caaaacc gctagc/969-1224 1.85 45.0 5.3 pVE195 Bsu361 1-143/caaaacc gctagc/969-1014 2.21 0 pVE196 Bsal 1-143/caaaacc gctagc/1012-1224 1.20 41.7 3.3 pVE196 ApaLI 1-143/caaaacc gctagc/1012-1151 0.86 37.5 8.4 a RNA w'as syntesized on the indicated plasmid DNAs using T7 RNA polymerase and introduced in tobacco protoplasts by electroporation.. The composition of the leader and trailer sequences is given, using the nucleotide numbering of TNVsgRNA2; us unrelated sequence with the length indicated in nucleotides. Total RNA was isolated from the protoplasts 130 min after electroporation. The cat RNA levels are in amol/pg of total RNA. The CAT protein level (pg/40 pg of soluble protein) was determined 5 hours after RNA introduction, in duplo.

An infective TNV-A RNA wherein the CP coding region was replaced by the cat coding region, was synthesized in vitro from Bsaldigested pFM216 DNA and introduced in tobacco protoplasts, by electroporation. As a control, a cat RNA containing STNV-2 leader and trailer (generated by in vitro transcription of Aval-linearized pFM207E), was introduced together with TNV RNA in tobacco protoplasts. Two days after infection, cat RNA and protein accumulation was monitored. As indicated in Table 11, the ratio protein/RNA was about 40 times higher for the TNV-cat RNA than for the STNV-cat RNA.

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCT/EP97/02832 Table 11. Comparison of cap-independent translation of replicating RNAs RNA CAT protein Ratio CAT/RNA Relative ratio (fmol/pg tot RNA) (pg/mg sol. protein) CAT/RNA TNV-cat 1 16 1.6 44 STNV-cat 66 24 0.036 1 Example 4. Effect of codon sequence on in vivo translation in tobacco protoplasts.

In vitro generated RNA transcripts comprising first and second translation enhancing sequences from STNV-2, using as templates the DNA listed in Table 12, were introduced in tobacco protoplasts by electroporation (together with TNV RNA to supply the RNA-dependent RNA polymerase in trans). These transcripts contain either native or synthetic coding regions of a Bt ICP gene. After 48 hrs, the amount of synthesized protein and positive-strand RNA was determined. Table 12 summarizes the ratios of synthesized protein over synthesized RNA (normalized to the value obtained for native coding sequence).

Table 12. Protein/(+) RNA ratio obtained 48 hrs after RNA introduction in tobacco protoplasts.

Used template for in vitro Coding Region Protein/(+) RNA Normalized RNA generation protein/(+)RNA pRVL11 (Bsal-linearized) [crygCnative] 0.27 1 pRVL12(Bsal-linearized) [cry9Csynth] 0.1 0.37 The ratio of accumulated protein/ accumulated RNA after 48 hrs was higher when native coding sequences were utilized than when synthetic coding regions, with codon preferences closer to that of plants, were used.

After introduction of the cry9C transcripts in tobacco protoplasts (both native and synthetic coding sequences), an in vivo RNA and protein accumulation profile was determined, wich allows to estimate the ratio of the translation efficiency for both types of RNA (Table 13). Again, a higher WO 97/49814 PCT/EP97/02832 56 translation enhancing activityy was obtained for the native coding sequence.

Table 13. CRY9C protein and uncapped RNA accumulation in tobacco protoplasts.

Used template Coding uncapped RNA Protein (dP/dt)/ Normalized for in vitro RNA Region accumulation accumulation (dR/dt) translation generation efficiency pRVL11(Bsal- [cry9Cwt] R=0.07t-0.1 P=2.3t-23 32.9 1 linearized) pRVL12(Bsal- [cry9Csynth] R=0.13t-0.3 P=2.1t-35 16.2 0.49 linearized) R=RNA (fmol/0.5pg total RNA); P=protein (ng/mg soluble protein); t=time(hours) Example 5. TED 2 stimulates autonomously the translation of dicistronic RNAs in vitro.

Efficient cap-independent translation of both cistrons of a dicistronic RNA by TED from STNV-2, as present in plasmids pFM203 and PFM203B was ascertained as follows.

Construction of pFM203 and PFM203B was based on pMA442, which is an in vitro transcription plasmid containing the nptll coding region between the first 173 nucleotides and the trailer of the STNV-2 RNA. It consists of the following sequences: from nucleotide 1 to 1003 it has the nucleotide sequence of SEQ ID No.38; from nucleotide 1004-1616 it has the nucleotide sequence between 633 and 1245 of SEQ ID No. 2; from nucleotide 1617 to 1633 it corresponds to nucleotide 24 to 40 of pGEM®- 3Z; from nucleotide 1634 to 1698 it contains nucleotides 2499 to 2435 of pGEM®-1 (in counterclockwise orientation) and from nucleotide 1699 to 4173 it corresponds to nucleotide 269 to 2743 of pGEM®-3Z. pFM203 was obtained by cloning of the 1246-bp long Xhol-Nsil fragment of pMA442 between the Sail and Pstl sites of pFM189. To construct pFM203B, the Nsil-bluntcd-Asp7181 1077 bp fragment of pMA442 was first cloned SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCTIEP97/02832 57 between the Pstl and blunted Xbal sites of pFM189, resulting in pFM211A.

Religation of blunted Ncol-EcoRI-digested pXD324 DNA resulted in pFM170D. To obtain pFM170, the nptll coding region was inserted as an EcoRI-BstBI fragment (SEQ ID No. 39 between the nucleotides at position s 3 and 818) between the EcoRI and Accl sites of pFM170D. A 260-nt-long Pstl-filled-in-BamHI fragment of pFM170 was inserted between the Pstl and trimmed Kpnl sites of pFM211A, resulting in plasmid pFM203B. In general the structure of the relevant features pFM203 and pFM203B can be represented as follows: pFM203:T7-STNV*leader-cat-STNV2(1-173)-nptll(transl.fusion)-TED pFM203B T7-STNV*leader-cat-TMVleader-nptlI-TED In vitro transcription with T7 RNA polymerase of BspHI- or Speldigested plasmid pFM203 or pFM203B DNA resulted in the synthesis of dicistronic RNAs lacking or including TED, respectively. Capped and uncapped RNA transcripts were translated in vitro in a wheat germ extract.

Protein accumulation profiles were determined and translation efficiencies as well as functional half-lives were deduced, allowing calculation of the peak levels.

The results summarized in Table 14 show that TED 2 stimulates capindependent translation of both cistrons to the same extent. Translation of the second cistron is by internal initiation as it is hardly stimulated by a cap and not proportional to the level of translation of the first cistron.

WO 97/49814 PCTIEP97/02832 58 Table 14. TED 2 stimulates autonomously the translation of dicistronic RNAs in vitro.

Plasmid cap CAT

NPTII

T.E. tY Peak T.E. ty Peak level (Rela- (min) level (min) tive units) pFM203 no 2.8 19.8 79.7 1.26 31.4 57.1 BspHI 0.3 2.3 0.18 6.8 pFM203 no 71.1 6.1 626 23.1 13.0 433 Spel 9.2 0.9 1.7 1.2 pFM203B no 1.21 10.5 18.3 0.58 20.5 17.2 BspHI 0.16 1.7 0.10 5.7 pFM203B yes 12.2 24.6 433 1.00 14.9 21.5 BspHI 1.1 3.6 0.40 7.7 pFM203B no 19.6 13.4 379 6.35 32.3 296 Spel 3.4 2.9 0.91 9.4 pFM203B yes 24.1 43.2 1502 5.26 73.5 558 Spel 2.6 10.3 0.14 6.9 Example 6. Construction of plant transformation vectors.

Below, the different steps to construct the interchangeable cassettes for the build-up of the plant transformation vectors are transcribed. These cassettes, which are ultimatily under the control of a T3 or T7 promoter, comprise: a terminator sequence for T3 and T7 RNA polymerases,(ii) Bt ICP encoding genes, flanked by appropriate DNA regions encoding the first and second translation enhancing sequences of TNV-A or STNV-2, (iii) marker genes which are either under the control of a plant-expressible promoter, or are under control of T3 or T7 promoters and are further flanked by appropriate DNA regions encoding first and second translation 1i enhancing sequences of TNV-A or STNV-2, and (iv) a T3 or T7 RNA polymerase encoding gene under control of a plant-expressible promoter, whereby the RNA polymerase is joined to a nuclear localisation signal of T-antigen.

WO 97/49814 PCTIEP97/02832 59 Several combinations of these cassettes are made, yielding the plasmids of the pFM-series summarized in Table 15. Other combinations were made yielding the plasmids of the pVE-series summarized in Table In these plasmids, the combined cassettes are flanked by unique restriction sites for the octacutters Sse83871 and Sgfl, hence they can be excised as one fragment and introduced in the polylinker sequence between the T-DNA borders of the T-DNA vector pTFM600, to yield the plant transformation vectors of pTFM-series summarized in Table Alternatively, the combined cassettes flanked by unique restriction sites for the octacutters Sse83871 and Sgfl, were excised as one fragment and introduced in the polylinker sequence between the T-DNA borders of the T- DNA vector pGVS20 to yield the plant transformation vectors of pTVEseries summarized in Table 1i Construction of DNA cassette comprising terminator sequences for T3 and T7 RNA polymerases.

A synthetic DNA fragment comprising the T3 terminator sequence, flanked by unique restriction sites (nucleotide sequence of in SEQ ID No.24) was cloned as a Pstl-Hindlll downstream of the TNV trailer, between the Pstl and Hindll sites of pVE190 (see Example resulting in plasmid pVE198. The terminator fragment was then duplicated by ligating the terminator-containing EcoRI-Xbal and EcoRI-Spel fragments of pVE198 or the terminator-containing Ndel-Xbal and Ndel-Spel fragments, resulting in plasmid pVE199. The duplicated terminator fragment of pVE199 was fused to the ApaLI site of the TNV trailer by cloning of the 631-bp ApaLl(blunted)-EcoRI fragment of pVE195 (see Example 1) between the EcoRI and trimmed Pstl sites of pVE199, yielding plasmid pFM500.

(ii) Construction of the DNA cassettes comprising Bt ICP encoding genes flanked by appropriate DNA regions complementary to the leader and (portions of the) trailer of STNV-2 or TNV-A.

a. Bt ICP encoding genes flanked by STNV-2 sequences.

WO 97/49814 PCT/EP97/02832 A fragment was amplified by PCR on plasmid pRVL11 (see Example 1) with primers FM22 and FM25 having the nucleotide sequences of SEQ ID No.30 and SEQ ID No.31, digested with Hindlll and Ndel, and cloned s between the Hindlll and Ndel sites of pRVL11, resulting in plasmid pRVL17. The cry9C-containing Ndel-Spel fragment of pRVL17 was cloned between the Ndel and Spel sites of pFM500, resulting in plasmid pFM407.

The crylA(b)-containing Ncol-Nhel fragment of pGEM1Ab1 (see example 1 )is fused to the 310-bp Aatll-Ncol and the 2554-bp Nhel-Aatll fragments of pFM407, resulting in plasmid pFM408.

b. Bt ICP encoding genes flanked by TNV-A sequences.

A PCR fragment is was amplified with primers FM22 and FM6 having the nucleotide sequence of SEQ ID No.30 and SEQ ID No.29 using plasmid pAB02 (see Example 1) as a template, digested with Nhel and Ndel and cloned between the Nhel and Ndel sites of pFM500, resulting in plasmid pFM401.

A PCR fragment was amplified with primers FM26 and FM6 having the nucleotide sequence of SEQ ID No.32 and SEQ ID No.29 using plasmid pVE190 (see example 1) as a template, digested with Nhel and Ndel and cloned between the Nhel and Ndel sites of pFM500, resulting in plasmid pFM501.

The cry9C-containing Ncol-Nhel fragment of pGEM9C1 (see example 1) was cloned between the Ncol and Nhel sites of pFM401, resulting in pFM402. pFM402 is then digested with Nhel and Bsu361, blunted and ligated, resulting in plasmid pFM403.

The crylA(b)-containing is cloned between the Ncol and Nhel sites of pFM401, resulting in pFM404.

The cry-containing Ncol-Eagl fragments of pFM402, pFM403, and pFM404 are then cloned between the Ncol and Eagl sites of pFM501, resulting in plasmids pFM502, pFM503, and pFM504, respectively. In an alternative way, plasmids pFM502 and pFM504 were constructed by cloning the Ncol-Nhel fragment of pGEM9C1, respectively the Ncol-Nhel fragment of pGEM1Abl in Ncol-Nhel digested pFM501.

WO 97/49814 PCT/EP97/02832 61 (iii) Marker gene cassettes.

As a source for the conventional marker gene (chimeric gene) we used plasmid pDE110. Plasmid pDE110 is a pUC-derivative containing the bar coding region under the control of the 35S promoter and the 3' end formation signal of Cauliflower mosaic virus. It comprises the followings fragments: from nucleotide 1 to nucleotide 401 it equals nucleotide 1 to nucleotide 401 of pUC19 (Yanisch-Perron et al., 1985); from nucleotide 402 to nucleotide 1779 it comprises a promoter region of the Cauliflower mosaic virus 35S RNA (Odell et al. Nature 313, 810-812 (1985); from nucleotide 1781 to nucleotide 2332 it comprises the coding region of the bialaphos resistance (bar) gene from Streptomyces hygroscopicus (Thompson et al., 1987); from nucleotide 2351-2614 it comprises a fragment containing the 3'-end formation signal of the nopaline synthase gene from the T-DNA of pTiT37 (Depicker et al., 1982); and from nucleotide 2615 to nucleotide 4883 it equals nucleotide 418 to nucleotide 2686 of pUC19.

To obtain a DNA cassette comprising the bar gene flanked by DNA encoding the first and second translation enhancing sequences from TNV- A, under control of T3 or T7 promoters, the bar-gene containing Ncol-filledin-Mlul fragment of pFM133 (see Example 1) was cloned between the Ncol and filled-in Nhel sites of pFM401 and pFM501, resulting in plasmids pFM405 (T7-promoter) and pFM505 (T3-promoter), respectively.

To obtain a DNA cassette comprising the bar gene flanked by DNA encoding the first and second tranlation enhancing sequences from STNV- 2, under control of T7 promoter, the bar-gene containing Nhel-Ncol fragment of pFM405 is fused to the 310-bp Aatll-Ncol fragment and the 2554-bp Nhel-Aatll fragment of pFM407, resulting in plasmid pFM406. In 3o an alternative way, plasmid pFM406 was obtained by fusing the the bargene containing Nhel-Ncol fragment of pFM405 to the 1.2 kb Bgll-Ncol fragment and the 1.8 kb Nhel-Bgll fragment of pFM407.

(iv) Construction of DNA cassettes encoding T3 or T7 RNA polymerase under control of plant-expressible promoter.

WO 97/49814 PCT/EP97/02832 62 The T7 RNA polymerase coding region is present on a DNA fragment which has the following sequence: from nucleotide 1 to 35: the nucleotide sequence as in SEQ ID No.36 (comprising the coding sequence for the nuclear localisation signal of the SV40 large T-antigen); from nucleotide 36 to nucleotide 2684: the sequence of Genbank Accession No.

V01146 (incorporated herein by reference)between the nucleotide at position 3174 and the nucleotide at position 5822 comprising the T7 RNA polymerase coding region; from nucleotide 2685 to nucleotide 2690: GCTAGC. The T3 RNA polymerase coding region is comprised within a similar DNA fragment in which the sequence between the nucleotide at position 36 and the nucleotide at position 2684 are replaced with the sequence of Genbank Accession No. X02981 (incorporated herein by reference) between the nucleotide at position 144 and the nucleotide at is position 2795. Such fragments can be obtained by PCR using appropriate primers and plasmids pAR1173 (ATCC 39562) or the T7 genome; and plasmid pCM56 (ATCC 53202) or the T3 genome.

pFM409 is a pUC19-derivative containing four unique 8-base cutters (Sse83871, Ascl, Notl, Sgfl), wherein between the Sse83871 and Ascl sites a gene cassette is inserted which consists of: a CaMV35S promoter, the leader sequence of the cab22L gene from Petunia, the 5' region of the coding region and a 3'-end formation signal of CaMV. It has the following sequence: from nucleotide 1 to nucleotide 186 it equals the nucleotide sequence of pUC19 from nucleotide position 1 to nucleotide position 186; from nucleotide position 187 to nucleotide position 1220 it has the nucleotide sequence of SEQ ID No.35; from nucleotide position 1221 to nucleotide position 3460 it has the nucleotide sequence of pUC19 between the nucleotides at position 447 and 2686 of pUC19.

The T7 RNA polymerase coding region is placed under the control of a 35S promoter of CaMV by cloning as a Ncol-Nhel fragment of the above mentioned DNA between the Ncol and Nhel sites of pFM409 resulting in plasmid pFM410.

Similarly, the T3 RNA polymerase coding region is cloned as an Ncol-Nhel fragment of the above mentioned DNA between the Ncol and Nhel sites of pFM409, resulting in plasmid pFM510.

WO 97/49814 PCTIEP97/02832 63 Assembly of the plant transformation vectors.

The major plasmids, used for the assembly of the plant transformation vectors have the following schematized structure: pFM402: T7p-TNVleader-cry9C-TNVtrailer(1 )-T3term(2x) pFM403: T7p-TNVleader-cry9C-TNVtrailer(2)-T3term(2x) pFM404: T7p-TNVleader-crylAb5-TNVtrailer(1 )-T3term(x2) pFM502: T3p-TNVleader-cry9C-TNVtrailer(1 )-T3term(2x) pFM503: T3p-TNVleader-cry9C-TNVtrailer(2)-T3term(2x) pFM504: T3p-TNVleader-crylAb5-TNVtrailer(1)-T3term(2x) pFM405: T7p-TNVleader-bar-TNVtrailer(1 )-T3term(2x) pFM505: T3p-TNVleader-bar-TNVtrailer( )-T3term(2x) pFM406: T7p-STNVleader-bar-TED-T3term(2x) pFM407: T7p-STNVIeader-cry9C-TED-T3term(2x) pFM408: T7p-STNVleader-crylAb5-TED-T3term(2x) pFM410: P35S-cab221eader-T7pol-3'35S pFM510: P35S-cab22leader-T3pol-3'35S pDE110: P35S-bar-3'nos The DNA encoding the translation enhancing sequence indicated as TNV trailer has the sequence of SEQ ID No.1 between the the nucleotides at position 3429 and 3611; the one indicated as TNV trailer (2) has the sequence of SEQ ID No.1 between the nucleotides 3472 and 3611. TED refers to the DNA encoding a STNV second translation enhancing sequence corresponding to SEQ ID No.2 between nucleotides at position 632 and 753; P35S refers to a CaMV35S promoter; TNV leader refers to the DNA encoding first translation enhancing sequence corresponding to the nucleotide sequence of SEQ ID No.1 between the nucleotides at positions 2461 and 2603; STNV leader refers to the DNA encoding a first translation enhancing sequence corresponding to SEQ ID No. 2 between nucleotides at position 1 and 38; cab22L leader refers to the DNA sequence encoding the leader sequence from cab22L gene of Petunia, having the nucleotide sequence complementary to the nucleotide WO 97/49814 PCT/EP97/02832 64 sequence of SEQ ID No. 35 between nucleotides at positions 370 and 429; T7p refers to the T7 promoter having the sequence of SEQ ID between nucleotides 22 and 39; T3p refers to the T3 promoter having the sequence of SEQ ID No.18 between nucleotides 14 and 32; 3' nos and 3' 35S refer to the 3' region of the nopaline synthase gene and the CaMV transcript (having the complementary nucleotide sequence of SEQ ID No.

between nucleotide 27 and 249), respectively; T3 term refers to the terminator region of phage T3 having the nucleotide sequence of SEQ ID No.24; cry 9C refers to the native nucleotide sequence encoding a truncated toxic fragment of CRY9C as indicated in SEQ ID No. 5 between nucleotide positions 6 and 1892; cry 1A(b) refers to the native nucleotide sequence encoding a truncated toxic fragment of CRY1Ab5 as indicated in SEQ ID No. 6 between nucleotide positions 8 and 1783.

pTFM600 was derived from plasmid pGSC1700 [Cornelissen and Vandewiele (1989), Nucl. Acids Res. 17: 833] but differs from the latter in that it does not contain a beta-lactamase gene and that its T-DNA is characterized by the sequence of SEQ ID No.37.

was derived from pTFM600 by removal of the Sphl site, followed by introduction of a DNA fragment derived from the nptl gene (Genbank Accesion No. V00359 between nucleotides 787 and 2308 wherein nucleotides 1592 and 1593 were removed) in the vector-part outside the T-DNA region, using standard recombinant DNA procedures.

The chimeric bar gene under control of a CaMV35S promoter is cloned as a Stul-Xbal fragment of pDE110 between the Hpal site and the Xbal site of pFM410 (containing the chimeric T7 RNA polymerase gene) and pFM510 (containing the chimeric T3 RNA polymerase gene), resulting in plasmids pFM411 and pFM511, respectively.

The chimeric bar gene under control of a T7 promoter is cloned as a BssHII-Xbal fragment of pFM405 (flanked by TNV-A sequences) or pFM406 (flanked by STNV-2 sequences) between the Mlul and Xbal sites of pFM410, resulting in plasmids pFM412 and pFM413, respectively.

The chimeric bar gene under control of a T3 promoter is cloned as a BssHII-Xbal fragment of pFM505 (flanked by TNV-A sequences) between the Mlul and Xbal sites of pFM510, resulting in plasmid pFM512.

WO 97/49814 PCT/EP97/02832 The chimeric cry genes under control of a T7 promoter of pFM402, pFM403, pFM404, pFM407, or pFM408 are cloned as BssHII-Eagl fragments between the Ascl and Notl sites of pFM411, pFM412, or pFM413 to obtain the plasmids pFM414-pFM422 of Table The chimeric cry genes under control of a T3-specific promoter of pFM502, pFM503, and pFM504 are cloned as BssHII-Eagl fragments between the Ascl and Notl sites of pFM511 and pFM512.

Finally the Sse83871-Sgfl fragments of pFM411 to pFM422, and of pFM511 to pFM520 are cloned between the Sse83871 and Sgfl sites of the T-DNA vector pTFM600, to yield the T-DNA vectors of the pTFM-series summarized in Table Using standard cloning procedures, the plasmids pVE220 (analogous to pFM414) pVE221 (analogous to pFM419) pVE223 (analogous to pFM514) and pVE224 (analogous to pFM519) were made.

pVE220 comprises the following nucleotide sequence from nucleotide 1 to 186 :.the sequence from the nucleotide at position 1 to the nucleotide at position 186 of pUC19; from nucleotide 187 to 201 the sequence from the nucleotide at position 1 to the nucleotide at position of SEQ ID No. 35; from nucleotide 202 to 207 CCGCTG; from nucleotide 208 to 453 the sequence from the nucleotide at position 16 to the nucleotide at position 261 of SEQ ID No. 35, the complementary sequence of which comprises the 3' end formation signal of cauliflower mosaic virus; from nucleotide 454 to 3102 the sequence complementary to Genbank Accession No. V01146 from the nucleotide at position 3174 to the nucleotide at position 5822, which comprises the T7 RNA polymerase coding region; from nucleotide 3103 to 3137 the sequence complementary to the sequence from the nucleotide at position 35 to the nucleotide at position 1 of SEQ ID No. 36, which comprises the coding sequence for the nuclear localization signal of the SV40 large T-antigen; from nucleotide 3138 to 3736 the sequence from the nucleotide at position 372 to the nucleotide at position 970 of SEQ ID No. 35, the complementary sequence of which comprises the cab22L leader sequence and a promoter of the cauliflower mosaic virus 35S RNA; from nucleotide 3737 to 3738 AT; from nucleotide 3739 to 3752: the sequence from the nucleotide at position 971 to the nucleotide at position 984 of SEQ ID No.

WO 97/49814 PCT/EP97/02832 66 from nucleotide 3753 to 3776 the sequence from the nucleotide at position 15 to the nucleotide at position 38 of SEQ ID No. 30, comprising the T7 RNA polymerase promoter; from nucleotide 3777 to 3919 the sequence from the nucleotide at position 2461 to the nucleotide at position 2603 of SEQ ID No. 1, comprising a first translation enhancing sequence of TNV; from nucleotide 3920 to 5811 the sequence from the nucleotide at position 6 to the nucleotide at position 1897 of SEQ ID No. 5, comprising the cry9C coding region;from nucleotide 5812 to 5994 the sequence from the nucleotide at position 3429 to the nucleotide at position 3611 of SEQ ID No. 1, comprising a second translation enhancing sequence of TNV; from nucleotide 5995 to 6109 the sequence from the nucleotide at position 6 to the nucleotide at position 120 of SEQ ID No. 24, comprising the T3 RNA polymerase terminator sequence; from nucleotide 6110 to 6222 the sequence from the nucleotide at position 16 to the nucleotide at position is 128 of SEQ ID No. 24, comprising the T3 RNA polymerase terminator sequence; from nucleotide 6223 to 6244 the sequence from the nucleotide at position 988 to the nucleotide at position 1009 of SEQ ID No.

from nucleotide 6245 to 7918 the sequence from the nucleotide at position 947 to the nucleotide at position 2620 of pDE110 (Stul-Xbal fragment), comprising the bar coding region under the control of a promoter and a 3' end formation signal of the cauliflower mosaic virus; from nucleotide 7919 to 7931 the sequence from the nucleotide at position 1022 to the nucleotide at position 1034 of SEQ ID No. 35; from nucleotide 7932 to 10171 the sequence from the nucleotide at position 447 to the nucleotide at position 2686 of pUC19.

Plasmid pVE221 comprises the following nucleotide sequence: from nucleotide 1 to 6244 the sequence from the nucleotide at position 1 to the nucleotide at position 6244 of pVE220; from nucleotide 6245 to 6247 AAC; from nucleotide 6245 to 6271 the sequence from the nucleotide at position 15 to the nucleotide at position 38 of SEQ ID No. 30, comprising the T7 RNA polymerase promoter; from nucleotide 6272 to 6414 the sequence from the nucleotide at position 2461 to 2603 the nucleotide at position of SEQ ID No. 1, comprising a first translation enhancing sequence of TNV; from nucleotide 6415 to 6421 the sequence from the nucleotide at position 6 to the nucleotide at position 12 of SEQ ID No. 5; from nucleotide WO 97/49814 PCTIEP97/02832 67 6422 to 6982 the sequence from the nucleotide at position 1780 to the nucleotide at position 2340 of pDE110, comprising the bar coding region; from nucleotide 6983 to 6987 CTAGC, from nucleotide 6988 to 7170 the sequence from the nucleotide at position 3429 to the nucleotide at position 3611 of SEQ ID No. 1, comprising a second translation enhancing sequence of TNV; from nucleotide 7171 to 7285 the sequence from the nucleotide at position 6 to the nucleotide at position 120 of SEQ ID No. 24, comprising the T3 RNA polymerase terminator sequence; from nucleotide 7286 to 7389 the sequence from the nucleotide at position 16 to the nucleotide at position 119 of SEQ ID No. 24, comprising the T3 RNA polymerase terminator sequence; from nucleotide 7390 to 9642 the sequence from the nucleotide at position 7919 to the nucleotide at position 10171 of pVE220.

Plasmid pVE223 comprises the following nucleotide sequence: from nucleotide 1 to 453: the sequence from the nucleotide at position 1 to the nucleotide at position 453 of pVE220; from nucleotide 454 to 3105 the sequence complementary to Genbank Accession No. X02981 from the nucleotide at position 144 to the nucleotide at position 2795, comprising the T3 RNA polymerase coding region; from nucleotide 3106 to 3755: the sequence from the nucleotide at position 3103 to the nucleotide at position 3752 of pVE220; from nucleotide 3756 to 3760 the sequence from the nucleotide at position 15 to the nucleotide at position 19 of SEQ ID No. from nucleotide 3761 to 3780 the sequence from the nucleotide at position 12 to the nucleotide at position 31 of SEQ ID No. 18, comprising the T3 RNA polymerase promoter; from nucleotide 3781 to 10175: the sequence from the nucleotide at position 3777 to the nucleotide at position 10171 of pVE220.

Plasmid pVE224 comprises the following nucleotide sequence: from nucleotide 1 to 6226 the sequence from the nucleotide at position 1 to the nucleotide at position 6226 of pVE220; from nucleotide 6227 to 6250 the sequence from the nucleotide at position 988 to the nucleotide at position 1011 of SEQ ID No. 35; from nucleotide 6251 to 6256: the sequence from the nucleotide at position 14 to the nucleotide at position 19 of SEQ ID No.

from nucleotide 6257 to 6276 the sequence from the nucleotide at position 12 to the nucleotide at position 31 of SEQ ID No. 18, comprising WO 97/49814 PCTIEP97/02832 68 the T3 RNA polymerase promoter; from nucleotide 6277 to 9647 the sequence from the nucleotide at position 6272 to the nucleotide at position 9642 of pVE221.

pVE236 is a plasmid analogous to pVE220 wherein the additional s nucleotides of the T7 consensus promoter are incorporated. The plasmid has the sequence of pVE220, but for the insertion of the nucleotide sequence GGAG between nucleotide position 3377 and 3778 of pVE220.

Finally the Sse83871-Sgfl fragments of pVE220, pVE221, pVE223, pVE224 were cloned between the Sse83871 and Sgfl sites of the T-DNA vector pGSV20, to yield the T-DNA vectors of the pTVE-series summarized in Table Table 15. Summary of the plant transformation vectors.

Plasmid T-DNA vector Promoter Leader coding region trailer terminator RNA polymerase selectable marker pFM41l pTFM4I1 T7 RNA Pot P35 S-bar pFM4l2 pTFM412 T7 RNA Pal T7-TNV-bar pFM413 pTFM4I3 T7 RNA Pol T7-STNV-bar pFM414 pTFM4I4 T7 TNVsgRNA2 cry9C TNV T3 T7 RNA Pol pVE22O pTVE228 T7 TNVsgRNA2 cry9C TNY T3 T7 RNA Pol pFM415 pTFM415 T7 TNVsgRNA2 cry9C TNV T3 T7 RNA Pot pFM4l6 pTFM4I6 T7 STNV cry9C TED T3 T7 RNA Pal pF47 pTFM417 T7 TNVsgRNA2 crylA(b) TNV 13 T7 RNA PoI pFM418 pTFM418 T7 SINY cry lA(b) TED T3 T7 RNA Pal pFM4 19 pTFM4 19 T7 TN~sgRNA2 cry9C TNV T3 T7 RNA Pal T7-TN-V-bar pVE22I pTVE229 T7 TNVsgRNA2 cry9C TNV T3 T7 RNA Pal T7-TNV-bar pFM42O pTFM42O T7 TNVsgRNA2 cry9C TNY T3 T7 RNA Pal T7-TNV-bar pFM42 1 pTFM42 1 T7 TNVsgRNA2 cry9C TNV T3 T7 RNA Pal T7-STNV-bar pFM422 pTFM422 T7 TN~sgRNA2 Cry9C TNV T3 T7 RNA Pal T7-STNV-bar pFM5l1 pTFM511-- T3 RNA Po P355-bar pFM512 pTFM5I2 T3 RNA Pol T3-TNV-bar pFM514 pTFM514 T3 TNVsgRNA2 cry9C TNV T3 T3 RNA Pol pVE223 pTVE225 T3 TNVsgRNA2 cry9C TNV T3 T3 RNA Pol 15 pTFM5 15 T3 TNVsgRNA2 cry9C TNV T3 T3 RNA Pol P35 S-bar pFM517 pTFM517 T3 TNVsgRNA2 crylA(b) TNV T3 T3 RNA PoI 19 pTFM5 19 T3 TNVsgRNA2 cry9C TNV T3 T3 RNA Pol T3-TN-V-bar pVE224 pTVE226 T3 TNVsgRNA2 cry9C TNV T3 T3 RNA Pol T3-TNV-bar pFM52O pTFM520 T3 TNVsgRNA2 cry9C TNV T3 T3 RNA Pol T3-TNV-bar WO 97/49814 PCT/EP97/02832 71 Example 7. Plant transformation and analysis of regenerated plants.

To obtain transformation of corn, the plasmids of the pFMseries of Example 5 (Table 15; preferably pFM414, pFM417, pFM514 and pFM517) and pVE236 are used for introduction in maize protoplasts [according to Wang et al. Plant Cell Tissue and Organ Culture 18: 33-46 (1989); Krens et al., Nature 296: 72-74 (1982)] for transient expression assays. Further they are used for electroporation of wounded type I callus (WO 92/09696) or they are introduced into corn protoplasts (EP 0469273) to obtain transgenic corn plants The plant transformation vectors of the pTFM series (preferably pTFM414, pTFM417, pTFM514 and pTFM517) are each mobilized into the Agrobacterium tumefaciens strain C58C1Rif R or LBA4011 carrying the avirulent Ti plasmid pGV2260 as described by Deblaere et al (1985). The respective Aarobacterium strains are used to transform oilseed rape using the method described by De Block et al (1989), while rice and corn are transformed according to WO 92/09696. Transformed calli are selected on medium containing phosphinotricin, and resistant calli are regenerated into plants. For each transformation experiment, about 10 individual transformants are regenerated and analyzed by Southern blotting and PCR to verify gene integration patterns. Northern analysis and Reverse Transcription-PCR are employed to analyse mRNA levels. RNA from the chimeric cap-independently translated genes is found.

On the protein level, insect controlling amounts of Bt ICPs are found.

Expression of the chimeric marker gene, translated in cap-independent manner is sufficient to allow selection of transformed plant cells on media containing phosphinotricin.

Plasmids pTVE228, pTVE229, and pTVE225 were introduced into Agrobacterium tumefaciens Ach5C3 containing the helper Ti-plasmid pGV4000 by mobilization. The resulting transconjugant strains A3684 (comprising pTVE228), A3685 (comprising pTVE229) and A3681( comprising pTVE225) were used for rice transformation according to WO 92/09696. The resulting transformed individual rice plants (110 from transformation with strain A3684; 22 from transformation with strain A3685; WO 97/49814 PCTIEP97/02832 72 101 from transformation with strain A3681) were tested for the expression of proteins reactive in a Cry9C ELISA assay.

Cry9C ELISA assay was performed using the following procedure: Plant material was harvested, stored at -70"C and crushed. To extract soluble proteins, 2 volumes of PBS (0.8g/I NaCI; 0.02 g/I Kcl; 0.115g/l Na 2

HPO

4

KH

2

PO

4 pH7.3) were added to one volume of plant material, mixed and centrifuged for 15 minutes in the cold room. 50 pl of supernatant was applied per well in a microtiterplate (Costar "High binding" cat. Nr 3599) coated with immuno affinity purified rabitt antibodies against CRY9C. A sandwich ELISA was performed using purified goat antibodies against CRY 9C. Quantification was done using rabiit anti goat IgG peroxidase conjugate (SIGMA cat. Nr A-3450) and the TMB kit (Kirkegaard Perry Laboratories cat. Nr. 50-65-00). A dilution series of purified CRY9C was reconstructed in each microtiterplate (120 to 0.94 ng/CRY9C/ml untransformed plant protein extract). Untransformed plant protein extract was used as a blank.

It is clear from the results summarized in Table 16 that proteins reactive in a CRY9C ELISA assay can be found in transformed rice plants harbouring cap-independently transcribed chimeric genes as described in the application. Moreover, as can be seen in the strain A3685 transformations (comprising pTVE229), a chimeric selectable gene comprising the bar coding region flanked by first and second translation enhancing sequences from TNV-A under control of a T7 promoter, allowed selection of transformed plants, based on PPT-resistance. Moreover, an ELISA assay to detect PAT protein, allowed estimation of PAT levels in leaves of the transformed rice plants between 40 to 270 ng PAT/ ml plant protein extract (corresponding to 0.008 and 0.026 of total protein).

Plasmid pVE223 (Table 15) was used to transform corn protoplasts as described in EP 0469273. Leaves from 8 individual regenerated transgenic corn plants were assayed by CRY9C specific ELISA as described above. Samples from 3 plants clearly reacted positively, allowing estimation of levels CRY9C protein between 8-13 ng/ml plant protein extract.

WO 97/49814 PCT/EP97/02832 Table 16. Results from the ELISA assay on transformed rice leaves All publications referred to in this application are hereby incorporated by reference.

SUBSTITUTE SHEET (RULE 26) WO 97/49814 PCTIEP97/02832 74 SEQUENCE LISTING GENERAL INFORMAiTION: Mi APPLICANT: NAME: Plant Genetic Systems N.V.

STREET: Jozef Plateaustraat 22 CITY: Gent COUNTRY: Belgium POSTAL CODE (ZIP): B-9000 TELEPHONE: 32 9 235 94 54 (ii) TITLE OF INVENTION: Gene expression In plants (iii) NUMBER OF SEQUENCES: 42.

(iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWAREf: Patentln Release Version #1.30 (EPO) INFOR4ATION FOR SEQ ID NO: 1: SEQUENCE CHARACTERISTICS: LENGTH: 3684 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (vi) ORIGINAL SOURCE: ORGANISM: Tobacco necrosis virus STRAIN: TNV-A (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: ACTATTCATA CCAAGAATAC CAAATAGGTG CA.AGGCCTTA CTCAGCTAAA GAXGTCTAAAA TGGAGCTACC AAACCAACAC AAGCAAACGG CCGCCGAGGG TTTCGTATCT TTCCTAAACT 120 GGCTATGCAA CCCATGGAGA CGACAGCGAA CAGTCAACGC TGCAGTTGCG TTCCAAAAAG 180 ATCTTCTCGC CATTGAGGAT TCCGAGCATT TGGATGACAT CAATGAGTGT TTCGAGGAGT 240 CTGCTGGGGC ACAALTCTCAG CGAACTAAGG TTGTCGCCGA CGGAGCATAT GCCCCCGCAA 300 AATCCAACAG GACCCGCCGA GTTCGTAAGC AGA.AGAAGCA CAAGTTTGTA AAATATCTTG 360 TCAACGAAGC TCGTGCCGAG TTTGGATTGC CCAAACCAAC TGAGGCAAAC AQACTTATGG 420 TCCAACATTT CTTGCTCAGA GTGTGCAAGG ATTGGGGCGT TGTTACTGCC CACGTACACG 480 GCA&TGTTGC ACTAGCTTTG CCACTGGTGT TCATCCCAAC GGAAGATGAT CTGCTATCAC 540 GAGCATTGAT GAACACACAT GCTACTAGAG CCGCTGTACG AGGCATGGAC AATGTCCAAG 600 GGGAGGGGTG GTGGAACAAT AGGTTGGGGA TTGGGGGCCA GGTCGGACTG GCCTTCCGGT 660 CCAAATAGGG GTGCCTTGAA AGGAGGCCAG GATTCTCCAC GTCCGTTTCG CGTGGGGAAC 720 WO 97/49814 PCTIEP97/02832 ATCCTGATCT GGTGGTCATA CCATCAGGGC G-CCCTGAGAA ACAGCGTCAG ATAGTGGTAT AGGCG-GCCAT TTATTAATCG GCATCCACAA CAACTCTCTT GTAGGGGCTT CPATGGAAAGA GTATTCTATG TCGAGGGGCC CAATGGCTT

TTGTTACGCT

TCCAACCTGC

CAAGACGCCC

CTAAGCCCGT

ATAGTTGGCG

AACTG-ACTAT

ATGCGAAACT

CTGCTCCCAG

TCCGACATTA

TCTTCAAALGG

ATGTTAATCC

AAGCACTCrQA CAAGGGAGCT TTTCGAACCC TTGATAAGTT TCGTQATCTC TATACTAAAA TCATACCCCT GTAACTAGTG AACAATTCCT AATGAATTAC ACG.GGCAGA TTACAGAGAG GCGGTTGATA GTTTGTCGCA TCAACCCCTT AGCTCACGAG AAAGACATTC GTGAAGGCCC AAAAATTAAA TCTTTCTAAG AAGCCTGACC GGTCATCCAA CCTAGATCGC CTCGGTATAA CGTTTGTTTG GGCAGGTACC TGAGCATCAC GCGTTTAAAA CCATTG.CCAA GTGCTTTGGG GAAATCACGG GTTTACTCTG GAGCAACAAG GGGAAATCAT GCGCTCQAAG TGGAATAAAT

CGTCGCAGTC

GTATGAGCAT

GGACTCGACG

GAATTTTACC

CCAGTCGTTT

TCAGAGACTA

AATCGCTGCT AAAGCAGCAA TAAAATACAA GAAQAAGGGT GCATTCTAALT GTGCG.CCATG TTGCAAATAA TGGGATGAc TTGTGCAACG TAGGAACGGC TGTAGAATGA GCGGAGACAT GTCTACGGGT TGAAAGAACA TGCGTCATTG TCTGTGAGAA CQACCAACAC GTGTCTGTTG CCCAAATGAT AAACAGCTAA ATTCGCCAGT GACQOCATTA GAACACGAGT TTGGGCAACT CTTAAACATC AATTTGTCCC AGCGGATTTA AAGAAATTGA rGATGGAAGTG CGAAAAACCCG CAAGCAGCAT CrGAGCCATAT TTCAAGCAGT TTGr.ATTCAA

TGGATATATT

ACATCATGGT

GTCAAACGAA

TTAAcGrOTGG.

GCACAAAATT

ATATGCATAG

TCGGTATCAC

AGTTCGATGA

GATTAccAAAL TGAGCGCATA rCAATTTTGCC AAACCCAALCC ACGCAAACCT TCTGTCQ.TAA CATCTAAAGA AGCACAAT Ac ccAGAATGG.C TGCAAGCTGT GATTCCTGTC ATGCAGAATT CACCAAQ&CC GGCGAGTTCC AGTGGCCCGG GTTCCTTCGC ACCAGACCTC CAAGAAGCAT TGTTATCCCA ACTG&ATAcCT CTGATGTAAC GGAGACAALT

TCTACCAAAA

AGACGAACCC

CTGAAACCCC

TGGAGATCTT

ACCAAGTGTC

GTCCAAATAC

ACTGGCGTTA

AATGGAACTA

TTTTAAGTTA

TACTTTGCAA

TATCACCATT

TGTGTTCGAT GGATCCCAGT CGTCACTAGC CTTATCCCAT AGGTrQAGTGT GGCATGAGCA GCTCCAAACT GQCATCCGCC PTTGGGGTAT CACTCTAGAT TTTATCCTTC TATCTACCTT CTATGATACC CACAGGCTTG PACQAGAGCAT TTG.ATCAATG GCGGTCGGGC TAGGAGCGTT AGCGTCACCC GGTGAGCGAA rGACCAATAT AGCCCQAQAA TGGCTGCGTG TCGCTGTTGT TTCTC-ATCCT TATATTGGCA CTCCA6AGCAC TTACGAGTAC ILAAGAAGAAC AP.CAACAACG 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 QACC.GTAAGA AACACAATGG TTCGGGATTA ACATCTCAQA AATCACAGCIA AGGTACTGGC CAGACCATTA CCGTGACATA CAACTTTAAC QATACTTCAC CAGGTATTAC ACTATTCCCT ATACTTGTTG TAGGGACTCC CAATCAACAA AAGACTCAAC ACATTTCQAT CGCAAAATAG ACATGGCAGG GTCAGTATAT AATACTGCGT ACTCCAGAGC AAr-AGGTGGA GTCGTGCTCA AATGGGTCGC ATGAAGAAGG CTAGACAGCC GATAGACCAG CGCAACG.CCC CGTTCAGCGA TACTTACAGC WO 97/498 14 76 AACACGGGTT GCGAAACCCA TTGTCCGGTA G&GGGGGCTA CATACTCCCT CCCACCTCCG GGGGGGTTGT CACTCGACCC ATAGTGCCQA AATTCTCCAA CAGGGGAGAT TCCACTATAG TCCGTAACAC TGAGATTTTG AACAACCAAA TCTTAGCGGC GCTAGGCGCA TTCAATACAA PCTIEP97/02832 CAAACTCCGC ACTGATTGCA GCAGCACCAT CATGGCTGGC TAGCATCGCT GTAAATACAG ATGGCTCTCA TGTGAGATCA TCTACATTCC AAAATGCCCC GTCGATCAAT TCCATGGCT TTCACATACG ACAGAAATC.A CGCTGCACCCI CTCAGCTGTC ACAATCTTAC AAGGCCATCA ATTTTCCACC GTATGCCGGA CAGCATATTT QAATTCGAAC CALGGGAGCTG GGTCAGCCAT CCCCGTTCAA. C CCAAGTTGGA CAAGCCATGG TACCCCACTA TCTCCTCTGC CGGCTTCGGG G TCCTCGATCA GAACCAATTC TGCCCCGCGT CCCTTGTGGT CGCTACCGATG CTACTGCTAC TCCAGCAGGG GACCTTTTCA TCAAGTACGT GATTGAGTTC TCAACCCAAC AATGAACGTC TAGTTCTTTG TACTGTAACT TGCCTAATGC C GTCACACCAT TGGAGACGGA CACGGATCCT GGGAAACAGG CTTCACGGGC GCCCCCGACG ACGCATCACT CCGGATACC.A ATGGTACACC ACTATGGCAG G GGTCTTGTGC ACCAAGAACC CCTGGAAACG GGGGGGAGGG GGGTAGCACAT ATTGAGGCC CTI'TGCCCCA CCCC INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 1245 base pairs TYPE., nucleic acid STRPNDEflNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE* ODNA (vi) ORIGINAL SOURCE: ORGANISM: Satellite tobacco necrosis virus STRAIN: STNV-2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: AGTAAAGACA GGAAACTTTA CCGACTATCA GAATGACAAA ACGTCAAAGC A ACCGCAAGAG CGTTGCATCA CAGGTGCGTA GTATTGTTGA GTCAATGGCT G CATTTGCTTT TCTTACGAAC ACCAACACAG TCACTACAGC AGGTACCGTG A! GCAACAACAT CGTGCAAGGA GATGACCTTG TTAATCGCAC CGGAGACCAG X~ TACACCAGAC TTTATTGACT CGGTGTACAG GAATTACCAA CAGCCAAAGC T' TCTGGTTTCG TGACAACACC AATAGGGGGA CTACACCGGC TGTGACTGAG G' GTGCTAGTAT AACATCCCAG TATAACCCCA CTACGTTCCA, GCAAAAGAGG T' TCCAAGATTT CATGTTGGAT ACCTCTATAG TTGGACGTGT GATTGTCCAT CC TTGATAAGAA ACGGCGTGCG ATATTTTACA ACGGTGCTGC TTCTGTAGCC GC

:&TCTTTACA

LCCACCACCA

LCCCCAAGGG

!ACGACGGAG

.TTGATGTTA

.CGCTCAGCG

4GGGGACCCG

,TTGAACCAA

TAAGGTCCA

GGGGGTGGT

GTCTGCCAA

ATCATCCAG

kACAATCAA

PLGCAGAAGC

rCAACCTGA

~TAAGACCA

TTCGGTTCA

LGTTAGACA

rCACTGTTT

=GACTGCCG

.GTCAAATG

2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3684 120 190 240 300 360 420 480 540 WO 97/49814 77 GCCCCGGTGC CACATTTGTA CTTGTCATTG GATCACATGC CACTGGACAG TATQATGTGA CAGCCQAGAT TGTTTATCTG GACATGTAGA. CCATGGTCAT GATGATGATA GTGAAGGACG PCTIEP97102832 CTGAAA&GkTG

ACGGTGGTAC

GACGGCTAAA

GGCTGCATQG

ATACTTGTGT

TCCATGTCAC

TGATCGTGAT

AACATGGCAT

GTCCGGTTAG

CGTAGCTACC CTCCTGGTGC ACTTCCTG;GT GCAAAGC-AGA ACCAAAGGGT GGCGGACAGT AGTCCTGAA.C TCCATTCCCA CTAGTGTATT GGTGGAAAAC CATGTGGTCG GCAACAATGC TGTTAATCAA AAGAATCAAG ATGCATGTCT TTTGCCCTGG GCACCTCGCG TAGGCAGGGA TAAGGTATAG CAGTGGGGTT GTGCGGAATG TAGTAAATCA GG&CCGGGAG AAAACCAGCT AGTGGAACGA GGCCCCGCGT GAKTTGGGGT CAGTCATTTC TCCTATGCXT TATTGTCTCA CGTAGCACTC AACATCACTT CAAAACCCCC GTGTTTAGCC OTATATATTT TGCATCCACT CGGTTGGTAC CCGCGGAGAC TCCCCACAGC TGACTAGACA AATGCGCGTG AAGCTGGAAA CAGCCTCAAC AAGGTATAGC TGCTGCATAG GTCTCATGAC TGCCC 720 780 840 900 960 1020 1080 1140 1200 1245 GAGATGTGAA CCTTTCAAAC TTGAATTCAA INFORMATION FOR SEQ ID NO: 3: Wi SEQUENCE CHAR.ACTERISTICS: LENGTH: 781 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY:.linear (ix) FEATURE: NAME/KEY: MOS LOCATION:5. .664 OTHER INFORMATION: /product= "chloramphenicol acetyltransferase' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: ATCGATGGAG AAAAAAATCA CTGGATATAC CACCGTTGAT ATATCCCAAT GGCATCGTAA AGAACATTTT GAGGCATTTC GGATATTACG GCCTTTTTAA TATTCACATT CTTGCCCGCC CGGTGAGCTG GTGATATGGG TGAAACGTTT TCATCGCTCT ATATTCGCAA GATGTGGCGT TGAGAATATG TTTTTCGTCT CGTGGCCAAT ATGGACAACT AGGCGACAAG GTGCTGATGC CCATGTCGGC AGAATGCTTA GTAATTTTTT TAAGGCAGTT TGATAATAAG CGGATGAATG AGTCAGTTGC TCAATGTACC TATAACCAGA. CCGTTCAGCT AGACCGTAAA GAAAAATAAG CACAAGTTTT ATCCGGCCTT TGATGAALTGC TCATCCGGAA TTCCGTATGG CAATGAAAGA ATAGTGTTCA CCCTTGTTAC ACCGTTTTCC ATGAGCAAAC GGAGTGAATA CCACGACGAT TTCCGGCAGT TTCTAC.ACALT GTTACGGTGA AAACCTGGCC TATTTCCCTA AALGGGTTTAT CAGCCAATCC CTGGGTGAGT TTCACCACTT TTGATTTAA TCTTCGCCCC CGTTTTCACC ATGGGCAAAT ATTATACGCA CGCTGGCGAT TCAGGTTCAT CATGCCGTCT GTGATGGCTT ATGAATTACA ACAGTACTGC GATGAGTGGC AGGGCGGGGC ATTGGTGCCC TTAAACGCCT GGTTGCTACG CCTGAATAAG GCAGAAATTC GAAAGCAAAT TCGACCCATC GCGCGTCTAG WO 97/49814 PCT/EP97/02832 INFORMATION FOR SEQ ID NO: 4: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 790 bass pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: GGATCCGTAT TTTTACAACA ATTACCACAA CAAAACAAAC AACAAACAAC ATTACAATTT ACTATTCTAG AATTACCATG GGCCCAGAAC GACGCCCGGC CGACATCCGC CGTGCCACCG AGGCGGACAT GCCGGCGGTC ACTTCCGTAC CGAGCCGCAG AGCGCTATCC CTGGCTCGTC GCCCCTGGAA GGCACGCAAC CCCGccACCA cGGACGGGA AGGCACAGGG CTTCA&GAGC GCATGCACGA GGCGCTCGGA ACGGGAACTG GCATGACGTG CTCCGGTCCT GCCCGTCACC CGACCTGCAG GC-ATGCAAGC TGCACCATCG TCAACCACTA CATCGAGACA AGCACGGTCA GAACCGCAGG AGTGGACGGA CGKCCTCGTC CGTCTGCGGG QCCGAGGTGG ACGGCGAGGT CGCCGGCATC GCCTACGCGG GCCTACQACT GGACGGCCGA GTCGACCGTG TACGTCTCCC CTGGGCTCCA CGCTCTACAC CCACCTGCTG AAGTCCCTGG GTGGTCGCTG TCATCGGGCT GCCCAACGAC CCGAGCGTGC TATGCCCCCC GCGGCATGCT GCGGGCGGCC GGCTTCAAGC GGTTTCTGGC AGCTGGACTT CAGCCTGCCG QTACCGCCCC GAGATCTGAT CTCACGCGAA TTCCGGGGAT CCTCTAGAGT TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAGAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAGAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA

GCTTGTATTC

INFOPNATION FOR SEQ ID NO: Wi BEQUENCE CHAR.ACTERISTICS: LENGTH: 1897 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ix) FEATURE: NPME/KEY: CDS LOCATION:13. 1990 OTHER INFORMATION: /product- "CRY9C (truncated)" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GGTACCAAAA CCATGGCTGA TTACTTACAA ATGACAGATG AGGACTACAC TGATTCTTAT ATAAATCCTA GTTTATCTAT TAGTGGTAGA GATGCAGTTC AGACTGCGCT TACTGTTGTT WO 97/49814 79 GGGAGATAC TCGGGGCTTT AGGTGTTCCG TTTCTGGAC AA.ATAGTGAG TTTTTATCAA TTCCTTTTAA ATACACTGTG GCCAGTTAAT GATACAGCTA TATGGGAAGC TTTCATQCGA CAGGTGGAGG AACTTGTCAA TCAACAAATA ACAGAATTTG CAAGAAATCA GGCACTTGCA AGATTGCAAG GATTAGGAGA CTCTThPAT GT~ATATCAAC GTTC!CCTTCA AAATTGGTTG GCTGATCGAA ATGATACACG AAATTTAAGT GTTGTTCGTG CTCAATTTAT AGCTTTAGAC CTTGATTC TTAATGCTAT TCCATTGTTT CACTAAATG GACAGCAGGT TCCATTACTG' TCAGTATATG CACAAGCTGT GAATTTACAT TTGTTATTAT TAAAAGATGC ATCTCTTTTT GGAGAAGGAT GGGGATTCAC ACAGGGGGMA ATTTCCACAT ATTATGACCG TCAATTGA CTAACCGCTA AGTACACTAA TTACTGTGAA ACTTGGTATA ATACAGGTTT AGATCGTTTA AGAGGAACAA. ATACTG&AAG TTGCTTALGA TATCATCAAT TCCGTAGAGA AATGACTTTA GTGGTATTAG ATGTTGTGGC GCTATTTCCA TATTATGATG TACGACTTTA TC!CAACGGGA TCAAACCCAC AGCT!TACACG TGAGGTATAT ALCAGATCCGA TTGTATTTAA TCCACCAGCT AATGTTGGAC TTTGCCCGACG TTGGGGTACT AATCCCTATA ATACTTTTTC TGAGCTCQAA AATGCCTTCA TTCGCCCACC ACATCTTTTT GATAGGCTGA ATAGCTTAAC AATCAGCAGT AATCGATTTC CAGTTTCATC TAATTTTATG GATTATTGGT CAGGACATAC GTTACGCCGT AGTTATCTrGA ACGATTCAGC AGTACAAGAA GATAGTTATG GCCTAATTAC AACCACAAGA GCAACAATTA ATCCCGGAGT TQATGGAACA AACCGC-ATAG AGTCAACGGC AGTAGATTTT CGTTCTGCAT TGATAGGTAT ATATGGCGTG AATAGAGCTT CTTTTGTCCC AGGAGGCTTG PCTIEP97/02832 190 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1897

TTTAATGGTA

GAATTACCAC

TTTAGCTTTC

GTTTGGACCC

TTACCATTGG

TTTACAGGAG

ACGGTTAATT

AATTTCAGTA

ACAATGAACA

ACTGGTCCGT

GCAGAAGGTG

CGACTTCTCC TGCTAATGGA GGATGTAQAG ATCTCTATGA TACAAATGAT CAGATGAAAG TACCGGAAGT TCAACCCATA GACTATCTCA TGTTrACCTTT AAACTAATC.A GCCTGGATCT ATAGCTAATG CAGGAAGTGT ACCTACTTAT GTCGTGATGT GGACCTTAAT AATACQATTA CCCCAAATAG AATTACACAA TAAAGGCATC TGCACCTGTT TCGGGTACTA CGGTCTTAAA AGGTCCAGGA GGGGTATACT CCGAAG&ACA ACTAATGGCA CATTTGGAAC GTTAAGAGTA CACCATTAAC ACAACAATAT CGCCTAAGAG TTCGTTTTGC CTCAACAGGA TAAGGQTACT CCGTGGAGGG GTTTCTATCG GTGATGTTAG ATTAGGGAGC GAGGGCAGGA ACTAXCTTAC GAATCCTTTT TCACAAGAGA GTTTACTACT TCAATCCGCC TTTTACATTT ACACAAGCTC AAGAGATTCT ANCAGTGAAT TTAGCACCGG TGGTGAATAT TATATAGATA GAATTGAAAT TGTCCCTGTG AATCCGGCAC GAGAAGCGGA AGAGGACTGA GGCTAGC INFORMATION F'OR SEQ ID NO: 6: SEQUJENCE CHARACTERISTICS: LENGTH: 1788 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear WO 97/49814 PCTIEP97/02832 (ix) FEATURE: NAME/KEY: CDS (B LOCATION:9. .1781 OTHER INFORMATION: /produact= "CRY1.Ab5 (truncated)" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: CCAAAACCAT GGCTATAGAA ACTGGTTACA CCCCAATCGA TATTTCCTTG. TCGCTAACGC AATTTCTTTT GAGTG.AATTT GTTCCCGGTG CTGTTTGT GTTAGCACTA GTTGATATAA TATOGGGAAT UTTTGGTCCC TTAACCAAAG AATAGAAGAA GCAATCTTTA TCAAMTTTAC CAGCATTAAG AGAAGACMATG CTATTCCTCT TTTTGCAGTT CTGCAAATTT ACATTTATCA TTGATGCCGC GACTATCA&AT CAGATCATGC TGTACGCTGG rGAQATTG*GAT AAGATATAAT TTTCTCTATT TCCGAACTAT CAAGAGAAAT TTATACAAAC TCTCAATGGG ACGCATTTCT TGTACAAATT GAACAGTTAA TTCGCTAGGA ACCAILGCCAT TTCTAGATTA GAAGGACTAA GCAGAATCTT TTAGAGMAGTG CCGAAGCAGAT CCTACTAATC CGTATTCAAT TCAATGACAT GAACAGTGCC CTTACAACCG CAAAATTATC AAGTTCCTCT TTTATCAGTA TATGTTCAArC QTTTTGAGAG ATGTTTCAQT AQTCGTTATA ATGATTTAAC TACAATACGG GATTAGAGCG CAATTTAGAA GAGAATTAAC GATAGTAGAA CGTATCCAAT CCAGTATTAG AAAATTTTGA GTTTGGACAA AGGTGGGGAT TAGGCTTATT GGCAACTATA TGTATGGGGA CCGGATTCTA ACTAACTGTA TTAGATATCG TCGAACAQTT TCCCAATTAA CTCAGGGCAT AGAACCGAAGT ATTAGGAGTC CCATCTATAC GGATGCTCAT AG~AGGAGAAT CTCCTGTAG.G GTTTTCGGGG CCAGAATTCA CAGCTCCACA ACAACGTATT GTTGCTCAAC CCACTTTATA TAGAAGACCT TTTAATATAG ACGGGACAGA ATTTGCTTAT GGAACCTCCT GCGGAPLCGGT AGATTCGCTG GATGAAATAC AAGG&TTTAG TCATCGATTA AGCCATGTTT GTGTAAGTAT AATAAGAGCT CCTATGTTCT ATATAATTCC TTC.ATCACAA ATTACACAAA CTGQAACTTC TGTCGTTAAA GGACCAGGAT CACCTGGCCA GATTTCAACC TTAAQAGTAA GQGTAAQAAT TCG.CTACGCT TCTACCACAA GACCTATTAA TCAGGGGAAT TTTTCAGCAA GAAGCTTTAG GACTGTA=CT TT2TACTACTC TTACGTTAAG TGCTCATGTC TTCAATTCAG

CACATTTGAT

ATTATTGGTC

CTTTTCCGCT

TAGGTCAGGG

GGATAAATAA

CAAATTTGCC

CGCCACAGAA

CAATGTTTCG

CTTGGATACA

TACCTTTAAC

TTACAGGACG

ATATTACTGC

ATTTACAATT

CTATGAGTAG

CGTTTAACTT

GCAATGAAGT

TCGTAGTTTT

GGATATACTT

AGGGCATCAA

ATATGGAACT

CGTGTATAGA

TCAACAACTA

ATCCGCTGTA

TAACAACGTG

TTCAGGCTTT

TCGTAGTGCT

AAAATCTACT

AGATATTCTT

ACCATTATCA

CCATACATCA

TGGGAGTAAT

TTCAAATGGA

TTATATAGAT

CGAGCCTCGG

AACAGTATAA

ATAATGGCTT

ATGGGAAATG

ACATTATCGT

TCTGTTCTTG

TACAGAAAAAt

CCACCTAGGC

AGTAATAGTA

C.AkTTTAATA

AATCTTGGCT

CGAAGANCTT

CAAAGALTATC

ATTGACGGA

TTACAGTCCG

TCAAG;TGTAT

CGAATTGAAT

120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1050 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 WO 97/498 14 PCTIEP97/02832 81 TTGTTCCGGC AGAAGTAACC TTT&P&GGCAG AATATGATTG AGGCTAGC 1788 INFORMATION FOR SEQ ID NO: 7: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid CC) STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide EM1" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: TAGCTCAGGG ATCCGGTCTC GATACTTCAC CAGGTATTAC AC 42 INFORMATION FOR SEQ ID NO: 9: Ci) SEQUENCE CHARACTERISTICS: CA) LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc 'oligonucleotide EX11l' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: GCTGCTGCAA. TCAGTGCGG 19 INFORM4ATION FOR SEQ ID NO: 9: Wi SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide RM9", Cxi) SEQUENCE DESCRIPTION:, SEQ ID) NO: 9: GTACTGTAAC TTGGCTAATG CC 22 INFORMATION FOR SEQ ID NO: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS; single TOPOLOGY: linear Cii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonuclaotide EM9"1 WO 97/49814 PCTIEP97/02832 82 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATGTAaACTG CAGGTCTCCG GGGTGGGGCA AAGGCC 36 INFORMATION FOR SEQ ID NO: 11: Wi SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide 1M1211 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: TCCCATATCA CCAGCTCACC INFORMATION FOR SEQ ID NO: 12: Wi SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid CC) STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide FM1611 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: CTTCGCCCCC GTTTTCACCA TGGGC INFORMATION FOR SEQ ID NO: 13: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 41 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide Fl4171" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: CTCAATCACA CCAATAACTG CCTTAGCTAG CTTACGCCCC G 41 INFORMATION FOR SEQ ID NO: 14: SEQUENCE CHARACTERISTICS: LENGTH: 40 base pairs TYPE: nucleic acid CC) STRANDEDNESS: single TOPOLOGY: linear WO 97/49814 PCTIEP97/02832 83 (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligenucleotide PM1iB" (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: GCGATGAGTC GCAGGGCGGG GCGTAAGCTA GCTAAGGCAG INF'ORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide FM19"1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GCCTGTTTCC CAGGATCCGT CTCCG INFORMATION FOR SEQ ID No: 16: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 38 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide RM2011 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: GATTGAGTTC ATTGAACCAA TCGCTAGCAC AATGAACG 38 INFORMATION FOR SEQ ID NO: 17: SEQUENCE CHARACTERISTICS: LENGTH: 40 base pairs TYPE: nucleic acid STRANDEDNESS, single TOPOLOGY:, linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide RM21"1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: GTACAAAGAA CTAGACGTTC ATTGTGCTAG CGATTGGTTC INFORMATION FOR SEQ ID NO: 18: Ci) SEQUENCE CHARACTERISTICS: WO 97/498 14 PCTIEP97/02832, 84 LENGTH: 45 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide EM4311 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: CGGCCAGCAT ATGTTATTAA CCCTCACTAA AGATACTTCA CCAGG INFORMA.TION FOR SEQ ID NO: 19: SEQUENCE CHAP.ACTERISTICS:- LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE:, other nucleic acid DESCRIPTION: /desc "oligonucleotide FM24" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: AAGAAGTTGT CCATATTGGC CA 22 INFORMATION FOR SEQ ID NO: Wi SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide FM1" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ACGGTCACAG CTTGTCTGTA AG 22 INFORMATION FOR SEQ ID NO: 21: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide RM1311 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: WO 97/49814 PCTI]EP97/02832 CTTTACCGAC TATCAGAATG ACACGCGTAA TAC 33 INFORMATION FOR SEQ ID NO: 22: Wi SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STEPIIDEDNESS: single TOPOLOGY:, linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide EA1411 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: TAAAGACAGG AAACTTTACT G.ACTACCATG INFORMATION FOR SEQ ID NO: 23: Wi SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs CB) TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear Cii) MOLECULE TYPE: other nucleic acid CA) DESCRIPTION: /desc "oligonucleotiLde FK15"1 Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: CATGGTAGTC AGTAAAGTTT CCTGTCTTTA INFORMATION FOR SEQ ID NO: 24: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 139 base pairs CB) TYPE: nucleic acid STRANqDEDNESS: double TOPOLOGY: linear (ix) FEATURE: NPAE/PMY: stem loop LOC-ATION:67..10j6 OTHER INFORMATION:/standard -name= "hairpin from T3 RNA polymerase terminator" Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: CTGCAGCGGA CCGACTAGTC CACCCTGAAA GCTCGTTGTG ATTGGGATAA CAATCTACTA ATATGCAAAC CCCTTGGGTT CCCTCTTTGG GAGTCTGAGG GGTTTTTTGC TTTAACCCTC: 120 TAGAGCTCGG CCGAAGCTT 139 INFORMATION FOR SEQ ID No: Wi SEQUENCE CHARACTERISTICS: LENGTH: 43 base pairs TYPE: nucleic acid WO 97/49814 PCTIEP97/02832 86 STAaNDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligoniucleotide F143, (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GTATTACCAT GGTCATCACG TGTCATTCTG ATAGTCGGTA AAG 43 INFORMATION FOR SEQ ID NO: 26: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 45 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide FM41, (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: GTACCGGTTC- GAAGCTTGAT ATCGGcCGCA TGCTGCAGCT AGCCC INFORMATION FOR SEQ ID NO: 27: SEQUENCE CHARACTERISTICS: LENGTH: 49 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid CA) DESCRIPTION: /desc ="oligonucleotide EMS"1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: CATGGGGCTA GCTGCAGCAT GCGGCCGATA TCAAGCTTCG AACCGGTAC 49 INFORMATION FOR SEQ ID NO: 28: Wi SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "olgonucleotide EM71" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: CTATGTACCA TGGGTGTCAT TCTGATAGTC GO-TA

A

WO 97/49814 PCTJEP97/02832 87 INFORMATION FOR SEQ ID NO: 29: SEQUENCE CHARACTERISTICS: LENGTH: 73 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc 'oligonucleotid6 VME", (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: GTACCTTAGG TTCGAAGCTA GCGGTCCGTT AACC.ATGGTT TTGGCGATCG AAATGTGTTG AGTCTTGTAC TCG 73 INFORMATION FOR SEQ ID NO: Wi SEQUENCE CHARACTERISTICS: LENGTH: 39 base pairs TYPE: nucleic acid STRANDEDNESS:- single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide FM2211 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CGGCCAGCAT ATGCGCGCCT GTAATACGAC TCACTATAS 39 INFORMATION FOR SEQ ID NO: 31: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide F1425" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: AGTTCCTCCA CCTGTCGC 1 INFORMATION FOR SEQ ID NO: 32: SEQUENCE CHARACTERISTICS: LENGTH: 40 base pairs TYPE: nucleic acid STRANDEDNESS. single CD) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc 'oligonucleotide E142611 1. 1.

WO 97/49814 PCTJEP97/02832 88 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: CGGCCACCAT ATGCGCGCCT GTTATTAACC CTCACTAAAG INFORMATION FOR SEQ ID NO: 33: Mi SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs WB TYPE: nucleic acid STRAIDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "oligonucleotide PIC2" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: GCCAAGTTAC ACGTACAAAG AACTAGAC 28 INFORMATION FOR SEQ ID NO: 34: Ci) SEQUENCE CHAR.ACTERISTICS: LENGTH: 1893 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ix) FEATURE: NAME/KEY: CDS LOCATION:9. .1986 OTHER INFORMATION: /product= 11CRY9C (truncated)" (xi) SEQUENCE DESCRIPTION., SEQ ID NO: 34: CCAAAACCAT GGCTGACTAC CTGCAGATGA CCGACGAGGA CTACACCGAC AGCTACATCA ACCCCAGCCT GAGCATCAGC GGTCGCGACG CCGTGCAGAC CGCTCTGACC GTGGTGGGTC 120 GCATCCTGGG TGCCCTGGGC GTGCCCTTCA. GCGGTCAGAT CGTGAGCTTC TACCAGTTCC 180 TGCTGAACAC CCTGTGGCCA GTGAACGACA. CCGCCATCTG GGAAGCTTTC ATGCGCCAGG 240 TGGAGGAGCT GGTGAACCAG CAGATCACCG AGTTCGCTCG CAACCAGGCC CTGGCTCGCC 300 TGCAGGGCCT GGGCGACAGC TTCAACGTGT ACCAGCGCAG CCTGCAGAAC TGGCTGGCCG 360 ACCGCAACGA CACCCGCAAC CTGAGCGTGG TGAGGGCCCA GTTCATCGCC CTGGACCTGG 420 ACTTCGTGAA CGCCATCCCC CTGTTCGCCG TGAACGCCCA GCAGGTGCCC CTGCTGAGCG 480 TGTACGCCCA GGCCGTGAAC CTGCACCTGC TGCTGCTGAA. GGATGCATCC CTGTTCGGCG 540 AGGGCTGGGG CTTCACCCAG. GGCGAGATCA rCACCTACTA CQACCGCCAG CTCGAGCTGA 600 CCGCCAAGTA CACCA&CTAC TGCGAGACCT GGTACAACAC CGGTCTGGAC CGCCTGAGGG 660 GCACCAACAC CGAGAGCTGG CTGCGCTACC ACCAGTTCCG CAGGGAGATG ACCCTGGTGG 720 TGCTGGACGT GGTGGCCCTG TTCCCCTACT ACGACGTGCG CCTGTACCCC ACCGGCAGCA 780 WO 97/49814 89 ACCCCCAGCT GACACGTGAG GTGTACACCG ACCCCATCGT GTTCAACCCA CCAGCCAACG TGGGCCTGTG CCGCAGGTGG GGCACCAACC CCTACAACAC CTTCAGCGAG CTGGAGAACG PCT/EP97/02832 CCTTCATCAG GCCACCCCAC CTGTTCGACC GATTCCCCGT GAGCAGCAAC TTCATGGACT ACCTGAACGA CAGCGCCGTG CAGGAGGACA CCATCAACCC AGGCG1TGG&C GGCACCAACC GCGCTCTGAT CGGCATCTAC GGCGTGAACA ACGGCACCAC CAGCCCAGCC TGCCACCCGA CGAGAGCACC CCTTCCAGAC CAACCAGGCT GAccAGGAG

CCCTGGTGAA

CCGGTGGCGG

TGAATTCCCC

TC.AGCATCCG

TGAAcAGGG

GTCCCTTCA.A

AGGGCGTGAG

CAGCTCGCGA

GGACGTGGA6C

GGCCAGCGCT

TATACTGCGC

ACTGACCCAG

CGTGCTGAGG

CCAGGAGCTG

CCCACCCTTC

CACCGGTGGC

AACGGTGGCT

GGCAGCAGCA

GGCAGCATCG

CTGAACAACA

CCCGTGAGC!G

AGGACCACCA

CAGTACCGCC

GGTGGCGTGA

ACCTACGAGA

ACCTTCACCC

GAGTACTACA

GCCTGAACAG CCTGACCATC AGCAGCAATC ACTGGAGCGG TCACACCCTG CGCAGGAGCT GCTACGGCCT GATCACCACC ACCAGGGCCA GCATCGAGAG CACCGCTGTG GACTTCCGCA GGGCCAGCTT CGTGCCAGGT GGCCTGTTCA.

GCCGAGALTCT GTACGACACC AACGACGAGC CCCACCGCCT GAGCCACGTC ACCTTCTTCA CCAACGCTGG CAGCGTGCCC ACCTACGTGT CCATCACCCC CAACCGCATC ACCC.AGCTGC GCACCACCGT GCTGAAGGGT CCAGGCTTCA ACGGCACCTT CGGCACCCTG CGCGTGACCG TGCGCGTGCG CTTCGCCAGC ACCGGCAACT GCATCGGCGA CGTGCGCCTG GGCAGCACCA GCTTCTTCAC CCGCGAGTTC ACCACCACCG AGGCCCAGGA GATCCTGACC GTGAACGCCG TCGACCGCAT CGAGATCGTG CCCGTGAACC 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1900 1860 1893 GGCCGAGGAG GACTGAGGCT AGC INFORMA~TION FOR SEQ ID NO: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1034 base pairs TYPE: nucleic acid STP.ANDEDNESS: double TOPOLOGY:, linear (ix) FEATURE: NAM'/KEY: 3'UT.

LOCATION:complament (27. .249) OTHER IFlOPMAT:ON: /function= "13 end formation signal of CakdV" (ix) FEATURE: NAM/KEY: CDS LOCATION:complement (262. .363) OTHER INFORMATION: /product= 1CRYIA~b)5 (N-terminus)" (ix) FEATURE: NAM1/KEY: LOCATION: complement (370. .429) OTHER INFORMATION: /standard name= "leader from cab22L gene from Petunia" (ix) FEATURE: NAM/KEY: promoter LOCATION: complement (434. .960)

I

WO 97/49814 PCT/EP97/02832 OTHER INFORM.LTION: /standard-nane= "CaMV35S promoter"o (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CCTGCAGGCA ATTGGTACCA TGCATGATCT GGATTTTAGT ACTGGATTTT GGTTTTAGGA ATTAGAAATT TTATTGP.TAG AAGTATTTTA TATGCTCAAC ACATGAGCGA AACCCTATAG CACACATTAT TATGGAIAAA ATAGAGAGAG AGCGTGTCCA. AGCTTGCTAG CTAGTCCTAA CAAAAGAAAT TGCGTTAGCG

CATGGTTTTG

AAATGAGCTC

GAAGGATAGT

TGCTTTGAAG

CCATCTTTGG

TGATGGCATT

GCTGGGCAAT

GCCCTTTGGT

TCCACCATGT

TGTTCGCCAG

TATGGTGCAT

GTTTAATAAG

GAGTCCTCTC

GGGATTGTGC

ACGTGGTTGG

GACCACTGTC

TGTAGGTGCC

GGAATCCGAG

CTTCTrAGAC

TGACGAAGAT

TCTTCACGGC

GGCGCGCCAT

ACAAGG.AAAT

AAGAGAAAAG

CAAATGAAAT

GTCATCCCTT

AACGTCTTCT

GGCAGAGGC.A

ACCTTCCTTT

GAGGTTTCCC

TGTATCTTTG

TTTCTTCTTG

GAGTTCTGTT

ATGCCCGGGC

CAAATACAAA TACATACTAAL GGGTTTCTTA GAACCCTAAT TCCCTTATCT GGGAACTACT ATAGALTTTGT AGAGAGAGAC TGGTGATTTC CACAAATCCA GCACCGGGAA. CAAATTCACT ATCGATTGGG GTGTAALCCGG TCTCGATAGC AGTTCTTTTG TTATGGCTGA AGTAATAGAG CAACTTCCTT ATATAGAGGA AGGTCTTGC ACGTCAGTG;G AGATATCALCA TCAALTCCACT TTTTCCACGA TGCTCCTCGT GGGTGGGGGT TCTTGAACGA TAGCCTTTCC TTTATCGCAA TCTACTGTCC TTTTGATGAA GTGACAGATA GATATTACCC TTTGTTGAAA AGTCTCAATA ATATTCTTGG AGTAGACGAG AGTGTCGTGC TCATTGAGTC GTAAAAGACT CTGTATGAAC AGATCCTCGA TCTGAATTTT TGACTCCATG CCTGTACAGC GGCCGCGTTA ACGCGTATAC 120 190 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1034 TCTAGAGCGA TCGC INFORMATION FOR SEQ ID NO: 36: SEQUENCE CHARACTERISTICS: LENGTH: 35 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "sequence preceding the RNA polymerase coding region in pEM410" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: CCAAAACCAT GGCTCCCAAG AAGAAGCGCAL AGGTT INFORMATION FOR SEQ ID NO: 37:- Wi SEQUENCE CHARACTERISTICS: LENGTH: 105 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY:, linear WO 97/49814 PCT/EP97/OZS32 91 (ix) FEATURE:

NAME/KEY:-

LOCATION:l. OTHER INFORMATION: /label- RB /note= "right border sequence from the T-DNA of pTFl4600"1 (ix) FEATURE: XNPE/KE: LOCATION:26. OTHER INFOPRHATION: /label- MCS /note- "multiple cloning site,' (ix) FEATURE: NAWEKEY; LOCATION:8l. .105 OTHER INFOP.HRTION:/label- LB /note- "left border sequence from the T-DNA of pTRM600"1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: AATTACAACG GTATATATCC TGCCAGTACT CGGCCGTCGA CCTGCAGGAA TTCTAGATAC GTAGCGATCG CCATGGAGCC ATTTACAATT GAATATATCC TGCCG 105 INFORMATION FOR SEQ ID NO: 38: Wi SEQUENCE CHARACTERISTICS: L.ENGTH: 1003 base pairs TYPE, nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAE/KEY: LOCALTION:18. .49 OTHER INFORMATION: /standard-name= "1STNV-2 leader" (ix) FEATURE: NAME/KEY: CDS LOCATION:50. 985 OTHER INFORMATION /product= "fusion between CP(N-terminus) and NPTII (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: GAGCTCTAGA GGTCTCGAGT AAAGACAGGA AACTTTACCG ACTATCAGAA TGACAAAACG TCAAAGCAAA CAATCAAACC GCAAGAGCGT TGCATCACAG GTGCGTAGTA TTGTTGAGTC 120 AATGGCTQALG CAGAAGCGAT TTGCTTTTCT TACGAACACC AACACAGTCA CTACAGCAGG 160 TACCGTGATC CGGCCAACCT TGGATGGATT GCACCCAGGT TCTCCGGCCG CTTGGGTGGA 240 CAGGCTATTC GGCTATGACT GGGCACAACA GACAATCGGC TGCTCTGATQ CCCCCTGTT 300 CCGGCTGTCA GCGCAGGGGC GCCCGGTTCT TTTTGTCAAG ACCGACCTGT CCGGTGCCCT 360 GAATGAACTG CAGGACGAGG CAGCGCGGCT ATCG;TGGCTG GCCACGACGG GCGTTCCTTG 420 CGCAGCTGTG CTCGACGTTG TCACTQAAQGC GGGAAGGCAC TGGCTGCTAT TGGGCGAAGT 480 GCCGGGGCAG GATCTCCTQT CATCTCACCT TGCTCCTGCC GAGAAAGTAT CCATCATGGC 540 WO 97/49814 92 TGP.TGCAATG CGGCGGCTGC ATACGCTTGA TCCGGCTACC TGCCCATTCG GAAACATCGC ATCGAGCGAG CACGTACTCG GATGGAAGCC GGTCTTGTCG TCTGGAC!GAA GAGCATCAGG GGCTCGCGCC AGCCGAACTG TTCGCCAGGC CATGCCCGAC GGCGAGGATC TCGTCGTGAC CCATGGCGAT GCCTGCTTGC GGTGGAAAAT GGCCGCTTTT CTGGATTC&T CGACTGTGGC CGGCTGGGTG CTATCAGGAC ATAGCGTTGG CTACCCGTGA TATTGCTGAA GAGCTTGGCG TGACCGCTTC CTCGTGCTTT ACGGTATCGC CGCTCCCGAT TCGCAGCGCA TCGCCTTCTT GACGAGTTCT TCTGAGCGGG ACTCTGGGGT TCG INFORMATION FOR SEQ ID NO: 39: Wi SEQUENCE CHARACTERISTICS: LENGTH: 818 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear PCT/EP97/02832

ACCACCAAGC

ATCAGGATGA

TCAAGGCGCG

CGAATATCAT

TGGCGGACCG

GCGAATGGGC

TCGCCTTCTA

600 660 720 780 840 900 960 1003 (ix) FEATURE: NAME/KEY: CDS LOCATION:1. .798 OTHER INFORM@ATION./gene= 'nptII" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: ATGAATTCCA GCTTGGATGG ATTGCACCCA GGTTCTCCGG CCGCTTGGGT GGAGAGGCTA TTCGGCTATG ACT GGG CACA TCAGCGCAGG GGCC.CCCGGT

CTGCAGGACG

GTGCTCGACG

CAGGATCTCC

ATGCGGCGGC

cGCATCGAGC

GAAGAGCATC

GACGGCGAGG

AATGGCCGCT

GACATAGCGT

TTCCTCGTGC

CTTGACGAGT

AGGCAGCGCG

TTGTCACTGA

TGTCATCTCA

TGCATACGCT

GAGCACGTAC

AGGGGCTCGC

ATCTCGTCGT

TTTCTGGATT

TGGCTACCCG

TTTACGGTAT

ACAGACAATC

TCTTTTTGTC

GCTATCGTGG

AGCGGGAAGG

CCTTGCTCCT

TGATCCGGCT

TCGGATGGAA

GCCAGCCGAA

GACCCATGGC

CATCGACTGT

TGATATTGCT

CGCCGCTCCC

GGCTGCTCTG ATGCCGCCGT AAGACCGACC TGTCCGGTGC CTGGCCACGA CGGGCGTTCC GACTGGCTGC TATTGGGCCA GCCGAGAAAG TATCCATCAT ACCTGCCCAT TCGACCACCA GCCGGTCTTG TCGATCAGGA C1'GTTCGCCA. GGCTCAAGGC GATGCCTGCT TGCCGAATAT GGCCGGCTGG GTGTGGCGQA GAAGAGCTTG GCGGCGAATG GATTCGCAGC GCATCGCCTT

GGTTCGAA

GTTCCGGCTG

CCTGAATGAA

TTGCGCAGCT

AGTGCCGGGG

GGCTGATGC.A

AGCGAAAbCAT

TGATCTGGAC

GCGCATGCCC

CATGGTGGAA

CCGCTATCAG

GGCTGACCGC

CTATCGCCTT

TCTTCTGAGC GGGACTCTGG INFORMATION FOR SEQ ID NO: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 98 base pairs 4, WO 97/49814 PCT/EP97/02832 TYP'E: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE; cDNA (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: Tobacco necrosis virus STRAIN: TNV-AC36 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GACCTTACCA AACTTTCAAA GAAGATAATT CTAAGATACA GTACATTAC.A ATCGGCGGAG CACTACTACA AAAGTGTCAA CAAATTAATA ATGCCTAA INFORMATION FOR SEQ ID NO: 41: SEQUENCE CHARACTERISTICS: LENGTH: 306 base pairs TYPE; nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO Civ) ANTI-SENSE: NO (vi) ORIGINAL SOURCE: ORGANISM: Tobacco necrosis virus STRAIN: TNV-AC36 (ix) FEATURE: NAME/KEY: LOCATION:19. .49 OTHER INFORMATION:/note= "pseudoknot 11" (Cix) FEATUJRE: NAME/KEY: LOCATION:63. .92 OTHER INFORMATION: /nota= "hairpin 1"1

FEATURE:

NAME/KEY: LOCATION:102. .227 OTHER INFORM).TION: /note- "hairpin 2", Cix) FEATURE: NAME/KEY: LOCATION:230. .272 OTHER INFORMATION:/note- "hairpin 3"1 (ix) FEATURE: NAME/KEY: LOCATION:288. .303 t WO 97/49814 PCT1IEP97102832 94 OTHER iNFoRHATIoN:/note= -hairpin 4"- (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: TAGTCGCTTT CATArGATCCG TCTTCCCAGA GACGTTAAGA AGAAACTGGA G~AAAAATATT AGTTTAGGAA CTTG;GGCTTG ACAAACCCAk GTGGCATCTC TTACGTQGTT AATC&CACTG 120 CATGTTGACG AATAGGATGG ATCCTGGGAA ACAGGTTTAA CGGGCTCTCT GTGGTGGAGG 180 CCQACGCAT CACCTATTTG TGCTCCAGCA GTGGTTGTCA TCACGTGTCC TGACATGGCT 240 CCATGCGACA GCATGGGGGG GTCCAG.AGTC AGTCCCCTCT TTATTTACCT AGGTTTTCCT 300 AGGAAccc 308

Claims

1. A chimeric gene which comprises: a) a first promoter selected from the group consisting of a promoter recognized by DNA dependent RNA polymerase I, a promoter recognized by DNA dependent RNA polymerase III, or a promoter recognized by a bacteriophage single subunit RNA polymerase b) a DNA region encoding a chimeric RNA which comprises a 5' UTR, a AU-rich heterologous coding sequence and a 3' UTR; and optionally, c) a terminator region recognized by said RNA polymerase wherein said chimeric RNA, produced by said RNA polymerase, is uncapped and comprises: i) a first translation enhancing sequence derived from the region upstream of the coat protein encoding cistron of genomic or subgenomic RNA of TNV or STNV, or a derivative thereof which has at least 90% sequence identity, located in the 5' region of said chimeric RNA; ii) a second translation enhancing sequence derived from the S* region downstream of the coat protein encoding cistron of genomic or subgenomic RNA of TNV or STNV, or a derivative thereof which has at least 90% sequence identity, located in the 3' region of said Schimeric RNA; and which is capable of being translated in the cytoplasm of a plant cell, to produce a protein or polypeptide of interest encoded by said AU rich heterologous coding sequence. 2, The chimeric gene of claim 1, wherein said first translation enhancing sequence is located in the 5' UTR of said uncapped RNA species, and wherein said second translation enhancing sequence is located in the 3' UTR of said uncapped RNA species.

3. The chimeric gene of claim 2, wherein said first translation enhancing sequence is located in a region surrounding the initiation codon of the heterologous coding sequence.

4. The chimeric gene of claim 2, wherein said second translation enhancing sequence is located in a region surrounding the stop codon of the heterologous coding sequence. The chimeric gene of claim 4, wherein said first and second translation enhancing sequences are derived from the genomic RNA of STNV-2.

6. The chimeric gene of claim 5, wherein the first translation enhancing sequence is encoded by a DNA comprising the sequence of SEQ ID No. 2 between the nucleotides at positions 1 and 38, and wherein the second translation enhancing sequence Is encoded by a DNA comprising the sequence of SEQ ID No. 2 between the nucleotides at position 632 and 753. The chimeric gene of claim 6, wherein the first and second translation enhancing sequences are derived from the subgenomic RNA 2 of TNV-A.

8. The chimeric gene of claim 7, wherein the first translation enhancing sequence is encoded by a DNA sequence selected from the group of DNA sequences consisting of; the DNA sequence of SEQ ID NO. 1 between the nucleotides at position 2461 and 2619, the DNA sequence of SEQ ID NO.1 "between the nucleotides at position 2461 and 2612, the DNA sequence of SEQ ID NO.1 between the nucleotides at position 2461 and 2603, and the DNA sequence of SEQ ID NO.1 between the nucleotides at position 2461 and 2598.

9. The chimeric gene claim 8, wherein the second translation enhancing sequence is encoded by a DNA sequence selected from the group of DNA sequences consisting of: the DNA sequence of SEQ ID NO.1 between the nucleotides at position 3399 and 3684, the DNA sequence of SEQ ID NO.1 between the nucleotides at position 3429 and 3611 and the DNA of SEQ ID NO. 1 between the nucleotides at position 3472 and 361 1. The chimeric gene of any one of claims I to 9, wherein said first promoter is an RNA polymerase I specific promoter. 11, The chimeric gene of any one of claims 1 to 9, wherein said first promoter Is an RNA polymerase Ill specific promoter. 12, The chimeric gene of any one of claims 1 to 9, wherein said first promoter is recognized by a bacteriophage single subunit RNA polymerase. 13, The chimeric gene of any one of claims 1 to 9, wherein said first promoter Is a T3 or T7 promoter,

14. The chimeric gene of claim 1, wherein said transcribed region comprises two or more cistrons. I15. The chimeric gene of any one of claims 1 to 14, wherein said heterologous coding sequence comprises a continuous nucleotide sequence of at least 400 nucleotides with an AU-content of at least 57.5%,

16. The chimeric gene of claim 15, wherein the continuous stretch of at least 400 nucleotides is encoded by a Bt ICP gene.

17. The chimeric gene of claim 16, wherein the heterologous coding sequence comprises a sequence encoding at least a fragment of a Bt lOP with insecticidal activity.

18. The chimeric gene of claim 16, wherein the Bt lOP gene is selected from the group consisting of: crylAb5, cry9C, crylla, cry3C, &Iy3A, cry1Da and cryi Ea.

19. A plant cell comprising the chimeric gene of any one of claims 1 to 18, Integrated In its nuclear DNA. The plant cell of claim 19, which produces said RNA polymerase,

21. The plant cell of claim 20, wherein said first promoter is a T3 promoter and wherein said plant cell further comprises a chimeric polymerase gene which comprises: a) a second plant-expressible promoter; b) a DNA sequence encoding a T3 RNA polymerase, operably linked to a nuclear localization signal; wherein said second promoter and said sequence are operably linked so that upon expression of the chimeric polymerase gene a functional and properly located RNA polyerase is produced. 22, The plant cell of claim 21, wherein said first promoter is a T7 promoter and wherein said plant cell further comprises a chimeric polymerase gene Swhich comprises; a) a second plant-expressible promoter; b) a DNA sequence encoding a T7 RNA polymerase, operably linked to a nuclear localization signal; wherein said second promoter and said sequence are operably linked so that upon expression of the chimeric polymerase gene a functional and properly located RNA polymerase is produced. 23, The plant cell of claim 21 or claim 22, wherein said second promoter is a promoter.

24. The plant cell of any one of claims 19 to 23, wherein the plant cell is derived from a plant selected from the group consisting of potato, tomato, cotton, a Brassica species such as 8, napus, tobacco, soybean, corn, wheat, rice and barley. The plant cell of claim 24, which is derived from a corn plant,

26. A plant comprising the plant cell of any one of claims 19 to

27. The plant of claim 26, which is a corn plant.

28. A process for producing a plant expressing a protein or polypeptide encoded by a heterologous gene, which comprises the steps of: a) transforming the nuclear genome of a plant cell with the chimeric gene of any one of claims 1 to 18; and b) regenerating a transformed plant from said transformed cell. 9** 29, A process for producing a plant expressing an insecticidal amount of Bt ICP which comprises the steps of: a) transforming the nuclear genome of a plant cell with the chimeric gene of any one of the claims 16 to 18; and b) regenerating a transformed plant from said transformed cell.

30. A process for producing a plant expressing a protein or polypeptide encoded by a heterologous gene, which comprises the step of regenerating the transformed plant cell of claim 21 or 22.

31. A process for producing a protein in cells of a plant, which comprises the step of expressing the chimeric genes of any one of claims 1 to 18.

32. A process for producing a protein in cell of a plant, which comprises the step of sowing or planting seeds or plants transformed with chimeric genes of any one of claims 1 to 18; 100

33. The use of a chimeric gene of anly one of claims I to 18, to obtain high expression of a protein or polypeptide.

34. The use of a chimeric gene of any one of claims 16 to 18, to obtain an insecticidal amount of Bt ICP in a plant cell. DATED this 4 th day of August 2000 PLANT GENETIC SYSTEMS, N.V. By their Patent Attorneys CULLEN CO.