WO2006034368A2 - Micrornas (mirnas) for plant growth and development - Google Patents

Micrornas (mirnas) for plant growth and development Download PDF

Info

Publication number
WO2006034368A2
WO2006034368A2 PCT/US2005/033879 US2005033879W WO2006034368A2 WO 2006034368 A2 WO2006034368 A2 WO 2006034368A2 US 2005033879 W US2005033879 W US 2005033879W WO 2006034368 A2 WO2006034368 A2 WO 2006034368A2
Authority
WO
WIPO (PCT)
Prior art keywords
gene
promoter
mirna
vector
plant
Prior art date
Application number
PCT/US2005/033879
Other languages
French (fr)
Other versions
WO2006034368A3 (en
Inventor
Vincent Lee Chuan Chiang
Shanfa Lu
Ying-Hsuan Sun
Laigeng Li
Original Assignee
North Carolina State University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North Carolina State University filed Critical North Carolina State University
Publication of WO2006034368A2 publication Critical patent/WO2006034368A2/en
Publication of WO2006034368A3 publication Critical patent/WO2006034368A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8245Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine involving modified carbohydrate or sugar alcohol metabolism, e.g. starch biosynthesis
    • C12N15/8246Non-starch polysaccharides, e.g. cellulose, fructans, levans
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8291Hormone-influenced development
    • C12N15/8294Auxins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8291Hormone-influenced development
    • C12N15/8295Cytokinins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8291Hormone-influenced development
    • C12N15/8297Gibberellins; GA3

Definitions

  • MIRNAS MICRORNAS
  • the presently disclosed subject matter relates, in general, to methods and compositions for modulating gene expression in a plant. More particularly, the presently disclosed subject matter relates to a method of using a microRNA (miRNA) to modulate the expression level of a gene in a plant, and to compositions comprising miRNAs.
  • miRNA microRNA
  • Trees are a major natural resource of the biosphere and have shown outstanding ecological and economic importance.
  • a key physiological process of tree development is the formation of wood, which is composed of a variety of cell types.
  • Wood is made up of plant cell wall lignins, which occur exclusively in higher plants and represent the second most abundant organic compound on the earth's surface after cellulose, accounting for about 25% of plant biomass.
  • Cell wall lignification involves the deposition of phenolic polymers (lignins) on the extracellular polysaccharide matrix. The polymers arise from the oxidative coupling of three cinnamyl alcohols. The main functions of lignins are to strengthen the plant vascular body, provide mechanical support for stems and leaf blades, and to provide resistance to diseases, insects, cold temperatures, and other biotic and abiotic stresses.
  • lignins play many important roles in vascular plants, their resistance to degradation greatly complicates various agricultural and industrial uses of plants. For example, animals lack the enzymes necessary for degrading the polysaccharides in plant cell walls, and thus must depend on microbial fermentation to break down plant fibers. High lignin concentration and methoxyl content reduce the digestibility of forage crops (for example, alfalfa), with cattle (for example) able to digest only 40-50% of legume fibers and 60-70% of grass fibers. Thus, lignins have been implicated in limiting forage digestibility, possibly by interfering with microbial degradation of fiber polysaccharides. Small decreases in lignin content of plants, however, can have a significant positive impact on forage digestibility.
  • lignin content also is problematic in the wood products industries, which is an important component of both the United States' and global economies. Up to thirty-six percent of the dry weight of wood is lignin. During pulp and papermaking, lignin must be separated from cellulose. This process consumes large amounts of energy and imposes a high environmental cost due to the requirement for using chemicals such as chlorine bleach. The availability of wood with reduced lignin content or with a modified lignin that is more amenable to extraction would increase the efficiency of pulp and papermaking processes and would decrease chemical consumption and disposal. Thus, both the digestibility of forage crops and the pulping properties of trees can be adversely affected by high lignin content.
  • Genetic engineering has great promise for agriculture because it can accelerate traditional breeding programs, cross reproductive barriers, and introduce specific desired traits. Genetic engineering can be particularly advantageous to forestry because traditional methods are hampered by the long generation times of trees. Yet, the manipulation of a plant's genome can have undesirable effects.
  • the presently disclosed subject matter provides methods for stably modulating expression of a plant gene.
  • the method comprises (a) providing a vector encoding a microRNA (miRNA) targeted to the plant gene; and (b) transforming a plant cell with the vector, whereby stable expression of the miRNA in the plant cell is provided.
  • miRNA microRNA
  • the method comprises (a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence; (b) growing the plant cells under conditions sufficient to select for a plurality of transformed plant cells that have integrated the vector into their genomes; (c) screening the plurality of transformed plant cells for expression of the miRNA encoded by the vector; (d) selecting a transformed plant cell that expresses the miRNA; and (e) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the plant gene is stably modulated. 5 033879
  • the modulating expression of a plant gene is inhibiting expression of the plant gene.
  • a method of stably inhibiting the expression of a gene in a plant cell comprises stably transforming the plant cell with a vector encoding a microRNA (miRNA) molecule, wherein the miRNA molecule comprises a nucleotide sequence at least 70% identical to a contiguous 17- 24 nucleotide subsequence of the gene.
  • miRNA microRNA
  • the vector is an Agrobacte ⁇ um binary vector.
  • the vector comprises (a) a promoter operatively linked to a nucleic acid molecule encoding the miRNA molecule; and (b) a transcription termination sequence.
  • the promoter is a DNA-dependent RNA polymerase III promoter.
  • the promoter is selected from the group consisting of an RNA polymerase III H1 promoter, an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, a tRNA gene promoter, and functional derivatives thereof.
  • RNA gene promoter comprises the sequence presented in SEQ ID NO: 164.
  • promoters are chosen that direct tissue-, cell- type-, or stage-specific expression of the miRNAs.
  • the stable expression of the microRNA (miRNA) in the plant occurs in a location or tissue selected from the group consisting of epidermis, root, vascular tissue, xylem, meristem, cambium, cortex, pith, leaf, flower, seed, and combinations thereof.
  • an miRNA is used to modulate the expression of a target gene.
  • the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a sense region, an antisense region, and a loop region, positioned in relation to each other such that upon transcription, a resulting RNA transcript is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand.
  • the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-
  • the plants are a dicot. In some embodiments, the plant is a monocot. In some embodiments, the plant is a tree. In some embodiments, the tree is an angiosperm. In some embodiments, the tree is a gymnosperm. In some embodiments, the tree is a member of the genus Populus. In some embodiments, the tree is a Populus trichocarpa tree. In some embodiments, the tree is a member of the genus Pinus. In some embodiments, the tree is a Pinus taeda tree.
  • the methods and compositions of the presently disclosed subject matter can be used to modulate the expression of any gene in a plant.
  • the plant gene has a nucleotide sequence comprising one of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, or a nucleotide sequence at least 80% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837.
  • the gene is selected from the group consisting of coniferaldehyde-5-hydroxylase (Cald ⁇ H), a lignin-related gene, a cellulose-related gene, a hemicellulose-related gene, a hormone-related gene, a stress-related gene, a disease-related gene, a growth-related gene, and a transcription factor gene.
  • Cald ⁇ H coniferaldehyde-5-hydroxylase
  • a lignin-related gene a cellulose-related gene, a hemicellulose-related gene, a hormone-related gene, a stress-related gene, a disease-related gene, a growth-related gene, and a transcription factor gene.
  • the lignin-related gene is selected from the group consisting of sinapyl alcohol dehydrogenase (SAD), cinnamyl alcohol dehydrogenase (CAD), 4-coumarate:coenzyme A (CoA) ligase (4CL), cinnamoyl CoA O-methyltransferase (CCoAOMT), caffeate O-methyltransferase (COMT), ferulate-5-hydroxylase (F5H), cinnamate-4-hydroxylase (C4H), p-coumarate-3-hydroxylase (C3H), and phenylalanine ammonia lyase (PAL).
  • SAD sinapyl alcohol dehydrogenase
  • CAD cinnamyl alcohol dehydrogenase
  • CoA coenzyme A
  • CCoAOMT cinnamoyl CoA O-methyltransferase
  • COMP caffeate O-methyltransferase
  • F5H cin
  • the cellulose- related gene is selected from the group consisting of cellulose synthase, cellulose synthase-like, glucosidase, glucan synthase, and sucrose synthase.
  • the hormone-related gene is selected from the group consisting of isopentyl transferase (ipt), gibberellic acid (GA) oxidase, auxin (AUX), and a rooting locus (ROL) gene.
  • ipt isopentyl transferase
  • GA gibberellic acid
  • AUX auxin
  • ROL rooting locus
  • the vector for stably expressing a microRNA (miRNA) molecule in a plant comprises (a) a promoter operatively linked to a nucleic acid molecule encoding the miRNA molecule; and (b) a transcription termination sequence.
  • the vector is an Agrobacterium binary vector.
  • the Agrobacterium binary vector comprises a nucleic acid encoding a selectable marker operatively linked to a promoter.
  • kits comprising the disclosed vectors and at least one reagent for introducing the disclosed vectors into a plant cell.
  • the kit further comprises instructions for introducing the vector into a plant cell.
  • the presently disclosed subject matter also provides plant cells, transgenic plants, transgenic seed, and transgenic progeny comprising the disclosed vectors.
  • the plant cell is from a plant selected from the group consisting of poplar, pine, eucalyptus, sweetgum, other tree species, tobacco, Arabidopsis, rice, corn, wheat, cotton, potato, and cucumber.
  • the presently disclosed subject matter also provides a method for stably inhibiting the expression of a gene in a plant cell.
  • the method comprises stably transforming the plant cell with a vector encoding a microRNA (miRNA) molecule comprising a nucleotide sequence at least 70% identical to a contiguous 17-24 nucleotide subsequence of the gene.
  • miRNA microRNA
  • the presently disclosed subject matter also provides a method for enhancing the expression of a gene in a plant cell.
  • the method comprises introducing into the plant cell a vector encoding a short interfering RNA (siRNA) molecule comprising a sequence that hybridizes under physiological conditions to a loop region or a stem region of a pre-microRNA that comprises a microRNA (miRNA) that modulates expression of the gene, thereby resulting in downregulation of expression of the miRNA and enhanced expression of the gene.
  • siRNA short interfering RNA
  • miRNA microRNA
  • the microRNA comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712 and nucleotide sequences at least 70% identical to any of SEQ ID NOs: 1- 59, 1247-1295, and 1662-1712.
  • an expression vector comprises a nucleic acid sequence encoding a microRNA
  • the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1- 59, 1247-1295, and 1662-1712 nucleotide sequences at least 70% identical to SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
  • the miRNA is at least 70% identical to about 17-24 contiguous nucleotides of a ribonucleic acid (RNA) transcribed from a gene selected from the group consisting of a lignin-related gene, a cellulose-related gene, a hemicellulose- related gene, a hormone-related gene, a stress-related gene, a disease- related gene, a growth-related gene, and a transcription factor gene.
  • the vector comprises a promoter for expressing the miRNA, a transcription termination sequence, and a cloning site between the promoter and the transcription termination sequence into which a nucleic acid molecule encoding the miRNA can be cloned.
  • the vector is a plasmid vector.
  • the vector further comprises a selectable marker.
  • the cloning site comprises a recognition sequence for at least one restriction enzyme that is not present elsewhere in the plasmid vector.
  • the nucleic acid sequence encoding the microRNA comprises (a) a sense region; (b) an antisense region; and (c) a loop region, wherein the sense, antisense, and loop regions are positioned in relation to each other
  • RNA molecule upon transcription, the resulting RNA molecule is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand.
  • Figure 1 depicts a general structure for an siRNA molecule of the presently disclosed subject matter, wherein N is any nucleotide, provided that in the loop structure identified as N 5- g, all 5-9 nucleotides remain in a single-stranded conformation.
  • Ni -8 can be any sequence of 1-8 nucleotides or modified nucleotides, provided that the nucleotides remain in a single-stranded conformation in the siRNA molecule.
  • Figures 2A and 2B depict potential hairpin configurations for exemplary miRNA precursors.
  • Figure 2A depicts a miRNA precursor derived from the PtMIR 115a gene (SEQ ID NO: 95) comprising the nucleotide sequence of miRNA PtmiR 115 (SEQ ID NO: 24).
  • Figure 2B depicts an miRNA precursor derived from the PtMIR 61a gene (SEQ ID NO: 71) comprising the nucleotide sequence of miRNA PtmiR 61 (SEQ ID NO: 10).
  • Figures 3A-3C depict potential hairpin configurations for a transcript of an exemplary miRNA precursor gene, PtMIR 156-1 a (SEQ ID NO: 132).
  • Figure 3A depicts a hairpin configuration where the PtmiR 156-1 sequence (SEQ ID NO: 47 in RNA form) is present in the 5' arm of the hairpin.
  • FIG. 3B and 3C depict two hairpin configurations where the PtmiR 156-1 sequence (SEQ ID NO: 47 in RNA form) is present in the 3' arm of the hairpin.
  • Figure 3B depicts a shorter stem-loop structure
  • Figure 3C depicts a longer (one is shorter (B) and another is longer stem-loop structure.
  • Figure 3C also shows the position of a 19-nucleotide side stem- loop, the nucleotides of which are not depicted for clarity.
  • the sequence of PtmiR 156-1 (SEQ ID NO: 47 in RNA form) is underlined.
  • Figure 4 depicts Northern analysis of the expression of exemplary miRNAs in leaf (L), phloem (Ph), and developing xylem (X), tension wood (XTW), and opposite wood (Xow) stem xylems.
  • 5S rRNA is included as an RNA quantity loading control.
  • Figure 5A depicts GUS staining of cross-sections of the stems, of the leaves, and of the roots of one month old siRNA-transgenic (GT1 and GT2) and GL/S-expressing control (C) tobacco plants.
  • Figure 5B is a graph of GUS protein activity (Jefferson et a/., 1987) in the leaves of control plants and of ten GT2 transgenic plants. Mean values were calculated from three independent measurements per line.
  • Figure 5C depicts a loading control for gel blot analysis of RNA transcript level using a 25S ribosomal RNA probe.
  • Figure 5D depicts the same gel blot as shown in Figure 5C, but is used to characterize the level of GUS mRNA using a GUS cDNA probe.
  • Figure 5E depicts gel blot detection of siRNAs of about 21 nucleotides (nt) (position indicated) using a GUS cDNA probe as described in Hutvagner et al., 2000. RNA was isolated from a portion of the leaves used for the GUS protein activity assay depicted in Figure 5B.
  • Figure 6 depicts a schematic representation of plasmid pUCSL.1.
  • the plasmid contains a promoter fragment (289 basepairs; PZ S L-RNA) containing
  • MCS multiple cloning site
  • 3'-NTS cassette can be excised from pUCSU using Eco Rl and Hind III sites that are present at the 5' and 3' ends of the cassette, respectively.
  • Figure 7 depicts a schematic representation of plasmid pSIT.
  • the plasmid contains the promoter:MCS:3'-NTS cassette from pUCSU in the opposite transcriptional orientation and downstream of a selectable marker cassette, the latter consisting of a promoter, selectable marker gene, and terminator sequence.
  • pSIT represents a binary vector transformation system mediated by Agrobacterium.
  • Figure 8 depicts a representation of the multiple cloning site (MCS) of pSIT. Between the Sma I and Xba I sites of the MCS is cloned a sequence comprising 17-26 nt from the sense strand of the gene of interest, followed by a 9 nt spacer, and then the reverse complement of the 17-26 nt sequence
  • nt GUS gene-specific sequence (GT1 represented nucleotide positions 80-98 and GT2 89-107) separated by a 9 nt spacer from the reverse complement of the same sequence followed by a termination signal of five thymidines was cloned into pSUPER (available from OligoEngine, Inc., Seattle, Washington, United States of America) downstream of the H1 promoter (H1-P).
  • the H1-P::GT expression construct was then excised and cloned into the binary vector pGPTV-HPT (Becker et a/., 1992) to replace the pAnos-uidA fragment.
  • the resulting vector, pGPH1-HPT which contained a hygromycin phosphotransferase selectable marker gene (hpf), was then mobilized into Agrobacterium tumefaciens C58 for transforming tobacco.
  • Figure 9 shows the sequences of GT1 and GT2 that form the hairpin as follows.
  • GT1 the hairpin is produced by the intramolecular hybridization of SEQ ID NO: 174 and SEQ ID NO: 175, with a 9 nt spacer between.
  • GT2 the hairpin is produced by the intramolecular hybridization of SEQ ID NO: 176 and SEQ ID NO: 177, with a 9 nt spacer between.
  • Figure 9 depicts these hairpins with the "top" strand in the 5' to 3' direction, and thus the "bottom” strand is depicted in the 3' to 5' direction.
  • Sequence Listing discloses, inter alia, the sequences of various miRNAs, genes encoding miRNA precursors, and sequences derived from the genomes of Populus sp. and Pin ⁇ s sp. that are targets for the disclosed miRNAs. While the sequences are presented in the form of DNA (i.e. with thymidine present instead of uracil), it is understood that the sequences are also intended to correspond to the RNA transcripts of these DNA sequences
  • SEQ ID NOs: 1 -59 and 1247-1295 are the nucleic acid sequences of various miRNAs from Populus t ⁇ chocarpa.
  • SEQ ID Nos: 60-156 and 1296-1375 are the nucleic acid sequences of various miRNA precursor genes. The relationships between the sequences disclosed as SEQ ID Nos: 1-59 and 1247-1295 and those disclosed as 60-156 and 1296-1375 are presented Table 1 below.
  • SEQ ID NO: 155 is the nucleic acid sequence of a 5'-phosphorylated- 3'-adaptor oligonucleotide used to clone a population of small RNAs predicted to include miRNAs.
  • SEQ ID NO: 156 is the nucleic acid sequence of a second adaptor molecule used during the isolation and cloning of small RNAs.
  • SEQ ID NOs: 160 and 161 are primer sequences used to PCR- amplify a region of the Arabidopsis At7SL4 promoter.
  • SEQ ID NO: 162 is the nucleic acid sequence of the product of a PCR reaction using the primers identified in SEQ ID NOs: 160 and 161.
  • SEQ ID NOs: 163 and 164 are primer used to amplify the 3'-NTS of the At7SL4 gene.
  • SEQ ID NO: 165 is the nucleic acid sequence of the product of a PCR reaction using the primers identified in SEQ ID NOs: 163 and 164.
  • SEQ ID NOs: 166-171 are the sequences of complementary oligonucleotides that were used to generate siRNAs targeted to the GUS gene. Three different regions of the GUS gene were targeted. For the production of pGSGTI , SEQ ID NOs: 166 and 167 were hybridized to each other. For the production of pGSGT2, SEQ ID NOs: 168 and 169 were hybridized to each other. For the production of pGSGT3, SEQ ID NOs: 170 and 171 were hybridized to each other. SEQ ID NOs: 172-175 are presented in Figure 9, and correspond to the sense and antisense sequences for representative siRNA-like molecules targeting the GUS gene.
  • SEQ ID NO: 172 is a nucleic acid sequence that corresponds to bases 80-98 of GENBANK ® Accession No. AY100472, and is a sense strand sequence.
  • SEQ ID NO: 173 is a nucleic acid sequence that hybridizes to SEQ ID NO: 174 and includes a one nucleotide 3' overhang
  • SEQ ID NO: 174 is a nucleic acid sequence that corresponds to bases 89-107 of GENBANK ® Accession No. AY100472, and is a sense strand sequence.
  • SEQ ID NO: 175 is a nucleic acid sequence that hybridizes to SEQ ID NO: 174 and includes a two nucleotide 3' overhangs (UU).
  • SEQ ID NOs: 176-781 and 1376-1553 are the nucleotide sequences of various genes and/or RNA transcripts (disclosed in "DNA form"' i.e. with T instead of U) identified in Populus spp. as targets for one or more of the miRNAs disclosed in SEQ ID NOs: 1-59 and 1247-1295.
  • SEQ ID NOs: 782-1246 are the amino acid sequences encoded by the nucleotide sequences disclosed in SEQ ID NOs: 176-781. Given that some of the nucleotide sequences disclosed in SEQ ID NOs: 176-781 encode the same amino acid sequence, there are fewer SEQ ID NOs. assigned to amino acid sequences than to nucleotide sequences.
  • the relationships between the sequences disclosed as SEQ ID NOs: 176-1246 and 1376-1661 are presented Table 3 below.
  • SEQ ID NOs: 1662-1712 are the nucleic acid sequences of various miRNAs from Pinus taeda.
  • SEQ ID NOs: 1713-1748 are the nucleic acid sequences of various miRNA precursor genes. The relationships between the sequences disclosed as SEQ ID NOs: 1662-1712 and 1713-1748 are presented Table 4 below.
  • SEQ ID NOs: 1749-1837 are the nucleotide sequences of various genes and/or RNA transcripts (disclosed in "DNA form"' i.e. with T instead of U) identified in Pinus sp. as targets for one or more of the miRNAs disclosed in SEQ ID NOs: 1662-1712.
  • SEQ ID NOs: 1838-1907 are the amino acid sequences encoded by the nucleotide sequences disclosed in SEQ ID NOs: 1749-1837. Given that some of the nucleotide sequences disclosed in SEQ ID NOs: 1749-1837 encode the same amino acid sequence, there are fewer SEQ ID NOs. assigned to amino acid sequences than to nucleotide sequences.
  • Table 5 The relationships between the sequences disclosed as SEQ ID NOs: 1749-1837 and 1838-1907 are presented Table 5 below.
  • microRNAs small, non-coding regulatory RNAs
  • RNA polymerase Il or RNA polymerase III to the primary miRNA stem-loop transcript, called pri-miRNA (Lee, N. S., et al., 2002).
  • the pri-miRNA is cleaved by the Drosha RNase III endonuclease at both stem strands near the stem-loop base, releasing an miRNA precursor (pre-miRNA) as an about 60-70 nt stem-loop RNA molecule (Lee, Y., et al., 2002; Zeng & Cullen, 2003).
  • the pre-miRNA is then transported into the cytoplasm where it is cleaved at both stem strands by Dicer, also an RNase III endonuclease, liberating the loop portion of the pre-miRNA and the stem portion of the duplex that comprises the mature miRNA of about 22 nt and the similar size miRNA* fragment derived from the opposing arm of the pre-miRNA (Lau et al., 2001 ; Lagos- Quintana et al., 2002; Aravin et al., 2003; Lim et al., 2003b).
  • the nuclear cleavage of the pri-miRNA is mediated by a Dicer-like protein, DCL1 , having a similar functionality as mammal Drosha (Reinhart et al., 2002; Lim et al., 2003b; Lee, Y., et al., 2002; Lee, Y., et al., 2003).
  • DCL1 Dicer-like protein
  • the resulting plant pre-miRNA stem-loop transcripts are, however, generally more variable in size, ranging from about 60 to about 300 nt (Bartel & Bartel, 2003; Bartel, 2004; Lim et al., 2003b).
  • DCL1 performs a second cut in the nucleus on the pre-miRNA to liberate the miRNA:miRNA * duplex (Reinhart et al., 2002; Lim et al., 2003b; Lee Y et al., 2002; Lee, Y., et al., 2003).
  • the miRNA pathway in plants and mammals appears to be quite similar, both involving helicase-like protein-mediated unwinding of the duplex to release the single-stranded mature miRNA (Bartel & Bartel, 2003; Bartel, 2004; Rhoades et al., 2002).
  • the mature miRNA then recruits a ribonucleoprotein complex known as the RNA-induced silencing complex (RISC), while the miRNA* appears to be degraded.
  • RISC RNA-induced silencing complex
  • the miRNA guides the RISC to identify target messages based on perfect or near perfect complementarity between the miRNA and the target mRNA.
  • an endonuclease within the RISC cleaves the mRNA at a site near the middle of the miRNA complementarity, resulting in gene silencing (Hutvagner et al., 2000; Elbashir et al., 2001a; Elbashir et al., 2001b; Llave et al., 2002;
  • the miRNA in RISC will direct cleavage of the target mRNA if the complementarity between the target mRNA and the miRNA is sufficiently high. If such complementarity is not sufficiently high, however, the miRNA will direct the repression of protein translation rather than target mRNA cleavage (Bartel & Bartel, 2003; Bartel, 2004).
  • RNA-guided gene silencing pathway is highly similar to the key steps of siRNA-mediated gene silencing known as posttranscriptional gene silencing (PTGS) in plants and RNA interference (RNAi) in animals
  • miRNAs which can be exogenous sequences (for example, transgenes), mediate the silencing of the same genes from which they are derived. miRNAs, on the other hand, are typically endogenous and encoded by their own genes, and target different genes, setting up the gene regulation circuitry. miRNAs have been cloned from various animals, including Drosophila melanogaster (Lagos-Quintana et al., 2001 ; Aravin et al., 2003), C.
  • Rhoades et al. led Rhoades et al. to successfully identify annotated Arabidopsis mRNAs having perfect or near perfect complementarity to the cloned Arabidopsis miRNAs (Rhoades et al., 2002). Seventy-four Arabidopsis target genes were identified, representing 61 unique mRNAs (Reinhart et al., 2002; Rhoades et al., 2002; Bartel & Bartel, 2003).
  • miRNA:mRNA pairings were conserved between Arabidopsis and rice (Reinhart et al., 2002; Rhoades et al., 2002; Bartel & Bartel, 2003; Wang et al., 2004).
  • the most striking discovery was that, in the 61 predicted targets, 40 are known or putative transcription factors. Most of these transcription factors are known to regulate or are associated with development, suggesting that miRNAs might help coordinate a wide range of cell division and differentiation associated activities throughout the plant (Bartel & Bartel, 2003; Bartel, 2004).
  • miRNAs microRNAs
  • the approach to gene function characterization through the use of microRNAs (miRNAs) offers the potential for agriculture and tree crop improvement.
  • the ability to modulate the expression of genes involved in important biochemical pathways allows for the manipulation of the plant genome to produce plants with advantageous characteristics (for example, lower lignin content).
  • miRNAs provide a general approach to modulating gene expression in plants that can potentially be applied to any plant gene.
  • the presently disclosed subject matter provide methods and compositions for modulating gene expression (for example, genes involved in lignin and/or cellulose synthesis) in plants (for example, trees, including but not limited to Populus trichocarpa and Pinus taeda).
  • the term "about”, when referring to a value or to an amount of mass, weight, time, volume, concentration, or percentage is meant to encompass variations of in some embodiments ⁇ 20% or ⁇ 10%, in some embodiments ⁇ 5%, in some embodiments ⁇ 1 %, in some embodiments ⁇ 0.5%, and in some embodiments ⁇ 0.1 % from the specified amount, as such variations are appropriate to practice the presently disclosed subject matter.
  • all numbers expressing quantities of ingredients, reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term “about”. Accordingly, unless indicated to the contrary, the numerical parameters set forth in this specification and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by the presently disclosed subject matter.
  • amino acid and “amino acid residue” are used interchangeably and refer to any of the twenty naturally occurring amino acids, as well as analogs, derivatives, and congeners thereof; amino acid analogs having variant side chains; and all stereoisomers of any of the foregoing.
  • amino acid is intended to embrace all molecules, whether natural or synthetic, which include both an amino functionality and an acid functionality and are capable of being included in a polymer of naturally occurring amino acids.
  • amino acid is formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages.
  • the amino acid residues described herein are in some embodiments in the "L" isomeric form. However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide.
  • NH 2 refers to the free amino group present at the amino terminus of a polypeptide.
  • COOH refers to the free carboxy group present at the carboxy terminus of a polypeptide.
  • amino acid residue sequences represented herein by formulae have a left-to-right orientation in the conventional direction of amino terminus to carboxy terminus.
  • amino acid residues are broadly defined to include modified and unusual amino acids.
  • a dash at the beginning or end of an amino acid residue sequence indicates a peptide bond to a further sequence of one or more amino acid residues or a covalent bond to an amino-terminal group such as NH 2 or acetyl or to a carboxy-terminal group such as COOH.
  • the term "cell” is used in its usual biological sense.
  • the cell is present in an organism, for example, a plant including, but not limited to poplar, pine, eucalyptus, sweetgum, and other tree species; tobacco; Arabidopsis; rice; corn; wheat; cotton; potato; and cucumber.
  • the cell can be eukaryotic (e.g., a plant cell, such as a tobacco cell or a cell from a tree) or prokaryotic (e.g. a bacterium).
  • the cell can be of somatic or germ line origin, totipotent, pluripotent, or differentiated to any degree, dividing or non-dividing.
  • the cell can also be derived from or can comprise a gamete or embryo, a stem cell, or a fully differentiated cell.
  • the terms "host cells” and “recombinant host cells” are used interchangeably and refer to cells (for example, plant cells) into which the compositions of the presently disclosed subject matter (for example, an expression vector) can be introduced.
  • the terms refer not only to the particular plant cell into which an expression construct is initially introduced, but also to the progeny or potential progeny of such a cell. Because certain modifications can occur in succeeding generations due to either mutation or environmental influences, such progeny might not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
  • the term “gene” refers to a nucleic acid that encodes an RNA, for example, nucleic acid sequences including, but not limited to, structural genes encoding a polypeptide.
  • the term “gene” also refers broadly to any segment of DNA associated with a biological function.
  • the term “gene” encompasses sequences including but not limited to a coding sequence, a promoter region, a transcriptional regulatory sequence, a non-expressed DNA segment that is a specific recognition sequence for regulatory proteins, a non-expressed DNA segment that contributes to gene expression, a DNA segment designed to have desired parameters, or combinations thereof.
  • a gene can be obtained by a variety of methods, including cloning from a biological sample, synthesis based on known or predicted sequence information, and recombinant derivation from one or more existing sequences.
  • a gene typically comprises a coding strand and a non-coding strand.
  • coding strand and “sense strand” are used interchangeably, and refer to a nucleic acid sequence that has the same sequence of nucleotides as an mRNA from which the gene product is translated.
  • the coding/sense strand includes thymidine residues instead of the uridine residues found in the corresponding mRNA.
  • the coding/sense strand can also include additional elements not found in the mRNA including, but not limited to promoters, enhancers, and introns.
  • the terms “template strand” and “antisense strand” are used interchangeably and refer to a nucleic acid sequence that is complementary to the coding/sense strand. It should be noted, however, that for those genes that do not encode polypeptide products (for example, an miRNA gene), the term “coding strand” is used to refer to the strand comprising the miRNA. In this usage, the strand comprising the miRNA is a sense strand with respect to the miRNA precursor, but it would be antisense with respect to its target RNA (i.e. the miRNA hybridizes to the target RNA because it comprises a sequence that is antisense to the target RNA).
  • the terms “complementarity” and “complementary” refer to a nucleic acid that can form one or more hydrogen bonds with another nucleic acid sequence by either traditional Watson-Crick or other non-traditional types of interactions.
  • the binding free energy for a nucleic acid molecule with its complementary sequence is sufficient to allow the relevant function of the nucleic acid to proceed, in some embodiments, ribonuclease activity.
  • the degree of complementarity between the sense and antisense strands of an miRNA precursor can be the same or different from the degree of complementarity between the miRNA-containing strand of an miRNA precursor and the target nucleic acid sequence. Determination of binding free energies for nucleic acid molecules is well known in the art. See e.g., Freier et al., 1986; Turner et al., 1987.
  • percent complementarity refers to the percentage of contiguous residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary).
  • the terms “100% complementary”, “fully complementary”, and “perfectly complementary” indicate that all of the contiguous residues of a nucleic acid sequence can hydrogen bond with the same number of contiguous residues in a second nucleic acid sequence.
  • miRNAs are about 17-24 nt, and up to 5 mismatches (e.g., 1 , 2, 3, 4, or 5 mismatches) are tolerated during miRNA-directed modulation of gene expression, a percent complementarity of at least about 70% between a target RNA and an miRNA should be sufficient for the miRNA to modulate the expression of the gene from which the target RNA was derived.
  • gene expression generally refers to the cellular processes by which a biologically active polypeptide is produced from a DNA sequence and exhibits a biological activity in a cell.
  • gene expression involves the processes of transcription and translation, but also involves post- transcriptional and post-translational processes that can influence a biological activity of a gene or gene product. These processes include, but are not limited to RNA synthesis, processing, and transport, as well as polypeptide synthesis, transport, and post-translational modification of polypeptides. Additionally, processes that affect protein-protein interactions within the cell can also affect gene expression as defined herein.
  • gene expression refers to the processes by which a precursor miRNA is produced from the gene.
  • transcription although unlike the transcription directed by RNA polymerase Il for protein-coding genes, the transcription products of an miRNA gene are not translated to produce a protein. Nonetheless, the production of a mature miRNA from an miRNA gene is encompassed by the term "gene expression" as that term is used herein.
  • isolated refers to a molecule substantially free of other nucleic acids, proteins, lipids, carbohydrates, and/or other materials with which it is normally associated, such association being either in cellular material or in a synthesis medium.
  • isolated nucleic acid refers to a ribonucleic acid molecule or a deoxyribonucleic acid molecule (for example, a genomic DNA, cDNA, mRNA, miRNA, etc.) of natural or synthetic origin or some combination thereof, which (1 ) is not associated with the cell in which the "isolated nucleic acid” is found in nature, or (2) is operatively linked to a polynucleotide to which it is not linked in nature.
  • isolated polypeptide refers to a polypeptide, in some embodiments prepared from recombinant DNA or RNA, or of synthetic origin, or some combination thereof, which (1 ) is not associated with proteins that it is normally found with in nature, (2) is isolated from the cell in which it normally occurs, (3) is isolated free of other proteins from the same cellular source, (4) is expressed by a cell from a different species, or (5) does not occur in nature.
  • isolated when used in the context of an “isolated cell”, refers to a cell that has been removed from its natural environment, for example, as a part of an organ, tissue, or organism.
  • label and “labeled” refer to the attachment of a moiety, capable of detection by spectroscopic, radiologic, or other methods, to a probe molecule.
  • label or “labeled” refer to incorporation or attachment, optionally covalently or non-covalently, of a detectable marker into a molecule, such as a polypeptide.
  • Various methods of labeling polypeptides are known in the art and can be used.
  • labels for polypeptides include, but are not limited to, the following: radioisotopes, fluorescent labels, heavy atoms, enzymatic labels or reporter genes, chemiluminescent groups, biotinyl groups, predetermined polypeptide epitopes recognized by a secondary reporter (e.g., leucine zipper pair sequences, binding sites for antibodies, metal binding domains, epitope tags).
  • labels are attached by spacer arms of various lengths to reduce potential steric hindrance.
  • the term “modulate” refers to an increase, decrease, or other alteration of any, or all, chemical and biological activities or properties of a biochemical entity, e.g., a wild-type or mutant nucleic acid molecule.
  • the term “modulate” can refer to a change in the expression level of a gene or a level of an RNA molecule or equivalent RNA molecules encoding one or more proteins or protein subunits; or to an activity of one or more proteins or protein subunits that is upregulated or down regulated, such that expression, level, or activity is greater than or less than that observed in the absence of the modulator.
  • the term “modulate” can mean “inhibit” or "suppress", but the use of the word
  • inhibition As used herein, the terms “inhibit”, “suppress”, “down regulate”, and grammatical variants thereof are used interchangeably and refer to an activity whereby gene expression or a level of an RNA encoding one or more gene products is reduced below that observed in the absence of a nucleic acid molecule of the presently disclosed subject matter. In some embodiments, inhibition with an miRNA molecule results in a decrease in the steady state expression level of a target RNA. In some embodiments, inhibition with an miRNA molecule results in an expression level of a target gene that is below that level observed in the presence of an inactive or attenuated molecule that is unable to downregulate the expression level of the target.
  • inhibition of gene expression with an miRNA molecule of the presently disclosed subject matter is greater in the presence of the miRNA molecule than in its absence.
  • inhibition of gene expression is associated with an enhanced rate of degradation of the mRNA encoded by the gene (for example, by miRNA-mediated inhibition of gene expression).
  • modulation refers to both upregulation (i.e., activation or stimulation) and downregulation (i.e., inhibition or suppression) of a response.
  • modulation when used in reference to a functional property or biological activity or process (e.g., enzyme activity or receptor binding), refers to the capacity to upregulate (e.g., activate or stimulate), downregulate (e.g., inhibit or suppress), or otherwise change a quality of such property, activity, or process.
  • upregulate e.g., activate or stimulate
  • downregulate e.g., inhibit or suppress
  • regulation can be contingent on the occurrence of a specific event, such as activation of a signal transduction pathway, and/or can be manifest only in particular cell types.
  • modulator refers to a polypeptide, nucleic acid, macromolecule, complex, molecule, small molecule, compound, species, or the like (naturally occurring or non-naturally occurring), or an extract made from biological materials such as bacteria, plants, fungi, or animal cells or tissues, that can be capable of causing modulation.
  • Modulators can be evaluated for potential activity as inhibitors or activators (directly or indirectly) of a functional property, biological activity or process, or a combination thereof (e.g., agonist, partial antagonist, partial agonist, inverse agonist, antagonist, anti-microbial agents, inhibitors of microbial infection or proliferation, and the like), by inclusion in assays. In such assays, many modulators can be screened at one time. The activity of a modulator can be known, unknown, or partially known.
  • Modulators can be either selective or non-selective.
  • selective when used in the context of a modulator (e.g. an inhibitor) refers to a measurable or otherwise biologically relevant difference in the way the modulator interacts with one molecule (e.g. a target RNA of interest) versus another similar but not identical molecule (e.g. an RNA derived from a member of the same gene family as the target RNA of interest).
  • a modulator to be considered a selective modulator, the nature of its interaction with a target need entirely exclude its interaction with other molecules related to the target (e.g. transcripts from family members other than the target itself).
  • the term selective modulator is not intended to be limited to those molecules that only bind to mRNA transcripts from a gene of interest and not to those of related family members.
  • the term is also intended to include modulators that can interact with transcripts from genes of interest and from related family members, but for which it is possible to design conditions under which the differential interactions with the targets versus the family members has a biologically relevant outcome.
  • Such conditions can include, but are not limited to differences in the degree of sequence identity between the modulator and the family members, and the use of the modulator in a specific tissue or cell type that expresses some but not all family members.
  • a modulator might be considered selective to a given target in a given tissue if it interacts with that target to cause a biologically relevant effect despite the fact that in another tissue that expresses additional family members the modulator and the target would not interact to cause a biological effect at all because the modulator would be "soaked out" of the tissue by the presence of other family members.
  • the modulator When a selective modulator is identified, the modulator binds to one molecule (for example an mRNA transcript of a gene of interest) in a manner that is different (for example, stronger) from the way it binds to another molecule (for example, an mRNA transcript of a gene related to the gene of interest).
  • the modulator is said to display "selective binding" or “preferential binding” to the molecule to which it binds more strongly as compared to some other possible molecule to which the modulator might bind.
  • mutation carries its traditional connotation and refers to a change, inherited, naturally occurring, or introduced, in a nucleic acid or polypeptide sequence, and is used in its sense as generally known to those of skill in the art.
  • naturally occurring refers to the fact that an object can be found in nature.
  • a polypeptide or polynucleotide sequence that is present in an organism (including bacteria) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally occurring. It must be understood, however, that any manipulation by the hand of man can render a "naturally occurring" object an “isolated” object as that term is used herein.
  • nucleic acid and “nucleic acid molecule” refer to any of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction
  • Nucleic acids can be composed of monomers that are naturally occurring nucleotides (such as deoxyribonucleotides and ribonucleotides), or analogs of naturally occurring nucleotides (e.g., ⁇ -enantiomeric forms of naturally occurring nucleotides), or a combination of both.
  • Modified nucleotides can have modifications in sugar moieties and/or in pyrimidine or purine base moieties.
  • Sugar modifications include, for example, replacement of one or more hydroxyl groups with halogens, alkyl groups, amines, and azido groups, or sugars can be functionalized as ethers or esters. Moreover, the entire sugar moiety can be replaced with sterically and electronically similar structures, such as aza- sugars and carbocyclic sugar analogs. Examples of modifications in a base moiety include alkylated purines and pyrimidines, acylated purines or pyrimidines, or other well-known heterocyclic substitutes. Nucleic acid monomers can be linked by phosphodiester bonds or analogs of such linkages.
  • Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, and the like.
  • nucleic acid also includes so-called “peptide nucleic acids”, which comprise naturally occurring or modified nucleic acid bases attached to a polyamide backbone. Nucleic acids can be either single stranded or double stranded.
  • operatively linked when describing the relationship between two nucleic acid regions, refers to a juxtaposition wherein the regions are in a relationship permitting them to function in their intended manner.
  • a control sequence "operatively linked" to a coding sequence can be ligated in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences, such as when the appropriate molecules (e.g., inducers and polymerases) are bound to the control or regulatory sequence(s).
  • the phrase "operatively linked” refers to a promoter connected to a coding sequence in such a way that the transcription of that coding sequence is controlled and regulated by that promoter.
  • Techniques for operatively linking a promoter to a coding sequence are well known in the art; the precise orientation and location relative to a coding sequence of interest is dependent, inter alia, upon the specific nature of the promoter.
  • operatively linked can refer to a promoter region that is connected to a nucleotide sequence in such a way that the transcription of that nucleotide sequence is controlled and regulated by that promoter region.
  • a nucleotide sequence is said to be under the "transcriptional control" of a promoter to which it is operatively linked.
  • operatively linked can also refer to a transcription termination sequence that is connected to a nucleotide sequence in such a way that termination of transcription of that nucleotide sequence is controlled by that transcription termination sequence.
  • a transcription termination sequence comprises a sequence that causes transcription by an RNA polymerase III to terminate at the third or fourth T in the terminator sequence, TTTTTTT. Therefore the nascent small transcript has 3 or 4 U's at the 3' terminus.
  • percent identity and percent identical in the context of two nucleic acid or protein sequences, refer to two or more sequences or subsequences that have in some embodiments at least 60%, in some embodiments at least 70%, in some embodiments at least 80%, in some embodiments at least 85%, in some embodiments at least 90%, in some embodiments at least 95%, in some embodiments at least 98%, and in some embodiments at least 99% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection.
  • the percent identity exists in some embodiments over a region of the sequences that is at least about 50 residues in length, in some embodiments over a region of at least about 100 residues, and in some embodiments the percent identity exists over at least about 150 residues. In some embodiments, the percent identity exists over the entire length of a given region, such as a coding region.
  • sequence comparison typically one sequence acts as a reference sequence to which test sequences are compared.
  • test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated.
  • sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm described in Smith &
  • This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence.
  • T is referred to as the neighborhood word score threshold (Altschul et al., 1990).
  • HSPs high scoring sequence pairs
  • M return score for a pair of matching residues; always > 0
  • N penalty score for mismatching residues; always ⁇ 0).
  • a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence is in some embodiments less than about 0.1 , in some embodiments less than about 0.01 , and in some embodiments less than about 0.001.
  • nucleotide sequences refers to two or more sequences or subsequences that have in some embodiments at least about 70% nucleotide identity, in some embodiments at least about 75% nucleotide identity, in some embodiments at least about 80% nucleotide identity, in some embodiments at least about 85% nucleotide identity, in some embodiments at least about 90% nucleotide identity, in some embodiments at least about 95% nucleotide identity, in some embodiments at least about 97% nucleotide identity, and in some embodiments at least about 99% nucleotide identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection.
  • the substantial identity exists in nucleotide sequences of at least 17 residues, in some embodiments in nucleotide sequence of at least about 18 residues, in some embodiments in nucleotide sequence of at least about
  • nucleotide sequence of at least about 23 residues in some embodiments in nucleotide sequence of at least about
  • polymorphic sequences can be substantially identical sequences.
  • the term "polymorphic" refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. An allelic difference can be as small as one base pair. Nonetheless, one of ordinary skill in the art would recognize that the polymorphic sequences correspond to the same gene.
  • nucleic acid sequences are substantially identical in that the two molecules specifically or substantially hybridize to each other under stringent conditions.
  • two nucleic acid sequences being compared can be designated a "probe sequence” and a "test sequence".
  • a "probe sequence” is a reference nucleic acid molecule
  • a "'test sequence” is a test nucleic acid molecule, often found within a heterogeneous population of nucleic acid molecules.
  • An exemplary nucleotide sequence employed for hybridization studies or assays includes probe sequences that are complementary to or mimic in some embodiments at least an about 14 to 40 nucleotide sequence of a nucleic acid molecule of the presently disclosed subject matter.
  • probes comprise 14 to 20 nucleotides, or even longer where desired, such as 30, 40, 50, 60, 100, 200, 300, or 500 nucleotides or up to the full length of a given gene.
  • Such fragments can be readily prepared by, for example, directly synthesizing the fragment by chemical synthesis, by application of nucleic acid amplification technology, or by introducing selected sequences into recombinant vectors for recombinant production.
  • hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex nucleic acid mixture (e.g., total cellular DNA or RNA).
  • hybridization can be carried out in 5x SSC, 4x SSC, 3x SSC, 2x SSC, 1x SSC, or 0.2x SSC for at least about 1 hour, 2 hours, 5 hours, 12 hours, or 24 hours (see Sambrook & Russell,
  • the temperature of the hybridization can be increased to adjust the stringency of the reaction, for example, from about 25°C (room temperature), to about 45°C, 50 0 C, 55 0 C, 60°C, or 65°C.
  • the hybridization reaction can also include another agent affecting the stringency; for example, hybridization conducted in the presence of 50% formamide increases the stringency of hybridization at a defined temperature.
  • the hybridization reaction can be followed by a single wash step, or two or more wash steps, which can be at the same or a different salinity and temperature.
  • the temperature of the wash can be increased to adjust the stringency from about 25°C (room temperature), to about 45°C, 50°C, 55 0 C, 6O 0 C, 65 0 C, or higher.
  • the wash step can be conducted in the presence of a detergent, e.g., SDS.
  • hybridization can be followed by two wash steps at 65°C each for about 20 minutes in 2x SSC, 0.1 % SDS, and optionally two additional wash steps at 65°C each for about
  • a probe nucleotide sequence hybridizes in one example to a target nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO 4 , 1 mm ethylenediamine tetraacetic acid (EDTA) at 5O 0 C followed by washing in 2X SSC, 0.1 % SDS at 50°C; in some embodiments, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO 4 , 1 mm EDTA at 50°C followed by washing in 1X SSC, 0.1%
  • a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO 4 , 1 mm EDTA at 5O 0 C followed by washing in 0.5X SSC, 0.1 % SDS at 5O 0 C; in some embodiments, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO 4 , 1 mm EDTA at 5O 0 C followed by washing in 0.1 X SSC, 0.1 % SDS at 50 0 C; in yet another example, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO 4 , 1 mm EDTA at 5O 0 C followed by washing in 0.1 X SSC, 0.1 % SDS at 65 0 C.
  • Additional exemplary stringent hybridization conditions include overnight hybridization at 42°C in a solution comprising or consisting of 50% formamide, 1Ox Denhardt's (0.2% Ficoll, 0.2% polyvinylpyrrolidone, 0.2% bovine serum albumin) and 200 mg/ml of denatured carrier DNA, e.g., sheared salmon sperm DNA, followed by two wash steps at 65°C each for about 20 minutes in 2x SSC, 0.1 % SDS, and two wash steps at 65 0 C each for about 20 minutes in 0.2x SSC, 0.1 % SDS.
  • denatured carrier DNA e.g., sheared salmon sperm DNA
  • Hybridization can include hybridizing two nucleic acids in solution, or a nucleic acid in solution to a nucleic acid attached to a solid support, e.g., a filter.
  • a prehybridization step can be conducted prior to hybridization.
  • Prehybridization can be carried out for at least about 1 hour, 3 hours, or 10 hours in the same solution and at the same temperature as the hybridization (but without the complementary polynucleotide strand).
  • stringency conditions are known to those skilled in the art or can be determined experimentally by the skilled artisan.
  • hybridizing substantially to refers to complementary hybridization between a probe nucleic acid molecule and a target nucleic acid molecule and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired hybridization.
  • phenotype refers to the entire physical, biochemical, and physiological makeup of a cell or an organism, e.g., having any one trait or any group of traits. As such, phenotypes result from the expression of genes within a cell or an organism, and relate to traits that are potentially observable or assayable.
  • polypeptide As used herein, the terms “polypeptide”, “protein”, and “peptide”, which are used interchangeably herein, refer to a polymer of the 20 protein amino acids, or amino acid analogs, regardless of its size or function.
  • polypeptide refers to peptides, polypeptides and proteins, unless otherwise noted.
  • protein polypeptide
  • polypeptide encompasses proteins of all functions, including enzymes.
  • exemplary polypeptides include gene products, naturally occurring proteins, homologs, orthologs, paralogs, fragments, and other equivalents, variants and analogs of the foregoing.
  • polypeptide fragment when used in reference to a reference polypeptide, refers to a polypeptide in which amino acid residues are deleted as compared to the reference polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino-terminus or carboxy-terminus of the reference polypeptide, or alternatively both. Fragments typically are at least 5, 6, 8 or 10 amino acids long, at least 14 amino acids long, at least 20, 30, 40 or 50 amino acids long, at least 75 amino acids long, or at least 100, 150, 200, 300, 500 or more amino acids long. A fragment can retain one or more of the biological activities of the reference polypeptide. Further, fragments can include a sub-fragment of a specific region, which sub-fragment retains a function of the region from which it is derived.
  • the term "primer” refers to a sequence comprising in some embodiments two or more deoxyribonucleotides or ribonucleotides, in some embodiments more than three, in some embodiments more than eight, and in some embodiments at least about 20 nucleotides of an exonic or intronic region. Such oligonucleotides are in some embodiments between ten and thirty bases in length.
  • purified refers to an object species that is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition).
  • a “purified fraction” is a composition wherein the object species comprises at least about 50 percent (on a molar basis) of all species present.
  • the solvent or matrix in which the species is dissolved or dispersed is usually not included in such determination; instead, only the species (including the one of interest) dissolved or dispersed are taken into account.
  • a purified composition will have one species that comprises more than about 80 percent of all species present in the composition, more than about 85%, 90%, 95%, 99% or more of all species present.
  • the object species can be purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods) wherein the composition consists essentially of a single species.
  • a skilled artisan can purify a polypeptide of the presently disclosed subject matter using standard techniques for protein purification in light of the teachings herein. Purity of a polypeptide can be determined by a number of methods known to those of skill in the art, including for example, amino-terminal amino acid sequence analysis, gel electrophoresis, and mass-spectrometry analysis.
  • a “reference sequence” is a defined sequence used as a basis for a sequence comparison.
  • a reference sequence can be a subset of a larger sequence, for example, as a segment of a full-length nucleotide or amino acid sequence, or can comprise a complete sequence.
  • a reference sequence is at least 200, 300 or 400 nucleotides in length, frequently at least 600 nucleotides in length, and often at least 800 nucleotides in length.
  • two proteins can each (1 ) comprise a sequence (i.e., a portion of the complete protein sequence) that is similar between the two proteins, and (2) can further comprise a sequence that is divergent between the two proteins
  • sequence comparisons between two (or more) proteins are typically performed by comparing sequences of the two proteins over a "comparison window" (defined hereinabove) to identify and compare local regions of sequence similarity.
  • regulatory sequence is a generic term used throughout the specification to refer to polynucleotide sequences, such as initiation signals, enhancers, regulators, promoters, and termination sequences, which are necessary or desirable to affect the expression of coding and non-coding sequences to which they are operatively linked.
  • Exemplary regulatory sequences are described in Goeddel, 1990, and include, for example, the early and late promoters of simian virus 40 (SV40), adenovirus or cytomegalovirus immediate early promoter, the lac system, the trp system, the TAC or TRC system, T7 promoter whose expression is directed by T7 RNA polymerase, the major operator and promoter regions of phage lambda, the control regions for fd coat protein, the promoter for 3- phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid phosphatase, e.g., Pho5, the promoters of the yeast a-mating factors, the polyhedron promoter of the baculovirus system and other sequences known to control the expression of genes of prokaryotic or eukaryotic cells or their viruses, and various combinations thereof.
  • SV40 simian virus 40
  • adenovirus or cytomegalovirus immediate early promoter the lac
  • regulatory sequences can differ depending upon the host organism.
  • such regulatory sequences generally include promoter, ribosomal binding site, and transcription termination sequences.
  • the term "regulatory sequence” is intended to include, at a minimum, components the presence of which can influence expression, and can also include additional components the presence of which is advantageous, for example, leader sequences and fusion partner sequences.
  • transcription of a polynucleotide sequence is under the control of a promoter sequence (or other regulatory sequence) that controls the expression of the polynucleotide in a cell-type in which expression is intended. It will also be understood that the polynucleotide can be under the control of regulatory sequences that are the same or different from those sequences which control expression of the naturally occurring form of the polynucleotide.
  • a promoter sequence is a DNA-dependent RNA polymerase III promoter (e.g. a promoter for an H1 , 5S, or U6 gene, or an Arabidopsis thaliana At7SL4 gene promoter, such as that disclosed as SEQ ID NO: 162).
  • a promoter sequence is selected from the group consisting of an adenovirus VA1 promoter sequence, a Vault promoter sequence, a telomerase RNA promoter sequence, and a tRNA gene promoter sequence. It is understood that the entire promoter identified for any promoter (for example, the promoters listed herein) need not be employed, and that a functional derivative thereof can be used.
  • the phrase "functional derivative” refers to a nucleic acid sequence that comprises sufficient sequence to direct transcription of another operatively linked nucleic acid molecule. As such, a “functional derivative" can function as a minimal promoter, as that term is defined herein.
  • Termination of transcription of a polynucleotide sequence is typically regulated by an operatively linked transcription termination sequence (for example, an RNA polymerase III termination sequence).
  • transcriptional terminators are also responsible for correct mRNA polyadenylation.
  • the 3' non-transcribed regulatory DNA sequence includes from in some embodiments about 50 to about 1 ,000, and in some embodiments about 100 to about 1 ,000, nucleotide base pairs and contains plant transcriptional and translational termination sequences.
  • Appropriate transcriptional terminators and those that are known to function in plants include the cauliflower mosaic virus (CaMV) 35S terminator, the tml terminator, the nopaline synthase terminator, the pea rbcS E9 terminator, the terminator for the T7 transcript from the octopine synthase gene of
  • an RNA polymerase III termination sequence comprises the nucleotide sequence TTTTTTT.
  • reporter gene refers to a nucleic acid comprising a nucleotide sequence encoding a protein that is readily detectable either by its presence or activity, including, but not limited to, luciferase, fluorescent protein (e.g., green fluorescent protein), chloramphenicol acetyl transferase, ⁇ -galactosidase, secreted placental alkaline phosphatase, ⁇ -lactamase, human growth hormone, and other secreted enzyme reporters.
  • fluorescent protein e.g., green fluorescent protein
  • chloramphenicol acetyl transferase e.g., chloramphenicol acetyl transferase
  • ⁇ -galactosidase e.g., secreted placental alkaline phosphatase
  • ⁇ -lactamase ⁇ -lactamase
  • human growth hormone and other secreted enzyme reporters.
  • a reporter gene encodes a polypeptide not otherwise produced by the host cell, which is detectable by analysis of the cell(s), e.g., by the direct fluoromethc, radioisotopic or spectrophotometric analysis of the cell(s) and typically without the need to kill the cells for signal analysis.
  • a reporter gene encodes an enzyme, which produces a change in fluorometric properties of the host cell, which is detectable by qualitative, quantitative, or semiquantitative function or transcriptional activation.
  • Exemplary enzymes include esterases, ⁇ -lactamase, phosphatases, peroxidases, proteases (tissue plasminogen activator or urokinase), and other enzymes whose function can be detected by appropriate chromogenic or fluorogenic substrates known to those skilled in the art or developed in the future.
  • sequencing refers to determining the ordered linear sequence of nucleic acids or amino acids of a DNA, RNA, or protein target sample, using conventional manual or automated laboratory techniques.
  • the term “substantially pure” refers to that the polynucleotide or polypeptide is substantially free of the sequences and molecules with which it is associated in its natural state, and those molecules used in the isolation procedure.
  • the term “substantially free” refers to that the sample is in some embodiments at least 50%, in some embodiments at least 70%, in some embodiments 80% and in some embodiments 90% free of the materials and compounds with which is it associated in nature.
  • target cell refers to a cell, into which it is desired to insert a nucleic acid sequence or polypeptide, or to otherwise effect a modification from conditions known to be standard in the unmodified cell.
  • a nucleic acid sequence introduced into a target cell can be of variable length. Additionally, a nucleic acid sequence can enter a target cell as a component of a plasmid or other vector or as a naked sequence.
  • target gene refers to a gene expressed in a cell the expression of which is targeted for modulation using the methods and compositions of the presently disclosed subject matter.
  • a target gene therefore, comprises a nucleic acid sequence the expression level of which is downregulated by an miRNA.
  • target RNA or “target mRNA” refers to the transcript of a target gene to which the miRNA is intended to bind, leading to modulation of the expression of the target gene.
  • the target gene can be a gene derived from a cell, an endogenous gene, a transgene, or exogenous genes such as genes of a pathogen, for example a virus, which is present in the cell after infection thereof.
  • the cell containing the target gene can be derived from or contained in any organism, for example a plant, animal, protozoan, virus, bacterium, or fungus.
  • transcription refers to a cellular process involving the interaction of an RNA polymerase with a gene that directs the expression as RNA of the structural information present in the coding sequences of the gene. The process includes, but is not limited to, the following steps: (a) the transcription initiation; (b) transcript elongation; (c) transcript splicing; (d) transcript capping; (e) transcript termination; (f) transcript polyadenylation; (g) nuclear export of the transcript; (h) transcript editing; and (i) stabilizing the transcript.
  • transcription factor refers to a cytoplasmic or nuclear protein which binds to a gene, or binds to an RNA transcript of a gene, or binds to another protein which binds to a gene or an RNA transcript or another protein which in turn binds to a gene or an RNA transcript, so as to thereby modulate expression of the gene. Such modulation can additionally be achieved by other mechanisms; the essence of a "transcription factor for a gene” pertains to a factor that alters the level of transcription of the gene in some way.
  • transfection refers to the introduction of a nucleic acid, e.g., an expression vector, into a recipient cell, which in certain instances involves nucleic acid-mediated gene transfer.
  • transformation refers to a process in which a cell's genotype is changed as a result of the cellular uptake of exogenous nucleic acid.
  • a transformed cell can express a recombinant form of a polypeptide of the presently disclosed subject matter.
  • the transformation of a cell with an exogenous nucleic acid can be characterized as transient or stable.
  • stable refers to a state of persistence that is of a longer duration than that which would be understood in the art as "transient”. These terms can be used both in the context of the transformation of cells (for example, a stable transformation), or for the expression of a transgene
  • a stable transformation results in the incorporation of the exogenous nucleic acid molecule (for example, an expression vector) into the genome of the transformed cell.
  • the exogenous nucleic acid molecule for example, an expression vector
  • the vector DNA is replicated along with plant genome so that progeny cells also contain the exogenous DNA in their genomes.
  • stable expression relates to expression of a nucleic acid molecule (for example, a vector-encoded miRNA) over time.
  • a nucleic acid molecule for example, a vector-encoded miRNA
  • stable expression requires that the cell into which the exogenous DNA is introduced express the encoded nucleic acid at a consistent level over time. Additionally, stable expression can occur over the course of generations. When the expressing cell divides, at least a fraction of the resulting daughter cells can also express the encoded nucleic acid, and at about the same level. It should be understood that it is not necessary that every cell derived from the cell into which the vector was originally introduced express the nucleic acid molecule of interest.
  • stable expression requires only that the nucleic acid molecule of interest be stably expressed in tissue(s) and/or location(s) of the plant in which expression is desired.
  • stable expression of an exogenous nucleic acid is achieved by the integration of the nucleic acid into the genome of the host cell.
  • vector refers to a nucleic acid capable of transporting another nucleic acid to which it has been linked.
  • Agrobacterium binary vector i.e., a nucleic acid capable of integrating the nucleic acid sequence of interest into the host cell (for example, a plant cell) genome.
  • Other vectors include those capable of autonomous replication and expression of nucleic acids to which they are linked.
  • Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as "expression vectors”.
  • expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
  • plasmid and "vector” are used interchangeably as the plasmid is the most commonly used form of vector.
  • vector the presently disclosed subject matter is intended to include such other forms of expression vectors which serve equivalent functions and which become known in the art subsequently hereto.
  • expression vector refers to a DNA sequence capable of directing expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter operatively linked to the nucleotide sequence of interest which is operatively linked to transcription termination sequences. It also typically comprises sequences required for proper translation of the nucleotide sequence.
  • the construct comprising the nucleotide sequence of interest can be chimeric. The construct can also be one that is naturally occurring but has been obtained in a recombinant form useful for heterologous expression.
  • the nucleotide sequence of interest including any additional sequences designed to effect proper expression of the nucleotide sequences, can also be referred to as an "expression cassette".
  • heterologous gene refers to a sequence that originates from a source foreign to an intended host cell or, if from the same source, is modified from its original form.
  • a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified, for example by mutagenesis or by isolation from native transcriptional regulatory sequences.
  • the terms also include non-naturally occurring multiple copies of a naturally occurring nucleotide sequence.
  • promoter refers to a nucleotide sequence within a gene that is positioned 5' to a coding sequence and functions to direct transcription of the coding sequence.
  • the promoter region comprises a transcriptional start site, and can additionally include one or more transcriptional regulatory elements.
  • a method of the presently disclosed subject matter employs a RNA polymerase III promoter.
  • a “minimal promoter” is a nucleotide sequence that has the minimal elements required to enable basal level transcription to occur. As such, minimal promoters are not complete promoters but rather are subsequences of promoters that are capable of directing a basal level of transcription of a reporter construct in an experimental system.
  • Minimal promoters include but are not limited to the cytomegalovirus (CMV) minimal promoter, the herpes simplex virus thymidine kinase (HSV-tk) minimal promoter, the simian virus 40 (SV40) minimal promoter, the human /?-actin minimal promoter, the human EF2 minimal promoter, the adenovirus E1 B minimal promoter, and the heat shock protein (hsp) 70 minimal promoter.
  • CMV cytomegalovirus
  • HSV-tk herpes simplex virus thymidine kinase
  • SV40 simian virus 40
  • hsp heat shock protein
  • Minimal promoters are often augmented with one or more transcriptional regulatory elements to influence the transcription of an operatively linked gene.
  • minimal promoter also encompasses a functional derivative of a promoter disclosed herein, including, but not limited to an RNA polymerase III promoter (for example, an H1 , 7SL, 5S, or U6 promoter), an adenovirus VA1 promoter, a
  • Vault promoter a telomerase RNA promoter, and a tRNA gene promoter.
  • promoters have different combinations of transcriptional regulatory elements. Whether or not a gene is expressed in a cell is dependent on a combination of the particular transcriptional regulatory elements that make up the gene's promoter and the different transcription factors that are present within the nucleus of the cell. As such, promoters are often classified as “constitutive”, “tissue-specific”, “cell-type-specific”, or “inducible”, depending on their functional activities in vivo or in vitro. For example, a constitutive promoter is one that is capable of directing transcription of a gene in a variety of cell types (in some embodiments, in all cell types) of an organism.
  • Exemplary constitutive promoters include the promoters for the following genes which encode certain constitutive or "housekeeping" functions: hypoxanthine phosphoribosyl transferase (HPRT), dihydrofolate reductase (DHFR; (Scharfmann et a/., 1991 ), adenosine deaminase, phosphoglycerate kinase (PGK), pyruvate kinase, phosphoglycerate mutase, the ⁇ -actin promoter (see e.g., Williams et al., 1993), and other constitutive promoters known to those of skill in the art.
  • HPRT hypoxanthine phosphoribosyl transferase
  • DHFR dihydrofolate reductase
  • PGK phosphoglycerate kinase
  • pyruvate kinase phosphoglycerate mutase
  • ⁇ -actin promoter see
  • tissue-specific or “cell-type-specific” promoters direct transcription in some tissues or cell types of an organism but are inactive in some or all others tissues or cell types.
  • tissue-specific promoters include those promoters described in more detail hereinbelow, as well as other tissue-specific and cell-type specific promoters known to those of skill in the art.
  • transcriptional regulatory sequence or “transcriptional regulatory element”, as used herein, each refers to a nucleotide sequence within the promoter region that enables responsiveness to a regulatory transcription factor. Responsiveness can encompass a decrease or an increase in transcriptional output and is mediated by binding of the transcription factor to the DNA molecule comprising the transcriptional regulatory element.
  • a transcriptional regulatory sequence is a transcription termination sequence, alternatively referred to herein as a transcription termination signal.
  • transcription factor generally refers to a protein that modulates gene expression by interaction with the transcriptional regulatory element and cellular components for transcription, including RNA
  • TAFs Transcription Associated Factors
  • RNA refers to an RNA molecule
  • target site refers to a sequence within a target RNA that is “targeted” for cleavage mediated by an miRNA or siRNA construct that contains sequences within its antisense strand that are complementary to the target site.
  • target cell refers to a cell that expresses a target RNA and into which an miRNA is intended to be introduced.
  • a target cell is in some embodiments a cell in a plant.
  • a target cell can comprise a target RNA expressed in a plant.
  • an miRNA or an siRNA is "targeted to" an RNA molecule if it has sufficient nucleotide similarity to the RNA molecule that it would be expected to modulate the expression of the RNA molecule under conditions sufficient for the iniRNA/siRNA and the RNA molecule to interact.
  • the interaction occurs within a plant cell.
  • the interaction occurs under physiological conditions.
  • physiological conditions refers to in vivo conditions within a plant cell, whether that plant cell is part of a plant or a plant tissue, or that plant cell is being grown in vitro.
  • physiological conditions refers to the conditions within a plant cell under any conditions that the plant cell can be exposed to, either as part of a plant or when grown in vitro.
  • the phrase "detectable level of cleavage” refers to a degree of cleavage of target RNA (and formation of cleaved product RNAs) that is sufficient to allow detection of cleavage products above the background of RNAs produced by random degradation of the target RNA. Production of miRNA-mediated cleavage products from at least 1-5% of the target RNA is sufficient to allow detection above background for most detection methods.
  • microRNA and "miRNA” are used interchangeably and refer to a nucleic acid molecule of about 17-24 nt that is produced from a pri- miRNA, a pre-miRNA, or a functional equivalent.
  • miRNAs are to be contrasted with siRNAs described hereinbelow, although in the context of exogenously supplied miRNAs and siRNAs, this distinction might be somewhat artificial.
  • an miRNA is necessarily the product of nuclease activity on a hairpin molecule such as has been described herein, and an siRNA can be generated from a fully double-stranded RNA molecule or a hairpin molecule.
  • an miRNA is designed to hybridize to an mRNA derived from a gene of interest and an siRNA is designed to hybridize to an miRNA precursor such as a pri-miRNA or a pre-miRNA.
  • miRNAs isolated from P. trichocarpa as disclosed herein are named using the general formula "PtmiR X", where X is a number. This is in contrast to P. trichocarpa genes encoding miRNAs, which are named using the general formula "PtMIR X", wherein X is a number sometimes followed by a lowercase letter.
  • miRNA names and miRNA-encoding gene names have the "Ml” in lowercase and uppercase, respectively.
  • small interfering RNA small interfering RNA
  • short interfering RNA and
  • siRNA are used interchangeably and refer to a ribonucleic acid or a modified ribonucleic acid that is designed to hybridize to a single-stranded loop region of an miRNA precursor.
  • miRNA precursor refers to any ribonucleic acid derived from a DNA sequence encoding an miRNA.
  • exemplary miRNA precursors include pri-miRNAs and pre-miRNAs, although the term is not limited to only these species.
  • the siRNA comprises a single stranded polynucleotide having self-complementary sense and antisense regions, wherein either the sense or the antisense region comprises a sequence complementary to a loop region of a pri-miRNA or a pre-miRNA.
  • the siRNA comprises a single stranded polynucleotide having one or more loop structures and a stem comprising self complementary sense and antisense regions, wherein the antisense region comprises a sequence complementary to a loop region of a pri-miRNA or a pre-miRNA, and wherein the polynucleotide can be processed either in vivo or in vitro to generate an active siRNA capable of mediating cleavage of the miRNA precursor.
  • the methods of the presently disclosed subject matter can employ siRNA molecules of the general structure shown in Figure 1 , wherein N is any nucleotide, provided that in the loop structure identified as Ns -9 above, all 5-9 nucleotides remain in a single-stranded conformation.
  • Ni -8 can be any sequence of 1-8 nucleotides or modified nucleotides, provided that the nucleotides remain in a single-stranded conformation in the siRNA molecule.
  • the duplex represented in Figure 1 as 17-30 bases of an miRNA precursor" can be formed using any contiguous 17-30 base sequence of a transcription product of an miRNA-encoding nucleic acid sequence.
  • a contiguous 17-30 base sequence of a transcription product of an miRNA-encoding nucleic acid sequence comprises a subsequence that 1 is predicted to hybridize to a single-stranded region of an miRNA precursor
  • this 17-30 base sequence is followed (in a 5' to 3' direction) by 5-9 random nucleotides (N 5-g above), the reverse-complement of the 17-30 base sequence, and finally 1-8 random nucleotides (N-i-s above).
  • RNA refers to a molecule comprising at least one ribonucleotide residue.
  • ribonucleotide is meant a nucleotide with a hydroxyl group at the 2' position of a ⁇ -D-ribofuranose moiety.
  • the terms encompass double stranded RNA, single stranded RNA, RNAs with both double stranded and single stranded regions, isolated RNA such as partially purified RNA, essentially pure RNA, synthetic RNA, and recombinantly produced RNA.
  • RNAs include, but are not limited to mRNA transcripts, miRNAs and miRNA precursors, and siRNAs.
  • RNA is also intended to encompass altered RNA, or analog RNA, which are RNAs that differ from naturally occurring RNA by the addition, deletion, substitution, and/or alteration of one or more nucleotides. Such alterations can include addition of non-nucleotide material, such as to the end(s) of the RNA or internally, for example at one or more nucleotides of the RNA. Nucleotides in the RNA molecules of the presently disclosed subject matter can also comprise non-standard nucleotides, such as non- naturally occurring nucleotides or chemically synthesized nucleotides or deoxynucleotides. These altered RNAs can be referred to as analogs or analogs of a naturally occurring RNA.
  • double stranded RNA refers to an RNA molecule at least a part of which is in Watson-Crick base pairing forming a duplex.
  • the term is to be understood to encompass an RNA molecule that is either fully or only partially double stranded.
  • Exemplary double stranded RNAs include, but are not limited to molecules comprising at least two distinct RNA strands that are either partially or fully duplexed by intermolecular hybridization.
  • the term is intended to include a single RNA molecule that by intramolecular hybridization can form a double stranded region (for example, a hairpin).
  • the phrases “intermolecular hybridization” and “intramolecular hybridization” refer to double stranded molecules for which the nucleotides involved in the duplex formation are present on different molecules or the same molecule, respectively.
  • double stranded region refers to any region of a nucleic acid molecule that is in a double stranded conformation via hydrogen bonding between the nucleotides including, but not limited to hydrogen bonding between cytosine and guanosine, adenosine and thymidine, adenosine and uracil, and any other nucleic acid duplex as would be understood by one of ordinary skill in the art.
  • the length of the double stranded region can vary from about 15 consecutive basepairs to several thousand basepairs.
  • the double stranded region is at least 15 basepairs, in some embodiments between 15 and 300 basepairs, and in some embodiments between 15 and about 60 basepairs.
  • the formation of the double stranded region results from the hybridization of complementary RNA strands (for example, a sense strand and an antisense strand), either via an intermolecular hybridization (Ae., involving 2 or more distinct RNA molecules) or via an intramolecular hybridization, the latter of which can occur when a single RNA molecule contains self-complementary regions that are capable of hybridizing to each other on the same RNA molecule.
  • These self-complementary regions are typically separated by a short stretch of nucleotides (for example, about 5-10 nucleotides) such that the intramolecular hybridization event forms what is referred to in the art as a "hairpin” or a "stem-loop structure".
  • the presently disclosed subject matter provides in some embodiments methods for modulating gene expression in a plant.
  • the presently disclosed subject matter provides a method for stably modulating expression of a plant gene comprising (a) providing a vector encoding a microRNA (miRNA) targeted to the plant gene; and (b) transforming a plant cell with the vector, whereby stable expression of the miRNA in the plant cell is provided.
  • miRNA microRNA
  • the presently disclosed subject matter concerns stably transforming a plant cell
  • an miRNA precursor is produced via the activity of the promoter in the plant cell, which is then processed using endogenous miRNA pathways to generate an miRNA target in the plant cell.
  • This promoter can be capable of binding any RNA polymerase, including, for example, an RNA polymerase Il andan RNA polymerase III.
  • RNA polymerase III H1 promoter an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, a tRNA gene promoter, and functional derivatives thereof.
  • RNA polymerase III H1 promoter an Arabidopsis thaliana 7SL RNA promoter
  • RNA polymerase III 5S promoter an RNA polymerase III U6 promoter
  • an adenovirus VA1 promoter a Vault promoter
  • telomerase RNA promoter a telomerase RNA promoter
  • tRNA gene promoter a tRNA gene promoter
  • a method for stably modulating expression of a plant gene comprises (a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence;
  • miRNA microRNA
  • the presently disclosed subject matter also provides methods for enhancing the expression of a gene in a plant cell.
  • the method comprises introducing into the plant cell a vector encoding a short interfering RNA (siRNA) molecule comprising a sequence that hybridizes to a loop region, stem region, or antisense sequence of an miRNA of a pre-microRNA that comprises a microRNA (miRNA) that modulates expression of the gene, thereby resulting in downregulation of expression of the miRNA and enhanced expression of the gene.
  • siRNA short interfering RNA
  • the disclosed methods are employed to modulate the expression of a gene in a tree cell.
  • Representative, non- limiting tree species for which the disclosed methods can be employed include trees of the genus Populus and of the genus Pinus, including, but not limited to Populus trichocarpa and Pinus taeda. IV. Target Genes
  • the presently disclosed subject matter provides methods for stably modulating expression of plant genes using miRNAs.
  • the methods are applicable to any gene expressed in the plant.
  • the methods are used to modulate the expression of genes in trees.
  • the methods are used to modulate the expression of genes in members of the genus Populus, including, but not limited to Populus trichocarpa.
  • the methods are used to modulate the expression of genes in members of the genus Pinus, including, but not limited to Pinus taeda.
  • Representative P. trichocarpa miRNAs are presented in SEQ ID NOs: 1-59 and 1247-1295. These miRNA were identified using the techniques disclosed in Examples 1-6, and are summarized in Table 1. Additionally, using the techniques disclosed in the Examples, miRNA precursor sequences present in a representative plant, P. trichocarpa were identified, and these sequences (SEQ ID NOs: 60-156 and 1296-1375) are also summarized in Table 1. Further analysis of the P. trichocarpa genome revealed target genes that the miRNAs of SEQ ID NOs: 1-59 and 1247-1295 modulate, which are summarized in Table 2. Representative Pinus taeda miRNAs are presented in SEQ ID NOs:
  • miRNA precursor sequences present in a second representative plant, Pinus taeda were identified, and these sequences (SEQ ID NOs: 1713-1748) are also summarized in Table 4. Further analysis of the P. taeda genome revealed target genes that the miRNAs of SEQ ID NOs: 1662-1712 can modulate, which are also summarized in Table 2.
  • plant gene sequences for example, gene sequences from Populus sp. including, but not limited to Populus trichocarpa
  • plant gene sequences that can be targeted by the miRNAs of SEQ ID NOs: 1- 59 and 1247-1295 can be identified.
  • miRNAs for example, gene sequences from Populus sp. including, but not limited to Populus trichocarpa
  • numerous particular target gene sequences were identified. These target gene sequences are presented in SEQ ID NOs: 176-781 and 1376-1553, and are summarized in Table 3.
  • plant gene sequences for example, gene sequences from Pinus sp. including, but not limited to Pinus taeda
  • plant gene sequences that can be targeted by the miRNAs of SEQ ID NOs: 1662-1712
  • plant gene sequences for example, gene sequences from Pinus sp. including, but not limited to Pinus taeda
  • plant gene sequences that can be targeted by the miRNAs of SEQ ID NOs: 1662-1712 can be identified.
  • target gene sequences for example, gene sequences from Pinus sp. including, but not limited to Pinus taeda
  • numerous particular target gene sequences were identified. These target gene sequences are presented in SEQ ID NOs: 1749-1837, and are summarized in Table 5.
  • PtMIR 133 AtMIR 172 APETAL2-like protein
  • PtMIR 104 AtMIR 162 DEAD/DEAH box helicase carpel factory / CAF identical to RNA helicase/RNAselll CAF protein
  • PtMIR 56 AtMIR 168 AGRONAUTE PtMIR 6 (UVR8) UVB-resistance protein PtMIR 13 (ERD4) early-responsive to dehydration protein-related
  • TIR-NBS-LRR class PtMIR73 disease resistance protein
  • PtMIR 139 putative sulfate transporter PtMIR 160 disease resistance protein TIR-NBS-LRR class
  • PtMIR 181 putative bifunctional aspartate kinase/homoserine dehydrogenase
  • PtMIR 172 CAD cinnamyl-alcohol dehydrogenase disease resistance protein-related LIM domain-containing protein
  • SPRY SPla/RYanodine receptor domain-containing protein
  • PtMIR 245 isoflavone reductase family protein trehalose-6-phosphate phosphatase
  • PtMIR 252 AthMIR 398 selenium-binding protein, putative PtMIR 255 SEC14 cytosolic factor family protein PtMIR 257 GCN5-related N-acetyltransferase gibberellin regulatory protein (RGL1 ) homeodomain transcription factor (KNAT7)
  • PtMIR 274 AthMIR 166 homeobox-Ieucine zipper family protein no apical meristem (NAM) family protein
  • SAM2 S-adenosylmethionine synthetase 2
  • SWAP SudAP (Suppressor-of-White-APricot)/surp domain-containing protein
  • PtMIR282 homeobox protein knotted-1 like 1 (KNAT1 ) ribosomal protein L1 family protein two-component responsive regulator family protein
  • PtMIR283 indigoidine synthase A family protein pectate lyase family protein eukaryotic release factor 1 family protein
  • PtMIR287 ankyrin repeat family protein beta-fructosidase disease resistance protein leucine-rich repeat family protein oxidoreductase, 2OG-Fe(II) oxygenase family protein
  • PtMIR 315 BAG domain-containing protein leucine-rich repeat family protein LpMIR IOO AMP-dependent synthetase elongation factor Tu, putative / EF-Tu expressed protein contains 3 transmembrane domains
  • peroxidase family protein similar to cationic peroxidase
  • F-box family protein (FBX1) E3 ubiquitin ligase
  • LpMIR 178 AthMIR 156 actin aspartyl protease family protein cellulose synthase endo-(1 ,3)-alpha-glucanase homeobox-leucine zipper protein 13 (HB-13) lateral organ boundaries domain protein 4 (LBD4) nitrate reductase 2 (NR2) peptidyl-tRNA hydrolase protein kinase family protein
  • LpMIR 27 3-deoxy-D-manno-octulosonic acid transferase chlorophyll A-B binding family protein hydrolase, alpha/beta fold family protein
  • nodulin MtN3 family protein thioredoxin family protein zinc finger (CCCH-type/C3HC4-type RING finger) family protein
  • LpMIR 28 60S ribosomal protein L24, putative abscisic acid-responsive HVA22 family protein aspartyl protease family protein lipase class 3 family protein microtubule organization 1 protein (MOR1)
  • LpMIR 89 sterol isomerase LpMIR 9 AthMIR 160 auxin-responsive AUX/IAA family protein transcriptional factor B3 family protein
  • a plant gene that is targeted for modulation has a nucleic acid sequence comprising any of SEQ ID NOs. 176-781 , 1376-1553, and 1749-1837, and encodes a polypeptide having an amino acid sequence comprising any of SEQ ID NOs: 782-1246, 1554-1661 , and 1838-1907.
  • a plant gene that is targeted for modulation comprises a nucleic acid sequence at least about 70% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, and encodes a polypeptide comprising an amino acid sequence have 5 or fewer (e.g., 5, 4,
  • Examples 1-6 additional plant genes can be selected and miRNAs designed to modulate the expression of the genes in any desired plant. Additionally, the basic methodology disclosed in these Examples can be used to isolate miRNAs from any desired plant and to identify genes that can be targeted using the methods disclosed herein.
  • Examples 1-6 were employed to identify genes from Pinus taeda and to design miRNAs to modulate the expression of genes in Pinus sp. These sequences are summarized in Table 4.
  • genes associated with lignin biosynthesis are targeted for modulation.
  • Lignin is a major component of wood, and the regulation of its biosynthesis has can have a major impact on paper and pulping processes.
  • lignin Several genes have been identified that are involved in the biosynthesis of lignin including, but not limited to sinapyl alcohol dehydrogenase (SAD), cinnamyl alcohol dehydrogenase (CAD), 4- coumarate:CoA ligase (4CL), cinnamoyl CoA O-methyltransferase (CCoAOMT; also referred to as CCOMT), caffeate O-methyltransferase (COMT), ferulate-5-hydroxylase (F5H), cinnamate-4-hydroxylase (C4H), p- coumarate-3-hydroxylase (C3H), and phenylalanine ammonia lyase (PAL).
  • SAD sinapyl alcohol dehydrogenase
  • CAD cinnamyl alcohol dehydrogenase
  • 4CL 4- coumarate:CoA ligase
  • CCoAOMT cinnamoyl CoA O-methyltransferas
  • genes associated with cellulose biosyntheses are targeted for modulation.
  • Representative, non-limiting genes that have been identified that are associated with cellulose biosynthesis include cellulose synthase (CeS; also referred to as CESA in some plants), cellulose synthase-like (CSL), glucosidase, glucan synthase, Korrigan endocellulase, callose synthase, and sucrose synthase.
  • other plant genes are targeted for modulation using miRNAs.
  • gene families that can be targeted include hormone-related genes, including but not limited to isopentyl transferase (ipt), gibberellic acid (GA) oxidase, auxin (AUX), auxin- responsive and auxin-induced genes, and members of the rooting locus (ROL) gene family; hemicellulose-related genes, disease-related genes, stress-related genes, growth-related genes and transcription factors.
  • ipt isopentyl transferase
  • GA gibberellic acid
  • AUX auxin- responsive and auxin-induced genes
  • ROL rooting locus
  • hemicellulose-related genes hemicellulose-related genes, disease-related genes, stress-related genes, growth-related genes and transcription factors.
  • nucleic acid molecules employed in accordance with the presently disclosed subject matter include any nucleic acid molecule encoding a plant gene product, as well as the nucleic acid molecules that are used in accordance with the presently disclosed subject matter to modulate the expression of a plant gene.
  • the nucleic acid molecules employed in accordance with the presently disclosed subject matter include, but are not limited to, the nucleic acid molecules described herein (for example, SEQ ID NOs: 1-1907); sequences substantially identical to those described herein (for example, sequences at least 70% identical to any of SEQ ID NOs: 1-1907); and subsequences and elongated sequences thereof.
  • the presently disclosed subject matter also encompasses genes, cDNAs, chimeric genes, and vectors comprising the disclosed nucleic acid sequences.
  • An exemplary nucleotide sequence employed in the methods disclosed herein comprises sequences that are complementary to each other, the complementary regions being capable of forming a duplex of, in some embodiments, at least about 15 to 300 basepairs, and in some embodiments, at least about 15-24 basepairs.
  • One strand of the duplex comprises a nucleic acid sequence of at least 15 contiguous bases having a nucleic acid sequence of a nucleic acid molecule of the presently disclosed subject matter.
  • one strand of the duplex comprises a nucleic acid sequence comprising 15, 16, 17, or 18 nucleotides, or even longer where desired, such as 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or
  • fragments can be readily prepared by, for example, directly synthesizing the fragment by chemical synthesis, by application of nucleic acid amplification technology, or by introducing selected sequences into recombinant vectors for recombinant production.
  • hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex nucleic acid mixture (e.g., total cellular DNA or RNA).
  • sequence refers to a sequence of a nucleic acid molecule or amino acid molecule that comprises a part of a longer nucleic acid or amino acid sequence.
  • An exemplary subsequence is a sequence that comprises part of a duplexed region of a pri-miRNA or a pre-miRNA including, but not limited to the nucleotides that become the mature miRNA after nuclease action or a single-stranded region in an miRNA precursor.
  • elongated sequence refers to an addition of nucleotides (or other analogous molecules) incorporated into the nucleic acid.
  • a polymerase e.g., a DNA polymerase
  • the nucleotide sequence can be combined with other DNA sequences, such as promoters, promoter regions, enhancers, polyadenylation signals, intronic sequences, additional restriction enzyme sites, multiple cloning sites, and other coding segments.
  • Nucleic acids of the presently disclosed subject matter can be cloned, synthesized, recombinantly altered, mutagenized, or subjected to combinations of these techniques.
  • miRNA precursor molecules are expressed from transcription units inserted into nucleic acid vectors (alternatively referred to generally as “recombinant vectors” or “expression vectors”).
  • a vector is used to deliver a nucleic acid molecule encoding an miRNA into a plant cell to target a specific plant gene.
  • the recombinant vectors can be, for example, DNA plasmids or viral vectors.
  • Various expression vectors are known in the art. The selection of the appropriate expression vector can be made on the basis of several factors including, but not limited to the cell type wherein expression is desired.
  • Agrobacterium-based expression vectors can be used to express the nucleic acids of the presently disclosed subject matter when stable expression of the vector insert is sought in a plant cell.
  • a vector is also used to deliver a nucleic acid molecule encoding an siRNA into a plant cell to target a specific miRNA precursor.
  • the expression of the nucleotide sequence in the expression cassette can be under the control of a constitutive promoter or an inducible promoter that initiates transcription only when the host cell is exposed to some particular external stimulus.
  • exemplary promoters include Simian virus 40 early promoter, a long terminal repeat promoter from retrovirus, an actin promoter, a heat shock promoter, and a metallothionein protein.
  • exemplary constitutive promoters are derived from the CaMV 35S, rice actin, and maize ubiquitin genes, each described herein below.
  • Exemplary inducible promoters for this purpose include the chemically inducible PR-Ia promoter and a wound-inducible promoter, also described herein below.
  • Selected promoters can direct expression in specific cell types (such as leaf epidermal cells, mesophyll cells, root cortex cells) or in specific tissues or organs (roots, leaves or flowers, for example).
  • tissue-specific promoters include well-characterized root-, pith-, and leaf-specific promoters, each described herein below.
  • promoter selection can be based on expression profile and expression level.
  • the following are non-limiting examples of promoters that can be used in the expression cassettes.
  • the CaMV 35S promoter can be used to drive constitutive gene expression.
  • Construction of the plasmid pCGN1761 is described in the published patent application EP 0 392 225, which is hereby incorporated by reference.
  • pCGN1761 contains the "double" CaMV 35S promoter and the tml transcriptional terminator with a unique EcoRI site between the promoter and the terminator and has a pUC-type backbone.
  • a derivative of pCGN1761 is constructed which has a modified polylinker that includes Notl and Xhol sites in addition to the existing EcoRI site. This derivative is designated pCGN 1761 ENX.
  • pCGN1761ENX is useful for the cloning of cDNA sequences or gene sequences (including microbial open reading frame (ORF) sequences) within its polylinker for the purpose of their expression under the control of the 35S promoter in transgenic plants.
  • the entire 35S promoter-gene sequence-tml terminator cassette of such a construction can be excised by Hindlll, Sphl, Sail, and Xbal sites 5 1 to the promoter and Xbal, BamHI and BgII sites 3' to the terminator for transfer to transformation vectors such as those described below.
  • the double 35S promoter fragment can be removed by 5' excision with Hindlll, Sphl, Sail, Xbal, or Pstl, and 3' excision with any of the polylinker restriction sites (EcoRI, Notl or Xhol) for replacement with another promoter.
  • Actin Promoter Several isoforms of actin are known to be expressed in most cell types and consequently the actin promoter is a good choice for a constitutive promoter.
  • the promoter from the rice Actl gene has been cloned and characterized (McElroy ⁇ if al., 1990). A 1.3 kb fragment of the promoter was found to contain all the regulatory elements required for expression in rice protoplasts.
  • the promoter expression cassettes described by McElroy ef al., 1991 can be easily modified for gene expression and are particularly suitable for use in monocotyledonous hosts. For example, promoter-containing fragments is removed from the McElroy constructions and used to replace the double 35S promoter in pCGN1761 ENX, which is then available for the insertion of specific gene sequences. The fusion genes thus constructed can then be transferred to appropriate transformation vectors.
  • the rice Actl promoter with its first intron has also been found to direct high expression in cultured barley cells (Chibbar ef a/., 1993). Ubiquiti ⁇ Promoter.
  • Ubiquitin is another gene product known to accumulate in many cell types and its promoter has been cloned from several species for use in transgenic plants (e.g. sunflower by Binet et al., 1991 and maize by Christensen et al., 1989).
  • the maize ubiquitin promoter has been developed in transgenic monocot systems and its sequence and vectors constructed for monocot transformation are disclosed in the patent publication EP 0 342 926 which is herein incorporated by reference.
  • Taylor et al., 1993 describe a vector (pAHC25) that comprises the maize ubiquitin promoter and first intron and its high activity in cell suspensions of numerous monocotyledons when introduced via microprojectile bombardment.
  • the ubiquitin promoter is suitable for gene expression in transgenic plants, especially monocotyledons.
  • Suitable vectors are derivatives of pAHC25 or any of the transformation vectors described in this application, modified by the introduction of the appropriate ubiquitin promoter and/or intron sequences.
  • the double 35S promoter in pCGN1761 ENX can be replaced with any other promoter of choice that will result in suitably high expression levels.
  • one of the chemically regulatable promoters described in U.S. Patent No. 5,614,395 can replace the double 35S promoter.
  • the promoter of choice is preferably excised from its source by restriction enzymes, but can alternatively be PCR-amplified using primers that carry appropriate terminal restriction sites. Should PCR-amplification be undertaken, then the promoter should be re-sequenced to check for amplification errors after the cloning of the amplified promoter in the target vector.
  • the chemical/pathogen regulated tobacco PR-Ia promoter is cleaved from plasmid pCIB1004 (for construction, see EP 0 332 104, which is hereby incorporated by reference) and transferred to plasmid pCGN 1761 ENX (Uknes et al., 1992).
  • pCIB1004 is cleaved with Ncol and the resultant 3' overhang of the linearized fragment is rendered blunt by treatment with T4 DNA polymerase.
  • the fragment is then cleaved with Hindlll and the resultant PR-Ia promoter-containing fragment is gel purified and cloned into pCGN 1761 ENX from which the double 35S promoter has been removed.
  • Wound-lnducible Promoters can also be suitable for gene expression. Numerous such promoters have been described (e.g. Xu et al., 1993; Logemann et al., 1989; Rohrmeier & Lehle, 1993; Firek et al., 1993; Warner et a/., 1993) and all are suitable for use with the presently disclosed subject matter. Logemann et al., 1989 describe the 5' upstream sequences of the dicotyledonous potato wunl gene. Xu et al.,
  • Root Promoter Another pattern of gene expression is root expression.
  • a suitable root promoter is described by de Framond, 1991 and also in the published patent application EP O 452 269, which is herein incorporated by reference. This promoter is transferred to a suitable vector such as pCGN 1761 ENX for the insertion of a selected gene and subsequent transfer of the entire promoter-gene-terminator cassette to a transformation vector of interest.
  • Pith Promoter PCT International Publication No. WO 93/07278, which is herein incorporated by reference, describes the isolation of the maize trpA gene, which is preferentially expressed in pith cells.
  • the gene sequence and promoter extending up to -1726 basepairs (bp) from the start of transcription are presented.
  • this promoter, or parts thereof can be transferred to a vector such as pCGN1761 where it can replace the 35S promoter and be used to drive the expression of a foreign gene in a pith-preferred manner.
  • fragments containing the pith-preferred promoter or parts thereof can be transferred to any vector and modified for utility in transgenic plants.
  • Leaf Promoter A maize gene encoding phosphoenol carboxylase (PEPC) has been described by Hudspeth & Grula, 1989. Using standard molecular biological techniques the promoter for this gene can be used to drive the expression of any gene in a leaf-specific manner in transgenic plants.
  • PEPC phosphoenol carboxylase
  • transcriptional terminators are available for use in expression cassettes. These are responsible for the termination of transcription beyond the transgene and its correct polyadenylation. Appropriate transcriptional terminators are those that are known to function in plants and include the CaMV 35S terminator, the tml terminator, the nopaline synthase terminator, and the pea rbcS E9 terminator. With regard to RNA polymerase III terminators, these terminators typically comprise a run of 5 or more consecutive thymidine residues. In some embodiments, an RNA polymerase III terminator comprises the sequence TTTTTTT. These can be used in both monocotyledons and dicotyledons. VI.C. Sequences for the Enhancement or Regulation of Expression
  • nucleic acids of the presently disclosed subject matter Numerous sequences have been found to enhance the expression of an operatively lined nucleic acid sequence, and these sequences can be used in conjunction with the nucleic acids of the presently disclosed subject matter to increase their expression in transgenic plants.
  • intron sequences have been shown to enhance expression, particularly in monocotyledonous cells.
  • the introns of the maize Adhl gene have been found to significantly enhance the expression of the wild-type gene under its cognate promoter when introduced into maize cells.
  • Intron 1 was found to be particularly effective and enhanced expression in fusion constructs with the chloramphenicol acetyltransferase gene (CaIMs et al., 1987).
  • the intron from the maize bronzel gene had a similar effect in enhancing expression.
  • Intron sequences have been routinely incorporated into plant transformation vectors, typically within the non-translated leader.
  • leader sequences derived from viruses are also known to enhance expression, and these are particularly effective in dicotyledonous cells.
  • TMV Tobacco Mosaic Virus
  • MCMV Maize Chlorotic Mottle Virus
  • AMV Alfalfa Mosaic Virus
  • Suitable expression vectors include, but are not limited to, the following vectors or their derivatives: yeast vectors, bacteriophage vectors (e.g., lambda phage), and plasmid and cosmid DNA vectors.
  • vectors available for plant transformation can be prepared and employed in the present methods.
  • Exemplary vectors include pCIB200, pCIB2001 , pCIBIO, pCIB3064, pSOG19, pSOG35, and pSIT, each described herein.
  • the selection of vector can depend upon the chosen transformation technique and the target species for transformation. VILA. Agrobacterium Transformation Vectors
  • vectors are available for transformation using Agrobacterium tumefaciens. These typically carry at least one T-DNA border sequence and include vectors such as pBIN19 (Bevan, 1984) and pXYZ. Below, the construction of two typical vectors suitable for Agrobacterium transformation is described.
  • PCIB200 and pCIB2001 are used for the construction of recombinant vectors for use with Agrobacterium and are constructed in the following manner.
  • pTJS75kan is created by Narl digestion of pTJS75 (Schmidhauser & Helinski, 1985) allowing excision of the tetracycline-resistance gene, followed by insertion of an Accl fragment from pUC4K carrying an NPTII (Messing & Vierra, 1982; Bevan et al., 1983; McBride et al., 1990).
  • Xhol linkers are ligated to the EcoRV fragment of PCIB7 which contains the left and right T-DNA borders, a plant selectable nos/nptll chimeric gene and the pUC polylinker (Rothstein et al., 1987), and the Xhol-digested fragment are cloned into Sail-digested pTJS75kan to create pCIB200 (see also EP 0 332 104, herein incorporated by reference).
  • pCIB200 contains the following unique polylinker restriction sites: EcoRI, Sstl, Kpnl, BgIII, Xbal, and Sail.
  • pC!B2001 is a derivative of pCIB200 created by the insertion into the polylinker of additional restriction sites.
  • Unique restriction sites in the polylinker of pCIB2001 are EcoRI, Sstl, Kpnl, BgIII, Xbal, Sail, MIuI, BcII, Avrll, Apal, Hpal, and Stul.
  • pCIB2001 in addition to containing these unique restriction sites also has plant and bacterial kanamycin selection, left and right T-DNA borders for
  • the pCIB2001 polylinker is suitable for the cloning of plant expression cassettes containing their own regulatory signals.
  • the binary vector pCIBIO contains a gene encoding kanamycin resistance for selection in plants and T-DNA right and left border sequences and incorporates sequences from the wide host-range plasmid pRK252 allowing it to replicate in both E. coli and Agrobacterium. Its construction is described by Rothstein et ai, 1987.
  • pCIBIO Various derivatives of pCIBIO are constructed which incorporate the gene for hygromycin B phosphotransferase described by Gritz ef a/., 1983. These derivatives enable selection of transgenic plant cells on hygromycin only (pCIB743), or hygromycin and kanamycin
  • pSIT is an Agrobacterium binary vector that can be used to stably express exogenous nucleic acids (for example, miRNAs and/or siRNAs) in plants.
  • pSIT encodes two transcription units. The first is a transcription unit encoding a selectable marker under control of a promoter- transcription terminator pair that functions in plants cells.
  • the second transcription unit encodes the gene of interest (for example, an miRNAs and/or siRNA) under the control of a second promoter-transcription terminator pair, which specifically directs the transcription to generate a functional miRNAs and/or siRNA in plant cells and which can be the same or different than the one operatively linked to the selectable marker.
  • an miRNAs and/or siRNA is operatively linked to an RNA polymerase III promoter (for example, the At7SL4 promoter) and the RNA- polymerase-lll-recognized transcription terminator (for example, TTTTTTT).
  • an RNA polymerase III promoter for example, the At7SL4 promoter
  • the RNA- polymerase-lll-recognized transcription terminator for example, TTTTTTT.
  • the integration of the miRNAs and/or siRNA cassette is guaranteed if the transformants survived through the antibiotic selection process due to the expression of the selection marker gene incorporated in the binary vector.
  • the hpt (hygromycin phosphotransferase) selection marker gene is operatively under the control of a pair of Pnos promoter and Nos terminator. Other pairs of promoter and terminator that can drive selection marker gene expression also are suitable for the purpose.
  • Transformation without the use of Agrobacterium tumefaciens circumvents the requirement for T-DNA sequences in the chosen transformation vector and consequently vectors lacking these sequences can be utilized in addition to vectors such as the ones described above which contain T-DNA sequences. Transformation techniques that do not rely on Agrobacterium include transformation via particle bombardment, protoplast uptake (e.g. polyethylene glycol (PEG) and electroporation), and microinjection. The choice of vector can depend on the technique chosen for the species being transformed. Below, the construction of typical vectors suitable for non-Agrobacterium transformation is described. PCIB3064.
  • pCIB3064 is a pUC-derived vector suitable for direct gene transfer techniques in combination with selection by the herbicide BASTA ® (or phosphinothricin).
  • the plasmid pCIB246 comprises the CaMV 35S promoter in operational fusion to the E. coli ⁇ -glucuronidase (GUS) gene and the CaMV 35S transcriptional terminator and is described in PCT International Publication No. WO 93/07278.
  • the 35S promoter of this vector contains two ATG sequences 5' of the start site. These sites are mutated using standard PCR techniques in such a way as to remove the ATGs and generate the restriction sites Sspl and Pvull.
  • the new restriction sites are 96 and 37 bp away from the unique Sail site and 101 and 42 bp away from the actual start site.
  • the resultant derivative of pCIB246 is designated
  • the GUS gene is then excised from pCIB3025 by digestion with Sail and Sacl, the termini rendered blunt and religated to generate plasmid pCIB3060.
  • the plasmid pJIT82 is obtained from the John lnnes Centre (Norwich, United Kingdom), and a 400 bp Smal fragment containing the bar gene from Streptomyces viridochromogenes is excised and inserted into the Hpal site of pCIB3060 (Thompson et al., 1987).
  • This generated pCIB3064 which comprises the bar gene under the control of the CaMV 35S promoter and terminator for herbicide selection, a gene for ampicillin resistance (for selection in E. coli) and a polylinker with the unique sites Sphl, Pstl, Hindlll, and BamHI.
  • This vector is suitable for the cloning of plant expression cassettes containing their own regulatory signals.
  • pSOG35 is a transformation vector that utilizes the E. coli gene dihydrofolate reductase (DHFR) as a selectable marker conferring resistance to methotrexate.
  • DHFR E. coli gene dihydrofolate reductase
  • PCR is used to amplify the E. coli gene dihydrofolate reductase
  • 35S promoter (-800 bp), intron 6 from the maize Adh1 gene (-550 bp) and 18 bp of the GUS untranslated leader sequence from pSOG10.
  • a 250-bp fragment encoding the E. coli dihydrofolate reductase type Il gene is also amplified by PCR and these two PCR fragments are assembled with a Sacl-Pstl fragment from pB1221 (Clontech, Palo Alto, California, United States of America) that comprises the pUC19 vector backbone and the nopaline synthase terminator.
  • pSOG19 which contains the 35S promoter in fusion with the intron 6 sequence, the GUS leader, the DHFR gene and the nopaline synthase terminator.
  • Replacement of the GUS leader in pSOG19 with the leader sequence from Maize Chlorotic Mottle Virus (MCMV) generates the vector pSOG35.
  • pSOG19 and pSOG35 carry a ⁇ -lactamase gene from the pUC vector for ampicillin resistance and have Hindlll, Sphl, Pstl and EcoRI sites available for the cloning of foreign substances. Vll.C. Selectable Markers
  • selection markers used routinely in transformation include the nptll gene, which confers resistance to kanamycin and related antibiotics (Messing & Vierra, 1982; Bevan et al., 1983), the bar gene, which confers resistance to the herbicide phosphinothricin (White et al., 1990; Spencer et al., 1990), the hph gene, which confers resistance to the antibiotic hygromycin (Blochlinger & Diggelmann, 1984), the dhfr gene, which confers resistance to methotrexate (Bourouis & Jarry, 1983), and the
  • ESP 5-enolpyruvylshikimate-3-phosphate
  • nucleic acid sequence of the presently disclosed subject matter is transformed into a plant cell.
  • the receptor and target expression cassettes of the presently disclosed subject matter can be introduced into the plant cell in a number of art-recognized ways. Methods for regeneration of plants are also well known in the art. For example, Ti plasmid vectors have been utilized for the delivery of foreign DNA, as have direct DNA uptake, liposomes, electroporation, microinjection, and microprojectiles. In addition, bacteria from the genus Agrobacterium can be utilized to transform plant cells.
  • the presently disclosed subject matter also provides a method for stably modulating expression of a gene in a plant. In some embodiments, the method comprises (a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence;
  • miRNA microRNA
  • the method comprises (a) transforming a plurality of plant cells with an Agrobacterium tumefaciens binary vector comprising (i) a nucleic acid sequence encoding a selectable marker; and (ii) a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence; (b) treating the plant cells with a drug under conditions sufficient to kill those plant cells that did not receive the binary vector, wherein the selectable marker provides resistance to the drug, to create a first plurality of transformed plant cells; (c) growing the first plurality of transformed plant cells under conditions sufficient to select for a second plurality of transformed plant cells that have integrated the binary vector into their genomes; (d) screening the second plurality of transformed plant cells for expression of the miRNA encoded by the expression vector; (e) selecting a transformed plant cell that expresses the miRNA; and (f) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the miRNA
  • the presently disclosed subject matter is based on the introduction of a stable and heritable miRNAs and/or siRNAs into plant cells to specifically manipulate a gene of the interest. As disclosed herein, this concept has been demonstrated through Agrobacterium transformation, but would also be applicable to other approaches for transformation, such as bombardment. Thus, it should be understood that the mechanism of transformation of a plant cell is not limited to the Agrobacterium-mediated techniques disclosed in certain embodiments herein. Any transformation technique that results in stable expression of a nucleic acid (for example, an miRNAs and/or siRNA) of the presently disclosed subject matter can be employed with the methods disclosed herein. Below are descriptions of representative techniques for transforming both dicotyledonous and monocotyledonous plants, as well as a representative plastid transformation technique. VIII.A. Transformation of Dicotyledons
  • Transformation techniques for dicotyledons are well known in the art and include Agrobacterium-based techniques and techniques that do not require Agrobacterium.
  • Uon-Agrobacterium techniques involve the uptake of exogenous genetic material directly by protoplasts or cells. This can be accomplished by PEG or electroporation-mediated uptake, particle bombardment-mediated delivery, or microinjection. Examples of these techniques are disclosed in Paszkowski et al., 1984; Potrykus et al., 1985;
  • /Agrobacter/t/m-mediated transformation is a useful technique for transformation of dicotyledons because of its high efficiency of transformation and its broad utility with many different species.
  • Agrobacterium transformation typically involves the transfer of the binary vector carrying the foreign DNA of interest (e.g. pSIT) to an appropriate Agrobacterium strain that can depend on the complement of vir genes carried by the host Agrobacterium strain either on a co-resident Ti plasmid or chromosomally (e.g. strain C58 or strains pCIB542 for pCIB200 and pCIB2001 ; Uknes et al., 1993).
  • the transfer of the recombinant binary vector to Agrobacterium is accomplished by a triparental mating procedure using E. coli carrying the recombinant binary vector, a helper E.
  • the recombinant binary vector can be transferred to Agrobacterium by DNA transformation (H ⁇ fgen & Willmitzer, 1988). Transformation of the target plant species by recombinant Agrobacterium usually involves co-cultivation of the Agrobacterium with explants from the plant and follows protocols well known in the art. Transformed tissue is regenerated on selectable medium carrying the antibiotic or herbicide resistance marker present between the binary plasmid
  • Transformation of most monocotyledon species has now also become routine.
  • Exemplary techniques include direct gene transfer into protoplasts using PEG or electroporation, and particle bombardment into callus tissue. Transformations can be undertaken with a single DNA species or multiple DNA species (i.e., co-transformation), and both these techniques are suitable for use with the presently disclosed subject matter.
  • Co- transformation can have the advantage of avoiding complete vector construction and of generating transgenic plants with unlinked loci for the gene of interest and a selectable marker, enabling the removal of the selectable marker in subsequent generations, should this be regarded as desirable.
  • a disadvantage of the use of co-transformation is the less than 100% frequency with which separate DNA species are integrated into the genome (Schocher et ai, 1986).
  • Patent Applications EP 0 292 435, EP 0 392 225, and WO 93/07278 describe techniques for the preparation of callus and protoplasts from an elite inbred line of maize, transformation of protoplasts using PEG or electroporation, and the regeneration of maize plants from transformed protoplasts.
  • Gordon-Kamm et al., 1990 and Fromm et al., 1990 have published techniques for transformation of A188-derived maize line using particle bombardment.
  • WO 93/07278 and Koziel et al., 1993 describe techniques for the transformation of elite inbred lines of maize by particle bombardment.
  • This technique utilizes immature maize embryos of 1.5-2.5 mm length excised from a maize ear 14-15 days after pollination and a PDS-1000He biolistic particle delivery device (DuPont Biotechnology, Wilmington, Delaware, United States of America) for bombardment.
  • Transformation of rice can also be undertaken by direct gene transfer techniques utilizing protoplasts or particle bombardment.
  • Protoplast- mediated transformation has been disclosed for Japon/ca-types and Indica- types (Zhang et al., 1988; Shimamoto et al., 1989; Datta et al., 1990). Both types are also routinely transformable using particle bombardment (Christou ef al., 1991 ).
  • WO 93/21335 describes techniques for the transformation of rice via electroporation.
  • Patent Application EP 0 332 581 describes techniques for the generation, transformation, and regeneration of Pooideae protoplasts. These techniques allow the transformation of Dactylis and wheat.
  • a representative technique for wheat transformation involves the transformation of wheat by particle bombardment of immature embryos and includes either a high sucrose or a high maltose step prior to gene delivery.
  • embryos Prior to bombardment, embryos (0.75-1 mm in length) are plated onto MS medium with 3% sucrose (Murashige & Skoog, 1962) and 3 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D) for induction of somatic embryos, which is allowed to proceed in the dark.
  • MS medium with 3% sucrose (Murashige & Skoog, 1962) and 3 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D) for induction of somatic embryos, which is allowed to proceed in the dark.
  • 2,4-D 2,4-dichlorophenoxyacetic acid
  • the embryos are allowed to plasmolyze for 2-3 hours and are then bombarded. Twenty embryos per target plate are typical, although not critical.
  • An appropriate gene-carrying plasmid (such as pCIB3064 or pSG35) is precipitated onto micrometer size gold particles using standard procedures.
  • Each plate of embryos is shot with the DuPont biolistics helium device using a burst pressure of about 1000 pounds per square inch (psi) using a standard 80 mesh screen.
  • the embryos are placed back into the dark to recover for about 24 hours (still on osmoticum).
  • the .embryos are removed from the osmoticum and placed back onto induction medium where they stay for about a month before regeneration. Approximately one month later the embryo explants with developing embryogenic callus are transferred to regeneration medium (MS + 1 mg/liter naphthaleneacetic acid
  • NAA 5 mg/liter GA
  • GA 5 mg/liter GA
  • appropriate selection agent 10 mg/l BASTA ® in the case of pCIB3064 and 2 mg/l methotrexate in the case of pSOG35.
  • GA7s sterile containers which contain half- strength MS, 2% sucrose, and the same concentration of selection agent.
  • Nicotiana tahacum c.v. 'Xanthi nc' are germinated seven per plate in a 1" circular array on T agar medium and bombarded 12-14 days after sowing with 1 ⁇ m tungsten particles (M10, Biorad, Hercules, California,
  • the presently disclosed subject matter also provides plants comprising the disclosed compositions.
  • the plant is characterized by a modification of a phenotype or measurable characteristic of the plant, the modification being attributable to the presence of an expression cassette comprising a nucleic acid molecule of the presently disclosed subject matter.
  • the modification involves, for example, nutritional enhancement, increased nutrient uptake efficiency, enhanced production of endogenous compounds, or production of heterologous compounds.
  • the modification includes having increased or decreased resistance to an herbicide, environmental stress, or a pathogen.
  • the modification includes having enhanced or diminished requirement for light, water, nitrogen, or trace elements.
  • the modification includes being enriched for an essential amino acid as a proportion of a polypeptide fraction of the plant.
  • the polypeptide fraction can be, for example, total seed polypeptide, soluble polypeptide, insoluble polypeptide, water-extractable polypeptide, and lipid-associated polypeptide.
  • the modification includes overexpression, underexpression, antisense modulation, sense suppression, inducible expression, inducible repression, or inducible modulation of a gene.
  • the modifications can include decreased or increased lignin content, lignin composition and/or structure changes, decreased or increased cellulose content, crystallinity and degree of polymerization (DP) changes, fiber property and morphology modifications, and/or increased resistance to pathogens, common diseases, and environment stresses in a tree.
  • DP crystallinity and degree of polymerization
  • IX.B Breeding
  • the plants obtained via transformation with a nucleic acid sequence of the presently disclosed subject matter can be any of a wide variety of plant species, including monocots and dicots, and angiosperms and gymnosperms; however, the plants used in the method for the presently disclosed subject matter are selected in some embodiments from the list of agronomically important target crops set forth hereinabove.
  • weeds As the growing crop is vulnerable to attack and damage caused by insects or infections as well as to competition by weed plants, measures are undertaken to control weeds, plant diseases, insects, nematodes, and other adverse conditions to improve yield. These include mechanical measures such as tillage of the soil or removal of weeds and infected plants, as well as the application of agrochemicals such as herbicides, fungicides, gametocides, nematicides, growth regulants, ripening agents, and insecticides.
  • Use of the advantageous genetic properties of the transgenic plants and seeds according to the presently disclosed subject matter can further be made in plant breeding, which aims at the development of plants with improved properties such as tolerance of pests, herbicides, or abiotic stress, improved nutritional value, increased yield, or improved structure causing less loss from lodging or shattering.
  • the various breeding steps are characterized by well-defined human intervention such as selecting the lines to be crossed, directing pollination of the parental lines, or selecting appropriate progeny plants.
  • different breeding measures are taken.
  • the relevant techniques are well known in the art and include, but are not limited to, hybridization, inbreeding, backcross breeding, multi-line breeding, variety blend, interspecific hybridization, aneuploid techniques, etc.
  • Hybridization techniques can also include the sterilization of plants to yield male or female sterile plants by mechanical, chemical, or biochemical means.
  • Cross-pollination of a male sterile plant with pollen of a different line assures that the genome of the male sterile but female fertile plant will uniformly obtain properties of both parental lines.
  • the transgenic seeds and plants according to the presently disclosed subject matter can be used for the breeding of improved plant lines that, for example, increase the effectiveness of conventional methods such as herbicide or pesticide treatment or allow one to dispense with said methods due to their modified genetic properties.
  • new crops with improved stress tolerance can be obtained, which, due to their optimized genetic "equipment", yield harvested product of better quality than products that were not able to tolerate comparable adverse developmental conditions (for example, drought).
  • Embodiments of the presently disclosed subject matter also provide seed from plants modified using the disclosed methods.
  • seed production In seed production, germination quality, and uniformity of seeds are essential product characteristics. As it is difficult to keep a crop free from other crop and weed seeds, to control seedbome diseases, and to produce seed with good germination, fairly extensive and well-defined seed production practices have been developed by seed producers who are experienced in the art of growing, conditioning, and marketing of pure seed. Thus, it is common practice for the farmer to buy certified seed meeting specific quality standards instead of using seed harvested from his own crop. Propagation material to be used as seeds is customarily treated with a protectant coating comprising herbicides, insecticides, fungicides, bactericides, nematicides, molluscicides, or mixtures thereof.
  • a protectant coating comprising herbicides, insecticides, fungicides, bactericides, nematicides, molluscicides, or mixtures thereof.
  • Customarily used protectant coatings comprise compounds such as captan, carboxin, thiram (tetramethylthiuram disulfide; TMTD ® ; available from R. T. Vanderbilt Company, Inc., Norwalk, Connecticut, United States of America), methalaxyl (APRON XL ® ; available from Syngenta Corp., Wilmington, Delaware, United States of America), and pirimiphos-methyl (ACTELLIC ® ; available from Agriliance, LLC, St. Paul, Minnesota, United States of America).
  • these compounds are formulated together with further carriers, surfactants, and/or application-promoting adjuvants customarily employed in the art of formulation to provide protection against damage caused by bacterial, fungal, or animal pests.
  • the protectant coatings can be applied by impregnating propagation material with a liquid formulation or by coating with a combined wet or dry formulation. Other methods of application are also possible such as treatment directed at the buds or the fruit.
  • transgenic plant is one that has been genetically modified to contain and express an miRNA and/or an siRNA.
  • a transgenic plant can be genetically modified to contain and express at least one homologous or heterologous DNA sequence operatively linked to and under the regulatory control of transcriptional control sequences which function in plant cells or tissue or in whole plants.
  • a transgenic plant also refers to progeny of the initial transgenic plant where those progeny contain and are capable of expressing the homologous or heterologous coding sequence under the regulatory control of the plant-expressible transcription control sequences described herein. Seeds containing transgenic embryos are encompassed within this definition as are cuttings and other plant materials for vegetative propagation of a transgenic plant.
  • coding sequence is operatively linked in the sense orientation to a suitable promoter and advantageously under the regulatory control of DNA sequences which quantitatively regulate transcription of a downstream sequence in plant cells or tissue or in planta, in the same orientation as the promoter, so that a sense (i.e., functional for translational expression) mRNA is produced.
  • a transcription termination signal for example, as polyadenylation signal, functional in a plant cell is advantageously placed downstream of an miRNA- and/or siRNA-encoding sequence, and a selectable marker which can be expressed in a plant, can be covalently linked to the inducible expression unit so that after this DNA molecule is introduced into a plant cell or tissue, its presence can be selected and plant cells or tissue not so transformed will be killed or prevented from growing.
  • tissue specific expression of the plant-expressible miRNA and/or siRNA coding sequence is desired, the skilled artisan can choose from a number of well-known sequences to mediate that form of gene expression as disclosed herein.
  • Environmentally regulated promoters are also well known in the art and are disclosed herein, and the skilled artisan can choose from well-known transcription regulatory sequences to achieve the desired result.
  • the presently disclosed subject matter can be employed, among other applications, to perform the following:
  • Total RNA was isolated from developing xylem tissue of P. trichocarpa or P. taeda, from pooled tension- and compression-stressed developing xylem of P. trichocarpa stems (bend for 4 days), from P. trichocarpa in vitro plants, or from pooled P. trichocarpa in vitro plants wit or without exposure to cold (4°C for 24 hours), heat (37°C for 24 hours), dehydration (draught for 14 hours), salinity (300 mM NaCI for 14 hours), or water (plants covered with water for 14 hours), using the cetyl trimethyl ammonium bromide (CTAB) method as described in Chang et a/., 1993.
  • CTAB cetyl trimethyl ammonium bromide
  • the recovered RNA was dephosphorylated with alkaline phosphatase, and a 5'-phosphorylated-3'-adaptor oligonucleotide with the sequence 5'-CTGTAGGCACCATTCATCAC-S' (SEQ ID NO: 155) with a 5'- phosphate and a 3'-amino-modifier C-7 (i.e. a seven-carbon spacer with a primary amino group) was then ligated to the dephosphorylated RNA.
  • the ligated products were separated from non-ligated RNA and the adaptor oligonucleotide on a 12% denaturing polyacrylamide gel. A band corresponding to the ligation product was excised from the gel, and the ligated RNA was recovered.
  • RNA was phosphorylated at the 5' end and a new 5' adaptor oligonucleotide (5'-ATGTCGTGaggcacctgaaa-3 J (SEQ ID NO: 156; the sequence in uppercase is a DNA strand and in lowercase is an RNA strand) containing hydroxyl groups at both 5' and 3' ends was ligated to the 5'-phosphorylated ligation product from the previous step.
  • the new ligation product was gel purified and eluted from the gel slice.
  • Reverse transcription was performed by using a RT primer (5'- GATGAATG GTGCCTAC-3'; SEQ ID NO: 157), followed by PCR using a 5' primer (5'-GTCGTGAGGCACCTGAAA-3 1 ; SEQ ID NO: 158) and a 3' primer ( ⁇ '-GATGAATGGTGCCTACAG-S 1 ; SEQ ID NO: 159).
  • the PCR product was then digested with Ban I and concatamerized using T4 DNA ligase.
  • the products of the ligation reaction were separated on an agarose gel, and a gel slice corresponding to concatamers of a size range of larger than 500 basepairs (bp) was isolated and the nucleic acids recovered from the gel slice.
  • the single-stranded regions of the ends of the concatamers were filled in by incubation with Taq polymerase, and the DNA product was directly ligated into the pCR2.1-TOPO ® vector using the TOPO TA CLONING ® kit (Invitrogen Corp., Carlsbad, California, United States of America).
  • inserts were sequenced from P. trichocarpa. After excluding sequences corresponding to rRNA, tRNA, snRNA, retrotransposons/transposons, and small RNAs with 2 nt or more mismatches with the P. trichocarpa genome, the remaining small RNA sequences and their surrounding sequences from the P. trichocarpa genome were used to predict the secondary structures of these small RNAs using the mfold program (Zuker, 2003). 52 miRNA families were identified (Table 1 ) based on their authentic pre-miRNA stem-loop structures (see Figure 2, showing two examples) or their significant homology to miRNAs identified in other species.
  • inserts were sequenced from P. taeda. After excluding sequences corresponding to rRNA, tRNA, snRNA, and retrotransposons/transposons, the remaining small RNA sequences and their surrounding sequences from the P. taeda expressed sequence tags (ESTs) deposited in dbEST of the GENBANK ® database were used to predict the secondary structures of these small RNAs using the mfold program (Zuker, 2003). 15 miRNA families were identified (Table 4, LpMIRI , LpMIR2, LpMIR7, LpMIR9, LpMIRI 78, LpMIR26, LpMIR27,
  • LpMIR28, LpMIR77, LpMIR82, LpMIR89, LpMIR95, LpMIRIOO, LpMIR119, and LpMIRI 76 based on their authentic pre-miRNA stem-loop structures or their significant homology to miRNAs identified in other species.
  • one locus had a sequence showing a 1 nt mismatch to both PtmiR 71 and PtmiR 142, and the other two loci each had a sequence showing a 1 nt mismatch to PtmiR 71 and 2 nt mismatch to PtmiR 142.
  • PtMIR 156-1 one locus harboring an miRNA with two mismatches to PtmiR 156 was able to form stable stem-loop structures with the miRNA sequences present in either the 5' or the 3' arm, and two stem- loop structures (one is shorter and another is longer) were found when the miRNA was present in the 3' arm (see Figure 3).
  • 71 genes had a sequence showing a 1 nt mismatch to PtmiR 142.
  • target genes for the isolated Populus trichocarpa miRNAs were identified by searching the genome and predicted transcripts of P. trichocarpa with the program PATSCAN (Dsouza & Larsen, 1997), which can be used to identify mRNAs capable of base pairing with one of the miRNAs with a score of 3.0 or less (see Jones-
  • Rhoades et ai 2004 for detail description for scoring method.
  • the same method was used to identify potenitial target genes for miRNAs isolated from Pinus taeda by seaching throught the Pine Gene Index Release 6.0 produced by The Institute for Genomic Research (TIGR; available at the website of TIGR).
  • TIGR The Institute for Genomic Research
  • the predicted targets comprise, in general, regulatory and defense related genes. While some of the targets are associated with development, and/or with cellulose biosynthesis, many of them are implicated in the lignin biosynthesis network. For example, LpMlR 178 was found to target a cellulose synthase, an enzyme involved in the synthesis of the backbone of the cell wall.
  • the predicted target of PtmiR 6 encodes a UVR8 protein, which positively regulates phenylpropanoid metabolism associated with cinnamate 4-hydroxylase (C4H) in response to UV-B induction (Hu et al., 1998; Jin et al., 2000; K Kunststoffenstein et al., 2002).
  • PtmiR 241 and PtmiR 13 each targets genes that encodes laccases and a mononuclear blue copper protein family member. These two protein families were suggested to be involved in lignin formation (Nersissian et al., 1999). A common target of
  • PtmiR 29, 71 , and 142 encode MYB factor proteins, which are transcription factors known to bind promoters of a variety of lignin biosynthetic pathway genes encoding, for example, PAL, C4H, 4-coumaroyl-CoA ligase (4CL), 5- hydroxyconiferaldehyde O-methyltransferase (COMT) and cinnamyl alcohol dehydrogenase (CAD; Tamagnone et al., 1998; Borevitz et al., 2000). Down- or up-regulating these genes results in drastic lignin reduction or augmentation, respectively (Tamagnone et al., 1998; Borevitz et al., 2000). Suppression of a LIM protein, a predicted target of PtmiR 172, also inhibited
  • EXAMPLE 7 Expression of PtmiR Nucleic Acids in P. trichocarpa Tissues
  • the expression of some of the PtmiRs in various P. trichocarpa tissues was characterized by Northern analysis ( Figure 4). This included xylem tissues suffering from tension stress from tension wood (TW) and from compression stress from stem wood opposite to TW, called opposite wood (OW). TW and OW can be easily created by bending the tree stem.
  • the tested PtmiR s are all expressed at some level in woody tissues (for example, phloem, secondary growth, tension wood, and opposite wood).
  • RNA was denatured for 10 minutes at 65-7O 0 C, separated on a 12% polyacrylamide/8 M urea gel (Amersham Biosciences, Piscataway, New Jersey, United States of America) in a
  • PROTEAN Il apparatus Bio-Rad Laboratories, Inc., Hercules, California, United States of America
  • electro-blotted onto a HYBONDTM-N + membrane Amersham
  • Trans-Blot SD Semi-Dry Electrophoretic Transfer Cell Bio-Rad
  • PtmiRs Based on the expression patterns of these PtmiRs showing high levels of transcripts in wood forming tissues, xylem in particular, and on the predicted target mRNAs (see Table 2), the disclosed PtmiRs might play significant roles in regulating wood development in plants.
  • the expression patterns and predicted target mRNA functions also point to critical roles for these PtmiRs in regulating lignin, cellulose, and hemicellulose biosynthesis.
  • the strong expression of PtmiR 73 in leaf together with its target gene function associated with disease resistance is direct evidence for the involvement of PtmiR 73 in the regulation of disease and stress tolerance.
  • RNA target of interest such as a plant mRNA transcript
  • sequence of a gene or RNA gene transcript derived from a database such as the GENBANK® database or any other database containing nucleotide sequence data (for example, a database containing sequence data from plants, such as Arabidopsis, P. trichocarpa, rice, etc.) is used to generate siRNA targets having complementarity to the target.
  • a database such as the GENBANK® database or any other database containing nucleotide sequence data (for example, a database containing sequence data from plants, such as Arabidopsis, P. trichocarpa, rice, etc.) is used to generate siRNA targets having complementarity to the target.
  • sequences can be obtained from a database, or can be determined experimentally as disclosed herein and/or known in the art.
  • Target sites that are known include, for example, those target sites determined to be effective target sites based on studies with other nucleic acid molecules, for example ribozymes or antisense, or those targets known to be associated with a disease or condition such as those sites containing mutations or deletions, can be used to design siRNA molecules targeting those sites as well.
  • Target sites can include single-stranded regions of miRNA precursors.
  • miRNA precursors adopt a stem-loop structure consisting of double-stranded and single- stranded regions.
  • siRNA molecules are designed that hybridize to the double-stranded or single stranded regions of an miRNA precursor or to the miRNA sequence, thus causing aberrant processing of the precursor and inhibiting miRNA production.
  • Various parameters can be used to determine which sites are the most suitable target sites within the target RNA sequence. These parameters include, but are not limited to secondary or tertiary RNA structure, the nucleotide base composition of the target sequence, the degree of homology between various regions of the target sequence, and the relative position of the target sequence within the RNA transcript.
  • any number of target sites within the RNA transcript can be chosen to screen siRNA molecules for efficacy, for example by using in vitro RNA cleavage assays, cell culture, or animal models.
  • anywhere from 1 to 1000 target sites are chosen within the transcript based on the size of the siRNA construct to be used.
  • High throughput screening assays can be developed for screening siRNA molecules using methods known in the art, such as with multi-well or multi-plate assays to determine efficient reduction in target gene expression.
  • siRNA templates comprised the 19 nt fragment linked via a 9 nt spacer to the reverse complement of the same 19 nt sequence.
  • Each template was cloned into a vector comprising a human H1 RNA transcription unit under the control of its cognate gene promoter ( Figure 9).
  • the resulting transcript was predicted to adopt an inverted hairpin RNA structure containing one (for GT1 ) or two (for GT2) 3' overhanging uridines, giving rise to siRNA-like transcripts containing GT1 or GT2 sequences ( Figure 9).
  • GT1 produces an siRNA-like transcript comprising SEQ ID NO: 172 - 9 nt spacer - SEQ ID NO: 173 (bottom left), and GT2 produces a transcript comprising SEQ ID NO 174 - 9 nt spacer - SEQ ID NO: 175.
  • RNA Silencing with Human H1 Promoter-Containing Constructs Agrobaterium tumefaciens C58 cells were transformed with the GT1 and GT2 vectors and used to transform a transgenic tobacco line expressing a GUS transgene (Hu et al., 1998). To transfer to tobacco, GUS-containing tobacco leaf disks were infected with the Agrobacterium C58 strain harboring the siRNA construct. Transformants were selected on MS104 containing 25 mg/L hygromycin and 300 mg/L claforan.
  • the hygromycin-resistant shoots were placed on hormone-free MSO agar medium containing 25 mg/L hygromycin and 300 mg/L claforan for root regeneration, and transgenic tobacco seedlings were planted in soil and grown in a greenhouse.
  • the gene silencing efficiency appeared to be independent of the GUS mRNA target sites and of the number of uridine residues (1 vs. 2) in the engineered siRNA transcripts. Furthermore, the silencing effect remained in about 90% of the Ti plants analyzed.
  • primers are SLpF ( ⁇ '-GGAATTCTGCGTTTGAAGAAGA GTGTTTGA-3'; SEQ ID NO: 160) as the forward primer (with the addition of an Eco Rl site at the 5' end) and SLpR (5'-GCCCGGG AAGATCGGTTCGTGTAATATAT-S'; SEQ ID NO: 161 ) as the reverse primer (with addition of a Sma I site at the 5' end). These two primers flank the forward primer (with the addition of an Eco Rl site at the 5' end) and SLpR (5'-GCCCGGG AAGATCGGTTCGTGTAATATAT-S'; SEQ ID NO: 161 ) as the reverse primer (with addition of a Sma I site at the 5' end). These two primers flank the forward primer (with the addition of an Eco Rl site at the 5' end) and SLpR (5'-GCCCGGG AAGATCGGTTCGTGTAATATAT-S'; SEQ ID NO: 16
  • At7SL4 gene promoter at both ends and were used for PCR amplification of the promoter fragment from Arabidopsis thaliana (Columbia ecotype) genomic DNA.
  • PCR product amplified from Arabidopsis genomic DNA using primers SLpF and SLpR was cloned into the PCR ® 2.1-TOPO ® system
  • At7SL4 promoter clone was named pCRSLp7, and contained the following At7SL4 promoter sequence: GGAATTCTGCGTTTGAAGAAGAGTGTTTGA TGTTCTCAAGTAAGTGAGTCTTATTGGGAATAATATTAACTCATGTTCTT
  • SEQ ID NO: 164 was used as the reverse primer (adds a Hindlll site to the 3' end of the 3'-NTS).
  • PCR was employed to amplify a nucleic acid molecule comprising the 3'-NTS using these two primers and Arabidopsis thaliana (Columbia ecotype) genomic DNA.
  • the amplified nucleic acid molecule was cloned into the PCR ® 2.1-TOPO ® system (Invitrogen Corp.) and sequenced
  • At7SL4-3'-NTS nucleotide sequence was determined to be: GTCTAGATTTTGATTTT GTTTTCCAAAACTTTCTACGCTTTTTGTTTTTGGGTTTAATGCTTTAAGAG GGAACAAAAACAAAGCTGTGAAAACTGAAAGCAAACTTTGAACAAAGCA AGAGACTTAAGAGTTGTATTTACAGCTTTTGTTCGATGTATGGAAATGTA
  • the At7SL4-3'-NTS sequence was released from pCRSLt2 by digestion with Xba I and Hind III.
  • the At7SL4-3'-NTS sequence was thereafter ligated into the Xba I and Hind III cloning sites of pUCSLp7-1 to produce a construct named pUCSLI .
  • This construct contained the siRNA delivery cassette in a pUC19 backbone vector.
  • the siRNA expression cassette contains the At7SL4 promoter sequence and the At7SL4-3'-NTS sequence. Between these two elements is a multiple cloning site (MCS) including sites for Sma I, Bam HI, and Xba I for insertion of target sequences (see Figure 6).
  • MCS multiple cloning site
  • RNA-dependent RNA polymerase III 7SL RNA genes from Arabidopsis thaliana were employed, because the transcription of these small genes is controlled exclusively by their upstream external regulatory sequence elements (USE and TATA) and terminates at a run of five to seven thymidines. These features allowed for the incorporation of these sequences into expression vectors to efficiently produce siRNA duplexes that contained three to four 3' overhanging uridines.
  • USE and TATA upstream external regulatory sequence elements
  • siRNA templates corresponding to GT1 , GT2, and GT3 were cloned into the pSIT expression vector (see Figure 7), which was then mobilized into A. tumefaciens C58 cells for transforming the transgenic GUS tobacco line described hereinabove (see also Hu et al., 1998). A total of 89 plants were produced containing one of these three expression constructs.
  • pSIT small interfering RNA transformation system
  • the insert structure is in some embodiments a 19 to 26-nucleotide sequence corresponding to the sense strand of a target gene followed by the complementary antisense sequence.
  • the sense and antisense sequences are separated by a 9- nucleotide spacer (5'-TTCAGATGA-S'; see Figure 8).
  • a string of several thymidines in some embodiments, a string of 7 was added to signal termination of transcription from the promoter.
  • siRNA-based gene modification system can be used for modulating gene expression in plants (for example, trees).
  • Representative, non-limiting genes the expression of which can be modulated include genes encoding the miRNAs disclosed as SEQ ID NOs: 1-59, 1247-1295, and 1662-1712
  • the system is particularly useful for the manipulation of the miRNA genes that modulate multiple family members. Only a short sequence of the target gene is needed in the siRNA system, allowing the design of an siRNA target sequence to be highly specific and discernable from the other miRNA family member genes or other unknown genes which share a high sequence homology with the target member.
  • the nucleotide sequence of a loop region is determined.
  • An siRNA is synthesized that hybridizes to this loop region, and an siRNA delivery cassette is generated.
  • the siRNA delivery cassette is cloned into pSIT using the techniques described herein, and the vector is transformed into a plant cell.
  • the transformed plant cell is used to regenerate a plant, and the expression of the plant gene targeted by the miRNA is determined in the regenerated plant and compared to the expression of the same plant gene in a wild type plant (i.e. a plant that has not been transformed with the pSIT construct.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Endocrinology (AREA)
  • Nutrition Science (AREA)
  • Virology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Agricultural Chemicals And Associated Chemicals (AREA)

Abstract

The presently disclosed subject matter provides methods and compositions for modulating gene expression in plants. Also provided are plants and cells comprising the compositions of the presently disclosed subject matter.

Description

DESCRIPTION MICRORNAS (MIRNAS) FOR PLANT GROWTH AND DEVELOPMENT
GRANT STATEMENT This work was supported by grant DE-FG02-03ER15442 from the
United States Department of Energy. Thus, the U.S. government has certain rights in the presently disclosed subject matter.
CROSS REFERENCE TO RELATED APPLICATIONS This application is based on and claims priority to United States
Provisional Application Serial Number 60/611 ,290, filed September 20, 2004, the disclosure of which is herein incorporated by reference in its entirety.
TECHNICAL FIELD The presently disclosed subject matter relates, in general, to methods and compositions for modulating gene expression in a plant. More particularly, the presently disclosed subject matter relates to a method of using a microRNA (miRNA) to modulate the expression level of a gene in a plant, and to compositions comprising miRNAs.
BACKGROUND
Trees are a major natural resource of the biosphere and have shown outstanding ecological and economic importance. A key physiological process of tree development is the formation of wood, which is composed of a variety of cell types.
Wood is made up of plant cell wall lignins, which occur exclusively in higher plants and represent the second most abundant organic compound on the earth's surface after cellulose, accounting for about 25% of plant biomass. Cell wall lignification involves the deposition of phenolic polymers (lignins) on the extracellular polysaccharide matrix. The polymers arise from the oxidative coupling of three cinnamyl alcohols. The main functions of lignins are to strengthen the plant vascular body, provide mechanical support for stems and leaf blades, and to provide resistance to diseases, insects, cold temperatures, and other biotic and abiotic stresses.
Although lignins play many important roles in vascular plants, their resistance to degradation greatly complicates various agricultural and industrial uses of plants. For example, animals lack the enzymes necessary for degrading the polysaccharides in plant cell walls, and thus must depend on microbial fermentation to break down plant fibers. High lignin concentration and methoxyl content reduce the digestibility of forage crops (for example, alfalfa), with cattle (for example) able to digest only 40-50% of legume fibers and 60-70% of grass fibers. Thus, lignins have been implicated in limiting forage digestibility, possibly by interfering with microbial degradation of fiber polysaccharides. Small decreases in lignin content of plants, however, can have a significant positive impact on forage digestibility.
High lignin content also is problematic in the wood products industries, which is an important component of both the United States' and global economies. Up to thirty-six percent of the dry weight of wood is lignin. During pulp and papermaking, lignin must be separated from cellulose. This process consumes large amounts of energy and imposes a high environmental cost due to the requirement for using chemicals such as chlorine bleach. The availability of wood with reduced lignin content or with a modified lignin that is more amenable to extraction would increase the efficiency of pulp and papermaking processes and would decrease chemical consumption and disposal. Thus, both the digestibility of forage crops and the pulping properties of trees can be adversely affected by high lignin content.
Genetic engineering has great promise for agriculture because it can accelerate traditional breeding programs, cross reproductive barriers, and introduce specific desired traits. Genetic engineering can be particularly advantageous to forestry because traditional methods are hampered by the long generation times of trees. Yet, the manipulation of a plant's genome can have undesirable effects.
Thus, there is a long-felt and continuing need in the art for new methods for identifying genes that specifically regulate important 5 033879
developmental pathways of plants. Also needed are new methods for genetically modifying cultivated vascular plants to manipulate the expression of genes of interest. Such methods would improve the ability of vascular plants to be used in agriculture, in the pulp and paper industry, and in other industries. The presently disclosed subject matter addresses this and other needs in the art.
SUMMARY This Summary lists several embodiments of the presently disclosed subject matter, and in many cases lists variations and permutations of these embodiments. This Summary is merely exemplary of the numerous and varied embodiments. Mention of one or more representative features of a given embodiment is likewise exemplary. Such an embodiment can typically exist with or without the feature(s) mentioned; likewise, those features can be applied to other embodiments of the presently disclosed subject matter, whether listed in this Summary or not. To avoid excessive repetition, this
Summary does not list or suggest all possible combinations of such features.
The presently disclosed subject matter provides methods for stably modulating expression of a plant gene. In some embodiments, the method comprises (a) providing a vector encoding a microRNA (miRNA) targeted to the plant gene; and (b) transforming a plant cell with the vector, whereby stable expression of the miRNA in the plant cell is provided. In some embodiments, the method comprises (a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence; (b) growing the plant cells under conditions sufficient to select for a plurality of transformed plant cells that have integrated the vector into their genomes; (c) screening the plurality of transformed plant cells for expression of the miRNA encoded by the vector; (d) selecting a transformed plant cell that expresses the miRNA; and (e) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the plant gene is stably modulated. 5 033879
In some embodiments of the disclosed methods, the modulating expression of a plant gene is inhibiting expression of the plant gene. In some embodiments, a method of stably inhibiting the expression of a gene in a plant cell comprises stably transforming the plant cell with a vector encoding a microRNA (miRNA) molecule, wherein the miRNA molecule comprises a nucleotide sequence at least 70% identical to a contiguous 17- 24 nucleotide subsequence of the gene.
Any expression vector that can be used to express nucleic acids encoding miRNAs and/or siRNAs in plants can be used in conjunction with the presently disclosed subject matter. In some embodiments, the vector is an Agrobacteήum binary vector. In some embodiments, the vector comprises (a) a promoter operatively linked to a nucleic acid molecule encoding the miRNA molecule; and (b) a transcription termination sequence.
The nucleic acids of the presently disclosed subject matter can be expressed from any promoter that shows activity in plants. In some embodiments, the promoter is a DNA-dependent RNA polymerase III promoter. In some embodiments, the promoter is selected from the group consisting of an RNA polymerase III H1 promoter, an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, a tRNA gene promoter, and functional derivatives thereof. In some embodiments, the Arabidopsis thaliana 7SL
RNA gene promoter comprises the sequence presented in SEQ ID NO: 164.
In some embodiments, promoters are chosen that direct tissue-, cell- type-, or stage-specific expression of the miRNAs. In some embodiments, the stable expression of the microRNA (miRNA) in the plant occurs in a location or tissue selected from the group consisting of epidermis, root, vascular tissue, xylem, meristem, cambium, cortex, pith, leaf, flower, seed, and combinations thereof. In some embodiments of the disclosed methods, an miRNA is used to modulate the expression of a target gene. In some embodiments, the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a sense region, an antisense region, and a loop region, positioned in relation to each other such that upon transcription, a resulting RNA transcript is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand. In some embodiments, the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-
59, 1247-1295, and 1662-1712, and nucleotide sequences at least 70% identical to SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
The methods and compositions of the presently disclosed subject matter can be used to modulate the expression of a gene in any plant. In some embodiments, the plant is a dicot. In some embodiments, the plant is a monocot. In some embodiments, the plant is a tree. In some embodiments, the tree is an angiosperm. In some embodiments, the tree is a gymnosperm. In some embodiments, the tree is a member of the genus Populus. In some embodiments, the tree is a Populus trichocarpa tree. In some embodiments, the tree is a member of the genus Pinus. In some embodiments, the tree is a Pinus taeda tree.
The methods and compositions of the presently disclosed subject matter can be used to modulate the expression of any gene in a plant. In some embodiments, the plant gene has a nucleotide sequence comprising one of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, or a nucleotide sequence at least 80% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837. In some embodiments, the gene is selected from the group consisting of coniferaldehyde-5-hydroxylase (CaldδH), a lignin-related gene, a cellulose-related gene, a hemicellulose-related gene, a hormone-related gene, a stress-related gene, a disease-related gene, a growth-related gene, and a transcription factor gene. In some embodiments, the lignin-related gene is selected from the group consisting of sinapyl alcohol dehydrogenase (SAD), cinnamyl alcohol dehydrogenase (CAD), 4-coumarate:coenzyme A (CoA) ligase (4CL), cinnamoyl CoA O-methyltransferase (CCoAOMT), caffeate O-methyltransferase (COMT), ferulate-5-hydroxylase (F5H), cinnamate-4-hydroxylase (C4H), p-coumarate-3-hydroxylase (C3H), and phenylalanine ammonia lyase (PAL). In some embodiments, the cellulose- related gene is selected from the group consisting of cellulose synthase, cellulose synthase-like, glucosidase, glucan synthase, and sucrose synthase. In some embodiments, the hormone-related gene is selected from the group consisting of isopentyl transferase (ipt), gibberellic acid (GA) oxidase, auxin (AUX), and a rooting locus (ROL) gene. The presently disclosed subject matter also provides vectors that can be used for performing the disclosed methods. In some embodiments, the vector for stably expressing a microRNA (miRNA) molecule in a plant comprises (a) a promoter operatively linked to a nucleic acid molecule encoding the miRNA molecule; and (b) a transcription termination sequence. In some embodiments, the vector is an Agrobacterium binary vector. In some embodiments, the Agrobacterium binary vector comprises a nucleic acid encoding a selectable marker operatively linked to a promoter.
The presently disclosed subject matter also provides kits comprising the disclosed vectors and at least one reagent for introducing the disclosed vectors into a plant cell. In some embodiments, the kit further comprises instructions for introducing the vector into a plant cell.
The presently disclosed subject matter also provides plant cells, transgenic plants, transgenic seed, and transgenic progeny comprising the disclosed vectors. In some embodiments, the plant cell is from a plant selected from the group consisting of poplar, pine, eucalyptus, sweetgum, other tree species, tobacco, Arabidopsis, rice, corn, wheat, cotton, potato, and cucumber.
The presently disclosed subject matter also provides a method for stably inhibiting the expression of a gene in a plant cell. In some embodiments, the method comprises stably transforming the plant cell with a vector encoding a microRNA (miRNA) molecule comprising a nucleotide sequence at least 70% identical to a contiguous 17-24 nucleotide subsequence of the gene.
The presently disclosed subject matter also provides a method for enhancing the expression of a gene in a plant cell. In some embodiments, the method comprises introducing into the plant cell a vector encoding a short interfering RNA (siRNA) molecule comprising a sequence that hybridizes under physiological conditions to a loop region or a stem region of a pre-microRNA that comprises a microRNA (miRNA) that modulates expression of the gene, thereby resulting in downregulation of expression of the miRNA and enhanced expression of the gene. In some embodiments, the microRNA (miRNA) comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712 and nucleotide sequences at least 70% identical to any of SEQ ID NOs: 1- 59, 1247-1295, and 1662-1712.
The presently disclosed subject matter also provides expression vectors for use with the disclosed methods. In some embodiments, an expression vector comprises a nucleic acid sequence encoding a microRNA
(miRNA) molecule that stably downregulates expression of a plant gene. In some embodiments of the disclosed expression vectors, the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1- 59, 1247-1295, and 1662-1712 nucleotide sequences at least 70% identical to SEQ ID NOs: 1-59, 1247-1295, and 1662-1712. In some embodiments, the miRNA is at least 70% identical to about 17-24 contiguous nucleotides of a ribonucleic acid (RNA) transcribed from a gene selected from the group consisting of a lignin-related gene, a cellulose-related gene, a hemicellulose- related gene, a hormone-related gene, a stress-related gene, a disease- related gene, a growth-related gene, and a transcription factor gene. In some embodiments, the vector comprises a promoter for expressing the miRNA, a transcription termination sequence, and a cloning site between the promoter and the transcription termination sequence into which a nucleic acid molecule encoding the miRNA can be cloned. In some embodiments, the vector is a plasmid vector. In some embodiments, the vector further comprises a selectable marker. In some embodiments, the cloning site comprises a recognition sequence for at least one restriction enzyme that is not present elsewhere in the plasmid vector. In some embodiments of the presently disclosed subject matter, the nucleic acid sequence encoding the microRNA (miRNA) comprises (a) a sense region; (b) an antisense region; and (c) a loop region, wherein the sense, antisense, and loop regions are positioned in relation to each other
- 7 - such that upon transcription, the resulting RNA molecule is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand.
Accordingly, it is an object of the presently disclosed subject matter to provide a method for manipulating gene expression in plants using an miRNA-mediated approach. This object is achieved in whole or in part by the presently disclosed subject matter.
An object of the presently disclosed subject matter having been stated above, other objects and advantages will become apparent to those of ordinary skill in the art after a study of the following description of the presently disclosed subject matter and non-limiting EXAMPLES.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 depicts a general structure for an siRNA molecule of the presently disclosed subject matter, wherein N is any nucleotide, provided that in the loop structure identified as N5-g, all 5-9 nucleotides remain in a single-stranded conformation. Similarly, Ni-8 can be any sequence of 1-8 nucleotides or modified nucleotides, provided that the nucleotides remain in a single-stranded conformation in the siRNA molecule. Figures 2A and 2B depict potential hairpin configurations for exemplary miRNA precursors. Figure 2A depicts a miRNA precursor derived from the PtMIR 115a gene (SEQ ID NO: 95) comprising the nucleotide sequence of miRNA PtmiR 115 (SEQ ID NO: 24). Figure 2B depicts an miRNA precursor derived from the PtMIR 61a gene (SEQ ID NO: 71) comprising the nucleotide sequence of miRNA PtmiR 61 (SEQ ID NO: 10).
In each Figure, the miRNA sequence is underlined.
Figures 3A-3C depict potential hairpin configurations for a transcript of an exemplary miRNA precursor gene, PtMIR 156-1 a (SEQ ID NO: 132). Figure 3A depicts a hairpin configuration where the PtmiR 156-1 sequence (SEQ ID NO: 47 in RNA form) is present in the 5' arm of the hairpin. Figures
3B and 3C depict two hairpin configurations where the PtmiR 156-1 sequence (SEQ ID NO: 47 in RNA form) is present in the 3' arm of the hairpin. Figure 3B depicts a shorter stem-loop structure, and Figure 3C depicts a longer (one is shorter (B) and another is longer stem-loop structure. Figure 3C also shows the position of a 19-nucleotide side stem- loop, the nucleotides of which are not depicted for clarity. For each of Figures 3A-3C, the sequence of PtmiR 156-1 (SEQ ID NO: 47 in RNA form) is underlined.
Figure 4 depicts Northern analysis of the expression of exemplary miRNAs in leaf (L), phloem (Ph), and developing xylem (X), tension wood (XTW), and opposite wood (Xow) stem xylems. 5S rRNA is included as an RNA quantity loading control. Figures 5A-5E depict human H1 promoter-mediated siRNA silencing of GUS gene expression in transgenic tobacco. Figure 5A depicts GUS staining of cross-sections of the stems, of the leaves, and of the roots of one month old siRNA-transgenic (GT1 and GT2) and GL/S-expressing control (C) tobacco plants. Figure 5B is a graph of GUS protein activity (Jefferson et a/., 1987) in the leaves of control plants and of ten GT2 transgenic plants. Mean values were calculated from three independent measurements per line. Figure 5C depicts a loading control for gel blot analysis of RNA transcript level using a 25S ribosomal RNA probe. Figure 5D depicts the same gel blot as shown in Figure 5C, but is used to characterize the level of GUS mRNA using a GUS cDNA probe. Figure 5E depicts gel blot detection of siRNAs of about 21 nucleotides (nt) (position indicated) using a GUS cDNA probe as described in Hutvagner et al., 2000. RNA was isolated from a portion of the leaves used for the GUS protein activity assay depicted in Figure 5B.
Figure 6 depicts a schematic representation of plasmid pUCSL.1. The plasmid contains a promoter fragment (289 basepairs; PZSL-RNA) containing
USE and TATA elements and a 3'-non-transcribed sequence (3'-NTS) fragment (267 basepairs) from the Arabidopsis thaliana At7SL4 gene, cloned into pUC19. Between the promoter and 3'-NTS sequences is a multiple cloning site (MCS) containing recognition sequences for Sma I, Bam HI, and Xba I, which can be used to clone siRNA sequences. The promoteπMCS:
3'-NTS cassette can be excised from pUCSU using Eco Rl and Hind III sites that are present at the 5' and 3' ends of the cassette, respectively. Figure 7 depicts a schematic representation of plasmid pSIT. The plasmid contains the promoter:MCS:3'-NTS cassette from pUCSU in the opposite transcriptional orientation and downstream of a selectable marker cassette, the latter consisting of a promoter, selectable marker gene, and terminator sequence. pSIT represents a binary vector transformation system mediated by Agrobacterium.
Figure 8 depicts a representation of the multiple cloning site (MCS) of pSIT. Between the Sma I and Xba I sites of the MCS is cloned a sequence comprising 17-26 nt from the sense strand of the gene of interest, followed by a 9 nt spacer, and then the reverse complement of the 17-26 nt sequence
(i.e., the antisense sequence cloned in the opposite direction). Downstream of the antisense sequence is the sequence TTTTTTT, which serves to terminate transcription from the promoter for siRNA transcription present in pSIT (see Figure 7). Figure 9 depicts the preparation of siRNA expression constructs. The
19 nucleotide (nt) GUS gene-specific sequence (GT1 represented nucleotide positions 80-98 and GT2 89-107) separated by a 9 nt spacer from the reverse complement of the same sequence followed by a termination signal of five thymidines was cloned into pSUPER (available from OligoEngine, Inc., Seattle, Washington, United States of America) downstream of the H1 promoter (H1-P). The H1-P::GT expression construct was then excised and cloned into the binary vector pGPTV-HPT (Becker et a/., 1992) to replace the pAnos-uidA fragment. The resulting vector, pGPH1-HPT, which contained a hygromycin phosphotransferase selectable marker gene (hpf), was then mobilized into Agrobacterium tumefaciens C58 for transforming tobacco.
The predicted secondary siRNA structures of GT1 and GT2 are depicted at the bottom of the Figure. Considered in the 5' to 3' direction, Figure 9 shows the sequences of GT1 and GT2 that form the hairpin as follows. For GT1 , the hairpin is produced by the intramolecular hybridization of SEQ ID NO: 174 and SEQ ID NO: 175, with a 9 nt spacer between. For GT2, the hairpin is produced by the intramolecular hybridization of SEQ ID NO: 176 and SEQ ID NO: 177, with a 9 nt spacer between. Figure 9 depicts these hairpins with the "top" strand in the 5' to 3' direction, and thus the "bottom" strand is depicted in the 3' to 5' direction.
BRIEF DESCRIPTION OF THE SEQUENCE LISTING The Sequence Listing discloses, inter alia, the sequences of various miRNAs, genes encoding miRNA precursors, and sequences derived from the genomes of Populus sp. and Pinυs sp. that are targets for the disclosed miRNAs. While the sequences are presented in the form of DNA (i.e. with thymidine present instead of uracil), it is understood that the sequences are also intended to correspond to the RNA transcripts of these DNA sequences
(i.e. with each T replaced by a U).
SEQ ID NOs: 1 -59 and 1247-1295 are the nucleic acid sequences of various miRNAs from Populus tήchocarpa.
SEQ ID NOs: 60-156 and 1296-1375 are the nucleic acid sequences of various miRNA precursor genes. The relationships between the sequences disclosed as SEQ ID NOs: 1-59 and 1247-1295 and those disclosed as 60-156 and 1296-1375 are presented Table 1 below.
SEQ ID NO: 155 is the nucleic acid sequence of a 5'-phosphorylated- 3'-adaptor oligonucleotide used to clone a population of small RNAs predicted to include miRNAs.
SEQ ID NO: 156 is the nucleic acid sequence of a second adaptor molecule used during the isolation and cloning of small RNAs.
SEQ ID NOs: 157-159 are the nucleotide sequences of oligonucleotide primers used during the reverse transcription and amplification by PCR of the small RNAs to which the adaptors of SEQ ID
NOs: 155 and 156 had been added.
SEQ ID NOs: 160 and 161 are primer sequences used to PCR- amplify a region of the Arabidopsis At7SL4 promoter.
SEQ ID NO: 162 is the nucleic acid sequence of the product of a PCR reaction using the primers identified in SEQ ID NOs: 160 and 161.
SEQ ID NOs: 163 and 164 are primer used to amplify the 3'-NTS of the At7SL4 gene. SEQ ID NO: 165 is the nucleic acid sequence of the product of a PCR reaction using the primers identified in SEQ ID NOs: 163 and 164.
SEQ ID NOs: 166-171 are the sequences of complementary oligonucleotides that were used to generate siRNAs targeted to the GUS gene. Three different regions of the GUS gene were targeted. For the production of pGSGTI , SEQ ID NOs: 166 and 167 were hybridized to each other. For the production of pGSGT2, SEQ ID NOs: 168 and 169 were hybridized to each other. For the production of pGSGT3, SEQ ID NOs: 170 and 171 were hybridized to each other. SEQ ID NOs: 172-175 are presented in Figure 9, and correspond to the sense and antisense sequences for representative siRNA-like molecules targeting the GUS gene. SEQ ID NO: 172 is a nucleic acid sequence that corresponds to bases 80-98 of GENBANK® Accession No. AY100472, and is a sense strand sequence. SEQ ID NO: 173 is a nucleic acid sequence that hybridizes to SEQ ID NO: 174 and includes a one nucleotide 3' overhang
(U). SEQ ID NO: 174 is a nucleic acid sequence that corresponds to bases 89-107 of GENBANK® Accession No. AY100472, and is a sense strand sequence. SEQ ID NO: 175 is a nucleic acid sequence that hybridizes to SEQ ID NO: 174 and includes a two nucleotide 3' overhangs (UU). SEQ ID NOs: 176-781 and 1376-1553 are the nucleotide sequences of various genes and/or RNA transcripts (disclosed in "DNA form"' i.e. with T instead of U) identified in Populus spp. as targets for one or more of the miRNAs disclosed in SEQ ID NOs: 1-59 and 1247-1295.
SEQ ID NOs: 782-1246 are the amino acid sequences encoded by the nucleotide sequences disclosed in SEQ ID NOs: 176-781. Given that some of the nucleotide sequences disclosed in SEQ ID NOs: 176-781 encode the same amino acid sequence, there are fewer SEQ ID NOs. assigned to amino acid sequences than to nucleotide sequences. The relationships between the sequences disclosed as SEQ ID NOs: 176-1246 and 1376-1661 are presented Table 3 below.
SEQ ID NOs: 1662-1712 are the nucleic acid sequences of various miRNAs from Pinus taeda. SEQ ID NOs: 1713-1748 are the nucleic acid sequences of various miRNA precursor genes. The relationships between the sequences disclosed as SEQ ID NOs: 1662-1712 and 1713-1748 are presented Table 4 below.
SEQ ID NOs: 1749-1837 are the nucleotide sequences of various genes and/or RNA transcripts (disclosed in "DNA form"' i.e. with T instead of U) identified in Pinus sp. as targets for one or more of the miRNAs disclosed in SEQ ID NOs: 1662-1712.
SEQ ID NOs: 1838-1907 are the amino acid sequences encoded by the nucleotide sequences disclosed in SEQ ID NOs: 1749-1837. Given that some of the nucleotide sequences disclosed in SEQ ID NOs: 1749-1837 encode the same amino acid sequence, there are fewer SEQ ID NOs. assigned to amino acid sequences than to nucleotide sequences. The relationships between the sequences disclosed as SEQ ID NOs: 1749-1837 and 1838-1907 are presented Table 5 below.
DETAILED DESCRIPTION
L. General Considerations
In studies of C. elegans development it was found that the lin-4 gene produced small RNAs of about 22 nucleotides (nt), instead of protein. It was further discovered that these small RNAs imperfectly paired to multiple sites in the 3'-untranslated region (3'-UTR) of lin-14 gene, mediating the translational repression of lin-14 message as part of the regulatory network that triggers the transition of developmental stages in the nematode (Lee RC et al., 1993; Wightman et al., 1993). These studies have led to the discovery of a new class of small, non-coding regulatory RNAs, termed microRNAs
(miRNAs), and, thus, of a new paradigm of gene expression regulation in eukaryotes (Lagos-Quintana et al., 2001 ; Lau et al., 2001 ; Lee & Ambros, 2001 ).
In a recent review, Bartel summarized the current knowledge of the biogenesis and functions of miRNAs in eukaryotes (Bartel, 2004). Briefly, the miRNA gene is presumably processed by RNA polymerase Il or RNA polymerase III to the primary miRNA stem-loop transcript, called pri-miRNA (Lee, N. S., et al., 2002). In mammals, the pri-miRNA is cleaved by the Drosha RNase III endonuclease at both stem strands near the stem-loop base, releasing an miRNA precursor (pre-miRNA) as an about 60-70 nt stem-loop RNA molecule (Lee, Y., et al., 2002; Zeng & Cullen, 2003). The pre-miRNA is then transported into the cytoplasm where it is cleaved at both stem strands by Dicer, also an RNase III endonuclease, liberating the loop portion of the pre-miRNA and the stem portion of the duplex that comprises the mature miRNA of about 22 nt and the similar size miRNA* fragment derived from the opposing arm of the pre-miRNA (Lau et al., 2001 ; Lagos- Quintana et al., 2002; Aravin et al., 2003; Lim et al., 2003b). In plants, the nuclear cleavage of the pri-miRNA is mediated by a Dicer-like protein, DCL1 , having a similar functionality as mammal Drosha (Reinhart et al., 2002; Lim et al., 2003b; Lee, Y., et al., 2002; Lee, Y., et al., 2003). The resulting plant pre-miRNA stem-loop transcripts are, however, generally more variable in size, ranging from about 60 to about 300 nt (Bartel & Bartel, 2003; Bartel, 2004; Lim et al., 2003b). It is believed that in plants, DCL1 performs a second cut in the nucleus on the pre-miRNA to liberate the miRNA:miRNA* duplex (Reinhart et al., 2002; Lim et al., 2003b; Lee Y et al., 2002; Lee, Y., et al., 2003).
After the export of the miRNA:miRNA* duplex to the cytoplasm, the miRNA pathway in plants and mammals appears to be quite similar, both involving helicase-like protein-mediated unwinding of the duplex to release the single-stranded mature miRNA (Bartel & Bartel, 2003; Bartel, 2004; Rhoades et al., 2002). The mature miRNA then recruits a ribonucleoprotein complex known as the RNA-induced silencing complex (RISC), while the miRNA* appears to be degraded. The miRNA guides the RISC to identify target messages based on perfect or near perfect complementarity between the miRNA and the target mRNA. Once such an mRNA is found, an endonuclease within the RISC cleaves the mRNA at a site near the middle of the miRNA complementarity, resulting in gene silencing (Hutvagner et al., 2000; Elbashir et al., 2001a; Elbashir et al., 2001b; Llave et al., 2002;
Kasschau et al., 2003). In general, the miRNA in RISC will direct cleavage of the target mRNA if the complementarity between the target mRNA and the miRNA is sufficiently high. If such complementarity is not sufficiently high, however, the miRNA will direct the repression of protein translation rather than target mRNA cleavage (Bartel & Bartel, 2003; Bartel, 2004).
This miRNA-guided gene silencing pathway is highly similar to the key steps of siRNA-mediated gene silencing known as posttranscriptional gene silencing (PTGS) in plants and RNA interference (RNAi) in animals
(Hamilton & Baulcombe, 1999; Hutvagner & Zamore, 2002). There is a distinction between miRNA and siRNA, however. siRNAs, which can be exogenous sequences (for example, transgenes), mediate the silencing of the same genes from which they are derived. miRNAs, on the other hand, are typically endogenous and encoded by their own genes, and target different genes, setting up the gene regulation circuitry. miRNAs have been cloned from various animals, including Drosophila melanogaster (Lagos-Quintana et al., 2001 ; Aravin et al., 2003), C. elegans (Lee & Ambros, 2001 ; Lim et al., 2003b; Ambros et al., 2003), fish (Lim et al., 2003a), mouse (Dostie et al., 2003; Houbaviy et al., 2003; Lagos-Quintana et al., 2003; Michael et al., 2003), and human (Lagos-Quintana et al., 2001 ; Mourelatos et al., 2002; Lagos-Quintana et al., 2003). Thus far, plant miRNAs have been isolated only from two non-woody plant species. The isolation is straightforward but the multitude of other small RNAs often complicates the initial classification (Llave et al., 2002; Park et al., 2002;
Reinhart et al., 2002; Rhoades et al., 2002; Elbashir et al., 2001a; Ambros et al., 2003). Of the more than 300 small RNAs isolated from Arabidopsis, only about 20 unique sequences have been reliably identified as miRNAs (Reinhart et al., 2002; Rhoades et al., 2002; Bartel & Bartel, 2003). In rice, 20 unique miRNAs that met the relevant criteria were identified from over
200 small RNAs (Wang et al., 2004).
The more challenging task, however, is to identify targets of miRNAs in order to determine the functions of the miRNAs. The observation that Arabidopsis miR171 has perfect antisense complementarity to three mRNAs encoding SCARECROW-like transcription factors (Llave et al., 2002;
Reinhart et al., 2002) led Rhoades et al. to successfully identify annotated Arabidopsis mRNAs having perfect or near perfect complementarity to the cloned Arabidopsis miRNAs (Rhoades et al., 2002). Seventy-four Arabidopsis target genes were identified, representing 61 unique mRNAs (Reinhart et al., 2002; Rhoades et al., 2002; Bartel & Bartel, 2003). When the same computational analysis was applied to animals, animal miRNAs had significantly lower mRNA hits, suggesting that perfect or near perfect miRNA:mRNA pairing might be specific to plants and, thus, that mRNA cleavage is the prevalent mechanism for miRNA-guided gene silencing in plants.
Furthermore, miRNA:mRNA pairings were conserved between Arabidopsis and rice (Reinhart et al., 2002; Rhoades et al., 2002; Bartel & Bartel, 2003; Wang et al., 2004). The most striking discovery was that, in the 61 predicted targets, 40 are known or putative transcription factors. Most of these transcription factors are known to regulate or are associated with development, suggesting that miRNAs might help coordinate a wide range of cell division and differentiation associated activities throughout the plant (Bartel & Bartel, 2003; Bartel, 2004).
The approach to gene function characterization through the use of microRNAs (miRNAs) offers the potential for agriculture and tree crop improvement. The ability to modulate the expression of genes involved in important biochemical pathways (for example, lignin synthesis) allows for the manipulation of the plant genome to produce plants with advantageous characteristics (for example, lower lignin content). miRNAs provide a general approach to modulating gene expression in plants that can potentially be applied to any plant gene. Thus, some embodiments the presently disclosed subject matter provide methods and compositions for modulating gene expression (for example, genes involved in lignin and/or cellulose synthesis) in plants (for example, trees, including but not limited to Populus trichocarpa and Pinus taeda).
|L Definitions For convenience, certain terms employed in the specification, examples, and appended claims are collected here. While the following terms are believed to be well understood by one of ordinary skill in the art, the following definitions are set forth to facilitate explanation of the presently disclosed subject matter.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which the presently disclosed subject matter belongs.
Although any methods, devices, and materials similar or equivalent to those described herein can be used in the practice or testing of the presently disclosed subject matter, representative methods, devices, and materials are now described. Following long-standing patent law convention, the terms "a", "an", and "the" refer to "one or more" when used in this application, including the claims. Thus, the articles "a", "an", and "the" are used herein to refer to one or to more than one {i.e., to at least one) of the grammatical object of the article. By way of example, "an element" refers to one element or more than one element.
As used herein, the term "about", when referring to a value or to an amount of mass, weight, time, volume, concentration, or percentage is meant to encompass variations of in some embodiments ±20% or ±10%, in some embodiments ±5%, in some embodiments ±1 %, in some embodiments ±0.5%, and in some embodiments ±0.1 % from the specified amount, as such variations are appropriate to practice the presently disclosed subject matter. Unless otherwise indicated, all numbers expressing quantities of ingredients, reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term "about". Accordingly, unless indicated to the contrary, the numerical parameters set forth in this specification and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by the presently disclosed subject matter.
As used herein, the terms "amino acid" and "amino acid residue" are used interchangeably and refer to any of the twenty naturally occurring amino acids, as well as analogs, derivatives, and congeners thereof; amino acid analogs having variant side chains; and all stereoisomers of any of the foregoing. Thus, the term "amino acid" is intended to embrace all molecules, whether natural or synthetic, which include both an amino functionality and an acid functionality and are capable of being included in a polymer of naturally occurring amino acids.
An amino acid is formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages. The amino acid residues described herein are in some embodiments in the "L" isomeric form. However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide. NH2 refers to the free amino group present at the amino terminus of a polypeptide. COOH refers to the free carboxy group present at the carboxy terminus of a polypeptide. In keeping with standard polypeptide nomenclature, abbreviations for amino acid residues are shown in tabular form presented hereinabove.
It is noted that all amino acid residue sequences represented herein by formulae have a left-to-right orientation in the conventional direction of amino terminus to carboxy terminus. In addition, the phrases "amino acid" and "amino acid residue" are broadly defined to include modified and unusual amino acids.
Furthermore, it is noted that a dash at the beginning or end of an amino acid residue sequence indicates a peptide bond to a further sequence of one or more amino acid residues or a covalent bond to an amino-terminal group such as NH2 or acetyl or to a carboxy-terminal group such as COOH.
As used herein, the term "cell" is used in its usual biological sense. In some embodiments, the cell is present in an organism, for example, a plant including, but not limited to poplar, pine, eucalyptus, sweetgum, and other tree species; tobacco; Arabidopsis; rice; corn; wheat; cotton; potato; and cucumber. The cell can be eukaryotic (e.g., a plant cell, such as a tobacco cell or a cell from a tree) or prokaryotic (e.g. a bacterium). The cell can be of somatic or germ line origin, totipotent, pluripotent, or differentiated to any degree, dividing or non-dividing. The cell can also be derived from or can comprise a gamete or embryo, a stem cell, or a fully differentiated cell.
As used herein, the terms "host cells" and "recombinant host cells" are used interchangeably and refer to cells (for example, plant cells) into which the compositions of the presently disclosed subject matter (for example, an expression vector) can be introduced. Furthermore, the terms refer not only to the particular plant cell into which an expression construct is initially introduced, but also to the progeny or potential progeny of such a cell. Because certain modifications can occur in succeeding generations due to either mutation or environmental influences, such progeny might not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
As used herein, the term "gene" refers to a nucleic acid that encodes an RNA, for example, nucleic acid sequences including, but not limited to, structural genes encoding a polypeptide. The term "gene" also refers broadly to any segment of DNA associated with a biological function. As such, the term "gene" encompasses sequences including but not limited to a coding sequence, a promoter region, a transcriptional regulatory sequence, a non-expressed DNA segment that is a specific recognition sequence for regulatory proteins, a non-expressed DNA segment that contributes to gene expression, a DNA segment designed to have desired parameters, or combinations thereof. A gene can be obtained by a variety of methods, including cloning from a biological sample, synthesis based on known or predicted sequence information, and recombinant derivation from one or more existing sequences.
As is understood in the art, a gene typically comprises a coding strand and a non-coding strand. As used herein, the terms "coding strand" and "sense strand" are used interchangeably, and refer to a nucleic acid sequence that has the same sequence of nucleotides as an mRNA from which the gene product is translated. As is also understood in the art, when the coding strand and/or sense strand is used to refer to a DNA molecule, the coding/sense strand includes thymidine residues instead of the uridine residues found in the corresponding mRNA. Additionally, when used to refer to a DNA molecule, the coding/sense strand can also include additional elements not found in the mRNA including, but not limited to promoters, enhancers, and introns. Similarly, the terms "template strand" and "antisense strand" are used interchangeably and refer to a nucleic acid sequence that is complementary to the coding/sense strand. It should be noted, however, that for those genes that do not encode polypeptide products (for example, an miRNA gene), the term "coding strand" is used to refer to the strand comprising the miRNA. In this usage, the strand comprising the miRNA is a sense strand with respect to the miRNA precursor, but it would be antisense with respect to its target RNA (i.e. the miRNA hybridizes to the target RNA because it comprises a sequence that is antisense to the target RNA).
As used herein, the terms "complementarity" and "complementary" refer to a nucleic acid that can form one or more hydrogen bonds with another nucleic acid sequence by either traditional Watson-Crick or other non-traditional types of interactions. In reference to the nucleic molecules of the presently disclosed subject matter, the binding free energy for a nucleic acid molecule with its complementary sequence is sufficient to allow the relevant function of the nucleic acid to proceed, in some embodiments, ribonuclease activity. For example, the degree of complementarity between the sense and antisense strands of an miRNA precursor can be the same or different from the degree of complementarity between the miRNA-containing strand of an miRNA precursor and the target nucleic acid sequence. Determination of binding free energies for nucleic acid molecules is well known in the art. See e.g., Freier et al., 1986; Turner et al., 1987.
As used herein, the phrase "percent complementarity" refers to the percentage of contiguous residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary). The terms "100% complementary", "fully complementary", and "perfectly complementary" indicate that all of the contiguous residues of a nucleic acid sequence can hydrogen bond with the same number of contiguous residues in a second nucleic acid sequence. As miRNAs are about 17-24 nt, and up to 5 mismatches (e.g., 1 , 2, 3, 4, or 5 mismatches) are tolerated during miRNA-directed modulation of gene expression, a percent complementarity of at least about 70% between a target RNA and an miRNA should be sufficient for the miRNA to modulate the expression of the gene from which the target RNA was derived.
The term "gene expression" generally refers to the cellular processes by which a biologically active polypeptide is produced from a DNA sequence and exhibits a biological activity in a cell. As such, gene expression involves the processes of transcription and translation, but also involves post- transcriptional and post-translational processes that can influence a biological activity of a gene or gene product. These processes include, but are not limited to RNA synthesis, processing, and transport, as well as polypeptide synthesis, transport, and post-translational modification of polypeptides. Additionally, processes that affect protein-protein interactions within the cell can also affect gene expression as defined herein.
However, in the case of genes that do not encode protein products, for example miRNA genes, the term "gene expression" refers to the processes by which a precursor miRNA is produced from the gene.
Typically, this process is referred to as transcription, although unlike the transcription directed by RNA polymerase Il for protein-coding genes, the transcription products of an miRNA gene are not translated to produce a protein. Nonetheless, the production of a mature miRNA from an miRNA gene is encompassed by the term "gene expression" as that term is used herein.
As used herein, the term "isolated" refers to a molecule substantially free of other nucleic acids, proteins, lipids, carbohydrates, and/or other materials with which it is normally associated, such association being either in cellular material or in a synthesis medium. Thus, the term "isolated nucleic acid" refers to a ribonucleic acid molecule or a deoxyribonucleic acid molecule (for example, a genomic DNA, cDNA, mRNA, miRNA, etc.) of natural or synthetic origin or some combination thereof, which (1 ) is not associated with the cell in which the "isolated nucleic acid" is found in nature, or (2) is operatively linked to a polynucleotide to which it is not linked in nature. Similarly, the term "isolated polypeptide" refers to a polypeptide, in some embodiments prepared from recombinant DNA or RNA, or of synthetic origin, or some combination thereof, which (1 ) is not associated with proteins that it is normally found with in nature, (2) is isolated from the cell in which it normally occurs, (3) is isolated free of other proteins from the same cellular source, (4) is expressed by a cell from a different species, or (5) does not occur in nature. The term "isolated", when used in the context of an "isolated cell", refers to a cell that has been removed from its natural environment, for example, as a part of an organ, tissue, or organism.
As used herein, the terms "label" and "labeled" refer to the attachment of a moiety, capable of detection by spectroscopic, radiologic, or other methods, to a probe molecule. Thus, the terms "label" or "labeled" refer to incorporation or attachment, optionally covalently or non-covalently, of a detectable marker into a molecule, such as a polypeptide. Various methods of labeling polypeptides are known in the art and can be used. Examples of labels for polypeptides include, but are not limited to, the following: radioisotopes, fluorescent labels, heavy atoms, enzymatic labels or reporter genes, chemiluminescent groups, biotinyl groups, predetermined polypeptide epitopes recognized by a secondary reporter (e.g., leucine zipper pair sequences, binding sites for antibodies, metal binding domains, epitope tags). In some embodiments, labels are attached by spacer arms of various lengths to reduce potential steric hindrance.
As used herein, the term "modulate" refers to an increase, decrease, or other alteration of any, or all, chemical and biological activities or properties of a biochemical entity, e.g., a wild-type or mutant nucleic acid molecule. For example, the term "modulate" can refer to a change in the expression level of a gene or a level of an RNA molecule or equivalent RNA molecules encoding one or more proteins or protein subunits; or to an activity of one or more proteins or protein subunits that is upregulated or down regulated, such that expression, level, or activity is greater than or less than that observed in the absence of the modulator. For example, the term "modulate" can mean "inhibit" or "suppress", but the use of the word
"modulate" is not limited to this definition.
As used herein, the terms "inhibit", "suppress", "down regulate", and grammatical variants thereof are used interchangeably and refer to an activity whereby gene expression or a level of an RNA encoding one or more gene products is reduced below that observed in the absence of a nucleic acid molecule of the presently disclosed subject matter. In some embodiments, inhibition with an miRNA molecule results in a decrease in the steady state expression level of a target RNA. In some embodiments, inhibition with an miRNA molecule results in an expression level of a target gene that is below that level observed in the presence of an inactive or attenuated molecule that is unable to downregulate the expression level of the target. In some embodiments, inhibition of gene expression with an miRNA molecule of the presently disclosed subject matter is greater in the presence of the miRNA molecule than in its absence. In some embodiments, inhibition of gene expression is associated with an enhanced rate of degradation of the mRNA encoded by the gene (for example, by miRNA-mediated inhibition of gene expression). The term "modulation" as used herein refers to both upregulation (i.e., activation or stimulation) and downregulation (i.e., inhibition or suppression) of a response. Thus, the term "modulation", when used in reference to a functional property or biological activity or process (e.g., enzyme activity or receptor binding), refers to the capacity to upregulate (e.g., activate or stimulate), downregulate (e.g., inhibit or suppress), or otherwise change a quality of such property, activity, or process. In certain instances, such regulation can be contingent on the occurrence of a specific event, such as activation of a signal transduction pathway, and/or can be manifest only in particular cell types. The term "modulator" refers to a polypeptide, nucleic acid, macromolecule, complex, molecule, small molecule, compound, species, or the like (naturally occurring or non-naturally occurring), or an extract made from biological materials such as bacteria, plants, fungi, or animal cells or tissues, that can be capable of causing modulation. Modulators can be evaluated for potential activity as inhibitors or activators (directly or indirectly) of a functional property, biological activity or process, or a combination thereof (e.g., agonist, partial antagonist, partial agonist, inverse agonist, antagonist, anti-microbial agents, inhibitors of microbial infection or proliferation, and the like), by inclusion in assays. In such assays, many modulators can be screened at one time. The activity of a modulator can be known, unknown, or partially known.
Modulators can be either selective or non-selective. As used herein, the term "selective" when used in the context of a modulator (e.g. an inhibitor) refers to a measurable or otherwise biologically relevant difference in the way the modulator interacts with one molecule (e.g. a target RNA of interest) versus another similar but not identical molecule (e.g. an RNA derived from a member of the same gene family as the target RNA of interest).
It must be understood that for a modulator to be considered a selective modulator, the nature of its interaction with a target need entirely exclude its interaction with other molecules related to the target (e.g. transcripts from family members other than the target itself). Stated another way, the term selective modulator is not intended to be limited to those molecules that only bind to mRNA transcripts from a gene of interest and not to those of related family members. The term is also intended to include modulators that can interact with transcripts from genes of interest and from related family members, but for which it is possible to design conditions under which the differential interactions with the targets versus the family members has a biologically relevant outcome. Such conditions can include, but are not limited to differences in the degree of sequence identity between the modulator and the family members, and the use of the modulator in a specific tissue or cell type that expresses some but not all family members. Under the latter set of conditions, a modulator might be considered selective to a given target in a given tissue if it interacts with that target to cause a biologically relevant effect despite the fact that in another tissue that expresses additional family members the modulator and the target would not interact to cause a biological effect at all because the modulator would be "soaked out" of the tissue by the presence of other family members.
When a selective modulator is identified, the modulator binds to one molecule (for example an mRNA transcript of a gene of interest) in a manner that is different (for example, stronger) from the way it binds to another molecule (for example, an mRNA transcript of a gene related to the gene of interest). As used herein, the modulator is said to display "selective binding" or "preferential binding" to the molecule to which it binds more strongly as compared to some other possible molecule to which the modulator might bind.
As used herein, the term "mutation" carries its traditional connotation and refers to a change, inherited, naturally occurring, or introduced, in a nucleic acid or polypeptide sequence, and is used in its sense as generally known to those of skill in the art. The term "naturally occurring", as applied to an object, refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including bacteria) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally occurring. It must be understood, however, that any manipulation by the hand of man can render a "naturally occurring" object an "isolated" object as that term is used herein.
As used herein, the terms "nucleic acid" and "nucleic acid molecule" refer to any of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction
(PCR), and fragments generated by any of ligation, scission, endonuclease action, and exonuclease action. Nucleic acids can be composed of monomers that are naturally occurring nucleotides (such as deoxyribonucleotides and ribonucleotides), or analogs of naturally occurring nucleotides (e.g., α-enantiomeric forms of naturally occurring nucleotides), or a combination of both. Modified nucleotides can have modifications in sugar moieties and/or in pyrimidine or purine base moieties. Sugar modifications include, for example, replacement of one or more hydroxyl groups with halogens, alkyl groups, amines, and azido groups, or sugars can be functionalized as ethers or esters. Moreover, the entire sugar moiety can be replaced with sterically and electronically similar structures, such as aza- sugars and carbocyclic sugar analogs. Examples of modifications in a base moiety include alkylated purines and pyrimidines, acylated purines or pyrimidines, or other well-known heterocyclic substitutes. Nucleic acid monomers can be linked by phosphodiester bonds or analogs of such linkages. Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, and the like.
The term "nucleic acid" also includes so-called "peptide nucleic acids", which comprise naturally occurring or modified nucleic acid bases attached to a polyamide backbone. Nucleic acids can be either single stranded or double stranded. The term "operatively linked", when describing the relationship between two nucleic acid regions, refers to a juxtaposition wherein the regions are in a relationship permitting them to function in their intended manner. For example, a control sequence "operatively linked" to a coding sequence can be ligated in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences, such as when the appropriate molecules (e.g., inducers and polymerases) are bound to the control or regulatory sequence(s). Thus, in some embodiments, the phrase "operatively linked" refers to a promoter connected to a coding sequence in such a way that the transcription of that coding sequence is controlled and regulated by that promoter. Techniques for operatively linking a promoter to a coding sequence are well known in the art; the precise orientation and location relative to a coding sequence of interest is dependent, inter alia, upon the specific nature of the promoter.
Thus, the term "operatively linked" can refer to a promoter region that is connected to a nucleotide sequence in such a way that the transcription of that nucleotide sequence is controlled and regulated by that promoter region. Similarly, a nucleotide sequence is said to be under the "transcriptional control" of a promoter to which it is operatively linked. Techniques for operatively linking a promoter region to a nucleotide sequence are known in the art.
The term "operatively linked" can also refer to a transcription termination sequence that is connected to a nucleotide sequence in such a way that termination of transcription of that nucleotide sequence is controlled by that transcription termination sequence. In some embodiments, a transcription termination sequence comprises a sequence that causes transcription by an RNA polymerase III to terminate at the third or fourth T in the terminator sequence, TTTTTTT. Therefore the nascent small transcript has 3 or 4 U's at the 3' terminus.
The phrases "percent identity" and "percent identical," in the context of two nucleic acid or protein sequences, refer to two or more sequences or subsequences that have in some embodiments at least 60%, in some embodiments at least 70%, in some embodiments at least 80%, in some embodiments at least 85%, in some embodiments at least 90%, in some embodiments at least 95%, in some embodiments at least 98%, and in some embodiments at least 99% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. The percent identity exists in some embodiments over a region of the sequences that is at least about 50 residues in length, in some embodiments over a region of at least about 100 residues, and in some embodiments the percent identity exists over at least about 150 residues. In some embodiments, the percent identity exists over the entire length of a given region, such as a coding region.
For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm described in Smith &
Waterman, 1981 , by the homology alignment algorithm described in Needleman & Wunsch, 1970, by the search for similarity method described in Pearson & Lipman, 1988, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the GCG® WISCONSIN PACKAGE®, available from Accelrys, Inc., San Diego, California, United States of America), or by visual inspection. See generally, Ausubel et al., 1989. One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., 1990. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information via the World Wide Web. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., 1990). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11 , an expectation (E) of 10, a cutoff of 100, M = 5, N = -4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation
(E) of 10, and the BLOSUM62 scoring matrix. See Henikoff & Henikoff, 1992. In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences. See e.g., Karlin & Altschul 1993. One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence is in some embodiments less than about 0.1 , in some embodiments less than about 0.01 , and in some embodiments less than about 0.001.
The term "substantially identical", in the context of two nucleotide sequences, refers to two or more sequences or subsequences that have in some embodiments at least about 70% nucleotide identity, in some embodiments at least about 75% nucleotide identity, in some embodiments at least about 80% nucleotide identity, in some embodiments at least about 85% nucleotide identity, in some embodiments at least about 90% nucleotide identity, in some embodiments at least about 95% nucleotide identity, in some embodiments at least about 97% nucleotide identity, and in some embodiments at least about 99% nucleotide identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. In one example, the substantial identity exists in nucleotide sequences of at least 17 residues, in some embodiments in nucleotide sequence of at least about 18 residues, in some embodiments in nucleotide sequence of at least about
19 residues, in some embodiments in nucleotide sequence of at least about
20 residues, in some embodiments in nucleotide sequence of at least about
21 residues, in some embodiments in nucleotide sequence of at least about
22 residues, in some embodiments in nucleotide sequence of at least about 23 residues, in some embodiments in nucleotide sequence of at least about
24 residues, in some embodiments in nucleotide sequence of at least about
25 residues, in some embodiments in nucleotide sequence of at least about
26 residues, in some embodiments in nucleotide sequence of at least about 27 residues, in some embodiments in nucleotide sequence of at least about 30 residues, in some embodiments in nucleotide sequence of at least about 50 residues, in some embodiments in nucleotide sequence of at least about 75 residues, in some embodiments in nucleotide sequence of at least about 100 residues, in some embodiments in nucleotide sequences of at least about 150 residues, and in yet another example in nucleotide sequences comprising complete coding sequences. In some embodiments, polymorphic sequences can be substantially identical sequences. The term "polymorphic" refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. An allelic difference can be as small as one base pair. Nonetheless, one of ordinary skill in the art would recognize that the polymorphic sequences correspond to the same gene.
Another indication that two nucleotide sequences are substantially identical is that the two molecules specifically or substantially hybridize to each other under stringent conditions. In the context of nucleic acid hybridization, two nucleic acid sequences being compared can be designated a "probe sequence" and a "test sequence". A "probe sequence" is a reference nucleic acid molecule, and a "'test sequence" is a test nucleic acid molecule, often found within a heterogeneous population of nucleic acid molecules.
An exemplary nucleotide sequence employed for hybridization studies or assays includes probe sequences that are complementary to or mimic in some embodiments at least an about 14 to 40 nucleotide sequence of a nucleic acid molecule of the presently disclosed subject matter. In one example, probes comprise 14 to 20 nucleotides, or even longer where desired, such as 30, 40, 50, 60, 100, 200, 300, or 500 nucleotides or up to the full length of a given gene. Such fragments can be readily prepared by, for example, directly synthesizing the fragment by chemical synthesis, by application of nucleic acid amplification technology, or by introducing selected sequences into recombinant vectors for recombinant production.
The phrase "hybridizing specifically to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex nucleic acid mixture (e.g., total cellular DNA or RNA).
By way of non-limiting example, hybridization can be carried out in 5x SSC, 4x SSC, 3x SSC, 2x SSC, 1x SSC, or 0.2x SSC for at least about 1 hour, 2 hours, 5 hours, 12 hours, or 24 hours (see Sambrook & Russell,
2001 , for a description of SSC buffer and other hybridization conditions). The temperature of the hybridization can be increased to adjust the stringency of the reaction, for example, from about 25°C (room temperature), to about 45°C, 500C, 550C, 60°C, or 65°C. The hybridization reaction can also include another agent affecting the stringency; for example, hybridization conducted in the presence of 50% formamide increases the stringency of hybridization at a defined temperature.
The hybridization reaction can be followed by a single wash step, or two or more wash steps, which can be at the same or a different salinity and temperature. For example, the temperature of the wash can be increased to adjust the stringency from about 25°C (room temperature), to about 45°C, 50°C, 550C, 6O0C, 650C, or higher. The wash step can be conducted in the presence of a detergent, e.g., SDS. For example, hybridization can be followed by two wash steps at 65°C each for about 20 minutes in 2x SSC, 0.1 % SDS, and optionally two additional wash steps at 65°C each for about
20 minutes in 0.2x SSC, 0.1 % SDS.
The following are examples of hybridization and wash conditions that can be used to clone homologous nucleotide sequences that are substantially identical to reference nucleotide sequences of the presently disclosed subject matter: a probe nucleotide sequence hybridizes in one example to a target nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO4, 1 mm ethylenediamine tetraacetic acid (EDTA) at 5O0C followed by washing in 2X SSC, 0.1 % SDS at 50°C; in some embodiments, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO4, 1 mm EDTA at 50°C followed by washing in 1X SSC, 0.1%
SDS at 500C; in some embodiments, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO4, 1 mm EDTA at 5O0C followed by washing in 0.5X SSC, 0.1 % SDS at 5O0C; in some embodiments, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO4, 1 mm EDTA at 5O0C followed by washing in 0.1 X SSC, 0.1 % SDS at 500C; in yet another example, a probe and test sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaPO4, 1 mm EDTA at 5O0C followed by washing in 0.1 X SSC, 0.1 % SDS at 650C.
Additional exemplary stringent hybridization conditions include overnight hybridization at 42°C in a solution comprising or consisting of 50% formamide, 1Ox Denhardt's (0.2% Ficoll, 0.2% polyvinylpyrrolidone, 0.2% bovine serum albumin) and 200 mg/ml of denatured carrier DNA, e.g., sheared salmon sperm DNA, followed by two wash steps at 65°C each for about 20 minutes in 2x SSC, 0.1 % SDS, and two wash steps at 650C each for about 20 minutes in 0.2x SSC, 0.1 % SDS.
Hybridization can include hybridizing two nucleic acids in solution, or a nucleic acid in solution to a nucleic acid attached to a solid support, e.g., a filter. When one nucleic acid is on a solid support, a prehybridization step can be conducted prior to hybridization. Prehybridization can be carried out for at least about 1 hour, 3 hours, or 10 hours in the same solution and at the same temperature as the hybridization (but without the complementary polynucleotide strand). Thus, upon a review of the present disclosure, stringency conditions are known to those skilled in the art or can be determined experimentally by the skilled artisan. See e.g., Ausubel et ai, 1989; Sambrook & Russell, 2001 ; Agrawal, 1993; Tijssen, 1993; Tibanyenda et ai, 1984; and Ebel et ai, 1992. The phrase "hybridizing substantially to" refers to complementary hybridization between a probe nucleic acid molecule and a target nucleic acid molecule and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired hybridization. The term "phenotype" refers to the entire physical, biochemical, and physiological makeup of a cell or an organism, e.g., having any one trait or any group of traits. As such, phenotypes result from the expression of genes within a cell or an organism, and relate to traits that are potentially observable or assayable.
As used herein, the terms "polypeptide", "protein", and "peptide", which are used interchangeably herein, refer to a polymer of the 20 protein amino acids, or amino acid analogs, regardless of its size or function.
Although "protein" is often used in reference to relatively large polypeptides, and "peptide" is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. The term "polypeptide" as used herein refers to peptides, polypeptides and proteins, unless otherwise noted. As used herein, the terms "protein", "polypeptide", and "peptide" are used interchangeably herein when referring to a gene product. The term "polypeptide" encompasses proteins of all functions, including enzymes. Thus, exemplary polypeptides include gene products, naturally occurring proteins, homologs, orthologs, paralogs, fragments, and other equivalents, variants and analogs of the foregoing.
The terms "polypeptide fragment" or "fragment", when used in reference to a reference polypeptide, refers to a polypeptide in which amino acid residues are deleted as compared to the reference polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino-terminus or carboxy-terminus of the reference polypeptide, or alternatively both. Fragments typically are at least 5, 6, 8 or 10 amino acids long, at least 14 amino acids long, at least 20, 30, 40 or 50 amino acids long, at least 75 amino acids long, or at least 100, 150, 200, 300, 500 or more amino acids long. A fragment can retain one or more of the biological activities of the reference polypeptide. Further, fragments can include a sub-fragment of a specific region, which sub-fragment retains a function of the region from which it is derived.
As used herein, the term "primer" refers to a sequence comprising in some embodiments two or more deoxyribonucleotides or ribonucleotides, in some embodiments more than three, in some embodiments more than eight, and in some embodiments at least about 20 nucleotides of an exonic or intronic region. Such oligonucleotides are in some embodiments between ten and thirty bases in length.
The term "purified" refers to an object species that is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition). A "purified fraction" is a composition wherein the object species comprises at least about 50 percent (on a molar basis) of all species present. In making the determination of the purity of a species in solution or dispersion, the solvent or matrix in which the species is dissolved or dispersed is usually not included in such determination; instead, only the species (including the one of interest) dissolved or dispersed are taken into account. Generally, a purified composition will have one species that comprises more than about 80 percent of all species present in the composition, more than about 85%, 90%, 95%, 99% or more of all species present. The object species can be purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods) wherein the composition consists essentially of a single species. A skilled artisan can purify a polypeptide of the presently disclosed subject matter using standard techniques for protein purification in light of the teachings herein. Purity of a polypeptide can be determined by a number of methods known to those of skill in the art, including for example, amino-terminal amino acid sequence analysis, gel electrophoresis, and mass-spectrometry analysis.
A "reference sequence" is a defined sequence used as a basis for a sequence comparison. A reference sequence can be a subset of a larger sequence, for example, as a segment of a full-length nucleotide or amino acid sequence, or can comprise a complete sequence. Generally, when used to refer to a nucleotide sequence, a reference sequence is at least 200, 300 or 400 nucleotides in length, frequently at least 600 nucleotides in length, and often at least 800 nucleotides in length. Because two proteins can each (1 ) comprise a sequence (i.e., a portion of the complete protein sequence) that is similar between the two proteins, and (2) can further comprise a sequence that is divergent between the two proteins, sequence comparisons between two (or more) proteins are typically performed by comparing sequences of the two proteins over a "comparison window" (defined hereinabove) to identify and compare local regions of sequence similarity.
The term "regulatory sequence" is a generic term used throughout the specification to refer to polynucleotide sequences, such as initiation signals, enhancers, regulators, promoters, and termination sequences, which are necessary or desirable to affect the expression of coding and non-coding sequences to which they are operatively linked. Exemplary regulatory sequences are described in Goeddel, 1990, and include, for example, the early and late promoters of simian virus 40 (SV40), adenovirus or cytomegalovirus immediate early promoter, the lac system, the trp system, the TAC or TRC system, T7 promoter whose expression is directed by T7 RNA polymerase, the major operator and promoter regions of phage lambda, the control regions for fd coat protein, the promoter for 3- phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid phosphatase, e.g., Pho5, the promoters of the yeast a-mating factors, the polyhedron promoter of the baculovirus system and other sequences known to control the expression of genes of prokaryotic or eukaryotic cells or their viruses, and various combinations thereof. The nature and use of such control sequences can differ depending upon the host organism. In prokaryotes, such regulatory sequences generally include promoter, ribosomal binding site, and transcription termination sequences. The term "regulatory sequence" is intended to include, at a minimum, components the presence of which can influence expression, and can also include additional components the presence of which is advantageous, for example, leader sequences and fusion partner sequences.
In certain embodiments, transcription of a polynucleotide sequence is under the control of a promoter sequence (or other regulatory sequence) that controls the expression of the polynucleotide in a cell-type in which expression is intended. It will also be understood that the polynucleotide can be under the control of regulatory sequences that are the same or different from those sequences which control expression of the naturally occurring form of the polynucleotide. In some embodiments, a promoter sequence is a DNA-dependent RNA polymerase III promoter (e.g. a promoter for an H1 , 5S, or U6 gene, or an Arabidopsis thaliana At7SL4 gene promoter, such as that disclosed as SEQ ID NO: 162). In some embodiments, a promoter sequence is selected from the group consisting of an adenovirus VA1 promoter sequence, a Vault promoter sequence, a telomerase RNA promoter sequence, and a tRNA gene promoter sequence. It is understood that the entire promoter identified for any promoter (for example, the promoters listed herein) need not be employed, and that a functional derivative thereof can be used. As used herein, the phrase "functional derivative" refers to a nucleic acid sequence that comprises sufficient sequence to direct transcription of another operatively linked nucleic acid molecule. As such, a "functional derivative" can function as a minimal promoter, as that term is defined herein.
Termination of transcription of a polynucleotide sequence is typically regulated by an operatively linked transcription termination sequence (for example, an RNA polymerase III termination sequence). In certain instances, transcriptional terminators are also responsible for correct mRNA polyadenylation. The 3' non-transcribed regulatory DNA sequence includes from in some embodiments about 50 to about 1 ,000, and in some embodiments about 100 to about 1 ,000, nucleotide base pairs and contains plant transcriptional and translational termination sequences. Appropriate transcriptional terminators and those that are known to function in plants include the cauliflower mosaic virus (CaMV) 35S terminator, the tml terminator, the nopaline synthase terminator, the pea rbcS E9 terminator, the terminator for the T7 transcript from the octopine synthase gene of
Agrobactehum tumefaciens, and the 3' end of the protease inhibitor I or Il genes from potato or tomato, although other 3' elements known to those of skill in the art can also be employed. Alternatively, a gamma coixin, oleosin 3, or other terminator from the genus Coix can be used. In some embodiments, an RNA polymerase III termination sequence comprises the nucleotide sequence TTTTTTT.
The term "reporter gene" refers to a nucleic acid comprising a nucleotide sequence encoding a protein that is readily detectable either by its presence or activity, including, but not limited to, luciferase, fluorescent protein (e.g., green fluorescent protein), chloramphenicol acetyl transferase, β-galactosidase, secreted placental alkaline phosphatase, β-lactamase, human growth hormone, and other secreted enzyme reporters. Generally, a reporter gene encodes a polypeptide not otherwise produced by the host cell, which is detectable by analysis of the cell(s), e.g., by the direct fluoromethc, radioisotopic or spectrophotometric analysis of the cell(s) and typically without the need to kill the cells for signal analysis. In certain instances, a reporter gene encodes an enzyme, which produces a change in fluorometric properties of the host cell, which is detectable by qualitative, quantitative, or semiquantitative function or transcriptional activation. Exemplary enzymes include esterases, ^-lactamase, phosphatases, peroxidases, proteases (tissue plasminogen activator or urokinase), and other enzymes whose function can be detected by appropriate chromogenic or fluorogenic substrates known to those skilled in the art or developed in the future.
As used herein, the term "sequencing" refers to determining the ordered linear sequence of nucleic acids or amino acids of a DNA, RNA, or protein target sample, using conventional manual or automated laboratory techniques.
As used herein, the term "substantially pure" refers to that the polynucleotide or polypeptide is substantially free of the sequences and molecules with which it is associated in its natural state, and those molecules used in the isolation procedure. The term "substantially free" refers to that the sample is in some embodiments at least 50%, in some embodiments at least 70%, in some embodiments 80% and in some embodiments 90% free of the materials and compounds with which is it associated in nature.
As used herein, the term "target cell" refers to a cell, into which it is desired to insert a nucleic acid sequence or polypeptide, or to otherwise effect a modification from conditions known to be standard in the unmodified cell. A nucleic acid sequence introduced into a target cell can be of variable length. Additionally, a nucleic acid sequence can enter a target cell as a component of a plasmid or other vector or as a naked sequence.
As used herein, the term "target gene" refers to a gene expressed in a cell the expression of which is targeted for modulation using the methods and compositions of the presently disclosed subject matter. A target gene, therefore, comprises a nucleic acid sequence the expression level of which is downregulated by an miRNA. Similarly, the terms "target RNA" or "target mRNA" refers to the transcript of a target gene to which the miRNA is intended to bind, leading to modulation of the expression of the target gene. The target gene can be a gene derived from a cell, an endogenous gene, a transgene, or exogenous genes such as genes of a pathogen, for example a virus, which is present in the cell after infection thereof. The cell containing the target gene can be derived from or contained in any organism, for example a plant, animal, protozoan, virus, bacterium, or fungus. As used herein, the term "transcription" refers to a cellular process involving the interaction of an RNA polymerase with a gene that directs the expression as RNA of the structural information present in the coding sequences of the gene. The process includes, but is not limited to, the following steps: (a) the transcription initiation; (b) transcript elongation; (c) transcript splicing; (d) transcript capping; (e) transcript termination; (f) transcript polyadenylation; (g) nuclear export of the transcript; (h) transcript editing; and (i) stabilizing the transcript.
As used herein, the term "transcription factor" refers to a cytoplasmic or nuclear protein which binds to a gene, or binds to an RNA transcript of a gene, or binds to another protein which binds to a gene or an RNA transcript or another protein which in turn binds to a gene or an RNA transcript, so as to thereby modulate expression of the gene. Such modulation can additionally be achieved by other mechanisms; the essence of a "transcription factor for a gene" pertains to a factor that alters the level of transcription of the gene in some way.
The term "transfection" refers to the introduction of a nucleic acid, e.g., an expression vector, into a recipient cell, which in certain instances involves nucleic acid-mediated gene transfer. The term "transformation" refers to a process in which a cell's genotype is changed as a result of the cellular uptake of exogenous nucleic acid. For example, a transformed cell can express a recombinant form of a polypeptide of the presently disclosed subject matter. The transformation of a cell with an exogenous nucleic acid (for example, an expression vector) can be characterized as transient or stable. As used herein, the term "stable" refers to a state of persistence that is of a longer duration than that which would be understood in the art as "transient". These terms can be used both in the context of the transformation of cells (for example, a stable transformation), or for the expression of a transgene
(for example, the stable expression' of a vector-encoded miRNA) in a transgenic cell. In some embodiments, a stable transformation results in the incorporation of the exogenous nucleic acid molecule (for example, an expression vector) into the genome of the transformed cell. As a result, when the cell divides, the vector DNA is replicated along with plant genome so that progeny cells also contain the exogenous DNA in their genomes.
In some embodiments, the term "stable expression" relates to expression of a nucleic acid molecule (for example, a vector-encoded miRNA) over time. Thus, stable expression requires that the cell into which the exogenous DNA is introduced express the encoded nucleic acid at a consistent level over time. Additionally, stable expression can occur over the course of generations. When the expressing cell divides, at least a fraction of the resulting daughter cells can also express the encoded nucleic acid, and at about the same level. It should be understood that it is not necessary that every cell derived from the cell into which the vector was originally introduced express the nucleic acid molecule of interest. Rather, particularly in the context of a whole plant, the term "stable expression" requires only that the nucleic acid molecule of interest be stably expressed in tissue(s) and/or location(s) of the plant in which expression is desired. In some embodiments, stable expression of an exogenous nucleic acid is achieved by the integration of the nucleic acid into the genome of the host cell.
The term "vector" refers to a nucleic acid capable of transporting another nucleic acid to which it has been linked. One type of vector that can be used in accord with the presently disclosed subject matter is an Agrobacterium binary vector, i.e., a nucleic acid capable of integrating the nucleic acid sequence of interest into the host cell (for example, a plant cell) genome. Other vectors include those capable of autonomous replication and expression of nucleic acids to which they are linked. Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as "expression vectors". In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. In the present specification, "plasmid" and "vector" are used interchangeably as the plasmid is the most commonly used form of vector. However, the presently disclosed subject matter is intended to include such other forms of expression vectors which serve equivalent functions and which become known in the art subsequently hereto.
The term "expression vector" as used herein refers to a DNA sequence capable of directing expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter operatively linked to the nucleotide sequence of interest which is operatively linked to transcription termination sequences. It also typically comprises sequences required for proper translation of the nucleotide sequence. The construct comprising the nucleotide sequence of interest can be chimeric. The construct can also be one that is naturally occurring but has been obtained in a recombinant form useful for heterologous expression. The nucleotide sequence of interest, including any additional sequences designed to effect proper expression of the nucleotide sequences, can also be referred to as an "expression cassette".
The terms "heterologous gene", "heterologous DNA sequence", "heterologous nucleotide sequence", "exogenous nucleic acid molecule", or "exogenous DNA segment", as used herein, each refer to a sequence that originates from a source foreign to an intended host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified, for example by mutagenesis or by isolation from native transcriptional regulatory sequences. The terms also include non-naturally occurring multiple copies of a naturally occurring nucleotide sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position within the host cell nucleic acid wherein the element is not ordinarily found. The term "promoter" or "promoter region" each refers to a nucleotide sequence within a gene that is positioned 5' to a coding sequence and functions to direct transcription of the coding sequence. The promoter region comprises a transcriptional start site, and can additionally include one or more transcriptional regulatory elements. In some embodiments, a method of the presently disclosed subject matter employs a RNA polymerase III promoter.
A "minimal promoter" is a nucleotide sequence that has the minimal elements required to enable basal level transcription to occur. As such, minimal promoters are not complete promoters but rather are subsequences of promoters that are capable of directing a basal level of transcription of a reporter construct in an experimental system. Minimal promoters include but are not limited to the cytomegalovirus (CMV) minimal promoter, the herpes simplex virus thymidine kinase (HSV-tk) minimal promoter, the simian virus 40 (SV40) minimal promoter, the human /?-actin minimal promoter, the human EF2 minimal promoter, the adenovirus E1 B minimal promoter, and the heat shock protein (hsp) 70 minimal promoter. Minimal promoters are often augmented with one or more transcriptional regulatory elements to influence the transcription of an operatively linked gene. For example, cell- type-specific or tissue-specific transcriptional regulatory elements can be added to minimal promoters to create recombinant promoters that direct transcription of an operatively linked nucleotide sequence in a cell-type- specific or tissue-specific manner. As used herein, the term "minimal promoter" also encompasses a functional derivative of a promoter disclosed herein, including, but not limited to an RNA polymerase III promoter (for example, an H1 , 7SL, 5S, or U6 promoter), an adenovirus VA1 promoter, a
Vault promoter, a telomerase RNA promoter, and a tRNA gene promoter.
Different promoters have different combinations of transcriptional regulatory elements. Whether or not a gene is expressed in a cell is dependent on a combination of the particular transcriptional regulatory elements that make up the gene's promoter and the different transcription factors that are present within the nucleus of the cell. As such, promoters are often classified as "constitutive", "tissue-specific", "cell-type-specific", or "inducible", depending on their functional activities in vivo or in vitro. For example, a constitutive promoter is one that is capable of directing transcription of a gene in a variety of cell types (in some embodiments, in all cell types) of an organism. Exemplary constitutive promoters include the promoters for the following genes which encode certain constitutive or "housekeeping" functions: hypoxanthine phosphoribosyl transferase (HPRT), dihydrofolate reductase (DHFR; (Scharfmann et a/., 1991 ), adenosine deaminase, phosphoglycerate kinase (PGK), pyruvate kinase, phosphoglycerate mutase, the β-actin promoter (see e.g., Williams et al., 1993), and other constitutive promoters known to those of skill in the art. "Tissue-specific" or "cell-type-specific" promoters, on the other hand, direct transcription in some tissues or cell types of an organism but are inactive in some or all others tissues or cell types. Exemplary tissue-specific promoters include those promoters described in more detail hereinbelow, as well as other tissue-specific and cell-type specific promoters known to those of skill in the art.
When used in the context of a promoter, the term "linked" as used herein refers to a physical proximity of promoter elements such that they function together to direct transcription of an operatively linked nucleotide sequence The term "transcriptional regulatory sequence" or "transcriptional regulatory element", as used herein, each refers to a nucleotide sequence within the promoter region that enables responsiveness to a regulatory transcription factor. Responsiveness can encompass a decrease or an increase in transcriptional output and is mediated by binding of the transcription factor to the DNA molecule comprising the transcriptional regulatory element. In some embodiments, a transcriptional regulatory sequence is a transcription termination sequence, alternatively referred to herein as a transcription termination signal. The term "transcription factor" generally refers to a protein that modulates gene expression by interaction with the transcriptional regulatory element and cellular components for transcription, including RNA
Polymerase, Transcription Associated Factors (TAFs), chromatin-remodeling proteins, and any other relevant protein that impacts gene transcription.
As used herein, "significance" or "significant" relates to a statistical analysis of the probability that there is a non-random association between two or more entities. To determine whether or not a relationship is "significant" or has "significance", statistical manipulations of the data can be performed to calculate a probability, expressed as a "p-value". Those p- values that fall below a user-defined cutoff point are regarded as significant. In one example, a p-value less than or equal to 0.05, in some embodiments less than 0.01 , in some embodiments less than 0.005, and in some embodiments less than 0.001 , are regarded as significant. As used herein, the phrase "target RNA" refers to an RNA molecule
(for example, an mRNA molecule encoding a plant gene product) that is a target for modulation. Similarly, the phrase "target site" refers to a sequence within a target RNA that is "targeted" for cleavage mediated by an miRNA or siRNA construct that contains sequences within its antisense strand that are complementary to the target site. Also similarly, the phrase "target cell" refers to a cell that expresses a target RNA and into which an miRNA is intended to be introduced. A target cell is in some embodiments a cell in a plant. For example, a target cell can comprise a target RNA expressed in a plant. An miRNA or an siRNA is "targeted to" an RNA molecule if it has sufficient nucleotide similarity to the RNA molecule that it would be expected to modulate the expression of the RNA molecule under conditions sufficient for the iniRNA/siRNA and the RNA molecule to interact. In some embodiments, the interaction occurs within a plant cell. In some embodiments the interaction occurs under physiological conditions. As used herein, the phrase "physiological conditions" refers to in vivo conditions within a plant cell, whether that plant cell is part of a plant or a plant tissue, or that plant cell is being grown in vitro. Thus, as used herein, the phrase "physiological conditions" refers to the conditions within a plant cell under any conditions that the plant cell can be exposed to, either as part of a plant or when grown in vitro.
As used herein, the phrase "detectable level of cleavage" refers to a degree of cleavage of target RNA (and formation of cleaved product RNAs) that is sufficient to allow detection of cleavage products above the background of RNAs produced by random degradation of the target RNA. Production of miRNA-mediated cleavage products from at least 1-5% of the target RNA is sufficient to allow detection above background for most detection methods.
The terms "microRNA" and "miRNA" are used interchangeably and refer to a nucleic acid molecule of about 17-24 nt that is produced from a pri- miRNA, a pre-miRNA, or a functional equivalent. As discussed in more detail herein, miRNAs are to be contrasted with siRNAs described hereinbelow, although in the context of exogenously supplied miRNAs and siRNAs, this distinction might be somewhat artificial. The distinction to keep in mind is that an miRNA is necessarily the product of nuclease activity on a hairpin molecule such as has been described herein, and an siRNA can be generated from a fully double-stranded RNA molecule or a hairpin molecule. Thus, while the distinction might be to some extent artificial, as used herein an miRNA is designed to hybridize to an mRNA derived from a gene of interest and an siRNA is designed to hybridize to an miRNA precursor such as a pri-miRNA or a pre-miRNA. miRNAs isolated from P. trichocarpa as disclosed herein are named using the general formula "PtmiR X", where X is a number. This is in contrast to P. trichocarpa genes encoding miRNAs, which are named using the general formula "PtMIR X", wherein X is a number sometimes followed by a lowercase letter. Thus, as referred to herein, miRNA names and miRNA-encoding gene names have the "Ml" in lowercase and uppercase, respectively. The terms "small interfering RNA", "short interfering RNA", and
"siRNA" are used interchangeably and refer to a ribonucleic acid or a modified ribonucleic acid that is designed to hybridize to a single-stranded loop region of an miRNA precursor. As used herein, the term "miRNA precursor" refers to any ribonucleic acid derived from a DNA sequence encoding an miRNA. Exemplary miRNA precursors include pri-miRNAs and pre-miRNAs, although the term is not limited to only these species. In some embodiments, the siRNA comprises a single stranded polynucleotide having self-complementary sense and antisense regions, wherein either the sense or the antisense region comprises a sequence complementary to a loop region of a pri-miRNA or a pre-miRNA. In some embodiments, the siRNA comprises a single stranded polynucleotide having one or more loop structures and a stem comprising self complementary sense and antisense regions, wherein the antisense region comprises a sequence complementary to a loop region of a pri-miRNA or a pre-miRNA, and wherein the polynucleotide can be processed either in vivo or in vitro to generate an active siRNA capable of mediating cleavage of the miRNA precursor.
The methods of the presently disclosed subject matter can employ siRNA molecules of the general structure shown in Figure 1 , wherein N is any nucleotide, provided that in the loop structure identified as Ns-9 above, all 5-9 nucleotides remain in a single-stranded conformation. Similarly, Ni-8 can be any sequence of 1-8 nucleotides or modified nucleotides, provided that the nucleotides remain in a single-stranded conformation in the siRNA molecule. The duplex represented in Figure 1 as 17-30 bases of an miRNA precursor" can be formed using any contiguous 17-30 base sequence of a transcription product of an miRNA-encoding nucleic acid sequence. In some embodiments, a contiguous 17-30 base sequence of a transcription product of an miRNA-encoding nucleic acid sequence comprises a subsequence that1 is predicted to hybridize to a single-stranded region of an miRNA precursor
(for example, the loop region of a stem-loop conformation). In constructing an siRNA molecule of the presently disclosed subject matter, this 17-30 base sequence is followed (in a 5' to 3' direction) by 5-9 random nucleotides (N5-g above), the reverse-complement of the 17-30 base sequence, and finally 1-8 random nucleotides (N-i-s above).
As used herein, the term "RNA" refers to a molecule comprising at least one ribonucleotide residue. By "ribonucleotide" is meant a nucleotide with a hydroxyl group at the 2' position of a β-D-ribofuranose moiety. The terms encompass double stranded RNA, single stranded RNA, RNAs with both double stranded and single stranded regions, isolated RNA such as partially purified RNA, essentially pure RNA, synthetic RNA, and recombinantly produced RNA. Thus, RNAs include, but are not limited to mRNA transcripts, miRNAs and miRNA precursors, and siRNAs. As used herein, the term "RNA" is also intended to encompass altered RNA, or analog RNA, which are RNAs that differ from naturally occurring RNA by the addition, deletion, substitution, and/or alteration of one or more nucleotides. Such alterations can include addition of non-nucleotide material, such as to the end(s) of the RNA or internally, for example at one or more nucleotides of the RNA. Nucleotides in the RNA molecules of the presently disclosed subject matter can also comprise non-standard nucleotides, such as non- naturally occurring nucleotides or chemically synthesized nucleotides or deoxynucleotides. These altered RNAs can be referred to as analogs or analogs of a naturally occurring RNA.
As used herein, the phrase "double stranded RNA" refers to an RNA molecule at least a part of which is in Watson-Crick base pairing forming a duplex. As such, the term is to be understood to encompass an RNA molecule that is either fully or only partially double stranded. Exemplary double stranded RNAs include, but are not limited to molecules comprising at least two distinct RNA strands that are either partially or fully duplexed by intermolecular hybridization. Additionally, the term is intended to include a single RNA molecule that by intramolecular hybridization can form a double stranded region (for example, a hairpin). Thus, as used herein the phrases "intermolecular hybridization" and "intramolecular hybridization" refer to double stranded molecules for which the nucleotides involved in the duplex formation are present on different molecules or the same molecule, respectively.
As used herein, the phrase "double stranded region" refers to any region of a nucleic acid molecule that is in a double stranded conformation via hydrogen bonding between the nucleotides including, but not limited to hydrogen bonding between cytosine and guanosine, adenosine and thymidine, adenosine and uracil, and any other nucleic acid duplex as would be understood by one of ordinary skill in the art. The length of the double stranded region can vary from about 15 consecutive basepairs to several thousand basepairs. In some embodiments, the double stranded region is at least 15 basepairs, in some embodiments between 15 and 300 basepairs, and in some embodiments between 15 and about 60 basepairs. As describe hereinabove, the formation of the double stranded region results from the hybridization of complementary RNA strands (for example, a sense strand and an antisense strand), either via an intermolecular hybridization (Ae., involving 2 or more distinct RNA molecules) or via an intramolecular hybridization, the latter of which can occur when a single RNA molecule contains self-complementary regions that are capable of hybridizing to each other on the same RNA molecule. These self-complementary regions are typically separated by a short stretch of nucleotides (for example, about 5-10 nucleotides) such that the intramolecular hybridization event forms what is referred to in the art as a "hairpin" or a "stem-loop structure".
HL Methods of Modulating Gene Expression
The presently disclosed subject matter provides in some embodiments methods for modulating gene expression in a plant. In some embodiments, the presently disclosed subject matter provides a method for stably modulating expression of a plant gene comprising (a) providing a vector encoding a microRNA (miRNA) targeted to the plant gene; and (b) transforming a plant cell with the vector, whereby stable expression of the miRNA in the plant cell is provided. Thus, in some embodiments the presently disclosed subject matter concerns stably transforming a plant cell
(for example, a cell from a tree) with a vector encoding a miRNA under the control of a promoter (an other transcriptional regulatory elements as necessary, such as a transcription termination signal) that is functional in that cell. In some embodiments, an miRNA precursor is produced via the activity of the promoter in the plant cell, which is then processed using endogenous miRNA pathways to generate an miRNA target in the plant cell. This promoter can be capable of binding any RNA polymerase, including, for example, an RNA polymerase Il andan RNA polymerase III. Representative promoters are disclosed hereinbelow, and include, but are not limited to an RNA polymerase III H1 promoter, an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, a tRNA gene promoter, and functional derivatives thereof. These promoters can be naturally occurring or artificially produced. An exemplary promoter has the sequence disclosed in SEQ ID NO: 162.
In some embodiments, a method for stably modulating expression of a plant gene comprises (a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence;
(b) growing the plant cells under conditions sufficient to select for a plurality of transformed plant cells that have integrated the vector into their genomes;
(c) screening the plurality of transformed plant cells for expression of the miRNA encoded by the vector; (d) selecting a transformed plant cell that expresses the miRNA; and (e) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the plant gene is stably modulated.
The presently disclosed subject matter also provides methods for enhancing the expression of a gene in a plant cell. In some embodiments, the method comprises introducing into the plant cell a vector encoding a short interfering RNA (siRNA) molecule comprising a sequence that hybridizes to a loop region, stem region, or antisense sequence of an miRNA of a pre-microRNA that comprises a microRNA (miRNA) that modulates expression of the gene, thereby resulting in downregulation of expression of the miRNA and enhanced expression of the gene.
In some embodiments, the disclosed methods are employed to modulate the expression of a gene in a tree cell. Representative, non- limiting tree species for which the disclosed methods can be employed include trees of the genus Populus and of the genus Pinus, including, but not limited to Populus trichocarpa and Pinus taeda. IV. Target Genes
The presently disclosed subject matter provides methods for stably modulating expression of plant genes using miRNAs. The methods are applicable to any gene expressed in the plant. In some embodiments, the methods are used to modulate the expression of genes in trees. In some embodiments, the methods are used to modulate the expression of genes in members of the genus Populus, including, but not limited to Populus trichocarpa. In some embodiments, the methods are used to modulate the expression of genes in members of the genus Pinus, including, but not limited to Pinus taeda.
Representative P. trichocarpa miRNAs are presented in SEQ ID NOs: 1-59 and 1247-1295. These miRNA were identified using the techniques disclosed in Examples 1-6, and are summarized in Table 1. Additionally, using the techniques disclosed in the Examples, miRNA precursor sequences present in a representative plant, P. trichocarpa were identified, and these sequences (SEQ ID NOs: 60-156 and 1296-1375) are also summarized in Table 1. Further analysis of the P. trichocarpa genome revealed target genes that the miRNAs of SEQ ID NOs: 1-59 and 1247-1295 modulate, which are summarized in Table 2. Representative Pinus taeda miRNAs are presented in SEQ ID NOs:
1662-1712. These miRNA were also identified using the techniques disclosed in Examples 1-6, and are summarized in Table 4. Additionally, using the techniques disclosed in the Examples, miRNA precursor sequences present in a second representative plant, Pinus taeda, were identified, and these sequences (SEQ ID NOs: 1713-1748) are also summarized in Table 4. Further analysis of the P. taeda genome revealed target genes that the miRNAs of SEQ ID NOs: 1662-1712 can modulate, which are also summarized in Table 2.
By comparing the nucleotide sequences of SEQ ID NOs: 1-59 and 1247-1295 to genomic and EST sequence data, plant gene sequences (for example, gene sequences from Populus sp. including, but not limited to Populus trichocarpa) that can be targeted by the miRNAs of SEQ ID NOs: 1- 59 and 1247-1295 can be identified. In view of the ability of miRNAs to tolerate various degrees of mismatches between the miRNA molecule and the target molecule (for example, 1 , 2, 3, 4 or 5 mismatches between the miRNA and the target), numerous particular target gene sequences were identified. These target gene sequences are presented in SEQ ID NOs: 176-781 and 1376-1553, and are summarized in Table 3.
Similarly, by comparing the nucleotide sequences of SEQ ID NOs: 1662-1712 to genomic and EST sequence data, plant gene sequences (for example, gene sequences from Pinus sp. including, but not limited to Pinus taeda) that can be targeted by the miRNAs of SEQ ID NOs: 1662-1712 can be identified. In view of the ability of miRNAs to tolerate various degrees of mismatches between the miRNA molecule and the target molecule (for example, 1 , 2, 3, 4 or 5 mismatches between the miRNA and the target), numerous particular target gene sequences were identified. These target gene sequences are presented in SEQ ID NOs: 1749-1837, and are summarized in Table 5.
Attorney Docket No. 297/206 PCT
Table 1
Comparisons Ot P. trichocarpa and ArabidoDsis miRNAs and miRNA Genes
Attorney Docket No. 297/206 PCT
Figure imgf000053_0001
Attorney Docket No. 297/206 PCT
Figure imgf000054_0001
Attorney Docket No. 297/206 PCT
Figure imgf000055_0001
Attorney Docket No. 297/206 PCT
Figure imgf000056_0001
Attorney Docket No. 297/206 PCT
Figure imgf000057_0001
Attorney Docket No. 297/206 PCT
Figure imgf000058_0001
Attorney Docket No. 297/206 PCT
Figure imgf000059_0001
Attorney Docket No. 297/206 PCT
Figure imgf000060_0001
Attorney Docket No. 297/206 PCT
Figure imgf000061_0001
Attorney Docket No. 297/206 PCT
Figure imgf000062_0001
Attorney Docket No. 297/206 PCT
Figure imgf000063_0001
Attorney Docket No. 297/206 PCT
Figure imgf000064_0001
Attorney Docket No. 297/206 PCT
Figure imgf000065_0001
Attorney Docket No. 297/206 PCT
Figure imgf000066_0001
Attorney Docket No. 297/206 PCT
Figure imgf000067_0001
Attorney Docket No. 297/206 PCT
Figure imgf000068_0001
Attorney Docket No. 297/206 PCT
Table 2
Potential Targets of Populus trichopcama and Pinus taeda miRNAs
P. trichopcarpa A. thaliana miRNA ID miRNA ID Putative Function of Predicted Targets
PtMIR 133 AtMIR 172 APETAL2-like protein PtMIR 104 AtMIR 162 DEAD/DEAH box helicase carpel factory / CAF identical to RNA helicase/RNAselll CAF protein
PtMIR 29 AtMIR 159, 40 MYB-related proteins PtMIR 71/ AtMIR 319 MYB-related proteins PtMIR 142 PtMIR 183 AtMIR 170,171 ,179 scarecrow-like transcription factor PtMIR 156 AtMIR 157 squamosa promoter binding protein PtMIR 61 AtMIR 164 transcription activator contain NAC1 domain PtMIR 115 AtMIR 160 transcriptional factor B3 family protein/similar to auxin-responsive factor
(ARF10)
PtMIR 56 AtMIR 168 AGRONAUTE PtMIR 6 (UVR8) UVB-resistance protein PtMIR 13 (ERD4) early-responsive to dehydration protein-related
Attorney Docket No. 297/206 PCT
plastocyanin
PtMIR69 pentatricopeptide (PPR) repeat-containing protein/F-box protein
UDP-glucoronosyl/UDP-glucosyl transferase family protein
protein kinase family protein
PtMIR73 disease resistance protein (TIR-NBS-LRR class)
PtMIR109 pentatricopeptide (PPR) repeat-containing protein
UDP-glucoronosyl/UDP-glucosyl transferase family protein
protein kinase family protein
PtMIR 122 GARS domain transcription factor / similar to (RGL1 ) gibberellin regulatory protein
PtMIR 139 putative sulfate transporter PtMIR 160 disease resistance protein (TIR-NBS-LRR class)
PtMIR 180 Intron of ubiquitin activating enzyme, putative (ECR1)
clathrin adaptor complex small chain family protein
Attorney Docket No. 297/206 PCT
PtMIR 181 putative bifunctional aspartate kinase/homoserine dehydrogenase
lectin protein kinase family protein
PtMIR 172 (CAD) cinnamyl-alcohol dehydrogenase disease resistance protein-related LIM domain-containing protein
putative TCP family transcription factor
PtMIR 184 lipase class 3 family protein PtMIR 185 UDP-glucoronosyl/UDP-glucosyl transferase protein kinase family protein mitogen-activated protein kinase luminal binding protein 1 (BiP-1 ) lipase class 3 family protein
ABC transporter family protein
PtMIR 186 disease resistance protein PtMIR 241 Flavoprotein monooxygenase laccase pseudo-response regulator 5
SPla/RYanodine receptor (SPRY) domain-containing protein
Attorney Docket No. 297/206 PCT
polyphenol oxidase SET domain-containing protein KH domain-containing protein
PtMIR 245 isoflavone reductase family protein trehalose-6-phosphate phosphatase
PtMIR 252 AthMIR 398 selenium-binding protein, putative PtMIR 255 SEC14 cytosolic factor family protein PtMIR 257 GCN5-related N-acetyltransferase gibberellin regulatory protein (RGL1 ) homeodomain transcription factor (KNAT7)
PtMIR 274 AthMIR 166 homeobox-Ieucine zipper family protein no apical meristem (NAM) family protein
PtMIR 275 AthMIR 167 auxin-responsive factor (ARF8) Squamosa promoter binding protein auxin-responsive factor (ARF6) multi-copper oxidase
S-adenosylmethionine synthetase 2 (SAM2)
PtMIR 277 AthMIR 396 beta-fructofuranosidase, putative DNAJ heat shock protein
Attorney Docket No. 297/206 PCT
PPR trypsin and protease inhibitor family protein calcium-binding EF hand family protein calcium-transporting ATPase 4 disease resistance protein transcription activator GRL1 and GRL5 expressed protein similar to auxin down-regulated protein ARG10
malate synthase protein kinase family protein short vegetative phase protein (SVP)
SWAP (Suppressor-of-White-APricot)/surp domain-containing protein
PtMIR282 homeobox protein knotted-1 like 1 (KNAT1 ) ribosomal protein L1 family protein two-component responsive regulator family protein
PtMIR283 indigoidine synthase A family protein pectate lyase family protein eukaryotic release factor 1 family protein
Attorney Docket No. 297/206 PCT
PtMIR284 AthMIR390 auxin transport protein leucine-rich repeat family protein phosphate transporter (PT2) subtilase family protein
PtMIR287 ankyrin repeat family protein beta-fructosidase disease resistance protein leucine-rich repeat family protein oxidoreductase, 2OG-Fe(II) oxygenase family protein
translationally controlled tumor family protein
PtMIR291 AthMIR 171 acyl-CoA: 1 -acylglycerol-3-phosphate acyltransferase
phosphatidylinositol-4-phosphate 5-kinase family protein scarecrow transcription factor
PtMIR295 F-box family protein PtMIR298 ATP-binding cassette transport protein disease resistance protein glutathione S-conjugate ABC transporter (MRP2)
PtMIR 302 cytochrome P450 71B36
Attorney Docket No. 297/206 PCT
rhomboid family protein
PtMIR 315 BAG domain-containing protein leucine-rich repeat family protein LpMIR IOO AMP-dependent synthetase elongation factor Tu, putative / EF-Tu expressed protein contains 3 transmembrane domains
peroxidase family protein similar to cationic peroxidase
LpMIR 119 DEAD box RNA helicase, putative (RH20) disease resistance protein lipase
MYB transcription factor ubiquitin activating enzyme zinc finger (C2H2 type)
LpMIR 176 ABC transporter family protein
AWPM-19-like membrane family protein fructose-bisphosphate aldolase osmotin-like protein (OSM34) pyrophosphate-energized vacuolar membrane proton pump
Attorney Docket No. 297/206 PCT
F-box family protein (FBX1) E3 ubiquitin ligase
LpMIR 178 AthMIR 156 actin aspartyl protease family protein cellulose synthase endo-(1 ,3)-alpha-glucanase homeobox-leucine zipper protein 13 (HB-13) lateral organ boundaries domain protein 4 (LBD4) nitrate reductase 2 (NR2) peptidyl-tRNA hydrolase protein kinase family protein
Squamosa promoter binding protein
LpMIR 26 disease resistance protein leucine-rich repeat family protein mob1/phocein family protein oxidoreductase family protein
RuBisCO subunit binding-protein alpha subunit
LpMIR 27 3-deoxy-D-manno-octulosonic acid transferase chlorophyll A-B binding family protein hydrolase, alpha/beta fold family protein
Attorney Docket No. 297/206 PCT
nodulin MtN3 family protein thioredoxin family protein zinc finger (CCCH-type/C3HC4-type RING finger) family protein
LpMIR 28 60S ribosomal protein L24, putative abscisic acid-responsive HVA22 family protein aspartyl protease family protein lipase class 3 family protein microtubule organization 1 protein (MOR1)
SAR DNA-binding protein
LpMIR 7 AthMIR 159,319 acyl-ACP thioesterase
ERF domain protein
MYB transcription factor ethylene-responsive protein ubiquitin carboxyl-terminal hydrolase family protein
17.8 kDa class I heat shock protein calcium-dependent protein kinase
GDSL-motif lipase/hydrolase family protein
LpMIR 77 chloroplast nucleoid DNA-binding protein protein kinase family protein
Attorney Docket No. 297/206 PCT
LpMIR 82 disease resistance protein leucine-rich repeat family protein protein phosphatase 2C family protein
LpMIR 89 sterol isomerase LpMIR 9 AthMIR 160 auxin-responsive AUX/IAA family protein transcriptional factor B3 family protein
LpMIR 95 auxin-responsive GH3 protein C2 domain-containing protein MYB transcription factor PQ-loop repeat family protein glycosyl hydrolase family 29 YbaK/prolyl-tRNA synthetase-related zinc finger (C3HC4-type RING finger)
Attorney Docket No. 297/206 PCT
Table 3 Populus trichocarpa miRNA Target Sequences
Figure imgf000079_0001
Attorney Docket No. 297/206 PCT
Figure imgf000080_0001
Attorney Docket No. 297/206 PCT
Figure imgf000081_0001
Attorney Docket No. 297/206 PCT
Figure imgf000082_0001
Attorney Docket No. 297/206 PCT
Figure imgf000083_0001
Attorney Docket No. 297/206 PCT
Figure imgf000084_0001
Attorney Docket No. 297/206 PCT
Figure imgf000085_0001
Attorney Docket No. 297/206 PCT
Figure imgf000086_0001
Attorney Docket No. 297/206 PCT
Figure imgf000087_0001
Attorney Docket No. 297/206 PCT
Figure imgf000088_0001
Attorney Docket No. 297/206 PCT
Figure imgf000089_0001
Attorney Docket No. 297/206 PCT
Figure imgf000090_0001
Attorney Docket No. 297/206 PCT
Figure imgf000091_0001
Attorney Docket No. 297/206 PCT
Figure imgf000092_0001
Attorney Docket No. 297/206 PCT
Figure imgf000093_0001
Attorney Docket No. 297/206 PCT
Figure imgf000094_0001
Attorney Docket No. 297/206 PCT
Figure imgf000095_0001
Attorney Docket No. 297/206 PCT
Figure imgf000096_0001
Attorney Docket No. 297/206 PCT
Figure imgf000097_0001
Attorney Docket No. 297/206 PCT
Figure imgf000098_0001
Attorney Docket No. 297/206 PCT
Figure imgf000099_0001
Attorney Docket No. 297/206 PCT
Figure imgf000100_0001
Attorney Docket No. 297/206 PCT
Figure imgf000101_0001
Attorney Docket No. 297/206 PCT
Figure imgf000102_0001
Attorney Docket No. 297/206 PCT
Figure imgf000103_0001
Attorney Docket No. 297/206 PCT
Figure imgf000104_0001
Attorney Docket No. 297/206 PCT
Figure imgf000105_0001
Attorney Docket No. 297/206 PCT
Figure imgf000106_0001
Attorney Docket No. 297/206 PCT
Figure imgf000107_0001
Attorney Docket No. 297/206 PCT
Figure imgf000108_0001
Attorney Docket No. 297/206 PCT
Figure imgf000109_0001
Attorney Docket No. 297/206 PCT
Figure imgf000110_0001
Attorney Docket No. 297/206 PCT
Figure imgf000111_0001
Attorney Docket No. 297/206 PCT
Figure imgf000112_0001
Attorney Docket No. 297/206 PCT
Figure imgf000113_0001
Attorney Docket No. 297/206 PCT
Figure imgf000114_0001
Attorney Docket No. 297/206 PCT
Figure imgf000115_0001
Attorney Docket No. 297/206 PCT
Figure imgf000116_0001
Attorney Docket No. 297/206 PCT
Figure imgf000117_0001
Attorney Docket No. 297/206 PCT
Figure imgf000118_0001
Attorney Docket No. 297/206 PCT
Figure imgf000119_0001
Attorney Docket No. 297/206 PCT
Figure imgf000120_0001
Table 4 Comparisons of Pinus taeda and Arabidopsis mi RNAs and mi RNA Genes
Figure imgf000120_0002
Attorney Docket No. 297/206 PCT
Figure imgf000121_0001
Attorney Docket No. 297/206 PCT
Figure imgf000122_0001
Attorney Docket No. 297/206 PCT
Figure imgf000123_0001
Attorney Docket No. 297/206 PCT
Figure imgf000124_0001
Attorney Docket No. 297/206 PCT
Figure imgf000125_0001
Attorney Docket No. 297/206 PCT
Figure imgf000126_0001
Table 5
Figure imgf000126_0002
Attorney Docket No. 297/206 PCT
Figure imgf000127_0001
Attorney Docket No. 297/206 PCT
Figure imgf000128_0001
Attorney Docket No. 297/206 PCT
Figure imgf000129_0001
Attorney Docket No. 297/206 PCT
Figure imgf000130_0001
Attorney Docket No. 297/206 PCT
Figure imgf000131_0001
Attorney Docket No. 297/206 PCT
Figure imgf000132_0001
n.d.: not determined
Thus, in some embodiments, a plant gene that is targeted for modulation has a nucleic acid sequence comprising any of SEQ ID NOs. 176-781 , 1376-1553, and 1749-1837, and encodes a polypeptide having an amino acid sequence comprising any of SEQ ID NOs: 782-1246, 1554-1661 , and 1838-1907. Furthermore, based on the knowledge that miRNAs can tolerate mismatches with their targets and still modulate the expression of those targets, in some embodiments a plant gene that is targeted for modulation comprises a nucleic acid sequence at least about 70% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, and encodes a polypeptide comprising an amino acid sequence have 5 or fewer (e.g., 5, 4,
3, 2, or 1) changed amino acids as compared to the amino acids disclosed as SEQ ID NOs: 782-1246, 1554-1661 , and 1838-1907.
Using the techniques disclosed in Examples 1-6, additional plant genes can be selected and miRNAs designed to modulate the expression of the genes in any desired plant. Additionally, the basic methodology disclosed in these Examples can be used to isolate miRNAs from any desired plant and to identify genes that can be targeted using the methods disclosed herein.
For example, the techniques disclosed in Examples 1-6 were employed to identify genes from Pinus taeda and to design miRNAs to modulate the expression of genes in Pinus sp. These sequences are summarized in Table 4.
In addition, knowledge of the sequence of a gene and/or a gene product can be used to design miRNAs to target the expression of the gene in any plant. For example, in some embodiments, genes associated with lignin biosynthesis are targeted for modulation. Lignin is a major component of wood, and the regulation of its biosynthesis has can have a major impact on paper and pulping processes. Several genes have been identified that are involved in the biosynthesis of lignin including, but not limited to sinapyl alcohol dehydrogenase (SAD), cinnamyl alcohol dehydrogenase (CAD), 4- coumarate:CoA ligase (4CL), cinnamoyl CoA O-methyltransferase (CCoAOMT; also referred to as CCOMT), caffeate O-methyltransferase (COMT), ferulate-5-hydroxylase (F5H), cinnamate-4-hydroxylase (C4H), p- coumarate-3-hydroxylase (C3H), and phenylalanine ammonia lyase (PAL). Reviewed in Anterola & Lewis, 2002; Boerjan et al., 2003. Reduction in the activities of one or more of these genes has been shown to result in reduced lignin deposition (see Anterola & Lewis, 2002; Boerjan et al., 2003), and thus these genes provide potential targets for miRNA-mediated gene expression modulation.
In some embodiments, genes associated with cellulose biosyntheses are targeted for modulation. Representative, non-limiting genes that have been identified that are associated with cellulose biosynthesis include cellulose synthase (CeS; also referred to as CESA in some plants), cellulose synthase-like (CSL), glucosidase, glucan synthase, Korrigan endocellulase, callose synthase, and sucrose synthase.
In some embodiments, other plant genes are targeted for modulation using miRNAs. A non-limiting list of gene families that can be targeted include hormone-related genes, including but not limited to isopentyl transferase (ipt), gibberellic acid (GA) oxidase, auxin (AUX), auxin- responsive and auxin-induced genes, and members of the rooting locus (ROL) gene family; hemicellulose-related genes, disease-related genes, stress-related genes, growth-related genes and transcription factors. It is understood that the target genes listed hereinabove are exemplary only, and that the methods and compositions of the presently disclosed subject matter can be applied to modulate the expression of any desired gene in any desired plant.
V. Nucleic Acids
The nucleic acid molecules employed in accordance with the presently disclosed subject matter include any nucleic acid molecule encoding a plant gene product, as well as the nucleic acid molecules that are used in accordance with the presently disclosed subject matter to modulate the expression of a plant gene. Thus, the nucleic acid molecules employed in accordance with the presently disclosed subject matter include, but are not limited to, the nucleic acid molecules described herein (for example, SEQ ID NOs: 1-1907); sequences substantially identical to those described herein (for example, sequences at least 70% identical to any of SEQ ID NOs: 1-1907); and subsequences and elongated sequences thereof. The presently disclosed subject matter also encompasses genes, cDNAs, chimeric genes, and vectors comprising the disclosed nucleic acid sequences.
An exemplary nucleotide sequence employed in the methods disclosed herein comprises sequences that are complementary to each other, the complementary regions being capable of forming a duplex of, in some embodiments, at least about 15 to 300 basepairs, and in some embodiments, at least about 15-24 basepairs. One strand of the duplex comprises a nucleic acid sequence of at least 15 contiguous bases having a nucleic acid sequence of a nucleic acid molecule of the presently disclosed subject matter. In one example, one strand of the duplex comprises a nucleic acid sequence comprising 15, 16, 17, or 18 nucleotides, or even longer where desired, such as 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or
30 nucleotides, or up to the full length of any of those nucleic acid sequences described herein. Such fragments can be readily prepared by, for example, directly synthesizing the fragment by chemical synthesis, by application of nucleic acid amplification technology, or by introducing selected sequences into recombinant vectors for recombinant production.
The phrase "hybridizing specifically to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex nucleic acid mixture (e.g., total cellular DNA or RNA). The term "subsequence" refers to a sequence of a nucleic acid molecule or amino acid molecule that comprises a part of a longer nucleic acid or amino acid sequence. An exemplary subsequence is a sequence that comprises part of a duplexed region of a pri-miRNA or a pre-miRNA including, but not limited to the nucleotides that become the mature miRNA after nuclease action or a single-stranded region in an miRNA precursor.
The term "elongated sequence" refers to an addition of nucleotides (or other analogous molecules) incorporated into the nucleic acid. For example, a polymerase {e.g., a DNA polymerase) can add sequences at the 3' terminus of the nucleic acid molecule. In addition, the nucleotide sequence can be combined with other DNA sequences, such as promoters, promoter regions, enhancers, polyadenylation signals, intronic sequences, additional restriction enzyme sites, multiple cloning sites, and other coding segments. Nucleic acids of the presently disclosed subject matter can be cloned, synthesized, recombinantly altered, mutagenized, or subjected to combinations of these techniques. Standard recombinant DNA and molecular cloning techniques used to isolate nucleic acids are known in the art. Exemplary, non-limiting methods are described by Silhavy et a/., 1984; Ausubel et a/., 1989; Glover & Hames, 1995; and Sambrook & Russell,
2001 ). Site-specific mutagenesis to create base pair changes, deletions, or small insertions is also known in the art as exemplified by publications (see e.g., Adelman et al., 1983; Sambrook & Russell, 2001).
VL Vectors
In some embodiments of the presently disclosed subject matter, miRNA precursor molecules are expressed from transcription units inserted into nucleic acid vectors (alternatively referred to generally as "recombinant vectors" or "expression vectors"). A vector is used to deliver a nucleic acid molecule encoding an miRNA into a plant cell to target a specific plant gene.
The recombinant vectors can be, for example, DNA plasmids or viral vectors. Various expression vectors are known in the art. The selection of the appropriate expression vector can be made on the basis of several factors including, but not limited to the cell type wherein expression is desired. For example, Agrobacterium-based expression vectors can be used to express the nucleic acids of the presently disclosed subject matter when stable expression of the vector insert is sought in a plant cell.
In some embodiments, a vector is also used to deliver a nucleic acid molecule encoding an siRNA into a plant cell to target a specific miRNA precursor. VI.A. Promoters
The expression of the nucleotide sequence in the expression cassette can be under the control of a constitutive promoter or an inducible promoter that initiates transcription only when the host cell is exposed to some particular external stimulus. For bacterial production of an miRNA and/or an siRNA, exemplary promoters include Simian virus 40 early promoter, a long terminal repeat promoter from retrovirus, an actin promoter, a heat shock promoter, and a metallothionein protein. For in vivo production of an miRNA and/or an siRNA in plants, exemplary constitutive promoters are derived from the CaMV 35S, rice actin, and maize ubiquitin genes, each described herein below. Exemplary inducible promoters for this purpose include the chemically inducible PR-Ia promoter and a wound-inducible promoter, also described herein below.
Selected promoters can direct expression in specific cell types (such as leaf epidermal cells, mesophyll cells, root cortex cells) or in specific tissues or organs (roots, leaves or flowers, for example). Exemplary tissue- specific promoters include well-characterized root-, pith-, and leaf-specific promoters, each described herein below.
Depending upon the host cell system utilized, any one of a number of suitable promoters can be used. Promoter selection can be based on expression profile and expression level. The following are non-limiting examples of promoters that can be used in the expression cassettes. VI.A.1. Constitutive Expression
35S Promoter. The CaMV 35S promoter can be used to drive constitutive gene expression. Construction of the plasmid pCGN1761 is described in the published patent application EP 0 392 225, which is hereby incorporated by reference. pCGN1761 contains the "double" CaMV 35S promoter and the tml transcriptional terminator with a unique EcoRI site between the promoter and the terminator and has a pUC-type backbone. A derivative of pCGN1761 is constructed which has a modified polylinker that includes Notl and Xhol sites in addition to the existing EcoRI site. This derivative is designated pCGN 1761 ENX. pCGN1761ENX is useful for the cloning of cDNA sequences or gene sequences (including microbial open reading frame (ORF) sequences) within its polylinker for the purpose of their expression under the control of the 35S promoter in transgenic plants. The entire 35S promoter-gene sequence-tml terminator cassette of such a construction can be excised by Hindlll, Sphl, Sail, and Xbal sites 51 to the promoter and Xbal, BamHI and BgII sites 3' to the terminator for transfer to transformation vectors such as those described below. Furthermore, the double 35S promoter fragment can be removed by 5' excision with Hindlll, Sphl, Sail, Xbal, or Pstl, and 3' excision with any of the polylinker restriction sites (EcoRI, Notl or Xhol) for replacement with another promoter. Actin Promoter. Several isoforms of actin are known to be expressed in most cell types and consequently the actin promoter is a good choice for a constitutive promoter. In particular, the promoter from the rice Actl gene has been cloned and characterized (McElroy θif al., 1990). A 1.3 kb fragment of the promoter was found to contain all the regulatory elements required for expression in rice protoplasts. Furthermore, numerous expression vectors based on the Actl promoter have been constructed specifically for use in monocotyledons (McElroy et al., 1991 ). These incorporate the Actl-'mtron 1 , /Ac//?/ 5' flanking sequence and Adhl-miron 1 (from the maize alcohol dehydrogenase gene) and sequence from the CaMV 35S promoter. Vectors showing highest expression were fusions of 35S and Actl intron or the Actl 5' flanking sequence and the Actl intron. Optimization of sequences around the initiating ATG (of the β-glucuronidase (GUS) reporter gene) also enhanced expression. The promoter expression cassettes described by McElroy ef al., 1991 can be easily modified for gene expression and are particularly suitable for use in monocotyledonous hosts. For example, promoter-containing fragments is removed from the McElroy constructions and used to replace the double 35S promoter in pCGN1761 ENX, which is then available for the insertion of specific gene sequences. The fusion genes thus constructed can then be transferred to appropriate transformation vectors. In a separate report, the rice Actl promoter with its first intron has also been found to direct high expression in cultured barley cells (Chibbar ef a/., 1993). Ubiquitiπ Promoter. Ubiquitin is another gene product known to accumulate in many cell types and its promoter has been cloned from several species for use in transgenic plants (e.g. sunflower by Binet et al., 1991 and maize by Christensen et al., 1989). The maize ubiquitin promoter has been developed in transgenic monocot systems and its sequence and vectors constructed for monocot transformation are disclosed in the patent publication EP 0 342 926 which is herein incorporated by reference. Taylor et al., 1993 describe a vector (pAHC25) that comprises the maize ubiquitin promoter and first intron and its high activity in cell suspensions of numerous monocotyledons when introduced via microprojectile bombardment. The ubiquitin promoter is suitable for gene expression in transgenic plants, especially monocotyledons. Suitable vectors are derivatives of pAHC25 or any of the transformation vectors described in this application, modified by the introduction of the appropriate ubiquitin promoter and/or intron sequences.
VI.A.2. Inducible Expression
Chemically Inducible PR-Ia Promoter. The double 35S promoter in pCGN1761 ENX can be replaced with any other promoter of choice that will result in suitably high expression levels. By way of example, one of the chemically regulatable promoters described in U.S. Patent No. 5,614,395 can replace the double 35S promoter. The promoter of choice is preferably excised from its source by restriction enzymes, but can alternatively be PCR-amplified using primers that carry appropriate terminal restriction sites. Should PCR-amplification be undertaken, then the promoter should be re-sequenced to check for amplification errors after the cloning of the amplified promoter in the target vector. The chemical/pathogen regulated tobacco PR-Ia promoter is cleaved from plasmid pCIB1004 (for construction, see EP 0 332 104, which is hereby incorporated by reference) and transferred to plasmid pCGN 1761 ENX (Uknes et al., 1992). pCIB1004 is cleaved with Ncol and the resultant 3' overhang of the linearized fragment is rendered blunt by treatment with T4 DNA polymerase. The fragment is then cleaved with Hindlll and the resultant PR-Ia promoter-containing fragment is gel purified and cloned into pCGN 1761 ENX from which the double 35S promoter has been removed. This is done by cleavage with Xhol and blunting with T4 DNA polymerase, followed by cleavage with Hindlll and isolation of the larger vector-terminator-containing fragment into which the pCIB1004 promoter fragment is cloned. This generates a pCGN1761 ENX derivative with the PR-Ia promoter and the tml terminator and an intervening polylinker with unique EcoRI and Notl sites. The selected coding sequence can be inserted into this vector, and the fusion products (i.e., promoter-gene-terminator) can subsequently be transferred to any selected transformation vector, including those described below. Various chemical regulators can be employed to induce expression of the selected coding sequence in the plants transformed according to the present invention, including the benzothiadiazole, isonicotinic acid, and salicylic acid compounds disclosed in U.S. Patent Nos. 5,523,311 and 5,614,395, herein incorporated by reference. Wound-lnducible Promoters. Wound-inducible promoters can also be suitable for gene expression. Numerous such promoters have been described (e.g. Xu et al., 1993; Logemann et al., 1989; Rohrmeier & Lehle, 1993; Firek et al., 1993; Warner et a/., 1993) and all are suitable for use with the presently disclosed subject matter. Logemann et al., 1989 describe the 5' upstream sequences of the dicotyledonous potato wunl gene. Xu et al.,
1993 show that a wound-inducible promoter from the dicotyledon potato (pin2) is active in the monocotyledon rice. Further, Rohrmeier & Lehle, 1993 describe the cloning of the maize Wipl cDNA, which is wound induced and which can be used to isolate the cognate promoter using standard techniques. Similarly, Firek et al., 1993 and Warner et al., 1993 have described a wound-induced gene from the monocotyledon Asparagus officinalis, which is expressed at local wound and pathogen invasion sites. Using cloning techniques well known in the art, these promoters can be transferred to suitable vectors, fused to the genes pertaining to this invention, and used to express these genes at the sites of plant wounding.
VI.A.3. Tissue-Specific Expression
Root Promoter. Another pattern of gene expression is root expression. A suitable root promoter is described by de Framond, 1991 and also in the published patent application EP O 452 269, which is herein incorporated by reference. This promoter is transferred to a suitable vector such as pCGN 1761 ENX for the insertion of a selected gene and subsequent transfer of the entire promoter-gene-terminator cassette to a transformation vector of interest.
Pith Promoter. PCT International Publication No. WO 93/07278, which is herein incorporated by reference, describes the isolation of the maize trpA gene, which is preferentially expressed in pith cells. The gene sequence and promoter extending up to -1726 basepairs (bp) from the start of transcription are presented. Using standard molecular biological techniques, this promoter, or parts thereof, can be transferred to a vector such as pCGN1761 where it can replace the 35S promoter and be used to drive the expression of a foreign gene in a pith-preferred manner. In fact, fragments containing the pith-preferred promoter or parts thereof can be transferred to any vector and modified for utility in transgenic plants.
Leaf Promoter. A maize gene encoding phosphoenol carboxylase (PEPC) has been described by Hudspeth & Grula, 1989. Using standard molecular biological techniques the promoter for this gene can be used to drive the expression of any gene in a leaf-specific manner in transgenic plants.
VLB. Transcriptional Terminators
A variety of transcriptional terminators are available for use in expression cassettes. These are responsible for the termination of transcription beyond the transgene and its correct polyadenylation. Appropriate transcriptional terminators are those that are known to function in plants and include the CaMV 35S terminator, the tml terminator, the nopaline synthase terminator, and the pea rbcS E9 terminator. With regard to RNA polymerase III terminators, these terminators typically comprise a run of 5 or more consecutive thymidine residues. In some embodiments, an RNA polymerase III terminator comprises the sequence TTTTTTT. These can be used in both monocotyledons and dicotyledons. VI.C. Sequences for the Enhancement or Regulation of Expression
Numerous sequences have been found to enhance the expression of an operatively lined nucleic acid sequence, and these sequences can be used in conjunction with the nucleic acids of the presently disclosed subject matter to increase their expression in transgenic plants.
Various intron sequences have been shown to enhance expression, particularly in monocotyledonous cells. For example, the introns of the maize Adhl gene have been found to significantly enhance the expression of the wild-type gene under its cognate promoter when introduced into maize cells. Intron 1 was found to be particularly effective and enhanced expression in fusion constructs with the chloramphenicol acetyltransferase gene (CaIMs et al., 1987). In the same experimental system, the intron from the maize bronzel gene had a similar effect in enhancing expression. Intron sequences have been routinely incorporated into plant transformation vectors, typically within the non-translated leader.
A number of non-translated leader sequences derived from viruses are also known to enhance expression, and these are particularly effective in dicotyledonous cells. Specifically, leader sequences from Tobacco Mosaic Virus (TMV, the "W-sequence"), Maize Chlorotic Mottle Virus (MCMV), and Alfalfa Mosaic Virus (AMV) have been shown to be effective in enhancing expression (e.g. Gallie et al., 1987; Skuzeski et al., 1990).
VII. Recombinant Expression Vectors
Suitable expression vectors that can be used include, but are not limited to, the following vectors or their derivatives: yeast vectors, bacteriophage vectors (e.g., lambda phage), and plasmid and cosmid DNA vectors.
Numerous vectors available for plant transformation can be prepared and employed in the present methods. Exemplary vectors include pCIB200, pCIB2001 , pCIBIO, pCIB3064, pSOG19, pSOG35, and pSIT, each described herein. The selection of vector can depend upon the chosen transformation technique and the target species for transformation. VILA. Agrobacterium Transformation Vectors
Many vectors are available for transformation using Agrobacterium tumefaciens. These typically carry at least one T-DNA border sequence and include vectors such as pBIN19 (Bevan, 1984) and pXYZ. Below, the construction of two typical vectors suitable for Agrobacterium transformation is described.
PCIB200 and pCIB2001. The binary vectors pclB200 and pCIB2001 are used for the construction of recombinant vectors for use with Agrobacterium and are constructed in the following manner. pTJS75kan is created by Narl digestion of pTJS75 (Schmidhauser & Helinski, 1985) allowing excision of the tetracycline-resistance gene, followed by insertion of an Accl fragment from pUC4K carrying an NPTII (Messing & Vierra, 1982; Bevan et al., 1983; McBride et al., 1990). Xhol linkers are ligated to the EcoRV fragment of PCIB7 which contains the left and right T-DNA borders, a plant selectable nos/nptll chimeric gene and the pUC polylinker (Rothstein et al., 1987), and the Xhol-digested fragment are cloned into Sail-digested pTJS75kan to create pCIB200 (see also EP 0 332 104, herein incorporated by reference). pCIB200 contains the following unique polylinker restriction sites: EcoRI, Sstl, Kpnl, BgIII, Xbal, and Sail. pC!B2001 is a derivative of pCIB200 created by the insertion into the polylinker of additional restriction sites. Unique restriction sites in the polylinker of pCIB2001 are EcoRI, Sstl, Kpnl, BgIII, Xbal, Sail, MIuI, BcII, Avrll, Apal, Hpal, and Stul. pCIB2001 , in addition to containing these unique restriction sites also has plant and bacterial kanamycin selection, left and right T-DNA borders for
Agrobacterium-mediated transformation, the RK2-derived trfA function for mobilization between E. coli and other hosts, and the OriT and OhV functions also from RK2. The pCIB2001 polylinker is suitable for the cloning of plant expression cassettes containing their own regulatory signals. pCIBIO and Hvgromvcin Selection Derivatives thereof. The binary vector pCIBIO contains a gene encoding kanamycin resistance for selection in plants and T-DNA right and left border sequences and incorporates sequences from the wide host-range plasmid pRK252 allowing it to replicate in both E. coli and Agrobacterium. Its construction is described by Rothstein et ai, 1987. Various derivatives of pCIBIO are constructed which incorporate the gene for hygromycin B phosphotransferase described by Gritz ef a/., 1983. These derivatives enable selection of transgenic plant cells on hygromycin only (pCIB743), or hygromycin and kanamycin
(PCIB715, pCIB717). pSIT. pSIT is an Agrobacterium binary vector that can be used to stably express exogenous nucleic acids (for example, miRNAs and/or siRNAs) in plants. pSIT encodes two transcription units. The first is a transcription unit encoding a selectable marker under control of a promoter- transcription terminator pair that functions in plants cells. The second transcription unit encodes the gene of interest (for example, an miRNAs and/or siRNA) under the control of a second promoter-transcription terminator pair, which specifically directs the transcription to generate a functional miRNAs and/or siRNA in plant cells and which can be the same or different than the one operatively linked to the selectable marker. In some embodiments, an miRNAs and/or siRNA is operatively linked to an RNA polymerase III promoter (for example, the At7SL4 promoter) and the RNA- polymerase-lll-recognized transcription terminator (for example, TTTTTTT). The integration of the miRNAs and/or siRNA cassette is guaranteed if the transformants survived through the antibiotic selection process due to the expression of the selection marker gene incorporated in the binary vector. The hpt (hygromycin phosphotransferase) selection marker gene is operatively under the control of a pair of Pnos promoter and Nos terminator. Other pairs of promoter and terminator that can drive selection marker gene expression also are suitable for the purpose.
VII.B. Other Plant Transformation Vectors
Transformation without the use of Agrobacterium tumefaciens circumvents the requirement for T-DNA sequences in the chosen transformation vector and consequently vectors lacking these sequences can be utilized in addition to vectors such as the ones described above which contain T-DNA sequences. Transformation techniques that do not rely on Agrobacterium include transformation via particle bombardment, protoplast uptake (e.g. polyethylene glycol (PEG) and electroporation), and microinjection. The choice of vector can depend on the technique chosen for the species being transformed. Below, the construction of typical vectors suitable for non-Agrobacterium transformation is described. PCIB3064. pCIB3064 is a pUC-derived vector suitable for direct gene transfer techniques in combination with selection by the herbicide BASTA® (or phosphinothricin). The plasmid pCIB246 comprises the CaMV 35S promoter in operational fusion to the E. coli β-glucuronidase (GUS) gene and the CaMV 35S transcriptional terminator and is described in PCT International Publication No. WO 93/07278. The 35S promoter of this vector contains two ATG sequences 5' of the start site. These sites are mutated using standard PCR techniques in such a way as to remove the ATGs and generate the restriction sites Sspl and Pvull. The new restriction sites are 96 and 37 bp away from the unique Sail site and 101 and 42 bp away from the actual start site. The resultant derivative of pCIB246 is designated
PC1B3025.
The GUS gene is then excised from pCIB3025 by digestion with Sail and Sacl, the termini rendered blunt and religated to generate plasmid pCIB3060. The plasmid pJIT82 is obtained from the John lnnes Centre (Norwich, United Kingdom), and a 400 bp Smal fragment containing the bar gene from Streptomyces viridochromogenes is excised and inserted into the Hpal site of pCIB3060 (Thompson et al., 1987). This generated pCIB3064, which comprises the bar gene under the control of the CaMV 35S promoter and terminator for herbicide selection, a gene for ampicillin resistance (for selection in E. coli) and a polylinker with the unique sites Sphl, Pstl, Hindlll, and BamHI. This vector is suitable for the cloning of plant expression cassettes containing their own regulatory signals.
PSOG19 and pSOG35. pSOG35 is a transformation vector that utilizes the E. coli gene dihydrofolate reductase (DHFR) as a selectable marker conferring resistance to methotrexate. PCR is used to amplify the
35S promoter (-800 bp), intron 6 from the maize Adh1 gene (-550 bp) and 18 bp of the GUS untranslated leader sequence from pSOG10. A 250-bp fragment encoding the E. coli dihydrofolate reductase type Il gene is also amplified by PCR and these two PCR fragments are assembled with a Sacl-Pstl fragment from pB1221 (Clontech, Palo Alto, California, United States of America) that comprises the pUC19 vector backbone and the nopaline synthase terminator. Assembly of these fragments generates pSOG19 which contains the 35S promoter in fusion with the intron 6 sequence, the GUS leader, the DHFR gene and the nopaline synthase terminator. Replacement of the GUS leader in pSOG19 with the leader sequence from Maize Chlorotic Mottle Virus (MCMV) generates the vector pSOG35. pSOG19 and pSOG35 carry a β-lactamase gene from the pUC vector for ampicillin resistance and have Hindlll, Sphl, Pstl and EcoRI sites available for the cloning of foreign substances. Vll.C. Selectable Markers
For certain target species, different antibiotic or herbicide selection markers can be preferred. Selection markers used routinely in transformation include the nptll gene, which confers resistance to kanamycin and related antibiotics (Messing & Vierra, 1982; Bevan et al., 1983), the bar gene, which confers resistance to the herbicide phosphinothricin (White et al., 1990; Spencer et al., 1990), the hph gene, which confers resistance to the antibiotic hygromycin (Blochlinger & Diggelmann, 1984), the dhfr gene, which confers resistance to methotrexate (Bourouis & Jarry, 1983), and the
5-enolpyruvylshikimate-3-phosphate (EPSP) synthase gene, which confers resistance to glyphosate (U.S. Patent Nos. 4,940,935 and 5,188,642).
VIII. Transformation Once a nucleic acid sequence of the presently disclosed subject matter has been cloned into an expression system, it is transformed into a plant cell. The receptor and target expression cassettes of the presently disclosed subject matter can be introduced into the plant cell in a number of art-recognized ways. Methods for regeneration of plants are also well known in the art. For example, Ti plasmid vectors have been utilized for the delivery of foreign DNA, as have direct DNA uptake, liposomes, electroporation, microinjection, and microprojectiles. In addition, bacteria from the genus Agrobacterium can be utilized to transform plant cells. The presently disclosed subject matter also provides a method for stably modulating expression of a gene in a plant. In some embodiments, the method comprises (a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence;
(b) growing the plant cells under conditions sufficient to select for a plurality of transformed plant cells that have integrated the vector into their genomes;
(c) screening the plurality of transformed plant cells for expression of the miRNA encoded by the vector; (d) selecting a transformed plant cell that expresses the miRNA; and (e) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the plant gene is stably modulated. In some embodiments, the method comprises (a) transforming a plurality of plant cells with an Agrobacterium tumefaciens binary vector comprising (i) a nucleic acid sequence encoding a selectable marker; and (ii) a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence; (b) treating the plant cells with a drug under conditions sufficient to kill those plant cells that did not receive the binary vector, wherein the selectable marker provides resistance to the drug, to create a first plurality of transformed plant cells; (c) growing the first plurality of transformed plant cells under conditions sufficient to select for a second plurality of transformed plant cells that have integrated the binary vector into their genomes; (d) screening the second plurality of transformed plant cells for expression of the miRNA encoded by the expression vector; (e) selecting a transformed plant cell that expresses the miRNA; and (f) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the gene in the plant is stably modulated.
The presently disclosed subject matter is based on the introduction of a stable and heritable miRNAs and/or siRNAs into plant cells to specifically manipulate a gene of the interest. As disclosed herein, this concept has been demonstrated through Agrobacterium transformation, but would also be applicable to other approaches for transformation, such as bombardment. Thus, it should be understood that the mechanism of transformation of a plant cell is not limited to the Agrobacterium-mediated techniques disclosed in certain embodiments herein. Any transformation technique that results in stable expression of a nucleic acid (for example, an miRNAs and/or siRNA) of the presently disclosed subject matter can be employed with the methods disclosed herein. Below are descriptions of representative techniques for transforming both dicotyledonous and monocotyledonous plants, as well as a representative plastid transformation technique. VIII.A. Transformation of Dicotyledons
Transformation techniques for dicotyledons are well known in the art and include Agrobacterium-based techniques and techniques that do not require Agrobacterium. Uon-Agrobacterium techniques involve the uptake of exogenous genetic material directly by protoplasts or cells. This can be accomplished by PEG or electroporation-mediated uptake, particle bombardment-mediated delivery, or microinjection. Examples of these techniques are disclosed in Paszkowski et al., 1984; Potrykus et al., 1985;
Reich et al., 1986; and Klein et al., 1987. In each case the transformed cells are regenerated to whole plants using standard techniques known in the art.
/Agrobacter/t/m-mediated transformation is a useful technique for transformation of dicotyledons because of its high efficiency of transformation and its broad utility with many different species.
Agrobacterium transformation typically involves the transfer of the binary vector carrying the foreign DNA of interest (e.g. pSIT) to an appropriate Agrobacterium strain that can depend on the complement of vir genes carried by the host Agrobacterium strain either on a co-resident Ti plasmid or chromosomally (e.g. strain C58 or strains pCIB542 for pCIB200 and pCIB2001 ; Uknes et al., 1993). The transfer of the recombinant binary vector to Agrobacterium is accomplished by a triparental mating procedure using E. coli carrying the recombinant binary vector, a helper E. coli strain that carries a plasmid such as pRK2013 and which is able to mobilize the recombinant binary vector to the target Agrobacterium strain. Alternatively, the recombinant binary vector can be transferred to Agrobacterium by DNA transformation (Hόfgen & Willmitzer, 1988). Transformation of the target plant species by recombinant Agrobacterium usually involves co-cultivation of the Agrobacterium with explants from the plant and follows protocols well known in the art. Transformed tissue is regenerated on selectable medium carrying the antibiotic or herbicide resistance marker present between the binary plasmid
T-DNA borders.
Another approach to transforming plant cells with a gene involves propelling inert or biologically active particles at plant tissues and cells. This technique is disclosed in U.S. Patent Nos. 4,945,050; 5,036,006; and 5,100,792; all to Sanford et a/. Generally, this procedure involves propelling inert or biologically active particles at the cells under conditions effective to penetrate the outer surface of the cell and afford incorporation within the interior thereof. When inert particles are utilized, the vector can be introduced into the cell by coating the particles with the vector containing the desired gene. Alternatively, the target cell can be surrounded by the vector so that the vector is carried into the cell by the wake of the particle. Biologically active particles (e.g., dried yeast cells, dried bacterium, or a bacteriophage, each containing DNA sought to be introduced) can also be propelled into plant cell tissue. VIII.B. Transformation of Monocotyledons
Transformation of most monocotyledon species has now also become routine. Exemplary techniques include direct gene transfer into protoplasts using PEG or electroporation, and particle bombardment into callus tissue. Transformations can be undertaken with a single DNA species or multiple DNA species (i.e., co-transformation), and both these techniques are suitable for use with the presently disclosed subject matter. Co- transformation can have the advantage of avoiding complete vector construction and of generating transgenic plants with unlinked loci for the gene of interest and a selectable marker, enabling the removal of the selectable marker in subsequent generations, should this be regarded as desirable. However, a disadvantage of the use of co-transformation is the less than 100% frequency with which separate DNA species are integrated into the genome (Schocher et ai, 1986). Patent Applications EP 0 292 435, EP 0 392 225, and WO 93/07278 describe techniques for the preparation of callus and protoplasts from an elite inbred line of maize, transformation of protoplasts using PEG or electroporation, and the regeneration of maize plants from transformed protoplasts. Gordon-Kamm et al., 1990 and Fromm et al., 1990 have published techniques for transformation of A188-derived maize line using particle bombardment. Furthermore, WO 93/07278 and Koziel et al., 1993 describe techniques for the transformation of elite inbred lines of maize by particle bombardment. This technique utilizes immature maize embryos of 1.5-2.5 mm length excised from a maize ear 14-15 days after pollination and a PDS-1000He biolistic particle delivery device (DuPont Biotechnology, Wilmington, Delaware, United States of America) for bombardment.
Transformation of rice can also be undertaken by direct gene transfer techniques utilizing protoplasts or particle bombardment. Protoplast- mediated transformation has been disclosed for Japon/ca-types and Indica- types (Zhang et al., 1988; Shimamoto et al., 1989; Datta et al., 1990). Both types are also routinely transformable using particle bombardment (Christou ef al., 1991 ). Furthermore, WO 93/21335 describes techniques for the transformation of rice via electroporation. Patent Application EP 0 332 581 describes techniques for the generation, transformation, and regeneration of Pooideae protoplasts. These techniques allow the transformation of Dactylis and wheat. Furthermore, wheat transformation has been disclosed in Vasil et al., 1992 using particle bombardment into cells of type C long-term regenerable callus, and also by Vasil ef al., 1993 and Weeks et al., 1993 using particle bombardment of immature embryos and immature embryo-derived callus.
A representative technique for wheat transformation, however, involves the transformation of wheat by particle bombardment of immature embryos and includes either a high sucrose or a high maltose step prior to gene delivery. Prior to bombardment, embryos (0.75-1 mm in length) are plated onto MS medium with 3% sucrose (Murashige & Skoog, 1962) and 3 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D) for induction of somatic embryos, which is allowed to proceed in the dark. On the chosen day of bombardment, embryos are removed from the induction medium and placed onto the osmoticum (i.e., induction medium with sucrose or maltose added at the desired concentration, typically 15%). The embryos are allowed to plasmolyze for 2-3 hours and are then bombarded. Twenty embryos per target plate are typical, although not critical. An appropriate gene-carrying plasmid (such as pCIB3064 or pSG35) is precipitated onto micrometer size gold particles using standard procedures. Each plate of embryos is shot with the DuPont biolistics helium device using a burst pressure of about 1000 pounds per square inch (psi) using a standard 80 mesh screen. After bombardment, the embryos are placed back into the dark to recover for about 24 hours (still on osmoticum). After 24 hours, the .embryos are removed from the osmoticum and placed back onto induction medium where they stay for about a month before regeneration. Approximately one month later the embryo explants with developing embryogenic callus are transferred to regeneration medium (MS + 1 mg/liter naphthaleneacetic acid
(NAA), 5 mg/liter GA), further containing the appropriate selection agent (10 mg/l BASTA® in the case of pCIB3064 and 2 mg/l methotrexate in the case of pSOG35). After approximately one month, developed shoots are transferred to larger sterile containers known as "GA7s" which contain half- strength MS, 2% sucrose, and the same concentration of selection agent.
Transformation of monocotyledons using Agrobacteήum has also been disclosed. See WO 94/00977 and U.S. Patent No. 5,591 ,616, both of which are incorporated herein by reference. See also Negrotto et al., 2000, incorporated herein by reference. Like other /Agrobacter/um-mediated binary vector system used for the transformation of monocotyledons, pSIT can also be employed to modify monocotyledons.
VIII.C. Transformation of Plastids
Seeds of Nicotiana tahacum c.v. 'Xanthi nc' are germinated seven per plate in a 1" circular array on T agar medium and bombarded 12-14 days after sowing with 1 μm tungsten particles (M10, Biorad, Hercules, California,
United States of America) coated with DNA from representative plasmids essentially as disclosed (Svab & Maliga, 1993). Bombarded seedlings are incubated on T medium for two days after which leaves are excised and placed abaxial side up in bright light (350-500 μmol photons/m2/s) on plates of RMOP medium (Svab et al., 1990) containing 500 //g/ml spectinomycin dihydrochloride (Sigma, St. Louis, Missouri, United States of America). Resistant shoots appearing underneath the bleached leaves three to eight weeks after bombardment are subcloned onto the same selective medium, allowed to form callus, and secondary shoots isolated and subcloned. Complete segregation of transformed plastid genome copies (homoplasmicity) in independent subclones is assessed by standard techniques of Southern blotting (Sambrook & Russell, 2001 ). BamHI/EcoRI- digested total cellular DNA is separated on 1 % Tris-borate-EDTA (TBE) agarose gels, transferred to nylon membranes (Amersham Biosciences, Piscataway, New Jersey, United States of America) and probed with 32P- labeled random primed DNA sequences corresponding to a 0.7 kb BamHI/Hindlll DNA fragment from pC8 containing a portion of the rps7/12 plastid targeting sequence. Homoplasmic shoots are rooted aseptically on spectinomycin-containing MS/IBA medium (McBride et al., 1994) and transferred to the greenhouse.
!>C Plants, Breeding, and Seed Production IXA Plants
The presently disclosed subject matter also provides plants comprising the disclosed compositions. In some embodiments, the plant is characterized by a modification of a phenotype or measurable characteristic of the plant, the modification being attributable to the presence of an expression cassette comprising a nucleic acid molecule of the presently disclosed subject matter. In some embodiments, the modification involves, for example, nutritional enhancement, increased nutrient uptake efficiency, enhanced production of endogenous compounds, or production of heterologous compounds. In some embodiments, the modification includes having increased or decreased resistance to an herbicide, environmental stress, or a pathogen. In some embodiments, the modification includes having enhanced or diminished requirement for light, water, nitrogen, or trace elements. In some embodiments, the modification includes being enriched for an essential amino acid as a proportion of a polypeptide fraction of the plant. In some embodiments, the polypeptide fraction can be, for example, total seed polypeptide, soluble polypeptide, insoluble polypeptide, water-extractable polypeptide, and lipid-associated polypeptide. In some embodiments, the modification includes overexpression, underexpression, antisense modulation, sense suppression, inducible expression, inducible repression, or inducible modulation of a gene. In alternative embodiments, the modifications can include decreased or increased lignin content, lignin composition and/or structure changes, decreased or increased cellulose content, crystallinity and degree of polymerization (DP) changes, fiber property and morphology modifications, and/or increased resistance to pathogens, common diseases, and environment stresses in a tree. IX.B. Breeding The plants obtained via transformation with a nucleic acid sequence of the presently disclosed subject matter can be any of a wide variety of plant species, including monocots and dicots, and angiosperms and gymnosperms; however, the plants used in the method for the presently disclosed subject matter are selected in some embodiments from the list of agronomically important target crops set forth hereinabove. The modification of expression of a gene in accordance with the presently disclosed subject matter in combination with other characteristics important for production and quality can be incorporated into plant lines through breeding. Breeding approaches and techniques are known in the art. See e.g., Welsh, 1981 ; Wood, 1983; Mayo, 1987; Singh, 1986; Wricke & Weber, 1986. The genetic properties engineered into the transgenic seeds and plants disclosed above are passed on by sexual reproduction or vegetative growth and can thus be maintained and propagated in progeny plants. Generally, maintenance and propagation make use of known agricultural methods developed to fit specific purposes such as tilling, sowing, or harvesting. Specialized processes such as hydroponics or greenhouse technologies can also be applied. As the growing crop is vulnerable to attack and damage caused by insects or infections as well as to competition by weed plants, measures are undertaken to control weeds, plant diseases, insects, nematodes, and other adverse conditions to improve yield. These include mechanical measures such as tillage of the soil or removal of weeds and infected plants, as well as the application of agrochemicals such as herbicides, fungicides, gametocides, nematicides, growth regulants, ripening agents, and insecticides.
Use of the advantageous genetic properties of the transgenic plants and seeds according to the presently disclosed subject matter can further be made in plant breeding, which aims at the development of plants with improved properties such as tolerance of pests, herbicides, or abiotic stress, improved nutritional value, increased yield, or improved structure causing less loss from lodging or shattering. The various breeding steps are characterized by well-defined human intervention such as selecting the lines to be crossed, directing pollination of the parental lines, or selecting appropriate progeny plants. Depending on the desired properties, different breeding measures are taken. The relevant techniques are well known in the art and include, but are not limited to, hybridization, inbreeding, backcross breeding, multi-line breeding, variety blend, interspecific hybridization, aneuploid techniques, etc. Hybridization techniques can also include the sterilization of plants to yield male or female sterile plants by mechanical, chemical, or biochemical means. Cross-pollination of a male sterile plant with pollen of a different line assures that the genome of the male sterile but female fertile plant will uniformly obtain properties of both parental lines. Thus, the transgenic seeds and plants according to the presently disclosed subject matter can be used for the breeding of improved plant lines that, for example, increase the effectiveness of conventional methods such as herbicide or pesticide treatment or allow one to dispense with said methods due to their modified genetic properties. Alternatively new crops with improved stress tolerance can be obtained, which, due to their optimized genetic "equipment", yield harvested product of better quality than products that were not able to tolerate comparable adverse developmental conditions (for example, drought). IX.C. Seed Production
Embodiments of the presently disclosed subject matter also provide seed from plants modified using the disclosed methods.
In seed production, germination quality, and uniformity of seeds are essential product characteristics. As it is difficult to keep a crop free from other crop and weed seeds, to control seedbome diseases, and to produce seed with good germination, fairly extensive and well-defined seed production practices have been developed by seed producers who are experienced in the art of growing, conditioning, and marketing of pure seed. Thus, it is common practice for the farmer to buy certified seed meeting specific quality standards instead of using seed harvested from his own crop. Propagation material to be used as seeds is customarily treated with a protectant coating comprising herbicides, insecticides, fungicides, bactericides, nematicides, molluscicides, or mixtures thereof. Customarily used protectant coatings comprise compounds such as captan, carboxin, thiram (tetramethylthiuram disulfide; TMTD®; available from R. T. Vanderbilt Company, Inc., Norwalk, Connecticut, United States of America), methalaxyl (APRON XL®; available from Syngenta Corp., Wilmington, Delaware, United States of America), and pirimiphos-methyl (ACTELLIC®; available from Agriliance, LLC, St. Paul, Minnesota, United States of America). If desired, these compounds are formulated together with further carriers, surfactants, and/or application-promoting adjuvants customarily employed in the art of formulation to provide protection against damage caused by bacterial, fungal, or animal pests. The protectant coatings can be applied by impregnating propagation material with a liquid formulation or by coating with a combined wet or dry formulation. Other methods of application are also possible such as treatment directed at the buds or the fruit.
X1 Transgenic Plants A "transgenic plant" is one that has been genetically modified to contain and express an miRNA and/or an siRNA. A transgenic plant can be genetically modified to contain and express at least one homologous or heterologous DNA sequence operatively linked to and under the regulatory control of transcriptional control sequences which function in plant cells or tissue or in whole plants. As used herein, a transgenic plant also refers to progeny of the initial transgenic plant where those progeny contain and are capable of expressing the homologous or heterologous coding sequence under the regulatory control of the plant-expressible transcription control sequences described herein. Seeds containing transgenic embryos are encompassed within this definition as are cuttings and other plant materials for vegetative propagation of a transgenic plant.
When plant expression of a homologous or heterologous gene or coding sequence of interest is desired, that coding sequence is operatively linked in the sense orientation to a suitable promoter and advantageously under the regulatory control of DNA sequences which quantitatively regulate transcription of a downstream sequence in plant cells or tissue or in planta, in the same orientation as the promoter, so that a sense (i.e., functional for translational expression) mRNA is produced. A transcription termination signal, for example, as polyadenylation signal, functional in a plant cell is advantageously placed downstream of an miRNA- and/or siRNA-encoding sequence, and a selectable marker which can be expressed in a plant, can be covalently linked to the inducible expression unit so that after this DNA molecule is introduced into a plant cell or tissue, its presence can be selected and plant cells or tissue not so transformed will be killed or prevented from growing.
Where tissue specific expression of the plant-expressible miRNA and/or siRNA coding sequence is desired, the skilled artisan can choose from a number of well-known sequences to mediate that form of gene expression as disclosed herein. Environmentally regulated promoters are also well known in the art and are disclosed herein, and the skilled artisan can choose from well-known transcription regulatory sequences to achieve the desired result. Summarily, the presently disclosed subject matter can be employed, among other applications, to perform the following:
1. Specifically downregulate a target gene in a stable and heritable manner; 2. Enhance target gene expression by downregulating negative regulators;
3. Regulate transcriptional activity of a target promoter; and
4. Molecular regulation through miRNA-induced silencing signal movement.
EXAMPLES
The following Examples have been included to illustrate modes of the presently disclosed subject matter. These Examples illustrate standard laboratory practices of the co-inventors. In light of the present disclosure and the general level of skill in the art, those of skill will appreciate that the following Examples are intended to be exemplary only and that numerous changes, modifications, and alterations can be employed without departing from the scope of the presently disclosed subject matter.
EXAMPLE 1
Isolation of Small RNAs from P. trichocarpa
Total RNA was isolated from developing xylem tissue of P. trichocarpa or P. taeda, from pooled tension- and compression-stressed developing xylem of P. trichocarpa stems (bend for 4 days), from P. trichocarpa in vitro plants, or from pooled P. trichocarpa in vitro plants wit or without exposure to cold (4°C for 24 hours), heat (37°C for 24 hours), dehydration (draught for 14 hours), salinity (300 mM NaCI for 14 hours), or water (plants covered with water for 14 hours), using the cetyl trimethyl ammonium bromide (CTAB) method as described in Chang et a/., 1993.
Cloning of miRNAs was performed as described (Lau et a/., 2001 ; Lagos- Quintana et a/., 2002; Elbashir et a/., 2001 b). Briefly, isolated total RNA was separated on a 12% denaturing polyacrylamide gel. A band corresponding to RNA of about 16-36 nt in size was excised and the RNA was recovered from the gel slice. The recovered RNA was dephosphorylated with alkaline phosphatase, and a 5'-phosphorylated-3'-adaptor oligonucleotide with the sequence 5'-CTGTAGGCACCATTCATCAC-S' (SEQ ID NO: 155) with a 5'- phosphate and a 3'-amino-modifier C-7 (i.e. a seven-carbon spacer with a primary amino group) was then ligated to the dephosphorylated RNA. The ligated products were separated from non-ligated RNA and the adaptor oligonucleotide on a 12% denaturing polyacrylamide gel. A band corresponding to the ligation product was excised from the gel, and the ligated RNA was recovered. The RNA was phosphorylated at the 5' end and a new 5' adaptor oligonucleotide (5'-ATGTCGTGaggcacctgaaa-3J (SEQ ID NO: 156; the sequence in uppercase is a DNA strand and in lowercase is an RNA strand) containing hydroxyl groups at both 5' and 3' ends was ligated to the 5'-phosphorylated ligation product from the previous step. The new ligation product was gel purified and eluted from the gel slice.
Reverse transcription was performed by using a RT primer (5'- GATGAATG GTGCCTAC-3'; SEQ ID NO: 157), followed by PCR using a 5' primer (5'-GTCGTGAGGCACCTGAAA-31; SEQ ID NO: 158) and a 3' primer (δ'-GATGAATGGTGCCTACAG-S1; SEQ ID NO: 159). The PCR product was then digested with Ban I and concatamerized using T4 DNA ligase. The products of the ligation reaction were separated on an agarose gel, and a gel slice corresponding to concatamers of a size range of larger than 500 basepairs (bp) was isolated and the nucleic acids recovered from the gel slice. The single-stranded regions of the ends of the concatamers were filled in by incubation with Taq polymerase, and the DNA product was directly ligated into the pCR2.1-TOPO® vector using the TOPO TA CLONING® kit (Invitrogen Corp., Carlsbad, California, United States of America).
EXAMPLE 2 Isolation of P. trichocarpa miRNAs
After the subcloning described in Example 1 , inserts were sequenced from P. trichocarpa. After excluding sequences corresponding to rRNA, tRNA, snRNA, retrotransposons/transposons, and small RNAs with 2 nt or more mismatches with the P. trichocarpa genome, the remaining small RNA sequences and their surrounding sequences from the P. trichocarpa genome were used to predict the secondary structures of these small RNAs using the mfold program (Zuker, 2003). 52 miRNA families were identified (Table 1 ) based on their authentic pre-miRNA stem-loop structures (see Figure 2, showing two examples) or their significant homology to miRNAs identified in other species.
These miRNAs were subjected to BLAST analyses against the GENBANK® database (available from the National Center for Biotechnology Information (NCBI) website) and the miRBase sequence database (available from the website of the Wellcome Trust Sanger Institute). According to the results from BLAST analyses, the cloned sequences were divided into two groups: group I and group II. Of these, 19 had either identical or highly homologous sequences to those of some Arabidopsis miRNAs (Palatnik et a/., 2003; Sunkar & Zhu, 2004; see Table 1 ). The other 33 miRNA sequences were did not show significant homology to Arabidopsis miRNAs. Interestingly, only 3 (PtmiR 73, PtmiR 132 and PtmiR 181 ) of these 33 miRNAs were found in Arabidopsis, indicating that a majority of the identified P. trichocarpa xylem miRNAs are unique to wood formation.
EXAMPLE 3
Isolation of P. taeda miRNAs
After the subcloning described in Example 1 , inserts were sequenced from P. taeda. After excluding sequences corresponding to rRNA, tRNA, snRNA, and retrotransposons/transposons, the remaining small RNA sequences and their surrounding sequences from the P. taeda expressed sequence tags (ESTs) deposited in dbEST of the GENBANK® database were used to predict the secondary structures of these small RNAs using the mfold program (Zuker, 2003). 15 miRNA families were identified (Table 4, LpMIRI , LpMIR2, LpMIR7, LpMIR9, LpMIRI 78, LpMIR26, LpMIR27,
LpMIR28, LpMIR77, LpMIR82, LpMIR89, LpMIR95, LpMIRIOO, LpMIR119, and LpMIRI 76) based on their authentic pre-miRNA stem-loop structures or their significant homology to miRNAs identified in other species.
These miRNAs were subjected to BLAST analyses against the GENBANK® database and the miRBase sequence database (available from the website of the Wellcome Trust Sanger Institute. According to the results from BLAST analyses, the cloned sequences were divided into two groups: group I and group II. Of these, 3 had either identical or highly homologous sequences to those of some Arabidopsis miRNAs (Palatnik et al., 2003; Sunkar & Zhu, 2004; see Table 1). The other 12 miRNA sequences did not show significant homology to Arabidopsis miRNAs.
EXAMPLE 4
Identification of Additional miRNAs from P. trichocarpa When the genomic sequences surrounding the closely related homologs (i.e., the P. trichocarpa miRNAs that showed 1 and 2 mismatches to the isolated P. trichocarpa miRNAs) were analyzed, 66 additional loci were identified. Some of the isolated miRNA showed high homology to each other, for example, PtmiR 71 and PtmiR 142 (Table 1 ), resulting in 3 loci each of which had a sequence showing high homology to two miRNAs. Among these 3 loci, one locus had a sequence showing a 1 nt mismatch to both PtmiR 71 and PtmiR 142, and the other two loci each had a sequence showing a 1 nt mismatch to PtmiR 71 and 2 nt mismatch to PtmiR 142.
Moreover, one locus (PtMIR 156-1 ) harboring an miRNA with two mismatches to PtmiR 156 was able to form stable stem-loop structures with the miRNA sequences present in either the 5' or the 3' arm, and two stem- loop structures (one is shorter and another is longer) were found when the miRNA was present in the 3' arm (see Figure 3). Moreover, the four PtmiR
71 genes had a sequence showing a 1 nt mismatch to PtmiR 142.
EXAMPLE 5 Identification of Additional miRNAs from P. taeda When the EST sequences surrounding the closely related homologs
(i.e., the P. taeda miRNAs that showed 1 and 2 mismatches to the isolated P. taeda miRNAs) were analyzed, 17 additional loci were identified (Table 4). Whether any of the P. trichocarpa miRNA families are present in P. taeda has also been investigated. By allowing zero to two nucleotide substitutions, the sequences of some PtmiRNAs were searched against the P. taeda EST database to identify their P. taeda homologs and the surrounding sequences. Analysis of the LpmiRNA sequence-containing loci in P. taeda by the mfold program (Zuker, 2003) resulted in the identification of 5 novel P. taeda miRNA families (LpMIRI 70, LpMIR274, LpMIR277, LpMIR279, and LpMIR472. representing by 10 additional loci (Table 4).
EXAMPLE 6 Identification of Potential miRNA Target Genes
Based on the miRNA sequences, target genes for the isolated Populus trichocarpa miRNAs were identified by searching the genome and predicted transcripts of P. trichocarpa with the program PATSCAN (Dsouza & Larsen, 1997), which can be used to identify mRNAs capable of base pairing with one of the miRNAs with a score of 3.0 or less (see Jones-
Rhoades et ai, 2004 for detail description for scoring method). The same method was used to identify potenitial target genes for miRNAs isolated from Pinus taeda by seaching throught the Pine Gene Index Release 6.0 produced by The Institute for Genomic Research (TIGR; available at the website of TIGR). This included potential target genes for 35 poplar and pine miRNAs that did not show any homology to Arahidopsis miRNAs (Table
2).
Discussion of Example 6 The predicted targets comprise, in general, regulatory and defense related genes. While some of the targets are associated with development, and/or with cellulose biosynthesis, many of them are implicated in the lignin biosynthesis network. For example, LpMlR 178 was found to target a cellulose synthase, an enzyme involved in the synthesis of the backbone of the cell wall. The predicted target of PtmiR 6 encodes a UVR8 protein, which positively regulates phenylpropanoid metabolism associated with cinnamate 4-hydroxylase (C4H) in response to UV-B induction (Hu et al., 1998; Jin et al., 2000; Kliebenstein et al., 2002). Also, PtmiR 241 and PtmiR 13 each targets genes that encodes laccases and a mononuclear blue copper protein family member. These two protein families were suggested to be involved in lignin formation (Nersissian et al., 1999). A common target of
PtmiR 29, 71 , and 142 encode MYB factor proteins, which are transcription factors known to bind promoters of a variety of lignin biosynthetic pathway genes encoding, for example, PAL, C4H, 4-coumaroyl-CoA ligase (4CL), 5- hydroxyconiferaldehyde O-methyltransferase (COMT) and cinnamyl alcohol dehydrogenase (CAD; Tamagnone et al., 1998; Borevitz et al., 2000). Down- or up-regulating these genes results in drastic lignin reduction or augmentation, respectively (Tamagnone et al., 1998; Borevitz et al., 2000). Suppression of a LIM protein, a predicted target of PtmiR 172, also inhibited
PAL, 4CL, and CAD expression, resulting in significant lignin reduction (Kawaoka et al., 2000; Kawaoka & Ebinuma, 2001 ). The most striking discovery was the perfect sequence complementarity between PtmiR 172 and another target, the G lignin-specific CAD, suggesting a role for PtmiR 172 in a negative feedback mechanism in, perhaps, controlling the preferential biosynthesis of specific lignin types.
EXAMPLE 7 Expression of PtmiR Nucleic Acids in P. trichocarpa Tissues The expression of some of the PtmiRs in various P. trichocarpa tissues was characterized by Northern analysis (Figure 4). This included xylem tissues suffering from tension stress from tension wood (TW) and from compression stress from stem wood opposite to TW, called opposite wood (OW). TW and OW can be easily created by bending the tree stem. The tested PtmiR s are all expressed at some level in woody tissues (for example, phloem, secondary growth, tension wood, and opposite wood).
Northern hybridization was performed essentially as described in Hutvagner et al., 2000. Total RNA (30 μg) was denatured for 10 minutes at 65-7O0C, separated on a 12% polyacrylamide/8 M urea gel (Amersham Biosciences, Piscataway, New Jersey, United States of America) in a
PROTEAN Il apparatus (Bio-Rad Laboratories, Inc., Hercules, California, United States of America), and electro-blotted onto a HYBOND™-N+ membrane (Amersham) using a Trans-Blot SD Semi-Dry Electrophoretic Transfer Cell (Bio-Rad). After UV cross-linking and air drying, blots were prehybridized in ULTRAHYB™-Oligo hybridization buffer (Ambion Inc.,
Austin, Texas, United States of America), and hybridized with O32P]ATP- labeled DNA oligonucleotides complementary to small RNA sequences. The hybridization was carried out overnight in ULTRAHYB™-Oligo buffer at 37°C. After hybridization, blots were washed twice with a wash buffer containing 2x SSC and 0.5% SDS at 37°C for 0.5 hour each time. Signals were visualized by autoradiography at -800C.
Interestingly, while PtmiR 29 is expressed strongly in xylem, its Arahidopsis homolog (AtmiR159) was not expressed in Arabidopsis stem, as reported by Park et al. See Park et al., 2002. Instead, AtmiRI 59 was found most highly expressed in Arabidopsis leaves, contrasting directly with the considerably lower expression of its P. trichocarpa homolog, PtmiR 29, in leaves than in lignifying tissues. Thus, miRNA sequence conservation between plant species might not suggest conserved miRNA functions in these species.
Discussion of Example 7
Based on the expression patterns of these PtmiRs showing high levels of transcripts in wood forming tissues, xylem in particular, and on the predicted target mRNAs (see Table 2), the disclosed PtmiRs might play significant roles in regulating wood development in plants. The expression patterns and predicted target mRNA functions also point to critical roles for these PtmiRs in regulating lignin, cellulose, and hemicellulose biosynthesis. The strong expression of PtmiR 73 in leaf together with its target gene function associated with disease resistance (see Table 2) is direct evidence for the involvement of PtmiR 73 in the regulation of disease and stress tolerance.
EXAMPLE 8 Identification of Potential siRNA Target Sites in Any RNA Sequence
The sequence of an RNA target of interest, such as a plant mRNA transcript, is screened for target sites, for example by using a computer- based folding algorithm. In a non-limiting example, the sequence of a gene or RNA gene transcript derived from a database, such as the GENBANK® database or any other database containing nucleotide sequence data (for example, a database containing sequence data from plants, such as Arabidopsis, P. trichocarpa, rice, etc.) is used to generate siRNA targets having complementarity to the target. Such sequences can be obtained from a database, or can be determined experimentally as disclosed herein and/or known in the art. Target sites that are known include, for example, those target sites determined to be effective target sites based on studies with other nucleic acid molecules, for example ribozymes or antisense, or those targets known to be associated with a disease or condition such as those sites containing mutations or deletions, can be used to design siRNA molecules targeting those sites as well.
Target sites can include single-stranded regions of miRNA precursors. As disclosed herein and shown in Figure 2, miRNA precursors adopt a stem-loop structure consisting of double-stranded and single- stranded regions. siRNA molecules are designed that hybridize to the double-stranded or single stranded regions of an miRNA precursor or to the miRNA sequence, thus causing aberrant processing of the precursor and inhibiting miRNA production. Various parameters can be used to determine which sites are the most suitable target sites within the target RNA sequence. These parameters include, but are not limited to secondary or tertiary RNA structure, the nucleotide base composition of the target sequence, the degree of homology between various regions of the target sequence, and the relative position of the target sequence within the RNA transcript. Based on these determinations, any number of target sites within the RNA transcript can be chosen to screen siRNA molecules for efficacy, for example by using in vitro RNA cleavage assays, cell culture, or animal models. In a non-limiting example, anywhere from 1 to 1000 target sites are chosen within the transcript based on the size of the siRNA construct to be used. High throughput screening assays can be developed for screening siRNA molecules using methods known in the art, such as with multi-well or multi-plate assays to determine efficient reduction in target gene expression. EXAMPLE 9 siRNA-mediated Modulation of Gus Gene Expression in Transgenic Tobacco
Design of siRNAs Directed Against the GUS Gene
Based on the standard design rules (Elbashir et al., 2002) two 19 nt sequences (designated GT1 and GT2) targeting two distinct sites in the GUS mRNA were selected for constructing the expression vectors. Individual siRNA templates comprised the 19 nt fragment linked via a 9 nt spacer to the reverse complement of the same 19 nt sequence. Each template was cloned into a vector comprising a human H1 RNA transcription unit under the control of its cognate gene promoter (Figure 9). The resulting transcript was predicted to adopt an inverted hairpin RNA structure containing one (for GT1 ) or two (for GT2) 3' overhanging uridines, giving rise to siRNA-like transcripts containing GT1 or GT2 sequences (Figure 9). As shown in Figure 9, GT1 produces an siRNA-like transcript comprising SEQ ID NO: 172 - 9 nt spacer - SEQ ID NO: 173 (bottom left), and GT2 produces a transcript comprising SEQ ID NO 174 - 9 nt spacer - SEQ ID NO: 175.
RNA Silencing with Human H1 Promoter-Containing Constructs. Agrobaterium tumefaciens C58 cells were transformed with the GT1 and GT2 vectors and used to transform a transgenic tobacco line expressing a GUS transgene (Hu et al., 1998). To transfer to tobacco, GUS-containing tobacco leaf disks were infected with the Agrobacterium C58 strain harboring the siRNA construct. Transformants were selected on MS104 containing 25 mg/L hygromycin and 300 mg/L claforan. The hygromycin-resistant shoots were placed on hormone-free MSO agar medium containing 25 mg/L hygromycin and 300 mg/L claforan for root regeneration, and transgenic tobacco seedlings were planted in soil and grown in a greenhouse.
Twenty-three transgenic plants were produced from the GT1 construct and nineteen from the GT2 construct. Transgenic plants and GUS-carrying control plants were characterized at about one month old. The stem, leaf, and root of a majority of the GT1 and GT2 transgenics exhibited either reduced or no GUS staining (Figure 5A). Assays of GUS protein activity in leaves indicated that 74% of the GT1 transgenics had a reduction in GUS activity ranging from 12 to 94%, and 84% of the GT2 transgenics exhibited a reduction in GUS activity of 31 to 97%. The reduction in GUS activity (see Figure 5B) reflected diminished GUS mRNA levels in these plants (see Figures 5C and 5D). Small discrete RNAs of about 21 nt in length were present in the transgenic lines having reduced GUS mRNA and protein activity, but absent from the control line (see Figure
5E). Overall, the abundance of this 21 nt RNA was inversely correlated with the abundance of GUS mRNA in these plants (see Figures 5C and 5E).
The gene silencing efficiency appeared to be independent of the GUS mRNA target sites and of the number of uridine residues (1 vs. 2) in the engineered siRNA transcripts. Furthermore, the silencing effect remained in about 90% of the Ti plants analyzed.
Cloning of the Arabidopsis 7SL4 Promoter. Two oligonucleotides corresponding to the promoter region of the Arabidopsis thaliana At7SL4 gene were designed based upon data present in the publicly available Arabidopsis database (see the website for the Institute for Genomic
Research). These primers are SLpF (δ'-GGAATTCTGCGTTTGAAGAAGA GTGTTTGA-3'; SEQ ID NO: 160) as the forward primer (with the addition of an Eco Rl site at the 5' end) and SLpR (5'-GCCCGGG AAGATCGGTTCGTGTAATATAT-S'; SEQ ID NO: 161 ) as the reverse primer (with addition of a Sma I site at the 5' end). These two primers flank the
At7SL4 gene promoter at both ends and were used for PCR amplification of the promoter fragment from Arabidopsis thaliana (Columbia ecotype) genomic DNA.
The PCR product amplified from Arabidopsis genomic DNA using primers SLpF and SLpR was cloned into the PCR®2.1-TOPO® system
(Invitrogen Corp., Carlsbad, California, United States of America) and the sequence of the promoter fragment confirmed by sequencing. The resulting At7SL4 promoter clone was named pCRSLp7, and contained the following At7SL4 promoter sequence: GGAATTCTGCGTTTGAAGAAGAGTGTTTGA TGTTCTCAAGTAAGTGAGTCTTATTGGGAATAATATTAACTCATGTTCTT
CTTGCATTTGATTTCTTTGCCGCTCTCTTCTTCTATCTCAAATCTGTCTCT TCAATTTCACAGTTGGGCTTTTTATTAGTCTATAATGGGACTCAAAATAA GGCTTTGGCCCACATCAAAAAGATAAGTCAAATGAAAACTAAATTCAGT CTTTTGTCCCACATCGATCACTCTACTCGTTTTGTGTTTGTTTATATATTA CACGAACCGATCTTCCCGGGC (SEQ ID NO: 162). The sequences of the SLpF and SLpR primers are underlined.
Cloning of the Arabidopsis At7SL4 Gene 3' Non-translated Sequence. To clone the 3'-NTS of the At7SL4 gene, two oligos were synthesized based on sequence information available in the the Arabidopsis database as described hereinabove. The primers used were as follows: SLtF 5'- GTCTAGATTTTGATTTTGTTTTCCAAAACTTTCTACG-3' (SEQ ID NO: 163), was used as the forward primer (adds an Xbal site added to the 5' end of the 3'-NTS); and SLtR δ'-GAAGCTTGGTGTTGATCACAACGATACA-S'
(SEQ ID NO: 164) was used as the reverse primer (adds a Hindlll site to the 3' end of the 3'-NTS). PCR was employed to amplify a nucleic acid molecule comprising the 3'-NTS using these two primers and Arabidopsis thaliana (Columbia ecotype) genomic DNA. The amplified nucleic acid molecule was cloned into the PCR®2.1-TOPO® system (Invitrogen Corp.) and sequenced
(plasmid referred to herein as pCRSLt2). The correct At7SL4-3'-NTS nucleotide sequence was determined to be: GTCTAGATTTTGATTTT GTTTTCCAAAACTTTCTACGCTTTTTGTTTTTGGGTTTAATGCTTTAAGAG GGAACAAAAACAAAGCTGTGAAAACTGAAAGCAAACTTTGAACAAAGCA AGAGACTTAAGAGTTGTATTTACAGCTTTTGTTCGATGTATGGAAATGTA
CAATTTTTTTGCTACTCAAAGAAATGAGACTTAAGAGTCAACGTTAAAAG AGCCAGGAGTAAAATGTCTAGGTATGATCTCAATTGTATCGTTGTGATC AACACCAAGCTTC (SEQ ID NO: 165). The sequences of the SLtF and SLtR primers are underlined. Assembly of the siRNA Delivery Cassette. The 7SL4-RNA promoter sequence was released from pCRSLp7 by digestion with Eco Rl and Sma I and then inserted into a pUC19 vector at the Eco Rl and Sma I cloning sites, yielding a plasmid referred to herein as pUCSLp7-1. To assemble the siRNA delivery cassette including the elements of the 7SL4-RNA promoter and the 3'-NTS fragment, the At7SL4-3'-NTS sequence was released from pCRSLt2 by digestion with Xba I and Hind III. The At7SL4-3'-NTS sequence was thereafter ligated into the Xba I and Hind III cloning sites of pUCSLp7-1 to produce a construct named pUCSLI . This construct contained the siRNA delivery cassette in a pUC19 backbone vector. The siRNA expression cassette contains the At7SL4 promoter sequence and the At7SL4-3'-NTS sequence. Between these two elements is a multiple cloning site (MCS) including sites for Sma I, Bam HI, and Xba I for insertion of target sequences (see Figure 6).
Plant 7SL Promoter-mediated siRNA Silencing of GUS Expression in
Transgenic Tobacco. A plant promoter-based system was also tested. DNA-dependent RNA polymerase III 7SL RNA genes from Arabidopsis thaliana were employed, because the transcription of these small genes is controlled exclusively by their upstream external regulatory sequence elements (USE and TATA) and terminates at a run of five to seven thymidines. These features allowed for the incorporation of these sequences into expression vectors to efficiently produce siRNA duplexes that contained three to four 3' overhanging uridines. From an A. thaliana At7SL4, the promoter and 3'-NTS region were cloned by PCR amplification as disclosed hereinabove. The plasmid containing the At7SL4 promoter and 3'-NTS was named pUCSU (see Figure 6).
In addition to the GT1 and GT2 sequences described hereinabove, an additional 19 nt GUS mRNA sequence, referred to herein as GT3, was selected for constructing an additional siRNA template, following the general design described hereinabove. siRNA templates corresponding to GT1 , GT2, and GT3 were cloned into the pSIT expression vector (see Figure 7), which was then mobilized into A. tumefaciens C58 cells for transforming the transgenic GUS tobacco line described hereinabove (see also Hu et al., 1998). A total of 89 plants were produced containing one of these three expression constructs.
The same analysis schemes described hereinabove were employed to screen transgenic plants. It was determined that 83% of these transgenic plants exhibited a reduction in GUS enzyme activity ranging from 20 to 99%. No apparent difference in overall GUS activity reduction efficiency was observed among these three expression constructs. The observed reduction in GUS enzyme activity correlated with diminished GUS mRNA level, and with the appearance/abundance of GL/S-specific siRNAs. Together, these results validated a plant promoter-based siRNA gene silencing system.
EXAMPLE 10 pSIT System for Stable Transformation of Plants
In order to introduce stably expressed miRNAs and/or siRNAs to plant tissues, a binary vector transformation system mediated by Agrohacterium was developed. The binary vector construct contained an siRNA delivery cassette and a selectable marker gene under the control of separate promoters, and is referred to herein as pSIT (small interfering RNA transformation system). See Figure 7. Cloning sites for Sma I, Bam HI, and Xba I have been included in pSIT, and can be used for the insertion of target gene sequences in a structure designed to form a double-stranded RNA when the target gene sequences are transcribed. The insert structure is in some embodiments a 19 to 26-nucleotide sequence corresponding to the sense strand of a target gene followed by the complementary antisense sequence. The sense and antisense sequences are separated by a 9- nucleotide spacer (5'-TTCAGATGA-S'; see Figure 8). At the 3'-end of the structure, a string of several thymidines (in some embodiments, a string of 7) was added to signal termination of transcription from the promoter.
EXAMPLE 11 siRNA-Based Modulation of miRNA Genes siRNA-based gene modification system can be used for modulating gene expression in plants (for example, trees). Representative, non-limiting genes the expression of which can be modulated include genes encoding the miRNAs disclosed as SEQ ID NOs: 1-59, 1247-1295, and 1662-1712
(i.e. genes comprising the nucleotide sequences disclosed as SEQ ID NOs:
60-156, 1296-1375, and 1713-1748), as well as miRNA genes involved in the regulation of the lignin and cellulose biochemical pathways. Moreover, the system is particularly useful for the manipulation of the miRNA genes that modulate multiple family members. Only a short sequence of the target gene is needed in the siRNA system, allowing the design of an siRNA target sequence to be highly specific and discernable from the other miRNA family member genes or other unknown genes which share a high sequence homology with the target member.
Based on the predicted stem-loop structure of an miRNA precursor, the nucleotide sequence of a loop region is determined. An siRNA is synthesized that hybridizes to this loop region, and an siRNA delivery cassette is generated. The siRNA delivery cassette is cloned into pSIT using the techniques described herein, and the vector is transformed into a plant cell. The transformed plant cell is used to regenerate a plant, and the expression of the plant gene targeted by the miRNA is determined in the regenerated plant and compared to the expression of the same plant gene in a wild type plant (i.e. a plant that has not been transformed with the pSIT construct.
References
The references listed below as well as all references cited in the specification are incorporated herein by reference to the extent that they supplement, explain, provide a background for or teach methodology, techniques and/or compositions employed herein. Adelman et al. (1983) DNA 2:183-193.
Agrawal S (ed.) Methods in Molecular Biology, volume 20, Humana Press, Totowa, New Jersey, United States of America.
Altschul et al. (1990) J MoI Biol 215:403-410.
Ambros et al. (2003) Curr Biol 13:807-818. Anterola & Lewis (2002) Phytochemistry 61 :221 -94.
Aravin et al. (2003) Dev Cell 5:337-350.
Ausubel et al., eds (1989) Current Protocols in Molecular Biology. Wiley, New York, New York, United States of America.
Bartel (2004) Ce// 116:281-297. Bartel & Bartel (2003) Plant Physiol 132:709-717.
Bevan (1984) Nucl. Acids Res 12:8711-21.
Bevan et al. (1983) Nature 304:184-187.
Binet et al. (1991 ) Plant MoI Biol 17:395-407. Blochinger & Diggelmann (1984) MoI Cell Biol 4:2929-2931.
Boerjan et al. (2003) Annu Rev Plant Biol 54:519-46.
Borevitz et al. (2000) Plant Cell 12:2383-2393.
Bourouis & Jarry (1983) EMBO J 2:1099-1104. Callis et al. (1987) Genes Dev 1 :1183-1200.
Chang et al. (1993) Plant MoI Biol Rep 11 : 113-116.
Chibbar et al. (1993) Plant Cell Rep 12:506-509.
Christensen & Quail (1989) Plant MoI Biol 12:619-632.
Christou et al. (1991 ) Bio/Technology 9: 957-962. Datta et al. (1990) Bio/Technology 8:736-740. de Framond (1991 ) FEBS Lett 290:103-6.
Dsouza et al. (1997) Trends Genet 13:497-8.
Dostiθ et al. (2003) RNA 9:631-632.
Ebel et al. (1992) Biochem 31 :12083-12086. Elbashir et al. (2001 a) Nature 411 :494-498.
Elbashir et al. (2001 b) Genes Dev 15:188-200.
Elbashir et al. (2002) Methods 26:199-213.
EP 0 292 435
EP0332104 EP0332581
EP0392225
EP0452269
Firek et al. (1993) Plant MoI Biol 22:129-142.
Freier et al. (1986) Proc Natl Acad Sci USA 83:9373-9377. Fromm (1990) Biotechnology (NY) 8:833-839.
Gallie et al. (1987) Nucl Acids Res 15:8693-8711.
Glover & Harries (1995) DNA Cloning: A Practical Approach, 2nd ed. IRL Press at Oxford University Press, Oxford ; New York.
Goeddel (1990) Gene Expression Technology. Methods in Enzymoloqy. Volume 185, Academic Press, San Diego, California, United States of
America.
Gritz & Davies (1983) Gene 25:179-188.
Hamilton & Baulcombe (1999) Science 286:950-952. Henikoff & Henikoff (1992) Proc Natl Acad Sci U S A 89:10915-10919.
Hofgen & Willmitzer (1988) Nucl Acids Res 16:9877.
Houbaviy et al. (2003) Dev Cell 5:351-358.
Hu et al. (1998) Proc Natl Acad ScI U S A 95:5407-5412. Hudspeth & Grula (1989) Plant Molec Biol 12:579-589.
Hutvagner & Zamore (2002) Curr Opin Genet Dev 12:225-232.
Hutvagner et al. (2000) RNA 6:1445-1454.
Jefferson et al. (1987) EMBO J 6:3901-3907.
Jin et al. (2000) EMBO J 19:6150-6161. Jones-Rhoades et al. (2004) Molecular Cell 14:787-799.
Karlin & Altschul (1993) Proc Natl Acad Sci U S A 90:5873-5877.
Kasschau et al. (2003) Dev Cell 4:205-217.
Kawaoka & Ebinuma (2001 ) Phytochemistry 57:1149-1157.
Kawaoka et al. (2000) Plant J 22:289-301. Kawasaki & Taira (2003) Nature 423:838-842.
Kliebenstein et al. (2002) Plant Physiol 130:234-243.
Koziel et al. (1993) Bio/Technology 11 : 194-200.
Lagos-Quintana et al. (2001 ) Science 294:853-858.
Lagos-Quintana et al. (2003) RNA 9:175-179. Lagos-Quintana et al. (2002) Curr Biol 12:735-739.
Lau et al. (2001 ) Science 294:858-862.
Lee et al. (2002) Nature Biotechnol 20:500-505.
Lee & Ambros (2001 ) Science 294:862-864.
Lee et al. (1993) Ce// 75:843-854. Lee et al. (2003) Nature 425:415-419.
Lee ef al. (2002) EMBO J 21 :4663-4670.
Lim et al. (2003a) Science 299:1540.
Lim et al. (2003b) Genes Dev 17:991-1008.
Llave et al. (2002). Science 297:2053-2056. Logemann et al. (1989) Plant Ce// 1 :151-158.
Mayo (1987) The Theory of Plant Breeding, Second Edition, Clarendon Press, New York, New York, United States of America.
McBride ef al. (1994) Proc Natl Acad Sci USA 91 :7301-7305. McBride & Summerfelt (1990) Plant MoI Biol 14: 269-276.
McElroy et al. (1991 ) MoI. Gen. Genet. 231 :150-160.
McElroy et al. (1990) Plant Ce// 2:163-71.
Messing & Vieira (1982) Gene 19:259-268. Michael et al. (2003) MoI. Cancer Res 1 :882-891.
Mourelatos et al. (2002) Genes Dev 16:720-728.
Murashige & Skoog (1962) Physiol Plant 15:473-497.
Needleman & Wunsch (1970) J MoI Biol 48:443-453.
Negrotto et al. (2000) Plant Cell Reports 19:798-803. Nersissian et al. (1999) Protein Sci 7:1915-1929.
Palatnik et al. (2003) Nature 425:257-263.
Park et al. (2002) Curr Biol 12:1484-1495.
Paszkowski et al. (1984) EMBO J 3:2717-2722.
PCT International Publication No. WO 93/07278 PCT International Publication No. WO 93/21335
PCT International Publication No. WO 94/00977
Pearson & Lipman (1988) Proc Natl Acad Sci U S A 85:2444-2448.
Potrykus et al. (1985) MoI Gen Genet 199:169-177.
Reinhart et al. (2002) Genes Dev 16:1616-1626. Rhoades et al. (2002) Cell 110:513-520.
Rohrmeier & Lehle (1993) Plant MoI Biol 22:783-792.
Rothstein et al. (1987) Gene 53:153-161.
Sambrook & Russell (2001 ) Molecular Cloning: A Laboratory Manual, 3rd ed.
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Scharfmann et al. (1991 ) Proc Natl Acad Sci U S A 88:4626-4630.
Schmidhauser & Helinski (1985) J Bacteriol 164:446-455.
Schocher et al. (1986) Bio/Technology 4:1093-1096.
Shimamoto et al. (1989) Nature 338:274-276.
Silhavy (1984) Experiments with Gene Fusions. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America.
Singh (1986) Breeding for Resistance to Diseases and Insect Pests, Springer-Verlag, New York, New York, United States of America.
Skuzeski et al. (1990) Plant MoI Biol 15:65-79. Smith & Waterman (1981) AdvAppl Math 2:482-489.
Spencer et al. (1990). TheorAppl Genet 79:625-631.
Sunkar & Zhu (2004) Plant Ce// 16:2001-19.
Svab et al. (1990) Proc Natl Acad Sci USA 87:8526-8530. Svab & Maliga (1993) Proc Natl Acad Sci USA 90:913-917.
Tamagnone et al. (1998) Plant Ce// 10:135-154.
Thompson et al. (1987) EMBO J 6:2519-2523.
Tibanyenda et al. (1984) Eur J Biochem 139:19-27.
Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology - Hybridization with Nucleic Acid Probes. Elsevier, New York, United
States of America.
Turner et al. (1987) Cold Spring Harb Symp Quant Biol Lll:123-133.
Uknes et al. (1993) Plant Cell 5: 159-169.
Uknes et al. (1992) Plant Ce// 4:645-656. U.S. Patent Nos. 4,940,935; 4,945,050; 5,036,006; 5,100,792; 5,188,642;
5,523,311 ; 5,591 ,616; and 5,614,395.
Vasil et al. (1992) Bio/Technology 10:667-674.
Vasil et al. (1993) Bio/Technology 11 :1553-1558.
Wang et al. (2004) Nucleic Acids Res 32:1688-1695. Warner et al. (1993) Plant J 3:191 -201.
Weeks et al. (1993) Plant Physiol 102:1077-1084.
Welsh (1981 ) Fundamentals of Plant Genetics and Breeding, John Wiley & Sons, New York, New York, United States of America.
White et al. (1990) Nucl Acids Res 18:1062. Wightman et al. (1993) Cell 75:855-862.
Williams et al. (1993) J Clin Invest 92:503-508.
Wood, ed. (1983) Crop Breeding, American Society of Agronomy, Madison, Wisconsin, United States of America.
Wricke & Weber (1986) Quantitative Genetics and Selection Plant Breeding. Walter de Gruyter and Co., Berlin, Germany.
Xu et al. (1993) Plant MoI Biol 22:573-588.
Zeng & Cullen (2003) RNA 9:112-123.
Zhang et al. (1988) Plant Cell Reports 7: 379-384. Zuker (2003) Nucleic Acids Res 31 :3406-15.
It will be understood that various details of the presently disclosed subject matter can be changed without departing from the scope of the presently disclosed subject matter. Furthermore, the foregoing description is for the purpose of illustration only, and not for the purpose of limitation.

Claims

CLAIMS What is claimed is:
1. A method for stably modulating expression of a plant gene, the method comprising: (a) providing a vector encoding a microRNA (miRNA) targeted to the plant gene; and
(b) transforming a plant cell with the vector, whereby stable expression of the miRNA in the plant cell is provided.
2. The method of claim 1 , wherein the modulating is inhibiting.
3. The method of claim 1 , wherein the vector is an Agrobacterium binary vector.
4. The method of claim 1 , wherein the vector comprises:
(a) a promoter operatively linked to a nucleic acid molecule encoding the miRNA molecule; and (b) a transcription termination sequence.
5. The method of claim 4, wherein the vector is an Agrobacterium binary vector.
6. The method of claim 4, wherein the promoter is a DNA-dependent RNA polymerase III promoter.
7. The method of claim 6, wherein the promoter is selected from the group consisting of an RNA polymerase III H1 promoter, an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, a tRNA gene promoter, and functional derivatives thereof.
8. The method of claim 7, wherein the Arabidopsis thaliana 7SL RNA gene promoter comprises the sequence presented in SEQ ID NO: 162.
9. The method of claim 4, wherein the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a sense region, an antisense region, and a loop region, positioned in relation to each other such that upon transcription, a resulting RNA transcript is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand.
10. The method of claim 9, wherein the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59,
1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
11. The method of claim 1 , wherein the plant gene comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, and sequences at least 80% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837.
12. The method of claim 1 , wherein the plant is a dicot.
13. The method of claim 1 , wherein the plant is a monocot.
14. The method of claim 1 , wherein the plant is a tree.
15. The method of claim 14, wherein the tree is an angiosperm.
16. The method of claim 14, wherein the tree is a gymnosperm.
17. The method of claim 14, wherein the tree is a member of the genus Populus.
18. The method of claim 1 , wherein the stable expression of the microRNA (miRNA) in the plant occurs in a location or tissue selected from the group consisting of epidermis, root, vascular tissue, xylem, meristem, cambium, cortex, pith, leaf, flower, seed, and combinations thereof.
19. A method for stably modulating expression of a plant gene, the method comprising:
(a) transforming a plurality of plant cells with an Agrobacteήυm tumefaciens binary vector comprising: (i) a nucleic acid sequence encoding a selectable marker; and
(ii) a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence; (b) treating the plant cells with a drug under conditions sufficient to kill those plant cells that did not receive the binary vector, wherein the selectable marker provides resistance to the drug, to create a first plurality of transformed plant cells; (c) growing the first plurality of transformed plant cells under conditions sufficient to select for a second plurality of transformed plant cells that have integrated the binary vector into their genomes; (d) screening the second plurality of transformed plant cells for expression of the miRNA encoded by the expression vector;
(e) selecting a transformed plant cell that expresses the miRNA; and
(f) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the gene in the plant is stably modulated.
20. A vector for stably expressing a microRNA (miRNA) molecule in a plant, the vector comprising:
(a) a promoter operatively linked to a nucleic acid molecule encoding the miRNA molecule; and
(b) a transcription termination sequence.
21. The vector of claim 20, wherein the vector is an Agrobacterium binary vector.
22. The vector of claim 20, wherein the promoter is a DNA-dependent RNA polymerase III promoter.
23. The vector of claim 22, wherein the promoter is selected from the group consisting of RNA polymerase III H1 promoter, an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, a tRNA gene promoter, and functional derivatives thereof.
24. The vector of claim 23, wherein the Arabidopsis thaliana SL7 RNA gene promoter comprises the sequence presented in SEQ ID NO: 162.
25. The vector of claim 20, wherein the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a sense region, an antisense region, and a loop region, positioned in relation to each other such that upon transcription, a resulting RNA transcript is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand.
26. The vector of claim 25, wherein the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59,
1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
27. The vector of claim 20, wherein the plant gene has a nucleotide sequence comprising a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, and nucleotide sequences at least 80% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837.
28. A kit comprising the vector of claim 20 and at least one reagent for introducing a vector of claim 18 into a plant cell.
29. The kit of claim 28, further comprising instructions for introducing the vector into a plant cell.
30. A plant cell comprising a vector of claim 20.
31. A transgenic plant comprising a vector of claim 20.
32. Transgenic seed or progeny from a transgenic plant of claim 31.
33. A method for stably inhibiting the expression of a gene in a plant cell, the method comprising stably transforming the plant cell with a vector encoding a microRNA (miRNA) molecule, wherein the miRNA molecule comprises a nucleotide sequence at least 70% identical to a contiguous 17- 24 nucleotide subsequence of the gene.
34. The method of claim 33, wherein the gene is selected from the group consisting of coniferaldehyde-5-hydroxylase (CaldδH), a lignin-related gene, a cellulose-related gene, a hemicellulose-related gene, a hormone- related gene, a disease-related gene, a stress-related gene, a growth- related gene, and a transcription factor gene.
35. The method of claim 34, wherein the lignin-related gene is selected from the group consisting of sinapyl alcohol dehydrogenase (SAD), cinnamyl alcohol dehydrogenase (CAD), 4-coumarate:CoA ligase (4CL), cinnamoyl CoA O-methyltransferase (CCoAOMT), caffeate O- methyltransferase (COMT), ferulate-5-hydroxylase (F5H), cinnamate-4- hydroxylase (C4H), p-coumarate-3-hydroxylase (C3H), and phenylalanine ammonia lyase (PAL).
36. The method of claim 34, wherein the cellulose-related gene is selected from the group consisting of cellulose synthase, cellulose synthase- like, glucosidase, glucan synthase, and sucrose synthase.
37. The method of claim 34, wherein the hormone-related gene is selected from the group consisting of isopentyl transferase (ipt), gibberellic acid (GA) oxidase, auxin (AUX), and a rooting locus (ROL) gene.
38. The method of claim 33, wherein the miRNA molecule is encoded by a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
39. The method of claim 33, wherein the plant gene comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 176-781 , 1376-1553, and 1749-1837, and nucleotide sequences at least 80% identical to any of SEQ ID NOs: 176-781 , 1376-1553, and 1749- 1837.
40. A method for enhancing the expression of a gene in a plant cell, the method comprising introducing into the plant cell a vector encoding a short interfering RNA (siRNA) molecule comprising a sequence that hybridizes under physiological conditions to a loop region or a stem region of a pre-microRNA that comprises a microRNA (miRNA) that modulates expression of the gene, thereby resulting in downregulation of expression of the miRNA and enhanced expression of the gene.
41. The method of claim 40, wherein the microRNA (miRNA) comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712, and nucleotide sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
42.An expression vector comprising a nucleic acid sequence encoding a microRNA (miRNA) molecule that stably down regulates expression of a plant gene.
43. The expression vector of claim 42, wherein the nucleic acid sequence encoding the microRNA (miRNA) molecule comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
44. The expression vector of claim 42, wherein the miRNA comprises a nucleotide sequence of about 17-24 contiguous nucleotides with up to 5 mismatches of a ribonucleic acid (RNA) transcribed from a gene selected from the group consisting of a lignin-related gene, a cellulose-related gene, a hemicellulose-related gene, a hormone-related gene, a disease-related gene, a stress-related gene, a medicine-related gene, and a transcription factor gene.
45. The expression vector of claim 44, wherein the lignin-related gene is selected from the group consisting of sinapyl alcohol dehydrogenase (SAD), cinnamyl alcohol dehydrogenase (CAD), 4-coumarate:CoA ligase (4CL), cinnamoyl CoA O-methyltransferase (CCoAOMT), caffeate O- methyltransferase (COMT), ferulate-5-hydroxylase (F5H), cinnamate-4- hydroxylase (C4H), p-coumarate-3-hydroxylase (C3H), and phenylalanine ammonia lyase (PAL).
46. The expression vector of claim 44, wherein the cellulose-related gene is selected from the group consisting of cellulose synthase, cellulose synthase-like, glucosidase, glucan synthase, and sucrose synthase.
47. The expression vector of claim 44, wherein the hormone-related gene is selected from the group consisting of isopentyl transferase (ipt), gibberellic acid (GA) oxidase, auxin (AUX), and a rooting locus (ROL) gene.
48. A plant cell comprising an expression vector of claim 42.
49. The plant cell of claim 48, wherein the plant cell is from a plant selected from the group consisting of poplar, pine, eucalyptus, sweetgum, other tree species, tobacco, Arabidopsis, rice, corn, wheat, cotton, potato, and cucumber.
50. A vector for the stable expression of a microRNA (miRNA) in a plant, wherein the vector comprises a promoter for expressing the miRNA, a transcription termination sequence, and a cloning site between the promoter and the transcription termination sequence into which a nucleic acid molecule encoding the miRNA can be cloned.
51. The vector of claim 50, wherein the microRNA (miRNA) comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
52. The vector of claim 51 , wherein the promoter is a DNA-dependent
RNA polymerase III promoter.
53. The vector of claim 52, wherein the promoter is selected from the group consisting of RNA polymerase III H1 promoter, an Arabidopsis thaliana 7SL RNA promoter, an RNA polymerase III 5S promoter, an RNA polymerase III U6 promoter, an adenovirus VA1 promoter, a Vault promoter, a telomerase RNA promoter, and a tRNA gene promoter, or a functional derivative thereof.
54. The vector of claim 53, wherein the Arabidopsis thaliana 7SL RNA gene promoter comprises SEQ ID NO: 162.
55. The vector of claim 51 , wherein the vector is a plasmid vector.
56. The vector of claim 55, wherein the vector further comprises a selectable marker.
57. The vector of claim 55, wherein the cloning site comprises a recognition sequence for at least one restriction enzyme that is not present elsewhere in the plasmid vector.
58. A method for stably modulating expression of a plant gene, the method comprising:
(a) transforming a plurality of plant cells with a vector comprising a nucleic acid sequence encoding a microRNA (miRNA) operatively linked to a promoter and a transcription termination sequence; (b) growing the plant cells under conditions sufficient to select for a plurality of transformed plant cells that have integrated the vector into their genomes;
(c) screening the plurality of transformed plant cells for expression of the miRNA encoded by the vector;
(d) selecting a transformed plant cell that expresses the miRNA; and
(e) regenerating the plant from the transformed plant cell that expresses the miRNA, whereby expression of the plant gene is stably modulated.
59. The method of claim 58, wherein the nucleic acid sequence encoding the microRNA (miRNA) comprises:
(a) a sense region;
(b) an antisense region; and (c) a loop region, wherein the sense, antisense, and loop regions are positioned in relation to each other such that upon transcription, a resulting RNA transcript is capable of forming a hairpin structure via intramolecular hybridization of the sense strand and the antisense strand.
60. The method of claim 58, wherein the vector is an Agrobacteriυm binary vector that comprises a nucleic acid encoding a selectable marker operatively linked to a promoter.
61. The method of claim 58, wherein the nucleic acid sequence encoding the miRNA comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247- 1295, and 1662-1712.
62. The method of claim 58, wherein the plant gene comprises a nucleotide sequence selected from the group consisting of any of SEQ ID NOs: 60-156, 1296-1375, and 1713-1748, and nucleotide sequences at least
80% identical to any of SEQ ID NOs: 60-156, 1296-1375, and 1713-1748.
63.An isolated microRNA (miRNA) comprising a nucleotide sequence of one of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712, and sequences at least 70% identical to any of SEQ ID NOs: 1-59, 1247-1295, and 1662-1712.
64. The isolated microRNA (miRNA) of claim 63, wherein the miRNA modulates expression of a gene expressed in a tree of the genus Populus.
65. The isolated microRNA (miRNA) of claim 64, wherein the tree is a Populus trichocaφa tree.
66. The isolated microRNA (miRNA) of claim 63, wherein the miRNA modulates expression of a gene expressed in a tree of the genus Pinus.
67. The isolated microRNA (miRNA) of claim 66, wherein the tree is a
Pinus taeda tree.
PCT/US2005/033879 2004-09-20 2005-09-20 Micrornas (mirnas) for plant growth and development WO2006034368A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US61129004P 2004-09-20 2004-09-20
US60/611,290 2004-09-20

Publications (2)

Publication Number Publication Date
WO2006034368A2 true WO2006034368A2 (en) 2006-03-30
WO2006034368A3 WO2006034368A3 (en) 2007-02-01

Family

ID=36090658

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/033879 WO2006034368A2 (en) 2004-09-20 2005-09-20 Micrornas (mirnas) for plant growth and development

Country Status (2)

Country Link
US (1) US20060236427A1 (en)
WO (1) WO2006034368A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150225781A1 (en) * 2012-08-22 2015-08-13 Seoulin Bioscience Co., Ltd. Silver nanocluster probe and target polynucleotide detection method using same, and silver nanocluster probe design method
WO2018206535A1 (en) * 2017-05-08 2018-11-15 Novozymes A/S Carbohydrate-binding domain and polynucleotides encoding the same
CN112063631A (en) * 2020-09-17 2020-12-11 东北林业大学 PtrLBD4-3 gene of populus trichocarpa as well as encoding protein and application thereof
CN113906138A (en) * 2019-03-14 2022-01-07 热带生物科学英国有限公司 Introduction of silencing activity into multiple functionally deregulated RNA molecules and modification of their specificity for a gene of interest

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060200878A1 (en) 2004-12-21 2006-09-07 Linda Lutfiyya Recombinant DNA constructs and methods for controlling gene expression
US8192938B2 (en) * 2005-02-24 2012-06-05 The Ohio State University Methods for quantifying microRNA precursors
WO2008063203A2 (en) * 2006-01-27 2008-05-29 Whitehead Institute For Biomedical Research Compositions and methods for efficient gene silencing in plants
EP2985353A1 (en) * 2006-10-12 2016-02-17 Monsanto Technology LLC Plant micrornas and methods of use thereof
EP2202314B1 (en) * 2007-01-15 2014-03-12 BASF Plant Science GmbH Use of subtilisin (RNR9) polynucleotides for achieving a pathogen resistance in plants
US20110145951A1 (en) * 2007-02-21 2011-06-16 Nagarjuna Energy Private Limited Transgenic sweet sorghum with altered lignin composition and process of preparation thereof
US20080235820A1 (en) * 2007-03-23 2008-09-25 Board Of Trustees Of Michigan State University Lignin reduction and cellulose increase in crop biomass via genetic engineering
CN101921766A (en) * 2010-06-29 2010-12-22 清华大学 miRNA-1425 in response to H2O2 and application thereof
US9334505B2 (en) * 2011-08-12 2016-05-10 Purdue Research Foundation Using corngrass1 to engineer poplar as a bioenergy crop
GB201501941D0 (en) 2015-02-05 2015-03-25 British American Tobacco Co Method
CN112011522A (en) * 2019-05-30 2020-12-01 中国农业大学 MxRHEL protein and novel application of encoding gene thereof
US11926833B2 (en) 2022-01-25 2024-03-12 Living Carbon PBC Compositions and methods for enhancing biomass productivity in plants
CN117683103B (en) * 2023-11-24 2024-05-14 南京林业大学 Small peptide miPEP i and application thereof in plant tissue culture

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040268441A1 (en) * 2002-07-19 2004-12-30 University Of South Carolina Compositions and methods for the modulation of gene expression in plants

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050166289A1 (en) * 2003-12-01 2005-07-28 North Carolina State University Small interfering RNA (siRNA)-mediated heritable gene manipulation in plants

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040268441A1 (en) * 2002-07-19 2004-12-30 University Of South Carolina Compositions and methods for the modulation of gene expression in plants

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BARTEL B. ET AL.: 'MicroRNAs: At the root of plant development?' PLANT PHYSIOL. vol. 132, June 2003, pages 709 - 717, XP002978691 *
KASSCHAU K.D. ET AL.: 'P1/HC-PRO, A VIRAL SUPPRESSOR OF RNA SILENCING, INTERFERENCE WITH ARABIDOPSIS DEVELOPMENT AND MIRNA FUNCTION' DEVELOPMENT CELL vol. 4, no. 2, February 2003, pages 205 - 217, XP009037925 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150225781A1 (en) * 2012-08-22 2015-08-13 Seoulin Bioscience Co., Ltd. Silver nanocluster probe and target polynucleotide detection method using same, and silver nanocluster probe design method
US9777313B2 (en) * 2012-08-22 2017-10-03 Seoulin Bioscience Co., Ltd. Silver nanocluster probe and target polynucleotide detection method using same, and silver nanocluster probe design method
WO2018206535A1 (en) * 2017-05-08 2018-11-15 Novozymes A/S Carbohydrate-binding domain and polynucleotides encoding the same
CN113906138A (en) * 2019-03-14 2022-01-07 热带生物科学英国有限公司 Introduction of silencing activity into multiple functionally deregulated RNA molecules and modification of their specificity for a gene of interest
CN112063631A (en) * 2020-09-17 2020-12-11 东北林业大学 PtrLBD4-3 gene of populus trichocarpa as well as encoding protein and application thereof

Also Published As

Publication number Publication date
WO2006034368A3 (en) 2007-02-01
US20060236427A1 (en) 2006-10-19

Similar Documents

Publication Publication Date Title
US20060236427A1 (en) MicroRNAs (miRNAs) for plant growth and development
US11214812B2 (en) Cotton plant with seed-specific reduction in gossypol
Wang et al. MiR397b regulates both lignin content and seed number in Arabidopsis via modulating a laccase involved in lignin biosynthesis
EP1838144B1 (en) Method to trigger rna interference
EP1809748B1 (en) Micrornas
US20070130653A1 (en) Methods and compositions for gene silencing
US9163233B2 (en) Compositions and methods for modulating expression of gene products
WO2008118394A1 (en) Methods of affecting nitrogen assimilation in plants
AU2011273004A1 (en) Novel microRNA precursor and use thereof in regulation of target gene expression
US20110004958A1 (en) Compositions for silencing the expression of gibberellin 2-oxidase and uses thereof
WO2014151749A1 (en) Maize microrna sequences and targets thereof for agronomic traits
US20050166289A1 (en) Small interfering RNA (siRNA)-mediated heritable gene manipulation in plants
KR101595866B1 (en) miR399f precusor DNA from rice increasing drought stress tolerance of plant and uses thereof
CN104829699A (en) Plant adverse resistance associated protein Gshdz4 and coding gene and application thereof
Rathore et al. Cotton plant with seed-specific reduction in gossypol
WO2019080727A1 (en) Lodging resistance in plants
CA2832639A1 (en) Methods and compositions for silencing gene families using artificial micrornas
MX2014004958A (en) Methods and compositions for silencing genes using artificial micrornas.
KR101666207B1 (en) MicroRNA precusor DNA from rice increasing drought stress tolerance of plant and uses thereof
US20070118926A1 (en) Engineering broad spectrum virus disease resistance in plants based on the regulation of expression of the RNA dependant RNA polymerase 6 gene
WO2009104181A1 (en) Plants having genetically modified lignin content and methods of producing same
WO2012112518A1 (en) Interfering rnas that promote root growth
WO2013067259A2 (en) Regulatory nucleic acids and methods of use
ZA200703832B (en) Micrornas

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase