WO2014004638A2 - Methods and compositions for enhancing gene expression - Google Patents

Methods and compositions for enhancing gene expression Download PDF

Info

Publication number
WO2014004638A2
WO2014004638A2 PCT/US2013/047837 US2013047837W WO2014004638A2 WO 2014004638 A2 WO2014004638 A2 WO 2014004638A2 US 2013047837 W US2013047837 W US 2013047837W WO 2014004638 A2 WO2014004638 A2 WO 2014004638A2
Authority
WO
WIPO (PCT)
Prior art keywords
organism
intron
gene
expression
utr
Prior art date
Application number
PCT/US2013/047837
Other languages
French (fr)
Other versions
WO2014004638A3 (en
Inventor
Tedd D. Elich
Original Assignee
Monsanto Technology Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Monsanto Technology Llc filed Critical Monsanto Technology Llc
Publication of WO2014004638A2 publication Critical patent/WO2014004638A2/en
Publication of WO2014004638A3 publication Critical patent/WO2014004638A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells

Definitions

  • sequence listing is submitted electronically via EFS- Web as an ASCII formatted sequence listing with a file named 435025SeqLst.txt, created on June 26, 2013, and having a size of 93.7 kilobytes, and is filed concurrently with the specification.
  • the sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
  • transgenic cells and organisms comprising a heterologous gene sequence are now routinely practiced by molecular biologists. Methods for incorporating an isolated gene sequence into an expression cassette, producing transformation vectors, and transforming many types of cells and organisms are well known.
  • the regulation or control of expression of the heterologous gene and the protein encoded by the gene can often be critical in the development of a transgenic organism for commercial use. For example, in transgenic plants comprising a heterologous gene that confers tolerance to herbicide that is normally toxic to the plant, it can be critical to have the heterologous gene expressed in a temporal and spatial manner that corresponds to when the plant is exposed to the herbicide and to what parts of the plant the herbicide normally exerts its phytotoxic effect.
  • a number of genetic regulatory elements are known to play a role in regulating the expression of a gene in plants and other organisms including, for example, promoters, 5 '-untranslated regions (UTRs), 3'-untranslated regions, and expression- enhancing introns.
  • promoters for example, promoters, 5 '-untranslated regions (UTRs), 3'-untranslated regions, and expression- enhancing introns.
  • UTRs 5 '-untranslated regions
  • 3'-untranslated regions e.g., 3'-untranslated regions, and expression- enhancing introns.
  • Silencing of transgenes previously showing stable expression can also be triggered 'de novo' when a new transgene is added by crossing or re-transformation if, for example, the same promoter has been used in both transgenes in an effort to promote coordinated expression (Halpin (2005) Plant Biotech. J. 3 : 141-155).
  • the use of the same promoter in multiple transgenes in a single plant is due to the lack of more than one promoter that gives the desired pattern and level of expression.
  • the Cauliflower mosaic virus (CaMV) 35S promoter is frequently used as the promoter in plant transgenes because it provides for high-level constitutive expression of an operably linked gene of interest.
  • the CaMV 35 promoter is often used to drive the high-level constitutive expression of two or more transgenes in the same plant.
  • additional promoters and other genetic regulatory elements are needed to avoid gene silencing that might be caused by the use of a particular genetic regulatory element more than once when two, three, four, or more transgenes are stacked in a single crop plant.
  • a common approach for identifying additional promoters that can be used to drive high-level, constitutive expression of an operably linked heterologous nucleotide sequence in plants involves screening plants to identify plant genes that display the high-level constitutive expression across most tissue and/or cell types.
  • Methods are provided for making an expression construct for enhancing gene expression in an organism.
  • the methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism.
  • the first intron is the first intron from the 5' end of the transcribed region of a gene.
  • the methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
  • the methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide.
  • polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron.
  • the expression construct provides for enhanced expression of the operably linked polynucleotide in a target organism when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
  • the methods comprise introducing into at least one cell of a target organism an expression construct comprising a promoter operably linked to a polynucleotide.
  • the polynucleotide comprises a 5'-UTR, a first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron, which is derived from a native gene that is highly expressed in a constitutive manner in an organism.
  • the promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
  • the methods can further comprise regenerating a target organism from at least one cell comprising the expression construct.
  • the target organism that is produced by the methods of the present invention is capable of expressing the polynucleotide when the target organism or cell thereof is exposed to conditions favorable for the expression of the
  • the target organism is capable of enhanced expression of the polynucleotide when compared to the expression in the target organism of the polynucleotide from a control expression construct which lacks the first intron.
  • the methods comprise obtaining a target organism comprising an expression construct or at least one cell thereof.
  • the expression construct comprises a promoter operably linked to a polynucleotide and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed.
  • the polynucleotide comprises a S'-UTR, a first intron, and a translated region, and the 5'- UTR or translated region comprises the first intron.
  • the first intron is derived from a native gene that is highly expressed in a constitutive manner in an organism.
  • the promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
  • expression of the polynucleotide is increased in a target organism comprising the expression construct or in at least one cell thereof, when compared to the expression of the polynucleotide in the target organism comprising a control expression construct which lacks the first intron or in at least one cell thereof.
  • the expression level of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
  • Methods are provided for making a regulatory construct.
  • the methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism.
  • the first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron.
  • the methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
  • the methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron.
  • the first intron is at or near the 3' end of the 5'-UTR.
  • the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron.
  • the methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
  • Nucleic acid molecules comprising the expression constructs and regulatory constructs of the present invention are provided. Additionally provided are organisms and host cells comprising the expression constructs and regulatory constructs of the present invention. In one embodiment of the invention, the organisms and host cells include, for example, plants, seeds, plant parts, and plant cells comprising at least one expression construct and/or at least one regulatory construct of the present invention.
  • nucleotide sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases.
  • the nucleotide sequences follow the standard convention of beginning at the 5' end of the sequence and proceeding forward (i.e., from left to right in each line) to the 3' end. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand.
  • SEQ ID NO: 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G 13440.
  • the first intron is located at nucleotide positions 1 1 1 1 to 1203.
  • the start of transcription is at nucleotide 955. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1 108, 1 109, 1 1 10, 1204 and 1205.
  • SEQ ID NO: 2 sets forth the nucleotide sequence of SEQ ID NO: 1 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 3 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G22840.
  • the first intron is located at nucleotide positions 296 to 774.
  • the start of transcription is at nucleotide 196. Additional nucleotides added to make a consensus splice site are at nucleotide positions 293, 294, 295, 775, and 776 .
  • SEQ ID NO: 4 sets forth the nucleotide sequence of SEQ ID NO: 3 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 5 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G52300.
  • the first intron is located at nucleotide positions 1 100 to 1201.
  • the start of transcription is at nucleotide 1017. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1099, 1202, and 1203.
  • SEQ ID NO: 6 sets forth the nucleotide sequence of SEQ ID NO: 5 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 7 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT4G37830.
  • the first intron is located at nucleotide positions 861 to 1203.
  • the start of transcription is at nucleotide 786. Additional nucleotides added to make a consensus splice site are at nucleotide positions 858, 859, 860, 1204, and 1205.
  • SEQ ID NO: 8 sets forth the nucleotide sequence of SEQ ID NO: 7 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 9 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G51650.
  • the first intron is located at nucleotide positions 819 to 1567.
  • the start of transcription is at nucleotide 751. Additional nucleotides added to make a consensus splice site are at nucleotide positions 816, 817, 818, 1568, and 1569.
  • SEQ ID NO: 10 sets forth the nucleotide sequence of SEQ ID NO: 9 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 1 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G48140.
  • the first intron is located at nucleotide positions 1045 to 1201.
  • the start of transcription is at nucleotide 929. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1044, 1202, and 1203.
  • SEQ ID NO: 12 sets forth the nucleotide sequence of SEQ ID NO: 1 1 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 13 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G02780.
  • the first intron is located at nucleotide positions 1003 to 1343.
  • the start of transcription is at nucleotide 926, Additional nucleotides added to make a consensus splice site are at nucleotide positions 1000, 1001 , 1002, 1344, and 1345.
  • SEQ ID NO: 14 sets forth the nucleotide sequence of SEQ ID NO: 13 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 15 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G01280.
  • the first intron is located at nucleotide positions 604 to 1 102.
  • the start of transcription is at nucleotide 448. Additional nucleotides added to make a consensus splice site are at nucleotide positions 601 , 602, 603, 1 103, and 1 104.
  • SEQ ID NO: 16 sets forth the nucleotide sequence of SEQ ID NO: 15 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 17 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G67430.
  • the first intron is located at nucleotide positions 1783 to 1891.
  • the start of transcription is at nucleotide 1730. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1780, 1781 , 1782, 1892, and 1893.
  • SEQ ID NO: 18 sets forth the nucleotide sequence of SEQ ID NO: 17 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 19 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G76200. The first intron is located at nucleotide positions 758 to 1073. The start of transcription is at nucleotide 654. Additional nucleotides added to make a consensus splice site are at nucleotide positions 755, 756, 757, 1074, 1075.
  • SEQ ID NO: 20 sets forth the nucleotide sequence of SEQ ID NO: 19 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 21 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G31490.
  • the first intron is located at nucleotide positions 704 to 1430.
  • the start of transcription is at nucleotide 624. Additional nucleotides added to make a consensus splice site are at nucleotide positions 701 , 702, 703, 1431 , and 1432.
  • SEQ ID NO: 22 sets forth the nucleotide sequence of SEQ ID NO: 21 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 23 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT5G08690.
  • the first intron is located at nucleotide positions 776 to 1077.
  • the start of transcription is at nucleotide 747. Additional nucleotides added to make a consensus splice site are at nucleotide positions 773, 774, 775, 1078, and 1079.
  • SEQ ID NO: 24 sets forth the nucleotide sequence of SEQ ID NO: 23 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 25 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G07600.
  • the first intron is located at nucleotide positions 1504 to 1783.
  • the start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1784, and 1785.
  • SEQ ID NO: 26 sets forth the nucleotide sequence of SEQ ID NO: 25 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 27 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G78380.
  • the first intron is located at nucleotide positions 1504 to 2004.
  • the start of transcription is at nucleotide 1414. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2005, and 2006.
  • SEQ ID NO: 28 sets forth the nucleotide sequence of SEQ ID NO: 27 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 29 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G33040.
  • the first intron is located at nucleotide positions 552 to 952.
  • the start of transcription is at nucleotide 415. Additional nucleotides added to make a consensus splice site are at nucleotide positions 551 , 953, and 954.
  • SEQ ID NO: 30 sets forth the nucleotide sequence of SEQ ID NO: 29 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 31 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g21940.
  • the first intron is located at nucleotide positions 1504 to 2482.
  • the start of transcription is at nucleotide 1406. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2483, and 2484.
  • SEQ ID NO: 32 sets forth the nucleotide sequence of SEQ ID NO: 31 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 33 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g45950.
  • the first intron is located at nucleotide positions 1504 to 1656.
  • the start of transcription is at nucleotide 1419. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1657, and 1658.
  • SEQ ID NO: 34 sets forth the nucleotide sequence of SEQ ID NO: 33 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 35 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l g47760.
  • the first intron is located at nucleotide positions 729 to 2633.
  • the start of transcription is at nucleotide 638. Additional nucleotides added to make a consensus splice site are at nucleotide positions 726, 727, 728, 2634, and 2635.
  • SEQ ID NO: 36 sets forth the nucleotide sequence of SEQ ID NO: 35 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 37 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os02g02130. The first intron is located at nucleotide positions 1504 to 1586. The start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1587, and 1588.
  • SEQ ID NO: 38 sets forth the nucleotide sequence of SEQ ID NO: 37 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 39 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g56190.
  • the first intron is located at nucleotide positions 1504 to 1615.
  • the start of transcription is at nucleotide 1437. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1616, and 1617.
  • SEQ ID NO: 40 sets forth the nucleotide sequence of SEQ ID NO: 39 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 41 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g47980.
  • the first intron is located at nucleotide positions 940 to 1553.
  • the start of transcription is at nucleotide 829. Additional nucleotides added to make a consensus splice site are at nucleotide positions 937, 938, 939, 1554, and 1555.
  • SEQ ID NO: 42 sets forth the nucleotide sequence of SEQ ID NO: 41 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 43 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os01 g46610.
  • the first intron is located at nucleotide positions 1504 to 2228.
  • the start of transcription is at nucleotide 1384. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2229, and 2230.
  • SEQ ID NO: 44 sets forth the nucleotide sequence of SEQ ID NO: 43 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 45 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os04g28180.
  • the first intron is located at nucleotide positions 1504 to 1646.
  • the start of transcription is at nucleotide 1399. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1647, and 1648.
  • SEQ ID NO: 46 sets forth the nucleotide sequence of SEQ ID NO: 45 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 47 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g01820.
  • the first intron is located at nucleotide positions 1504 to 2453.
  • the start of transcription is at nucleotide 1229. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2454, and 2455.
  • SEQ ID NO: 48 sets forth the nucleotide sequence of SEQ ID NO: 47 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 49 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l gl 1390.
  • the first intron is located at nucleotide positions 1504 to 2798.
  • the start of transcription is at nucleotide 1431. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2799, and 2780.
  • SEQ ID NO: 50 sets forth the nucleotide sequence of SEQ ID NO: 49 without the first intron and any nucleotides that were added to form a consensus splice site.
  • SEQ ID NO: 51 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession AT4G37830 and the first intron from gene accession AT1 G52300.
  • the first intron is located at nucleotide positions 861 to 960.
  • the start of transcription is at nucleotide 786.
  • SEQ ID NO: 52 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession ATl G52300and the first intron from gene accession AT4G37830 .
  • the first intron is located at nucleotide positions 1 100 to 1442.
  • the start of transcription is at nucleotide 1017.
  • FIG. 1 is a graphical representation of root expression enhancement of
  • Arabidopsis promoters by cognate first introns.
  • Expression constructs comprising a promoter, a 5'-UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in Arabidopsis roots and compared to a control expression construct lacking the first intron (i.e., - intron variant).
  • Average intron-mediated enhancement is expressed as on the ⁇ -axis as 2 A -fold enhancement (e.g., 2 2 and 2 4 stand for 4-fold and 16-fold expression enhancement, respectively.)
  • the individual promoters used are listed below the -axis.
  • FIG. 2 is a graphical representation of expression enhancement of fice promoters by cognate first introns.
  • Expression constructs comprising a promoter, a 5'- UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in corn and compared to a control expression constructs lacking the first intron.
  • Average intron-mediated enhancement (IME) is expressed as expressed on the j ⁇ -axis as 2 -fold enhancement (e.g., 2 2 and 2 4 stand for 4-fold and 16-fold expression enhancement, respectively.)
  • the individual promoters used are listed below the -axis.
  • an element means one or more element.
  • the word "comprise,” or variations such as “comprises” or “comprising,” will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
  • expression construct refers to a recombinant DNA or nucleic acid, which comprises in a 5' ⁇ to-3' order and in operable linkage a promoter, a 5 '-untranslated region (5'-UTR), and a translated region, wherein the 5'-UTR comprises a first intron from a native gene of an organism.
  • the transcribed region of the expression construct comprises 5'-UTR, the first intron, and the translated region.
  • the expression constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
  • regulatory construct refers to a recombinant DNA or nucleic acid, which comprises in a 5'-to-3' order and in operable linkage a promoter and a 5'- UTR, wherein the 5'-UTR comprises a first intron from a native gene of an organism.
  • the regulatory constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a translated region or coding sequence, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
  • RNA transcript of a gene or an expression construct or a regulatory construct of the present invention also comprises a 5'-UTR.
  • any introns that occur in a '5-UTR of a gene or expression construct or regulatory construct are not found in the corresponding mature RNA transcript produced in vivo as such introns are typically spliced out by the host organism or cell thereof, unless the intron is non-functional in the host organism or the host organism is incompetent for splicing out such introns.
  • first intron refers to the first intron from the 5' end of a native gene of an organism.
  • the first intron can be found within the 5 '-UTR or the translated region of the native gene.
  • the first intron is between the first protein coding exon and the second protein coding exon.
  • typically the 5' end of a first intron that is capable of enhancing expression as disclosed herein is within about the first 1000 base pairs (bp) after the transcriptional start site (in a 5' to 3' direction) and is preferably within about the first 500 bp after the transcription start site.
  • an “expression-enhancing intron” or “enhancing intron” is an intron that is capable of causing an increase in the expression of a gene or polynucleotide to which it is operably linked.
  • a “first intron” of the present invention is an expression-enhancing intron. While the present invention is not known to depend on a particular biological mechanism, it is believed that the expression-enhancing introns of the present invention enhance expression through intron-mediated enhancement (IME). It is recognized that naturally occurring introns that enhance expression through IME are typically found within 1 Kb of the transcription start site of their native genes (see, Rose el al. (2008) Plant Cell 20:543-551 ).
  • Such introns are usually the first intron, whether the first intron is in the 5'-UTR or the coding sequence, and are in a transcribed region.
  • Introns that enhance expression solely through IME do not enhance gene expression when they are inserted into a non-transcribed region of gene, such as for example, a promoter. That is, they do not function as transcriptional enhancers.
  • the first introns of the present invention are capable of enhancing gene expression when they are found in a transcribed region of a gene but not when they occur in a non-transcribed region such as, for example, a promoter.
  • the term "translated region” refers to the portion of a gene or expression construct of the present invention or its corresponding RNA transcript that encodes a polypeptide or protein of interest.
  • the translated region comprises the start codon (e.g., ATG) for translation through the last codon of the protein or polypeptide encoded thereby. It is recognized that the translated region of a gene or expression construct can comprise one or more introns.
  • any introns that occur in the translated region of a gene or expression construct of the present invention are not typically found in the corresponding mature RNA transcript produced in vivo as such introns are normally spliced out by the host organism or cell thereof unless the intron or introns are non-functional and/or the host organism is incompetent for splicing out such introns.
  • native gene refers to a gene that is part of a natural genome of an organism and that was not introduced into the organism or a progenitor thereof by artificial means that do not involve the transfer of genes from one organism to another organism by sexual reproduction.
  • Such artificial means include, for example, any methods involving the introduction of recombinant DNA or other recombinant nucleic acid molecules into the organism or a progenitor thereof.
  • a gene that is introduced into a progenitor of an organism by artificial means does not become a native gene when it is transferred from the progenitor to the organism via sexual reproduction.
  • recombinant DNA refers to DNA and other recombinant nucleic acid molecules that are an artificial or non-naturally occurring combination of nucleic acid fragments, e.g., regulatory and coding sequences that are not all found together in the same form in nature.
  • recombinant nucleic acid molecules may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature.
  • an "expression construct” and a "regulatory construct” each comprise recombinant DNA.
  • enhancing gene expression is intended to mean enhancing or increasing the expression of a gene or its gene product, particularly a protein or polypeptide.
  • Gene expression can be determining by monitoring the formation of a transcript of a gene or polynucleotide or gene of interest of the present invention, a protein encoded by the transcript, or even an activity or function of the encoded protein. In preferred embodiments of the present invention, gene expression is determined by monitoring the level of a protein encoded by the gene or the activity or function of the encoded protein.
  • the expression of a polynucleotide or gene of interest of the present invention can be assessed in an organism or at least one cell thereof by determining the level of level of the protein encoded by the translated region of the polynucleotide or gene of interest or the activity or function of the encoded protein.
  • the polynucleotide or gene of interest comprises a translated region which encodes green fluorescent protein (GFP), and expression of the polynucleotide or gene of interest can be determined by measuring green fluorescence emitted from the GFP protein when it is exposed to blue light.
  • GFP green fluorescent protein
  • the polynucleotide or gene of interest comprises a translated region which encodes f3-glucuronidase (GUS) and expression of the polynucleotide can be determined by measuring GUS activity using the MUG fluorometric assay.
  • GUS f3-glucuronidase
  • a "promoter” refers to a nucleic acid that is capable of controlling the expression of an operably linked coding sequence or other sequence encoding an RNA.
  • the promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, nucleic acid fragments of some variation may have identical promoter activity.
  • an “enhancer” is a DNA sequence that can stimulate promoter activity, and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Enhancers may be found in both non-transcribed and transcribed regions of a gene. Typically, the promoter stimulating activity of an enhancer is insensitive with respect to the position and orientation (i.e., can be inverted) of the enhancer within a gene.
  • Promoters that cause an operably linked gene or polynucleotide to be expressed in most cell types of an organism and at most times are commonly referred to as “constitutive promoters".
  • the constitutive promoters of the present invention cause an operably linked gene or polynucleotide to be expressed in all or substantially all tissues and stages of development and being minimally responsive to abiotic stimuli.
  • Expression of a gene of a gene or polynucleotide in most cell types of an organism and at most times is referred to herein as “constitutive gene expression” or “constitutive expression”, “expressed constitutively”, “expression in a constitutive manner", or expression in a 'constitutive pattern”. It is understood that for the terms “constitutive promoter” and “constitutive expression” and that some variation in absolute levels of expression or activity can exist among different tissues and stages of development of an organism.
  • the present invention provides novel expression constructs comprising a promoter operably linked to a polynucleotide. It is recognized that nucleic acid molecules comprising such novel expression constructs can be synthesized or produced using a number of methods known in the art. As used herein, “synthesizing an expression construct” or “producing an expression construct” are interchangeable terms that are intending to mean the making of an expression construct by any known method including, but not limited to, chemical synthesis of the entire nucleic acid molecule or part or parts thereof, modification of a pre-existing nucleic acid molecule by molecular biology methods such as, for example, restriction endonuclease digestion, DNA amplification by polymerase and ligation, and the combination of chemical synthesis and modification.
  • progeny comprises any subsequent generation of an organism or a host cell, whether the result of sexual reproduction or asexual reproduction.
  • a progeny of the present invention is made by the methods of the present invention and/or comprises an expression construct of the present invention.
  • progenitor or “progenitor organism” refers to an ancestor of an organism or host cell.
  • methods are described that can involve the use of an organism or cell comprising an expression construct of the present invention wherein the organism or cell is descended from a progenitor into which the expression construct was introduced.
  • the expression construct was stably introduced into the genome of the progenitor by, for example, a stable transformation method described herein or otherwise known in the art.
  • an "organism” refers any life form that has genetic material comprising nucleic acids including, but not limited to, prokaryotes, eukaryotes, and viruses.
  • Organisms include, for example, plants, animals, fungi, bacteria, and viruses, and cells and parts thereof.
  • Preferred organisms of the present invention are eukaryotic organisms, including, for example, plants, animals, fungi, and protists.
  • a "target organism” is the organism into which an expression construct of the present invention is introduced, particularly for the purpose of expressing the protein encoded by the translated region of the expression construct.
  • gene of interest is intended any nucleotide sequence that can be expressed when operable linked to a promoter or a regulatory construct of the present invention.
  • a gene of interest of the present invention may, but need not, encode a protein.
  • a translated region of the present invention can be a gene of interest.
  • the gene of interest does not by itself comprise a functional promoter.
  • the gene of interest does not comprise a full-length 5 -UTR. More preferably, the gene of interest is a translated region.
  • heterologous gene is any nucleic acid molecule or polynucleotide that is expressed from a nucleotide construct of the present invention.
  • a heterologous gene can comprise a nucleotide sequence that is native or endogenous to an organism or can be foreign.
  • the present invention does not depend on a particular method of determining if the expression construct of the present invention is capable of enhancing gene expression in a target organism, typically gene expression is determined by transforming the target organism or at least one cell thereof with a polynucleotide construct comprising the expression construct.
  • the expression construct can further comprise additional genetic regulatory elements, if desired or necessary for expression in the translated region in the organism or at least one cell thereof.
  • determining whether the expression construct is capable of enhancing the expression of an operably linked gene in the desired manner in the target organism or any other organism of interest can depend on any number of factors including, for example, the type of genetic regulatory element (e.g., promoter, a 5'-untranslated region (UTR), a 3 '-untranslated region, an intron, a terminator, a chromatin control element), the presence of additional genetic elements in the construct, the gene of interest to be expressed, the organism or part or cell thereof in which expression is assayed, the expression assay, the detection method (e.g., GFP visible fluorescent, detection of GFP RNA by qPCR), the environmental conditions during the assay, and the like.
  • the type of genetic regulatory element e.g., promoter, a 5'-untranslated region (UTR), a 3 '-untranslated region, an intron, a terminator, a chromatin control element
  • the detection method e.g., GFP visible fluorescent, detection of GFP
  • a "control expression construct” is the same or substantially the same as an expression construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein.
  • a control expression construct lacks a first intron but otherwise comprises the same promoter, 5'-UTR, and translated region as an expression construct of the present invention.
  • a control expression construct lacks a first intron but otherwise has the same nucleotide sequence as an expression construct of the present invention, except for the missing portion that would correspond to the first intron in the expression construct.
  • a "control regulatory construct” is the same or substantially the same as a regulatory construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein.
  • a control regulatory construct lacks a first intron but otherwise comprises the same promoter and 5'-UTR as a regulatory construct of the present invention.
  • a control regulatory construct lacks a first intron but otherwise has the same nucleotide sequence as a regulatory construct of the present invention, except for the missing portion that would correspond to the first intron in the regulatory construct.
  • reporter refers to a nucleic acid molecule encoding a detectable marker.
  • Reporter genes include, for example, luciferase (e.g., firefly luciferase or Renilla luciferase), ⁇ -galactosidase, ⁇ - glucuronidase (GUS), chloramphenicol acetyl transferase (CAT), and a fluorescent protein (e.g., green fluorescent protein (GFP), red fluorescent protein (DsRed), yellow fluorescent protein, blue fluorescent protein, cyan fluorescent protein, or variants thereof, including enhanced variants such as enhanced GFP (eGFP).
  • Reporter genes are detectable by a reporter assay. Reporter assays can measure the level of reporter gene expression or activity by any number of means, including, for example, measuring the level of reporter mRNA, the level of reporter protein, or the amount of reporter protein activity. Reporter assays are known in the art or otherwise disclosed herein.
  • the present invention provides methods and compositions for enhancing gene expression in organisms, particularly eukaryotic organisms. Such methods and compositions can be used for the expression of polynucleotides, particularly the proteins encoded thereby, constitutively and at a high level in a target organism. Thus, the methods and compositions of the present invention find use in the production of any protein of interest in a eukaryotic organism or cells thereof.
  • the target organisms are plants, particularly monocot and dicot plants, more particularly monocot and dicot plants that are crop plants or that are suitable for the production of a protein of interest when grown in fields, greenhouses and/or controlled-environment facilities.
  • the present invention was made during the course of research related to the discovery and characterization of promoters that can be used to drive the expression of operably linked polynucleotides constitutively and at a high level in plants.
  • promoters are known as strong constitutive promoters.
  • the present inventors discovered that the expression of a polynucleotide can be increased by adding to a polynucleotide construct comprising a constitutive promoter an operably linked intron from the same plant gene as the promoter or an intron from a different plant gene that is also known to be expressed constitutively and at a high level.
  • the present invention provides methods for making an expression construct for enhancing gene expression in an organism.
  • the methods comprise selecting a first intron that is derived from a first gene that is highly expressed in a constitutive manner in a first organism.
  • the first intron is the first intron from the 5' end of the first gene, and the first gene is a gene that is native to the first organism.
  • Such a native gene is part of the natural genome of the first organism and was not introduced into the organism or a progenitor organism by artificial means.
  • the methods further comprise selecting a promoter.
  • the promoter can be selected before, after, or at the same time as, the first intron is selected.
  • the promoter can be a promoter derived from the first gene or a promoter derived from a second gene that is highly expressed in a constitutive manner either in the first organism or in a second organism.
  • the second gene is native to either the first organism or the second organism.
  • the methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the
  • polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and wherein the 5'-UTR or translated region comprises the first intron.
  • the 5'-UTR or any part thereof can be derived from the native 5 -UTR of the first gene, the second gene, or a different gene, or can be synthetic or artificial.
  • an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron. More preferably, an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism expression without significantly altering the constitutive manner of expression of the
  • an expression construct made by the methods disclosed provides for at least a about 1.25, 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60. 70, 80, 90, 100-fold increase in expression of the
  • an expression construct of the present invention can provide for an approximately 2 to 70-fold increase in expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
  • the methods of the present invention can involve a first organism and a target organism.
  • the first organism and the target organism can be the same species or different species. In embodiments in which the first organism and the target organism are not the same species, the first organism and the target organism are typically from related species.
  • the first organism and the target organism can be two different plant species, preferably two different monocot or dicot plant species, more preferably two different plant species within the same taxonomic family, most preferably two different plant species within the same genus.
  • the methods involve a first organism, a second organism, and a target organism.
  • the first organism, the second organism, and the target organism can be the same species or two or more different species.
  • the first organism, the second organism, and the target organism are not all the same species the first organism, the second organism, and the target organism are typically from two or more related species.
  • the first organism, the second organism, and the target organism can be three different plant species, preferably three different monocot or dicot plant species, more preferably three different plant species within the same taxonomic family, most preferably three different plant species within the same genus.
  • the expression construct comprises a promoter operably linked to a polynucleotide for transcription of the polynucleotide.
  • a polynucleotide of the present invention comprises a transcribed region.
  • the polynucleotide represents the region of the expression construct that is transcribed so as to produce an RNA molecule or transcript. It is recognized the initial RNA molecule or transcript this is produced may be further modified in the organism or cell thereof so as to produce a mature RNA transcript. Modifications can include, for example, splicing out one or more introns including, but not limited to, the first intron.
  • the polynucleotide comprises the 5'-UTR, the first intron, and the translated region, and either the 5'-UTR or the translated regions comprises the first intron.
  • the first intron is between the first and second exons of the translated region.
  • the 5'-UTR comprises the first intron.
  • the first intron is at or near the 3' end of the 5'-UTR. More preferably, the first intron is at the 3' end of the 5'-UTR immediately before the translational start site.
  • 3' end of the 5'-UTR is the nucleotide immediately before the first nucleotide of the start codon for translation.
  • the start codon will be ATG.
  • other start codons are known to be used by some organisms and that the present invention does not depend a particular start codon.
  • non-intron sequences of 5'-UTRs are typically in the range of about 30 bp to about 200 bp, preferably about 50 to about 150 bp, although substantially larger or smaller 5'-UTRs are also encompassed by the present invention.
  • the expression constructs and regulatory constructs of the present invention comprise promoters and first introns that are derived from native genes.
  • the 5'-UTR or portion thereof can also be derived from a native gene.
  • the promoters, first introns, and the 5-UTRs can be identical to or substantially the same as the corresponding element in its native gene. It is recognized that promoters, first introns, and 5'-UTRs of the present invention that are each derived from a native gene can be modified so that their sequences are no longer identical to the corresponding sequences in the native gene.
  • modifications include, for example, the addition of a consensus splice sites on one or both ends of an intron, removal of cryptic splice site, and sequence modifications that increase transcription. Generally, any such modifications will not alter constitutive expression of the promoters and the function of the first introns but it is recognized such modifications may enhance gene expression.
  • the first introns comprise consensus splice sites on both the 5' and 3' ends.
  • the first introns comprise consensus splice sites on both the 5' and 3' ends, wherein the consensus splice sites are selected, or designed to be, efficiently spliced out when present in a transcript in the organism of interest.
  • first intron that is not spliced out may be disruptive to translation when the first intron is located in the 5'-UTR, particularly when located near the 3 '-end of the 5'-UTR.
  • first introns that are located within the translated region and that are not spliced out at all or spliced out inefficiently can have the unintended effect of reducing or eliminating the expression of the protein of interest.
  • the methods of the present invention can comprise selecting a first intron and/or a promoter that is derived from a gene that is highly expressed in a constitutive manner in an organism.
  • the selected first intron and promoter can be derived from the same gene, from different genes in the same organism, or even from different genes in different organisms.
  • the first intron and/or a promoter can be selected from the promoters and first introns of genes that are known to be highly expressed in a constitutive manner.
  • Such promoters and first introns and methods for identifying them are generally known the art. See, for example: U.S. Patent Application No.
  • the methods of the present invention can further comprise identifying highly expressed constitutive genes from any organism and the selecting first introns and/or promoters from the newly identified highly expressed constitutive genes. Any method known in the art for the identification of highly expressed constitutive genes can be used in the methods disclosed herein. See, for example: U.S. Patent Application No. 13/528,515, filed June 20, 2012; WO 201 1/079197; and WO 2012/006426.
  • the present invention provides methods for making a making a regulatory construct.
  • the methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism.
  • the first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron.
  • the methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner in the same organism as the gene from which the first intron was derived or in a different organism.
  • the methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron.
  • the first intron is at or near the 3' end of the 5'- UTR.
  • the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron.
  • the methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
  • regulatory construct of the present invention is essentially the same as an expression construct of present invention but without an operably linked translated region.
  • the descriptions herein of the various elements of, and the arrangement within, the expression constructs of the present invention are also germane to the regulatory constructs of the present invention with the exception that the regulatory constructs are not required to comprise an operably linked translated region.
  • the expression constructs of the present invention find use in the making of organisms or cells that express a heterologous gene in a constitutive manner and at high level.
  • the present invention provides methods for making an organism for expressing a heterologous gene. The methods comprising introducing into at least one cell of a target organism an expression construct of the present invention.
  • Such an expression construct comprises a promoter operably linked to a polynucleotide, wherein:
  • the polynucleotide comprises a 5'-UTR and a translated region
  • the 5'-UTR or the translated region comprises a first intron
  • the first intron is the first intron from the 5' end of the first gene
  • the first gene is native to the first organism
  • the promoter is derived from the first gene or from a second gene
  • the second gene is native to at least one of the first organism and
  • the methods for making an organism for expressing a heterologous gene can further comprise regenerating from the at least one cell a target organism comprising the expression construct.
  • the target organism or cell is capable of expressing the polynucleotide when the target organism or cell is exposed to conditions favorable for the expression of the polynucleotide for a sufficient period of time, and the polynucleotide is expressed at an increased level in the target organism or at least one cell thereof when compared to the expression of the polynucleotide in the target organism or at least one cell thereof comprising a control expression construct which lacks the first intron.
  • expression of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
  • the methods for making an organism for expressing a heterologous gene can further comprise producing additional organisms or progeny by one or more rounds of sexual or asexual reproduction and optionally selecting for progeny comprising the expression construct.
  • the methods of the present invention are not only limited to making the initial organism or the initial cell into which the expression construct was introduced but also encompass all progeny cells and organisms, however produced, that are descended from initial organism and/or the initial cell and that comprise the expression construct.
  • the expression constructs of present invention find use in methods for expressing a heterologous gene in an organism.
  • the present invention provides methods for expressing a heterologous gene in an organism. The methods involve obtaining a target organism comprising an expression construct of the present invention or at least one cell thereof and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed.
  • the polynucleotide is expressed at an increased level in the target organism or cell thereof when compared to the expression of the polynucleotide in the target organism or cell thereof comprising a control expression construct which lacks the first intron.
  • the methods further comprise producing the target organism or a progenitor thereof by introducing the expression construct into at least one cell of an organism and regenerating the at least one cell into the target organism or a progenitor thereof comprising the expression construct.
  • the methods for expressing a heterologous gene in an organism can further comprise making the expression construct as described herein above.
  • the present invention additionally provides nucleic acid molecules, vectors, expression cassettes comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention. Further provided are non-human organisms and non-human host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs as disclosed herein.
  • the invention further provides expression cassettes, plants, plant parts, plant cells, seeds and host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention.
  • expression constructs and regulatory constructs can comprise promoters, first introns, and/or 5'UTRs that are identical in nucleotide sequence to corresponding promoters, first introns, and/or 5'UTRs in one or more native genes
  • the expression constructs and regulatory constructs of the present invention are not known to be naturally occurring.
  • the expression constructs and regulatory constructs of the present invention are recombinant nucleic acids that are not native to the genome of an organism.
  • the invention encompasses isolated or substantially purified nucleic acid molecule or polynucleotide compositions.
  • An "isolated” or “purified” nucleic acid molecule or polynucleotide, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the nucleic acid molecule or polynucleotide as found in its naturally occurring environment.
  • an isolated or purified nucleic acid molecule or polynucleotide is substantially free of other cellular material or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
  • fragments and variants of the disclosed nucleic acid molecules or polynucleotides encompasses fragments and variants of the disclosed nucleic acid molecules or polynucleotides.
  • fragment is intended a portion of the nucleic acid molecule or polynucleotide. Fragments of a polynucleotide comprising nucleic acid sequences retain biological activity of the full-length nucleic acid molecule or polynucleotide. Alternatively, fragments of a polynucleotide that are useful as hybridization probes generally do not encode proteins that retain biological activity or do not retain promoter activity. Thus, fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides, and up to the full-length polynucleotide of the invention.
  • a fragment of a polynucleotide of the invention may encode a biologically active portion of a polynucleotide.
  • a biologically active portion of a polynucleotide can be prepared by isolating a portion of one of the polynucleotides of the invention that comprises the genetic regulatory element and assessing activity as described herein.
  • Polynucleotides that are fragments of a nucleotide sequence of the present invention comprise at 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1 ,000, 1 , 100, 1 ,200, 1 ,300, 1 ,400, 1 ,500, 1 ,600, 1 ,700, 1 ,800, 1 ,900, 2,000, 2,100, 2,200, 2,300, 2,400 2,500, 2,600, or 2,700 contiguous nucleotides, or up to the number of nucleotides present in a full-length polynucleotide disclosed herein.
  • a variant comprises a polynucleotide having deletions (i.e., truncations) at the 5' and/or 3' end; deletion and/or addition of one or more nucleotides at one or more internal sites in the reference polynucleotide; and/or substitution of one or more nucleotides at one or more sites in the reference polynucleotide.
  • a "reference" polynucleotide comprises a nucleotide sequence produced by the methods disclosed herein.
  • Variant polynucleotides also include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still comprise biological activity.
  • variants of a particular polynucleotide or nucleic acid molecule of the invention will have at least about 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters as described elsewhere herein.
  • Variant polynucleotides also encompass sequences derived from a mutagenic and recombinogenic procedure such as DNA shuffling.
  • Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) PNAS 91 : 10747- 10751 ; Stemmer (1994) Nature 370:389-391 ; Crameri et al. (1997) Nature Biotech. 15:436-438; Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) PNAS 94:4504-4509; Crameri et al. (1998) Nature 391 :288-291 ; and U.S. Patent Nos.
  • oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any plant of interest.
  • Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, New York). See also Innis et al , eds. (1990) ?CR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds.
  • PCR PCR Strategies
  • nested primers single specific primers
  • degenerate primers gene-specific primers
  • vector-specific primers partially-mismatched primers
  • polynucleotide molecules of the present invention encompass polynucleotide molecules comprising a nucleotide sequence that is sufficiently identical to one of the nucleotide sequences set forth in any one or more of SEQ ID NOS: 1-52.
  • the term "sufficiently identical" is used herein to refer to a first nucleotide sequence that contains a sufficient or minimum number of identical or equivalent nucleotides to a second nucleotide sequence such that the first and second nucleotide sequences have a common structural domain and/or common functional activity.
  • nucleotide sequences that contain a common structural domain having at least about 85% or 90% identity, preferably 95% identity, more preferably 96%), 97%), 98%o or 99% identity are defined herein as sufficiently identical.
  • the sequences are aligned for optimal comparison purposes.
  • the two sequences are the same length.
  • the percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent identity, typically exact matches are counted.
  • the determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
  • a preferred, nonlimiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of arlin and Altschul (1990) PNAS 87:2264, modified as in Karlin and Altschul
  • PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See
  • sequence identity values for pairs of sequences provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters (Altschul el al , (1997) Nucleic Acids Res. 25:3389-402) using the full-length sequences of the invention.
  • sequence identity values for multiple sequence alignments provided herein refer to the value obtained using MUSCLE (Version 3.8) using default parameters using the full-length sequences of the invention. MUSCLE is available at http://www.drive5.com/muscle/ or http://www.ebi.ac.uk/Tools/msa/muscle/. See, Edgar (2004) Nucleic Acids Res.
  • polynucleotide and “nucleic acid” is not intended to limit the present invention to polynucleotides and nucleic acids comprising DNA.
  • polynucleotides and nucleic acids can comprise ribonucleotides and combinations of ribonucleotides and
  • deoxyribonucleotides Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues.
  • the polynucleotides and nucleic acids of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
  • the expression constructs and regulatory constructs of the present invention can be provided in expression cassettes for expression in the plant or other organism or host cell of interest. It is recognized that the expression constructs of the present invention and expression cassettes comprising one or more of such expression constructs can be used for the expression in both human and non-human host cells including, but not limited to, host cells from plants, animals, fungi, protists, and algae. In one
  • the host cells are human host cells or a host cell line that is incapable of differentiating into a human being.
  • the expression cassette can include additional 5' and 3' regulatory sequences operably linked to the expression construct or regulatory construct.
  • "Operably linked" intended to mean a functional linkage between two or more elements.
  • an operable linkage between one or more genetic regulatory elements and a gene of interest is functional link between the gene of interest and the one or more genetic regulatory elements that allows for expression of the gene of interest.
  • Operably linked elements may be contiguous or non-contiguous.
  • an "operably linked intron” is an intron that is functional and splices out of a polynucleotide when in a host organism capable of splicing out such a functional intron.
  • an "operably linked intron” is one that is functional and splices out of a coding region or translated region of an RNA without disrupting the reading frame for translation when the polynucleotide is in a host organism capable of splicing out such a functional intron. It is understood that the term "in operable linkage” as used herein has the same meaning as “operably linked”.
  • the expression cassette may additionally contain at least one additional gene to be co-transformed into the organism.
  • the additional gene(s) can be provided on multiple expression cassettes.
  • Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotide to be under the transcriptional regulation of the regulatory regions.
  • the expression cassette may additionally contain selectable marker genes.
  • the expression cassette can comprise in the 5 '-3' direction of transcription, a transcriptional initiation region (i.e., a promoter), a translational initiation region, nucleotide sequence to be expressed, a translational stop site, and a transcriptional termination region (i.e., termination region) functional in plants or other organism or host cell.
  • the expression cassette further comprises a first intron either in the 5'-UTR or coding region.
  • the regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) and/or the polynucleotide to be expressed may be native/analogous to the host cell or to each other. Alternatively, any of the regulatory regions and/or the polynucleotide to be expressed may be
  • heterologous in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
  • a promoter operably linked to a heterologous polynucleotide is from a species different from the species from which the polynucleotide was derived, or, if from the same/analogous species, one or both are substantially modified from their original form and/or genomic locus, or the promoter is not the native promoter for the operably linked polynucleotide.
  • a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
  • the termination region may be native with the transcriptional initiation region, may be native with the operably linked polynucleotide of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous) to the promoter, the polynucleotide of interest, the plant host, or any combination thereof.
  • Convenient termination regions are available from the Ti-plasmid of A. lumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also
  • a promoter of the present invention for gene expression in plants is capable of directing the constitutive expression of an operably linked gene of interest in a plant, a plant part, and/or a plant cell.
  • the genes of interest may be optimized for increased expression in the transformed plant. That is, the polynucleotides can be synthesized using plant-preferred codons for improved expression. See, for example, Campbell and Gowri (1990) Plant Physiol. 92: 1-1 1 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Patent Nos. 5,380,831 , and 5,436,391 , and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference.
  • Additional sequence modifications are known to enhance gene expression in a cellular host. These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression.
  • the G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures.
  • the expression cassettes may additionally contain heterologous 5' UTRs (also known as 5' leader sequences). Such 5' UTRs can act to enhance translation.
  • Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5' noncoding region) (Elroy-Stein et al. (1989) PNAS USA 86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie el al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak et al.
  • EMCV leader Engelphalomyocarditis 5' noncoding region
  • potyvirus leaders for example, TEV leader (Tobacco Etch Virus) (Gallie el al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (Bi
  • AMV RNA 4 alfalfa mosaic virus
  • TMV tobacco mosaic virus leader
  • MCMV maize chlorotic mottle virus leader
  • the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame.
  • adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like.
  • in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions may be involved.
  • the expression cassette can also comprise a selectable marker gene for the selection of transformed cells.
  • Selectable marker genes are utilized for the selection of transformed cells or tissues.
  • Marker genes include genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT), as well as genes conferring resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4- dichlorophenoxyacetate (2,4-D).
  • Additional selectable markers include phenotypic markers such as ⁇ -galactosidase and fluorescent proteins such as green fluorescent protein (GFP) (Su et al.
  • selectable marker genes are not meant to be limiting. Any selectable marker gene can be used in the present invention.
  • the methods of the invention involve introducing an expression construct or regulatory construct into an organism.
  • introducing is intended presenting to the organism the expression construct in such a manner that the construct gains access to the interior of a cell of the organism.
  • the methods of the invention do not depend on a particular method for introducing an expression construct or regulatory construct into an organism, only that the expression construct or regulatory construct gains access to the interior of at least one cell of the organism.
  • Methods for introducing expression constructs, regulatory constructs, and other polynucleotides into various organisms such as, for example, plants, animals, fungi, protists, and bacteria are known in the art including, but not limited to, stable transformation methods, transient transformation methods, and virus-mediated methods.
  • stable transformation is intended that the polynucleotide construct introduced into a organism integrates into a genome of organism and is capable of being inherited by progeny thereof.
  • transient transformation is intended that a polynucleotide construct introduced into an organism does not integrate into a genome of the organism.
  • the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in the organism or host cell.
  • the selection of the vector depends on the preferred transformation technique and the species of target organism or host cell to be transformed.
  • the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in a plant or plant cell.
  • the selection of the vector depends on the preferred transformation technique and the target plant species to be transformed.
  • nucleic acid molecules, expression constructs, and regulatory constructs of the invention may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating a nucleic acid molecule or an expression construct of the invention within a viral DNA or RNA molecule. It is recognized that the a protein of the invention may be initially synthesized as part of a viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein.
  • the cells that have been transformed may be grown into plants in accordance with conventional ways. See, for example, McCormick et al. (1986) Plant Cell Reports 5 :81 -84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting hybrid having constitutive expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved. In this manner, the present invention provides transformed seed (also referred to as "transgenic seed") having a polynucleotide construct of the invention, for example, an expression cassette of the invention, stably incorporated into their genome.
  • the nucleic acid molecules, expression constructs, and regulatory constructs of the present invention can be provided to a plant or other organism using a variety of transient transformation methods.
  • transient transformation methods include, but are not limited to, the introduction of the sequence or variants and fragments thereof directly into the plant or other organism or the introduction of a transcript into the plant.
  • Such methods include, for example, microinjection, electroporation, or particle bombardment. See, for example, Crossway et al. ( ⁇ 986) Mo! Gen. Genet. 202: 179-185; Nomura et al. (1986) Plant Sci. 44:53-58; Hepler et al. (1994) PNAS 91 : 2176-2180 and Hush et al.
  • polynucleotide can be transiently transformed into the plant or other organism using any other technique known in the art.
  • nucleic acid molecules,expression constructs, and regulatory constructs of the present invention can be used for transformation of any plant species, including, but not limited to, monocots and dicots.
  • plant species of interest include, but are not limited to, Arabidopsis thaliana, peppers ⁇ Capsicum spp; e.g., Capsicum annuum, C. baccatum, C. chinense, C. frutescens, C.
  • juncea particularly those Brassica species useful as sources of seed oil, alfalfa (Medicago saliva), rice (Oryza saliva), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), green millet (Setaria viridis), finger millet (Eleusine coracana)), sunflower (Helianlhus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycine max), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (lpomoea batatus), cassava (Manihot esculenta), coffee (
  • Wolffiella spp., and Wolffia spp.) algae e.g., Chlamydomonas reinhardlii, Botryococcus braunii, Chlorella spp. , Dunaliella tertiolecta, Gracilaria spp.), oats, barley, vegetables, ornamentals, and conifers.
  • algae e.g., Chlamydomonas reinhardlii, Botryococcus braunii, Chlorella spp. , Dunaliella tertiolecta, Gracilaria spp.
  • oats barley, vegetables, ornamentals, and conifers.
  • the term plant includes plant cells, plant protoplasts, plant cell or tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruits, roots, root tips, anthers, and the like. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced expression constructs or polynucleotides.
  • Various changes in phenotype are of interest including modifying the fatty acid composition in a plant, altering the amino acid content of a plant, altering a plant's pathogen defense mechanism, and the like. These results can be achieved by providing expression of heterologous products or increased expression of endogenous products in plants.
  • the present invention provides methods for expressing heterologous genes in organisms.
  • a heterologous gene of the present invention can be any gene of interest that can be expressed by the methods of the present invention.
  • Genes of interest encode proteins of interest.
  • a translated region of the present invention can comprise a gene of interest that encodes a protein of interest.
  • genes of interest are reflective of the commercial markets and interests of those involved in the development of the crop. Crops and markets of interest change, and as developing nations open up world markets, new crops and technologies will emerge also. In addition, as our understanding of agronomic traits and characteristics such as yield and heterosis increase, the choice of genes for transformation will change accordingly.
  • General categories of genes of interest include, for example, those genes involved in information, such as zinc fingers, those involved in communication, such as kinases, and those involved in housekeeping, such as heat shock proteins. More specific categories of transgenes, for example, include genes encoding important traits for agronomics, insect resistance, disease resistance, herbicide resistance, sterility, grain characteristics, yield, abiotic stress tolerance, and commercial products. Genes of interest include, generally, those involved in oil, starch, carbohydrate, or nutrient metabolism. In addition, genes of interest include genes encoding enzymes and other proteins from plants and other sources including prokaryotes and other eukaryotes.
  • Promoters and introns from Arabidopsis and rice highly expressed constitutive genes were used to make expression constructs comprising a promoter and intron from the same gene operably linked to a reporter gene and control expression constructs comprising a promoter operably linked to the reporter gene.
  • the Arabidopsis and rice genes were previously identified as being highly expressed constitutive genes as reported in WO 201 1/079197 (see also, U.S. Patent Application No. 13/528,515, filed June 20, 2012) and WO 2012/006426.
  • the accession numbers of the genes are listed in Tables 1 and 2 along with cross-references to the sequence identifiers for the constructs in these publications. Table 1. Gene Accessions from Arabidopsis thaliana
  • intron-mediated enhancement was calculated as average expression with a + intron construct divided by average expression with the corresponding - intron construct.
  • the constructs from Arabidopsis were tested for GFP expression in Arabidopsis lhaliana by calculating the GFP index, and the rice constructs were tested for GUS expression in corn (Zea mays) by determining GUS enzymatic activity, as described in WO 201 1/079197 and WO 2012/006426.
  • IME was calculated for each of the 14 (Arabidopsis) or 10 (corn) tissues/zones/stages, and then these values are averaged for presentation in Figures 1 and 2.
  • the presence of the first introns in the constructs enhanced expression in most cases (12 of 15 cases in
  • Arabidopsis 10 of 10 cases in rice
  • the expression enhancement ranging from 2- 70 fold in both Arabidopsis and corn
  • median IME in corn was 1 1.5-fold
  • the starred Arabidopsis IME values in Figure 1 and all of the corn IME values in Figure 2 are minimal estimates for IME because there was no detectable expression in the absence of an intron in one or more of the tissues tested.
  • the IME value is calculated using background GFP or GUS values, respectively, for the tissues with no detectable expression in the -intron transgenics.
  • Arabidopsis expression measurements are from the root epidermis, cortex, endodermis, and stele in each of the meristematic, elongation, and maturation zones, as well as the root cap and quiescent center (14 measurements throughout root
  • Corn expression measurements are from V3-root, V7-root, VT-root, V3-leaf, V7-leaf, VT-leaf, VT-anther, VT-silk, 21 -DAP-embryo, and 21 -DAP-endosperm (10 measurements throughout plant development total) from R0 seedlings.
  • IME was also determined in shoot tissue from two representative Arabidopsis promoters using quantitative PCR analysis (qRT-PCR) and northern blot analysis.
  • qRT-PCR quantitative PCR analysis
  • Plant tissues were harvested from 2-3 week old seedlings and homogenized in liquid nitrogen by grinding with mortar and pestle.
  • Total RNA was extracted from tissues using the RNeasy kit (Qiagen). Gel resolution, transfer and crosslinking were done with the NorthernMax kit (Ambion).
  • Probes for GFP and the housekeeping gene ATPK1 were labeled with the Prime-A-Gene kit (Promega). Unincorporated labels were removed via Micro Bio-spin P30 Tris chromatography columns (BioRad). Following overnight hybridization, membranes were washed in 2X SSC with 0.1 %SDS, dried, and screened at ⁇ ⁇ using the Scan Phospholmager. Bands were quantified via ImageQuant software.
  • cDNA was generated from total RNA using Superscriptlll reverse transcriptase (Invitrogen) per manufacturer's instructions. Quantitative PCR was performed with iQ Multiplex Powermix (Bio-Rad) supplemented with the appropriate primers and probes (see below) on an iCycler iQ real-time detection system (Bio-Rad) using the following thermal-cycler program: (1) 9 min at 95°C; (2) 15 s at 94°C; (3) 30 s at 57°C; (4) 30 s at 72°C; repeat 40 cycles of steps 2-4. Amplification data recorded by the iQ software (Bio-Rad) was exported to Linregpcr program (Ruijter et al. (2009) Nucleic Acids Res.
  • PCR efficiency and cycle threshold values were used to calculate GFP transgene copy number and expression relative to the 35S:GFP control using REST-MCS beta tool (Pfaffl et al. (2002) Nucleic Acids Res. 30(9):e36).
  • Relative GFP expression in each tissue was calculated by normalizing the amplification of GFP in cDNA to the amplification of ubiquitin-conjugating enzyme 9 (UBC9), a "housekeeping gene", and subsequent normalization to 35S:GFP.
  • UBC9 ubiquitin-conjugating enzyme 9
  • PDS 1 Probe 5' - 5 TEX 615/TCGGTGTTAGAGCCGTTGCGATTGAA /3IAbRQSp.
  • 5TEX615 indicate the presence of 5' fluorophore modifications while 3IAbRQSp and 3IABkFQ indicate the presence of 3' quencher modifications
  • IME Expression enhancement
  • Tables 4 and 5 demonstrate the absolute expression activity of the + intron variants when compared to well-know, high constitutive expressing control promoters.
  • expression constructs with Arabidopsis promoters and cognate introns were compared to the CaMV 35S promoter for expression in Arabidopsis roots.
  • GFP expression in Arabidopsis was measured as the GFP index as described in WO
  • tissue/stages from 5-10 lines per promoter tissue/stages from 5-10 lines per promoter.
  • the introns that have been identified can enhance the expression of heterologous promoters.
  • introns were swapped between two promoters from Figure 1 and tested for expression enhancement by northern analysis of shoot tissue as described above.
  • the result in Table 6 for the AT1 G52300/AT4G37830 construct is based on 1 single copy homozygous line of each the - and + intron variants.
  • the result in Table 6 for the AT4G37830/AT1 G52300 construct is based on 2 (- intron variant) and 5 (+ intron variant) single copy, homozygous lines.
  • Table 6 Intron-Mediated Enhancement (IME) of Heterologous Promoters
  • the present invention demonstrates how to identify enhancing introns - by taking the first introns from genes selected for particular properties (e.g., high and uniform expression in all cell types, organs, tissues).
  • the first introns are usually in the coding region but as disclosed herein the enhancing property of the first introns is modular because the first introns can be moved to the 3' end of 5 -UTRs of cloned promoters and still provide effective enhancement. This is important because the present invention demonstrates that there it is not necessary to make fusion constructs comprising a first intron inserted within the translated region of a gene of interest.
  • regulatory constructs can be prepared which comprise a promoter operably linked to a 5'-UTR which comprises a first intron preferably at or new the 3' of the 5'-UTR.
  • a construct can be operably linked to any gene of interest with relative ease without making any modification to the translated region of the gene of interest.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Methods for making expression constructs for enhancing gene expression in an organism are provided. The expression constructs comprise a constitutive promoter operably linked to a polynucleotide, which comprises a 5'-untranslated region (5'-UTR) and a translated region. The 5'-UTR comprises the first intron of a gene that is native to an organism and expressed constitutively. Methods of using the expression constructs to enhance the expression of a gene in an organism and compositions comprising the expression constructs are further provided.

Description

METHODS AND COMPOSITIONS FOR ENHANCING GENE EXPRESSION
CROSS-REFERENCE TO RELATED APPLICATION
This application claims the benefit of U.S. Application No. 61/666,318, filed June 29, 2012, which is hereby incorporated herein in its entirety by reference. FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with United States Government support under STTR 0957836 awarded by the National Science Foundation. The United States Government has certain rights in the invention. REFERENCE TO A SEQUENCE LISTING SUBMITTED
AS A TEXT FILE VIA EFS WEB
The official copy of the sequence listing is submitted electronically via EFS- Web as an ASCII formatted sequence listing with a file named 435025SeqLst.txt, created on June 26, 2013, and having a size of 93.7 kilobytes, and is filed concurrently with the specification. The sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
The production of transgenic cells and organisms comprising a heterologous gene sequence is now routinely practiced by molecular biologists. Methods for incorporating an isolated gene sequence into an expression cassette, producing transformation vectors, and transforming many types of cells and organisms are well known. The regulation or control of expression of the heterologous gene and the protein encoded by the gene can often be critical in the development of a transgenic organism for commercial use. For example, in transgenic plants comprising a heterologous gene that confers tolerance to herbicide that is normally toxic to the plant, it can be critical to have the heterologous gene expressed in a temporal and spatial manner that corresponds to when the plant is exposed to the herbicide and to what parts of the plant the herbicide normally exerts its phytotoxic effect.
A number of genetic regulatory elements are known to play a role in regulating the expression of a gene in plants and other organisms including, for example, promoters, 5 '-untranslated regions (UTRs), 3'-untranslated regions, and expression- enhancing introns. To express a transgene in a plant or organism, one or more of these genetic regulatory elements is operably linked for expression to a nucleic acid sequence or gene of interest.
Recently, it has become commonplace to introduce or "stack" multiple transgenes into a single transgenic crop plant. The stacking of multiple transgenes into a single transgenic plant has, however, proved to be problematic, particularly when the same genetic regulatory elements are used in more than one of the stacked transgenes. The use of multiple copies of the same regulatory sequence within two or more transgenes in a single plant is known to promote the activation of gene silencing mechanisms (Halpin (2005) Plant Biotech. J. 3: 141-155). Silencing of transgenes previously showing stable expression can also be triggered 'de novo' when a new transgene is added by crossing or re-transformation if, for example, the same promoter has been used in both transgenes in an effort to promote coordinated expression (Halpin (2005) Plant Biotech. J. 3 : 141-155). Often, the use of the same promoter in multiple transgenes in a single plant is due to the lack of more than one promoter that gives the desired pattern and level of expression. For example, the Cauliflower mosaic virus (CaMV) 35S promoter is frequently used as the promoter in plant transgenes because it provides for high-level constitutive expression of an operably linked gene of interest. Because of a lack of suitable alternative promoters, the CaMV 35 promoter is often used to drive the high-level constitutive expression of two or more transgenes in the same plant. Thus, additional promoters and other genetic regulatory elements are needed to avoid gene silencing that might be caused by the use of a particular genetic regulatory element more than once when two, three, four, or more transgenes are stacked in a single crop plant. A common approach for identifying additional promoters that can be used to drive high-level, constitutive expression of an operably linked heterologous nucleotide sequence in plants involves screening plants to identify plant genes that display the high-level constitutive expression across most tissue and/or cell types. Often, however, this approach yields less than satisfactory results when the promoter from the plant gene is separated from its native downstream transcribed region, operably linked to reporter gene or other gene of interest, introduced into a plant or plant cell, and assayed for the level of expression of the reporter gene or other gene of interest. In many cases, such promoters fail to display the same high-level constitutive expression of the operably linked gene as the promoter displays when it occurs in its native position operably linked to its native transcribed region. Thus, new approaches are needed to provide additional promoters suitable for driving high-level, constitutive expression of an operably linked heterologous nucleotide sequence in plants. BRIEF SUMMARY OF THE INVENTION
Methods are provided for making an expression construct for enhancing gene expression in an organism. The methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism. The first intron is the first intron from the 5' end of the transcribed region of a gene. The methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. The methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide. The
polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron.
Preferably, the expression construct provides for enhanced expression of the operably linked polynucleotide in a target organism when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
Additionally provided are methods for making an organism for expressing a heterologous gene. The methods comprise introducing into at least one cell of a target organism an expression construct comprising a promoter operably linked to a polynucleotide. The polynucleotide comprises a 5'-UTR, a first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron, which is derived from a native gene that is highly expressed in a constitutive manner in an organism. The promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. The methods can further comprise regenerating a target organism from at least one cell comprising the expression construct. The target organism that is produced by the methods of the present invention is capable of expressing the polynucleotide when the target organism or cell thereof is exposed to conditions favorable for the expression of the
polynucleotide for a sufficient period of time. Preferably, the target organism is capable of enhanced expression of the polynucleotide when compared to the expression in the target organism of the polynucleotide from a control expression construct which lacks the first intron.
Further provided are methods for expressing a heterologous gene in an organism. The methods comprise obtaining a target organism comprising an expression construct or at least one cell thereof. The expression construct comprises a promoter operably linked to a polynucleotide and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed. For this method, the polynucleotide comprises a S'-UTR, a first intron, and a translated region, and the 5'- UTR or translated region comprises the first intron. The first intron is derived from a native gene that is highly expressed in a constitutive manner in an organism. The promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. Preferably, expression of the polynucleotide is increased in a target organism comprising the expression construct or in at least one cell thereof, when compared to the expression of the polynucleotide in the target organism comprising a control expression construct which lacks the first intron or in at least one cell thereof. In certain embodiments, the expression level of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
Methods are provided for making a regulatory construct. The methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism. The first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron. The methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. The methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron. Preferably, the first intron is at or near the 3' end of the 5'-UTR. Also preferably, the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron. The methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
Nucleic acid molecules comprising the expression constructs and regulatory constructs of the present invention are provided. Additionally provided are organisms and host cells comprising the expression constructs and regulatory constructs of the present invention. In one embodiment of the invention, the organisms and host cells include, for example, plants, seeds, plant parts, and plant cells comprising at least one expression construct and/or at least one regulatory construct of the present invention.
SEQUENCE LISTING
The nucleotide sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases. The nucleotide sequences follow the standard convention of beginning at the 5' end of the sequence and proceeding forward (i.e., from left to right in each line) to the 3' end. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand. SEQ ID NO: 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G 13440. The first intron is located at nucleotide positions 1 1 1 1 to 1203. The start of transcription is at nucleotide 955. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1 108, 1 109, 1 1 10, 1204 and 1205.
SEQ ID NO: 2 sets forth the nucleotide sequence of SEQ ID NO: 1 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 3 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G22840. The first intron is located at nucleotide positions 296 to 774. The start of transcription is at nucleotide 196. Additional nucleotides added to make a consensus splice site are at nucleotide positions 293, 294, 295, 775, and 776 .
SEQ ID NO: 4 sets forth the nucleotide sequence of SEQ ID NO: 3 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 5 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G52300. The first intron is located at nucleotide positions 1 100 to 1201. The start of transcription is at nucleotide 1017. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1099, 1202, and 1203.
SEQ ID NO: 6 sets forth the nucleotide sequence of SEQ ID NO: 5 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 7 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT4G37830. The first intron is located at nucleotide positions 861 to 1203. The start of transcription is at nucleotide 786. Additional nucleotides added to make a consensus splice site are at nucleotide positions 858, 859, 860, 1204, and 1205.
SEQ ID NO: 8 sets forth the nucleotide sequence of SEQ ID NO: 7 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 9 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G51650. The first intron is located at nucleotide positions 819 to 1567. The start of transcription is at nucleotide 751. Additional nucleotides added to make a consensus splice site are at nucleotide positions 816, 817, 818, 1568, and 1569.
SEQ ID NO: 10 sets forth the nucleotide sequence of SEQ ID NO: 9 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 1 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G48140. The first intron is located at nucleotide positions 1045 to 1201. The start of transcription is at nucleotide 929. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1044, 1202, and 1203.
SEQ ID NO: 12 sets forth the nucleotide sequence of SEQ ID NO: 1 1 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 13 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G02780. The first intron is located at nucleotide positions 1003 to 1343. The start of transcription is at nucleotide 926, Additional nucleotides added to make a consensus splice site are at nucleotide positions 1000, 1001 , 1002, 1344, and 1345.
SEQ ID NO: 14 sets forth the nucleotide sequence of SEQ ID NO: 13 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 15 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G01280. The first intron is located at nucleotide positions 604 to 1 102. The start of transcription is at nucleotide 448. Additional nucleotides added to make a consensus splice site are at nucleotide positions 601 , 602, 603, 1 103, and 1 104.
SEQ ID NO: 16 sets forth the nucleotide sequence of SEQ ID NO: 15 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 17 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G67430. The first intron is located at nucleotide positions 1783 to 1891. The start of transcription is at nucleotide 1730. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1780, 1781 , 1782, 1892, and 1893.
SEQ ID NO: 18 sets forth the nucleotide sequence of SEQ ID NO: 17 without the first intron and any nucleotides that were added to form a consensus splice site. SEQ ID NO: 19 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G76200. The first intron is located at nucleotide positions 758 to 1073. The start of transcription is at nucleotide 654. Additional nucleotides added to make a consensus splice site are at nucleotide positions 755, 756, 757, 1074, 1075.
SEQ ID NO: 20 sets forth the nucleotide sequence of SEQ ID NO: 19 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 21 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G31490. The first intron is located at nucleotide positions 704 to 1430. The start of transcription is at nucleotide 624. Additional nucleotides added to make a consensus splice site are at nucleotide positions 701 , 702, 703, 1431 , and 1432.
SEQ ID NO: 22 sets forth the nucleotide sequence of SEQ ID NO: 21 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 23 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT5G08690. The first intron is located at nucleotide positions 776 to 1077. The start of transcription is at nucleotide 747. Additional nucleotides added to make a consensus splice site are at nucleotide positions 773, 774, 775, 1078, and 1079.
SEQ ID NO: 24 sets forth the nucleotide sequence of SEQ ID NO: 23 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 25 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G07600. The first intron is located at nucleotide positions 1504 to 1783. The start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1784, and 1785.
SEQ ID NO: 26 sets forth the nucleotide sequence of SEQ ID NO: 25 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 27 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G78380. The first intron is located at nucleotide positions 1504 to 2004. The start of transcription is at nucleotide 1414. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2005, and 2006.
SEQ ID NO: 28 sets forth the nucleotide sequence of SEQ ID NO: 27 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 29 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G33040. The first intron is located at nucleotide positions 552 to 952. The start of transcription is at nucleotide 415. Additional nucleotides added to make a consensus splice site are at nucleotide positions 551 , 953, and 954.
SEQ ID NO: 30 sets forth the nucleotide sequence of SEQ ID NO: 29 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 31 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g21940. The first intron is located at nucleotide positions 1504 to 2482. The start of transcription is at nucleotide 1406. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2483, and 2484.
SEQ ID NO: 32 sets forth the nucleotide sequence of SEQ ID NO: 31 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 33 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g45950. The first intron is located at nucleotide positions 1504 to 1656. The start of transcription is at nucleotide 1419. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1657, and 1658.
SEQ ID NO: 34 sets forth the nucleotide sequence of SEQ ID NO: 33 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 35 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l g47760. The first intron is located at nucleotide positions 729 to 2633. The start of transcription is at nucleotide 638. Additional nucleotides added to make a consensus splice site are at nucleotide positions 726, 727, 728, 2634, and 2635.
SEQ ID NO: 36 sets forth the nucleotide sequence of SEQ ID NO: 35 without the first intron and any nucleotides that were added to form a consensus splice site. SEQ ID NO: 37 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os02g02130. The first intron is located at nucleotide positions 1504 to 1586. The start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1587, and 1588.
SEQ ID NO: 38 sets forth the nucleotide sequence of SEQ ID NO: 37 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 39 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g56190. The first intron is located at nucleotide positions 1504 to 1615. The start of transcription is at nucleotide 1437. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1616, and 1617.
SEQ ID NO: 40 sets forth the nucleotide sequence of SEQ ID NO: 39 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 41 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g47980. The first intron is located at nucleotide positions 940 to 1553. The start of transcription is at nucleotide 829. Additional nucleotides added to make a consensus splice site are at nucleotide positions 937, 938, 939, 1554, and 1555.
SEQ ID NO: 42 sets forth the nucleotide sequence of SEQ ID NO: 41 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 43 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os01 g46610. The first intron is located at nucleotide positions 1504 to 2228. The start of transcription is at nucleotide 1384. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2229, and 2230.
SEQ ID NO: 44 sets forth the nucleotide sequence of SEQ ID NO: 43 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 45 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os04g28180. The first intron is located at nucleotide positions 1504 to 1646. The start of transcription is at nucleotide 1399. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1647, and 1648.
SEQ ID NO: 46 sets forth the nucleotide sequence of SEQ ID NO: 45 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 47 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g01820. The first intron is located at nucleotide positions 1504 to 2453. The start of transcription is at nucleotide 1229. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2454, and 2455.
SEQ ID NO: 48 sets forth the nucleotide sequence of SEQ ID NO: 47 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 49 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l gl 1390. The first intron is located at nucleotide positions 1504 to 2798. The start of transcription is at nucleotide 1431. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2799, and 2780.
SEQ ID NO: 50 sets forth the nucleotide sequence of SEQ ID NO: 49 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 51 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession AT4G37830 and the first intron from gene accession AT1 G52300. The first intron is located at nucleotide positions 861 to 960. The start of transcription is at nucleotide 786.
SEQ ID NO: 52 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession ATl G52300and the first intron from gene accession AT4G37830 . The first intron is located at nucleotide positions 1 100 to 1442. The start of transcription is at nucleotide 1017.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 is a graphical representation of root expression enhancement of
Arabidopsis promoters by cognate first introns. Expression constructs comprising a promoter, a 5'-UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in Arabidopsis roots and compared to a control expression construct lacking the first intron (i.e., - intron variant). Average intron-mediated enhancement (IME) is expressed as on the ^-axis as 2A-fold enhancement (e.g., 22 and 24 stand for 4-fold and 16-fold expression enhancement, respectively.) The dashed line at 2° ( = 1 ) indicates the relative expression of the - intron variants. The individual promoters used are listed below the -axis.
FIG. 2 is a graphical representation of expression enhancement of fice promoters by cognate first introns. Expression constructs comprising a promoter, a 5'- UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in corn and compared to a control expression constructs lacking the first intron. Average intron-mediated enhancement (IME) is expressed as expressed on the j^-axis as 2 -fold enhancement (e.g., 22 and 24 stand for 4-fold and 16-fold expression enhancement, respectively.) The individual promoters used are listed below the -axis.
DETAILED DESCRIPTION OF THE INVENTION
The present inventions now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the inventions are shown. Indeed, these inventions may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
In the context of this disclosure, a number of terms and abbreviations are used. The following definitions are provided.
The articles "a" and "an" are used herein to refer to one or more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one or more element.
Throughout the specification the word "comprise," or variations such as "comprises" or "comprising," will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. As used herein, the term "expression construct" refers to a recombinant DNA or nucleic acid, which comprises in a 5'~to-3' order and in operable linkage a promoter, a 5 '-untranslated region (5'-UTR), and a translated region, wherein the 5'-UTR comprises a first intron from a native gene of an organism. The transcribed region of the expression construct comprises 5'-UTR, the first intron, and the translated region. In particular embodiments, the expression constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
As used herein, the term "regulatory construct" refers to a recombinant DNA or nucleic acid, which comprises in a 5'-to-3' order and in operable linkage a promoter and a 5'- UTR, wherein the 5'-UTR comprises a first intron from a native gene of an organism. In particular embodiments, the regulatory constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a translated region or coding sequence, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
As used herein, the terms "5'-untranslated region" or "5'-UTR" refer to a portion of transcribed region of a gene or an expression construct of the present invention that extends from the transcriptional start site and ends with the nucleotide immediately before the first nucleotide of the start codon for translation. As its name implies, the 5'- UTR does not normally serve as a template for translation and thus, is referred to as an untranslated region. The "5'-UTR" is, however, transcribed into RNA. Thus, an RNA transcript of a gene or an expression construct or a regulatory construct of the present invention also comprises a 5'-UTR. In most cases, any introns that occur in a '5-UTR of a gene or expression construct or regulatory construct are not found in the corresponding mature RNA transcript produced in vivo as such introns are typically spliced out by the host organism or cell thereof, unless the intron is non-functional in the host organism or the host organism is incompetent for splicing out such introns.
As used herein, the term "first intron" refers to the first intron from the 5' end of a native gene of an organism. The first intron can be found within the 5 '-UTR or the translated region of the native gene. When the first intron is located within the translated region of the gene, the first intron is between the first protein coding exon and the second protein coding exon. While the present invention does not depend on the location of the first intron within a native gene, typically the 5' end of a first intron that is capable of enhancing expression as disclosed herein is within about the first 1000 base pairs (bp) after the transcriptional start site (in a 5' to 3' direction) and is preferably within about the first 500 bp after the transcription start site.
An "expression-enhancing intron" or "enhancing intron" is an intron that is capable of causing an increase in the expression of a gene or polynucleotide to which it is operably linked. A "first intron" of the present invention is an expression-enhancing intron. While the present invention is not known to depend on a particular biological mechanism, it is believed that the expression-enhancing introns of the present invention enhance expression through intron-mediated enhancement (IME). It is recognized that naturally occurring introns that enhance expression through IME are typically found within 1 Kb of the transcription start site of their native genes (see, Rose el al. (2008) Plant Cell 20:543-551 ). Such introns are usually the first intron, whether the first intron is in the 5'-UTR or the coding sequence, and are in a transcribed region. Introns that enhance expression solely through IME do not enhance gene expression when they are inserted into a non-transcribed region of gene, such as for example, a promoter. That is, they do not function as transcriptional enhancers. Unless stated otherwise or apparent from the context, the first introns of the present invention are capable of enhancing gene expression when they are found in a transcribed region of a gene but not when they occur in a non-transcribed region such as, for example, a promoter.
As used herein, the term "translated region" refers to the portion of a gene or expression construct of the present invention or its corresponding RNA transcript that encodes a polypeptide or protein of interest. Thus, the translated region comprises the start codon (e.g., ATG) for translation through the last codon of the protein or polypeptide encoded thereby. It is recognized that the translated region of a gene or expression construct can comprise one or more introns. It is further recognized that any introns that occur in the translated region of a gene or expression construct of the present invention are not typically found in the corresponding mature RNA transcript produced in vivo as such introns are normally spliced out by the host organism or cell thereof unless the intron or introns are non-functional and/or the host organism is incompetent for splicing out such introns. As used herein, "native gene" refers to a gene that is part of a natural genome of an organism and that was not introduced into the organism or a progenitor thereof by artificial means that do not involve the transfer of genes from one organism to another organism by sexual reproduction. Such artificial means include, for example, any methods involving the introduction of recombinant DNA or other recombinant nucleic acid molecules into the organism or a progenitor thereof. A gene that is introduced into a progenitor of an organism by artificial means does not become a native gene when it is transferred from the progenitor to the organism via sexual reproduction.
The terms "recombinant DNA", "recombinant nucleic acid molecule", and similar terms refer to DNA and other recombinant nucleic acid molecules that are an artificial or non-naturally occurring combination of nucleic acid fragments, e.g., regulatory and coding sequences that are not all found together in the same form in nature. For example, recombinant nucleic acid molecules may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. As used herein, an "expression construct" and a "regulatory construct" each comprise recombinant DNA.
By "enhancing gene expression" is intended to mean enhancing or increasing the expression of a gene or its gene product, particularly a protein or polypeptide. Gene expression can be determining by monitoring the formation of a transcript of a gene or polynucleotide or gene of interest of the present invention, a protein encoded by the transcript, or even an activity or function of the encoded protein. In preferred embodiments of the present invention, gene expression is determined by monitoring the level of a protein encoded by the gene or the activity or function of the encoded protein. Thus, it is understood that the expression of a polynucleotide or gene of interest of the present invention can be assessed in an organism or at least one cell thereof by determining the level of level of the protein encoded by the translated region of the polynucleotide or gene of interest or the activity or function of the encoded protein. In some embodiments of the present invention, the polynucleotide or gene of interest comprises a translated region which encodes green fluorescent protein (GFP), and expression of the polynucleotide or gene of interest can be determined by measuring green fluorescence emitted from the GFP protein when it is exposed to blue light. In other embodiments, the polynucleotide or gene of interest comprises a translated region which encodes f3-glucuronidase (GUS) and expression of the polynucleotide can be determined by measuring GUS activity using the MUG fluorometric assay.
As used herein, a "promoter" refers to a nucleic acid that is capable of controlling the expression of an operably linked coding sequence or other sequence encoding an RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, nucleic acid fragments of some variation may have identical promoter activity.
An "enhancer" is a DNA sequence that can stimulate promoter activity, and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Enhancers may be found in both non-transcribed and transcribed regions of a gene. Typically, the promoter stimulating activity of an enhancer is insensitive with respect to the position and orientation (i.e., can be inverted) of the enhancer within a gene.
Promoters that cause an operably linked gene or polynucleotide to be expressed in most cell types of an organism and at most times are commonly referred to as "constitutive promoters". Preferably, the constitutive promoters of the present invention cause an operably linked gene or polynucleotide to be expressed in all or substantially all tissues and stages of development and being minimally responsive to abiotic stimuli. Expression of a gene of a gene or polynucleotide in most cell types of an organism and at most times is referred to herein as "constitutive gene expression" or "constitutive expression", "expressed constitutively", "expression in a constitutive manner", or expression in a 'constitutive pattern". It is understood that for the terms "constitutive promoter" and "constitutive expression" and that some variation in absolute levels of expression or activity can exist among different tissues and stages of development of an organism.
The present invention provides novel expression constructs comprising a promoter operably linked to a polynucleotide. It is recognized that nucleic acid molecules comprising such novel expression constructs can be synthesized or produced using a number of methods known in the art. As used herein, "synthesizing an expression construct" or "producing an expression construct" are interchangeable terms that are intending to mean the making of an expression construct by any known method including, but not limited to, chemical synthesis of the entire nucleic acid molecule or part or parts thereof, modification of a pre-existing nucleic acid molecule by molecular biology methods such as, for example, restriction endonuclease digestion, DNA amplification by polymerase and ligation, and the combination of chemical synthesis and modification.
As used herein, "progeny" comprises any subsequent generation of an organism or a host cell, whether the result of sexual reproduction or asexual reproduction.
Preferably, a progeny of the present invention is made by the methods of the present invention and/or comprises an expression construct of the present invention.
A used herein, "progenitor" or "progenitor organism" refers to an ancestor of an organism or host cell. In certain embodiments of the invention, methods are described that can involve the use of an organism or cell comprising an expression construct of the present invention wherein the organism or cell is descended from a progenitor into which the expression construct was introduced. Preferably, the expression construct was stably introduced into the genome of the progenitor by, for example, a stable transformation method described herein or otherwise known in the art.
As used herein, an "organism" refers any life form that has genetic material comprising nucleic acids including, but not limited to, prokaryotes, eukaryotes, and viruses. Organisms include, for example, plants, animals, fungi, bacteria, and viruses, and cells and parts thereof. Preferred organisms of the present invention are eukaryotic organisms, including, for example, plants, animals, fungi, and protists.
As used herein, a "target organism" is the organism into which an expression construct of the present invention is introduced, particularly for the purpose of expressing the protein encoded by the translated region of the expression construct.
By "gene of interest" is intended any nucleotide sequence that can be expressed when operable linked to a promoter or a regulatory construct of the present invention. A gene of interest of the present invention may, but need not, encode a protein. A translated region of the present invention can be a gene of interest. Unless stated otherwise or readily apparent from the context, when a gene of interest of the present invention is said to be operably linked to a promoter of the invention, the gene of interest does not by itself comprise a functional promoter. Preferably, the gene of interest does not comprise a full-length 5 -UTR. More preferably, the gene of interest is a translated region.
As used herein, a "heterologous gene" is any nucleic acid molecule or polynucleotide that is expressed from a nucleotide construct of the present invention. Such a heterologous gene can comprise a nucleotide sequence that is native or endogenous to an organism or can be foreign.
While the present invention does not depend on a particular method of determining if the expression construct of the present invention is capable of enhancing gene expression in a target organism, typically gene expression is determined by transforming the target organism or at least one cell thereof with a polynucleotide construct comprising the expression construct. The expression construct can further comprise additional genetic regulatory elements, if desired or necessary for expression in the translated region in the organism or at least one cell thereof.
Those of skill in the art will appreciate that determining whether the expression construct is capable of enhancing the expression of an operably linked gene in the desired manner in the target organism or any other organism of interest can depend on any number of factors including, for example, the type of genetic regulatory element (e.g., promoter, a 5'-untranslated region (UTR), a 3 '-untranslated region, an intron, a terminator, a chromatin control element), the presence of additional genetic elements in the construct, the gene of interest to be expressed, the organism or part or cell thereof in which expression is assayed, the expression assay, the detection method (e.g., GFP visible fluorescent, detection of GFP RNA by qPCR), the environmental conditions during the assay, and the like.
As used herein, a "control expression construct" is the same or substantially the same as an expression construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein. Preferably, a control expression construct lacks a first intron but otherwise comprises the same promoter, 5'-UTR, and translated region as an expression construct of the present invention. More preferably, a control expression construct lacks a first intron but otherwise has the same nucleotide sequence as an expression construct of the present invention, except for the missing portion that would correspond to the first intron in the expression construct.
Similarly, as used herein, a "control regulatory construct" is the same or substantially the same as a regulatory construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein. Preferably, a control regulatory construct lacks a first intron but otherwise comprises the same promoter and 5'-UTR as a regulatory construct of the present invention. More preferably, a control regulatory construct lacks a first intron but otherwise has the same nucleotide sequence as a regulatory construct of the present invention, except for the missing portion that would correspond to the first intron in the regulatory construct.
As used herein a "reporter" or a "reporter gene" refers to a nucleic acid molecule encoding a detectable marker. Reporter genes include, for example, luciferase (e.g., firefly luciferase or Renilla luciferase), β-galactosidase, β- glucuronidase (GUS), chloramphenicol acetyl transferase (CAT), and a fluorescent protein (e.g., green fluorescent protein (GFP), red fluorescent protein (DsRed), yellow fluorescent protein, blue fluorescent protein, cyan fluorescent protein, or variants thereof, including enhanced variants such as enhanced GFP (eGFP). Reporter genes are detectable by a reporter assay. Reporter assays can measure the level of reporter gene expression or activity by any number of means, including, for example, measuring the level of reporter mRNA, the level of reporter protein, or the amount of reporter protein activity. Reporter assays are known in the art or otherwise disclosed herein.
The present invention provides methods and compositions for enhancing gene expression in organisms, particularly eukaryotic organisms. Such methods and compositions can be used for the expression of polynucleotides, particularly the proteins encoded thereby, constitutively and at a high level in a target organism. Thus, the methods and compositions of the present invention find use in the production of any protein of interest in a eukaryotic organism or cells thereof. In preferred embodiments of the invention, the target organisms are plants, particularly monocot and dicot plants, more particularly monocot and dicot plants that are crop plants or that are suitable for the production of a protein of interest when grown in fields, greenhouses and/or controlled-environment facilities.
The present invention was made during the course of research related to the discovery and characterization of promoters that can be used to drive the expression of operably linked polynucleotides constitutively and at a high level in plants. Such promoters are known as strong constitutive promoters. During the course of that research, the present inventors discovered that the expression of a polynucleotide can be increased by adding to a polynucleotide construct comprising a constitutive promoter an operably linked intron from the same plant gene as the promoter or an intron from a different plant gene that is also known to be expressed constitutively and at a high level.
In one aspect, the present invention provides methods for making an expression construct for enhancing gene expression in an organism. The methods comprise selecting a first intron that is derived from a first gene that is highly expressed in a constitutive manner in a first organism. The first intron is the first intron from the 5' end of the first gene, and the first gene is a gene that is native to the first organism. Such a native gene is part of the natural genome of the first organism and was not introduced into the organism or a progenitor organism by artificial means. The methods further comprise selecting a promoter. The promoter can be selected before, after, or at the same time as, the first intron is selected. The promoter can be a promoter derived from the first gene or a promoter derived from a second gene that is highly expressed in a constitutive manner either in the first organism or in a second organism. The second gene is native to either the first organism or the second organism. The methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the
polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and wherein the 5'-UTR or translated region comprises the first intron. The 5'-UTR or any part thereof can be derived from the native 5 -UTR of the first gene, the second gene, or a different gene, or can be synthetic or artificial.
Preferably, an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron. More preferably, an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism expression without significantly altering the constitutive manner of expression of the
polynucleotide in the target organism from a control expression construct which lacks the first intron. In preferred embodiments, an expression construct made by the methods disclosed, provides for at least a about 1.25, 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60. 70, 80, 90, 100-fold increase in expression of the
polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron. Typically, an expression construct of the present invention, can provide for an approximately 2 to 70-fold increase in expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
In certain embodiments, the methods of the present invention can involve a first organism and a target organism. The first organism and the target organism can be the same species or different species. In embodiments in which the first organism and the target organism are not the same species, the first organism and the target organism are typically from related species. For example, the first organism and the target organism can be two different plant species, preferably two different monocot or dicot plant species, more preferably two different plant species within the same taxonomic family, most preferably two different plant species within the same genus.
In other embodiments, the methods involve a first organism, a second organism, and a target organism. The first organism, the second organism, and the target organism can be the same species or two or more different species. In embodiments in which the first organism, the second organism, and the target organism, are not all the same species the first organism, the second organism, and the target organism are typically from two or more related species. For example, the first organism, the second organism, and the target organism can be three different plant species, preferably three different monocot or dicot plant species, more preferably three different plant species within the same taxonomic family, most preferably three different plant species within the same genus. In the methods of the present invention for making an expression construct for enhancing gene expression in an organism, the expression construct comprises a promoter operably linked to a polynucleotide for transcription of the polynucleotide. Thus, a polynucleotide of the present invention comprises a transcribed region. When an expression construct of the present invention is introduced into a target organism or at least one cell thereof the polynucleotide represents the region of the expression construct that is transcribed so as to produce an RNA molecule or transcript. It is recognized the initial RNA molecule or transcript this is produced may be further modified in the organism or cell thereof so as to produce a mature RNA transcript. Modifications can include, for example, splicing out one or more introns including, but not limited to, the first intron.
As described above, the polynucleotide comprises the 5'-UTR, the first intron, and the translated region, and either the 5'-UTR or the translated regions comprises the first intron. In embodiments of the invention in which the translated region comprises the first intron, the first intron is between the first and second exons of the translated region. In other embodiments, the 5'-UTR comprises the first intron. Preferably in these embodiments, the first intron is at or near the 3' end of the 5'-UTR. More preferably, the first intron is at the 3' end of the 5'-UTR immediately before the translational start site. It is recognized that 3' end of the 5'-UTR is the nucleotide immediately before the first nucleotide of the start codon for translation. Typically, the start codon will be ATG. However, it is recognized that other start codons are known to be used by some organisms and that the present invention does not depend a particular start codon.
While the present invention does not depend on a 5' UTR of a certain size, it is recognized that non-intron sequences of 5'-UTRs are typically in the range of about 30 bp to about 200 bp, preferably about 50 to about 150 bp, although substantially larger or smaller 5'-UTRs are also encompassed by the present invention.
The expression constructs and regulatory constructs of the present invention comprise promoters and first introns that are derived from native genes. In some embodiments of the invention, the 5'-UTR or portion thereof can also be derived from a native gene. In certain embodiments, the promoters, first introns, and the 5-UTRs can be identical to or substantially the same as the corresponding element in its native gene. It is recognized that promoters, first introns, and 5'-UTRs of the present invention that are each derived from a native gene can be modified so that their sequences are no longer identical to the corresponding sequences in the native gene. Such modifications include, for example, the addition of a consensus splice sites on one or both ends of an intron, removal of cryptic splice site, and sequence modifications that increase transcription. Generally, any such modifications will not alter constitutive expression of the promoters and the function of the first introns but it is recognized such modifications may enhance gene expression. In preferred embodiments of the invention, the first introns comprise consensus splice sites on both the 5' and 3' ends. In particularly preferred embodiments of the invention, the first introns comprise consensus splice sites on both the 5' and 3' ends, wherein the consensus splice sites are selected, or designed to be, efficiently spliced out when present in a transcript in the organism of interest. While the present invention is not bound by a particular biological mechanism, it is recognized that a first intron that is not spliced out may be disruptive to translation when the first intron is located in the 5'-UTR, particularly when located near the 3 '-end of the 5'-UTR. Moreover, first introns that are located within the translated region and that are not spliced out at all or spliced out inefficiently can have the unintended effect of reducing or eliminating the expression of the protein of interest.
The methods of the present invention can comprise selecting a first intron and/or a promoter that is derived from a gene that is highly expressed in a constitutive manner in an organism. The selected first intron and promoter can be derived from the same gene, from different genes in the same organism, or even from different genes in different organisms. Generally, the first intron and/or a promoter can be selected from the promoters and first introns of genes that are known to be highly expressed in a constitutive manner. Such promoters and first introns and methods for identifying them are generally known the art. See, for example: U.S. Patent Application No. 13/528,515, filed June 20, 2012; WO 201 1/079197; and WO 2012/006426; all of which are herein incorporated in their entirety by reference. If desired, the methods of the present invention can further comprise identifying highly expressed constitutive genes from any organism and the selecting first introns and/or promoters from the newly identified highly expressed constitutive genes. Any method known in the art for the identification of highly expressed constitutive genes can be used in the methods disclosed herein. See, for example: U.S. Patent Application No. 13/528,515, filed June 20, 2012; WO 201 1/079197; and WO 2012/006426.
In another aspect, the present invention provides methods for making a making a regulatory construct. The methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism. The first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron. The methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner in the same organism as the gene from which the first intron was derived or in a different organism. The methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron. Preferably, the first intron is at or near the 3' end of the 5'- UTR. Also preferably, the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron. The methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
It is recognized that regulatory construct of the present invention is essentially the same as an expression construct of present invention but without an operably linked translated region. Thus, it is further recognized that the descriptions herein of the various elements of, and the arrangement within, the expression constructs of the present invention are also germane to the regulatory constructs of the present invention with the exception that the regulatory constructs are not required to comprise an operably linked translated region.
The expression constructs of the present invention find use in the making of organisms or cells that express a heterologous gene in a constitutive manner and at high level. Thus, in yet another aspect, the present invention provides methods for making an organism for expressing a heterologous gene. The methods comprising introducing into at least one cell of a target organism an expression construct of the present invention. Such an expression construct comprises a promoter operably linked to a polynucleotide, wherein:
(a) the polynucleotide comprises a 5'-UTR and a translated region,
(b) the 5'-UTR or the translated region comprises a first intron,
(c) wherein the first intron is derived from a first gene that is highly
expressed in a constitutive manner in a first organism,
(d) the first intron is the first intron from the 5' end of the first gene,
(e) the first gene is native to the first organism,
(f) the promoter is derived from the first gene or from a second gene
that is highly expressed in a constitutive manner in the first organism or in a second organism, and
(g) the second gene is native to at least one of the first organism and
the second organism.
The methods for making an organism for expressing a heterologous gene can further comprise regenerating from the at least one cell a target organism comprising the expression construct. Preferably, the target organism or cell is capable of expressing the polynucleotide when the target organism or cell is exposed to conditions favorable for the expression of the polynucleotide for a sufficient period of time, and the polynucleotide is expressed at an increased level in the target organism or at least one cell thereof when compared to the expression of the polynucleotide in the target organism or at least one cell thereof comprising a control expression construct which lacks the first intron. In preferred embodiments of the invention, expression of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
The methods for making an organism for expressing a heterologous gene can further comprise producing additional organisms or progeny by one or more rounds of sexual or asexual reproduction and optionally selecting for progeny comprising the expression construct. Thus, the methods of the present invention are not only limited to making the initial organism or the initial cell into which the expression construct was introduced but also encompass all progeny cells and organisms, however produced, that are descended from initial organism and/or the initial cell and that comprise the expression construct.
The expression constructs of present invention, as well as the organisms and cells of the present invention that comprise such expression constructs, find use in methods for expressing a heterologous gene in an organism. Thus, in yet another aspect, the present invention provides methods for expressing a heterologous gene in an organism. The methods involve obtaining a target organism comprising an expression construct of the present invention or at least one cell thereof and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed. Preferably, the polynucleotide is expressed at an increased level in the target organism or cell thereof when compared to the expression of the polynucleotide in the target organism or cell thereof comprising a control expression construct which lacks the first intron. In some embodiments, the methods further comprise producing the target organism or a progenitor thereof by introducing the expression construct into at least one cell of an organism and regenerating the at least one cell into the target organism or a progenitor thereof comprising the expression construct. In certain embodiments, the methods for expressing a heterologous gene in an organism can further comprise making the expression construct as described herein above.
The present invention additionally provides nucleic acid molecules, vectors, expression cassettes comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention. Further provided are non- human organisms and non-human host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs as disclosed herein. The invention further provides expression cassettes, plants, plant parts, plant cells, seeds and host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention.
While the expression constructs and regulatory constructs can comprise promoters, first introns, and/or 5'UTRs that are identical in nucleotide sequence to corresponding promoters, first introns, and/or 5'UTRs in one or more native genes, the expression constructs and regulatory constructs of the present invention are not known to be naturally occurring. The expression constructs and regulatory constructs of the present invention are recombinant nucleic acids that are not native to the genome of an organism.
The invention encompasses isolated or substantially purified nucleic acid molecule or polynucleotide compositions. An "isolated" or "purified" nucleic acid molecule or polynucleotide, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the nucleic acid molecule or polynucleotide as found in its naturally occurring environment. Thus, an isolated or purified nucleic acid molecule or polynucleotide is substantially free of other cellular material or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
The invention encompasses fragments and variants of the disclosed nucleic acid molecules or polynucleotides. By "fragment" is intended a portion of the nucleic acid molecule or polynucleotide. Fragments of a polynucleotide comprising nucleic acid sequences retain biological activity of the full-length nucleic acid molecule or polynucleotide. Alternatively, fragments of a polynucleotide that are useful as hybridization probes generally do not encode proteins that retain biological activity or do not retain promoter activity. Thus, fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides, and up to the full-length polynucleotide of the invention.
A fragment of a polynucleotide of the invention may encode a biologically active portion of a polynucleotide. A biologically active portion of a polynucleotide can be prepared by isolating a portion of one of the polynucleotides of the invention that comprises the genetic regulatory element and assessing activity as described herein. Polynucleotides that are fragments of a nucleotide sequence of the present invention comprise at 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1 ,000, 1 , 100, 1 ,200, 1 ,300, 1 ,400, 1 ,500, 1 ,600, 1 ,700, 1 ,800, 1 ,900, 2,000, 2,100, 2,200, 2,300, 2,400 2,500, 2,600, or 2,700 contiguous nucleotides, or up to the number of nucleotides present in a full-length polynucleotide disclosed herein.
"Variants" is intended to mean substantially similar sequences. For
polynucleotides, a variant comprises a polynucleotide having deletions (i.e., truncations) at the 5' and/or 3' end; deletion and/or addition of one or more nucleotides at one or more internal sites in the reference polynucleotide; and/or substitution of one or more nucleotides at one or more sites in the reference polynucleotide. As used herein, a "reference" polynucleotide comprises a nucleotide sequence produced by the methods disclosed herein. Variant polynucleotides also include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still comprise biological activity. Generally, variants of a particular polynucleotide or nucleic acid molecule of the invention will have at least about 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters as described elsewhere herein.
Variant polynucleotides also encompass sequences derived from a mutagenic and recombinogenic procedure such as DNA shuffling. Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) PNAS 91 : 10747- 10751 ; Stemmer (1994) Nature 370:389-391 ; Crameri et al. (1997) Nature Biotech. 15:436-438; Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) PNAS 94:4504-4509; Crameri et al. (1998) Nature 391 :288-291 ; and U.S. Patent Nos.
5,605,793 and 5,837,458.
For PGR amplifications of the polynucleotides disclosed herein, oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any plant of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, New York). See also Innis et al , eds. (1990) ?CR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like.
It is recognized that the polynucleotide molecules of the present invention encompass polynucleotide molecules comprising a nucleotide sequence that is sufficiently identical to one of the nucleotide sequences set forth in any one or more of SEQ ID NOS: 1-52. The term "sufficiently identical" is used herein to refer to a first nucleotide sequence that contains a sufficient or minimum number of identical or equivalent nucleotides to a second nucleotide sequence such that the first and second nucleotide sequences have a common structural domain and/or common functional activity. For example, nucleotide sequences that contain a common structural domain having at least about 85% or 90% identity, preferably 95% identity, more preferably 96%), 97%), 98%o or 99% identity are defined herein as sufficiently identical.
To determine the percent identity of two nucleic acids, the sequences are aligned for optimal comparison purposes. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., percent identity = number of identical positions/total number of positions (e.g., overlapping positions) x 100). In one embodiment, the two sequences are the same length. The percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent identity, typically exact matches are counted.
The determination of percent identity between two sequences can be accomplished using a mathematical algorithm. A preferred, nonlimiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of arlin and Altschul (1990) PNAS 87:2264, modified as in Karlin and Altschul
(1993) PNAS 90:5873-5877. Such an algorithm is incorporated into the NBLAST and XBLAST programs of Altschul et al. (1990) J. Mol. Biol. 215:403. BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12, to obtain nucleotide sequences homologous to the polynucleotide molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25:3389.
Alternatively, PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See
http://www.ncbi.nlm.nih.gov. Another preferred, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller (1988) CABIOS 4: 1 1 -17. Such an algorithm is incorporated into the ALIGN program (version 2.0), which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM 120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. Alignment may also be performed manually by inspection.
Unless otherwise stated, sequence identity values for pairs of sequences provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters (Altschul el al , (1997) Nucleic Acids Res. 25:3389-402) using the full-length sequences of the invention. Unless otherwise stated, sequence identity values for multiple sequence alignments provided herein refer to the value obtained using MUSCLE (Version 3.8) using default parameters using the full-length sequences of the invention. MUSCLE is available at http://www.drive5.com/muscle/ or http://www.ebi.ac.uk/Tools/msa/muscle/. See, Edgar (2004) Nucleic Acids Res.
32(5): 1792- 1797; herein incorporated by reference.
The use of the term "polynucleotide" and "nucleic acid" is not intended to limit the present invention to polynucleotides and nucleic acids comprising DNA. Those of ordinary skill in the art will recognize that polynucleotides and nucleic acids, can comprise ribonucleotides and combinations of ribonucleotides and
deoxyribonucleotides. Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues. The polynucleotides and nucleic acids of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
The expression constructs and regulatory constructs of the present invention can be provided in expression cassettes for expression in the plant or other organism or host cell of interest. It is recognized that the expression constructs of the present invention and expression cassettes comprising one or more of such expression constructs can be used for the expression in both human and non-human host cells including, but not limited to, host cells from plants, animals, fungi, protists, and algae. In one
embodiment of the invention, the host cells are human host cells or a host cell line that is incapable of differentiating into a human being. The expression cassette can include additional 5' and 3' regulatory sequences operably linked to the expression construct or regulatory construct. "Operably linked" intended to mean a functional linkage between two or more elements. For example, an operable linkage between one or more genetic regulatory elements and a gene of interest is functional link between the gene of interest and the one or more genetic regulatory elements that allows for expression of the gene of interest. Operably linked elements may be contiguous or non-contiguous. When used to refer to the joining of two protein coding regions, by "operably linked" is intended that the coding regions are in the same reading frame. With respect to introns, an "operably linked intron" is an intron that is functional and splices out of a polynucleotide when in a host organism capable of splicing out such a functional intron. In the case of introns within a coding region or translated region of a gene, an "operably linked intron" is one that is functional and splices out of a coding region or translated region of an RNA without disrupting the reading frame for translation when the polynucleotide is in a host organism capable of splicing out such a functional intron. It is understood that the term "in operable linkage" as used herein has the same meaning as "operably linked".
The expression cassette may additionally contain at least one additional gene to be co-transformed into the organism. Alternatively, the additional gene(s) can be provided on multiple expression cassettes. Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotide to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain selectable marker genes.
The expression cassette can comprise in the 5 '-3' direction of transcription, a transcriptional initiation region (i.e., a promoter), a translational initiation region, nucleotide sequence to be expressed, a translational stop site, and a transcriptional termination region (i.e., termination region) functional in plants or other organism or host cell. The expression cassette further comprises a first intron either in the 5'-UTR or coding region. The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) and/or the polynucleotide to be expressed may be native/analogous to the host cell or to each other. Alternatively, any of the regulatory regions and/or the polynucleotide to be expressed may be
heterologous to the host cell or to each other. As used herein, "heterologous" in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous polynucleotide is from a species different from the species from which the polynucleotide was derived, or, if from the same/analogous species, one or both are substantially modified from their original form and/or genomic locus, or the promoter is not the native promoter for the operably linked polynucleotide. As used herein, a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
The termination region may be native with the transcriptional initiation region, may be native with the operably linked polynucleotide of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous) to the promoter, the polynucleotide of interest, the plant host, or any combination thereof. Convenient termination regions are available from the Ti-plasmid of A. lumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also
Guerineau et al. (1991) Mo/. Gen. Genet. 262: 141 -144; Proudfoot (1991 ) Cell 64:671 - 674; Sanfacon et al. (1991 ) Genes Dev. 5: 141-149; Mogen et al. (1990) Plant Cell 2: 1261 - 1272; Munroe et al. (1990) Gene 91 : 151 -158; Ballas et al. (1989) Nucleic Acids Res. 17:7891 -7903; and Joshi et al. (1987) Nucleic Acids Res. 15:9627-9639.
Unless stated otherwise or obvious from the context, a promoter of the present invention for gene expression in plants is capable of directing the constitutive expression of an operably linked gene of interest in a plant, a plant part, and/or a plant cell.
Where appropriate, the genes of interest may be optimized for increased expression in the transformed plant. That is, the polynucleotides can be synthesized using plant-preferred codons for improved expression. See, for example, Campbell and Gowri (1990) Plant Physiol. 92: 1-1 1 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Patent Nos. 5,380,831 , and 5,436,391 , and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference.
Additional sequence modifications are known to enhance gene expression in a cellular host. These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. The G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures.
The expression cassettes may additionally contain heterologous 5' UTRs (also known as 5' leader sequences). Such 5' UTRs can act to enhance translation.
Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5' noncoding region) (Elroy-Stein et al. (1989) PNAS USA 86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie el al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak et al. (1991) Nature 353:90-94); untranslated leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4) (Jobling et al. (1987) Nature 325:622-625); tobacco mosaic virus leader (TMV) (Gallie et al. (1989) in Molecular Biology of RNA, ed. Cech (Liss, New York), pp. 237-256); and maize chlorotic mottle virus leader (MCMV) (Lommel et al. (1991) Virology 81 :382-385). See also,
Della-Cioppa et al. (1987) Plant Physiol. 84:965-968.
In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved.
The expression cassette can also comprise a selectable marker gene for the selection of transformed cells. Selectable marker genes are utilized for the selection of transformed cells or tissues. Marker genes include genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT), as well as genes conferring resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4- dichlorophenoxyacetate (2,4-D). Additional selectable markers include phenotypic markers such as β-galactosidase and fluorescent proteins such as green fluorescent protein (GFP) (Su et al. (2004) Biotechnol Bioeng. 85:610-9 and Fetter et al. (2004) Plant Cell 16:215-28), cyan florescent protein (CYP) (Bolte et al. (2004) J. Cell Science 1 17:943-54 and Kato et al. (2002) Plant Physiol. 129:913-42), and yellow florescent protein (PhiYFP™ from Evrogen, see, Bolte et al. (2004) J. Cell Science 1 17:943-54). For additional selectable markers, see generally, Yarranton (1992) Curr. Opin. Biotech. 3 :506-51 1 ; Christopherson et al. (1992) PNAS 89:6314-6318; Yao et al. (1992) Cell 71 :63-72; Reznikoff (1992) Mol. Microbiol. 6:2419-2422; Barkley et al.
(1980) in The Operon, pp. 177-220; Hu et al. (1987) Cell 48:555-566; Brown el al. (1987) Cell 49:603-612; Figge et al. (1988) Cell 52:713-722; Deuschle et al. (1989) PNAS 86:5400-5404; Fuerst et al. (1989) PNAS 86:2549-2553; Deuschle et al. (1990) Science 248:480-483; Gossen (1993) Ph.D. Thesis, University of Heidelberg; Reines et al (1993) PNAS 90: 1917-1921 ; Labow e/ al. (1990) Mol. Cell. Biol. 10:3343-3356; Zambretti et al. (1992) PNAS 89:3952-3956; Bairn et al. (1991) PNAS 88:5072-5076; Wyborski et al.
(1991) Nucleic Acids Res. 19:4647-4653; Hillenand-Wissman (1989) Topics Mol Struc. Biol. 10: 143-162; Degenkolb e al. (1991) Antimicrob. Agents Chemother. 35: 1591 -1595; Kleinschnidt et al. (1988) Biochemistry 27: 1094-1 104; Bonin (1993) Ph.D. Thesis, University of Heidelberg; Gossen et al. (1992) PNAS 89:5547-5551 ; Oliva el al. (1992) Antimicrob. Agents Chemother. 36:913-919; Hlavka et al. (1985) Handbook of
Experimental Pharmacology, Vol. 78 ( Springer- Verlag, Berlin); Gill et al. (1988) Nature 334:721-724. Such disclosures are herein incorporated by reference.
The above list of selectable marker genes is not meant to be limiting. Any selectable marker gene can be used in the present invention.
Numerous plant transformation vectors and methods for transforming plants are available. See, for example, An, G. el al. (1986) Plant Pysiol , 81 :301 -305; Fry, J., el al. (1987) Plant Cell Rep. 6:321 -325; Block, M. (1988) Theor. Appl Genet. l6:161-11A; Hinchee, et al. (1990) Stadler. Genet. Symp. 203212.203-212; Cousins, et al. (1991 ) A st. J. Plant Physiol. 18:481 -494; Chee, P. P. and Slightom, J. L. (1992) Gene
1 18:255-260; Christou, et al. (1992; Trends. Biotechnol. 10:239-246; D'Halluin, el al
( 1992) Bio/Technol. 10:309-314; Dhir, et al. (1992) Plant Physiol. 99:81-88; Casas et al. (1993) PNAS 90: 1 1212- 1 1216; Christou, P. (1993) In Vitro Cell. Dev. Biol. -Plant; 29P: 1 19- 124; Davies, et al. ( 1993) Plant ' Cell Rep. 12: 180- 183; Dong, J. A. and Mchughen, A. (1993) Plant Sci. 91 : 139- 148; Franklin, C. I. and Trieu, T. N. (1993) Plant. Physiol. 102: 167; Golovkin, et al. (1993) Plant Sci. 90:41 -52; duo Chin Sci. Bull. 38:2072-2078; Asano, et al. (1994) Plant Cell Rep. 13; Ayeres N. M. and Park, W. D. (1994) Crit. Rev. Plant. Sci. 13 :219-239; Barcelo, et al. (1994) Plant. J. 5 :583- 592; Becker, et al. (1994) Plant. J. 5 :299-307; Borkowska et al. ( 1994) Acta. Physiol Plant. 16:225-230; Christou, P. (1994) Agro. Food. Ind. Hi Tech. 5 : 17-27; Eapen et al. (1994) Plant Cell Rep. 13 :582-586; Hartman, et al. (1994) Bio-Technology 12: 919923 ; Ritala, et al. ( 1994) Plant. Mol. Biol. 24:317-325; and Wan, Y. C. and Lemaux, P. G. ( 1994) Plant Physiol. 104:3748.
The methods of the invention involve introducing an expression construct or regulatory construct into an organism. By "introducing" is intended presenting to the organism the expression construct in such a manner that the construct gains access to the interior of a cell of the organism. The methods of the invention do not depend on a particular method for introducing an expression construct or regulatory construct into an organism, only that the expression construct or regulatory construct gains access to the interior of at least one cell of the organism. Methods for introducing expression constructs, regulatory constructs, and other polynucleotides into various organisms such as, for example, plants, animals, fungi, protists, and bacteria are known in the art including, but not limited to, stable transformation methods, transient transformation methods, and virus-mediated methods.
By "stable transformation" is intended that the polynucleotide construct introduced into a organism integrates into a genome of organism and is capable of being inherited by progeny thereof. By "transient transformation" is intended that a polynucleotide construct introduced into an organism does not integrate into a genome of the organism.
For the transformation of target organisms and host cells, the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in the organism or host cell. The selection of the vector depends on the preferred transformation technique and the species of target organism or host cell to be transformed.
For the transformation of plants and plant cells, the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in a plant or plant cell. The selection of the vector depends on the preferred transformation technique and the target plant species to be transformed.
Methodologies for constructing plant expression cassettes and introducing foreign nucleic acids into plants are generally known in the art and have been previously described. For example, foreign DNA can be introduced into plants, using tumor-inducing (Ti) plasmid vectors. Other methods utilized for the delivery foreign DNA or other foreign nucleic acids involve the use of PEG mediated protoplast transformation, electroporation, microinjection whiskers, and biolistics or
microprojectile bombardment for direct DNA uptake. Such methods are known in the art. (U.S. Pat. No. 5,405,765 to Vasil et al ; Bilang et at. ( 1991 ) Gene 100: 247-250; Scheid et al , (1991 ) Mol. Gen. Genet. 228: 104- 1 12; Guerche et al , (1987) Plant Science 52: 1 1 1 -1 16; Neuhause et al , (1987) Theor. Appl Genet. 75: 30-36; Klein et al , (1987) Nature 327: 70-73 ; Howell et al , (1980) Science 208: 1265; Horsch et al ,
(1985) Science 227: 1229- 123 1 ; DeBlock et al , (1989) Plant Physiology 91 : 694-701 ; Methods for Plant Molecular Biology (Weissbach and Weissbach, eds.) Academic
Press, Inc. (1988) and Methods in Plant Molecular Biology (Schuler and Zielinski, eds.) Academic Press, Inc. (1989). The method of transformation depends upon the plant cell to be transformed, stability of vectors used, expression level of gene products and other parameters.
Other suitable methods of introducing nucleotide sequences into plant cells and subsequent insertion into the plant genome include microinjection as Crossway el al.
(1986) Biotechniques 4:320-334, electroporation as described by Riggs et al. ( 1986) PNAS 83 :5602-5606, Agrobacterium-mQd' ted transformation as described by
Townsend et al , U.S. Patent No. 5,563,055, Zhao et al , U.S. Patent No. 5,981 ,840, Yukou et al. , WO 94/000977, and Hideaki et al. , WO 95/06722, direct gene transfer as described by Paszkowski et al. (1984) EMBO J. 3 :2717-2722, and ballistic particle acceleration as described in, for example, Sanford et al , U.S. Patent No. 4,945,050; Tomes et al , U.S. Patent No. 5,879,918; Tomes et al , U.S. Patent No. 5,886,244; Bidney et al , U.S. Patent No. 5,932,782; Tomes et al. (1995) "Direct DNA Transfer into Intact Plant Cells via Microprojectile Bombardment," in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg and Phillips (Springer-Verlag, Berlin); McCabe et al. (1988) Biotechnology 6:923-926); and Led transformation (WO 00/28058). Also see, Weissinger et al. ( 1988) Rev. Genet. 22:421 -477; Sanford el al. (1987) Particulate Science and Technology 5:27-37 (onion); Christou et al (1988) Plant Physiol. 87:671 -674 (soybean); McCabe et al. (1988) Bio/Technology 6:923-926 (soybean); Finer and McMullen (1991) In Vitro Cell Dev. Biol. 27P: 175-182 (soybean); Singh et al. (1998) Theor. Appl. Genet. 96:319-324 (soybean); Datta et al. (1990) Biotechnology 8:736-740 (rice); Klein et al. (1988) PNAS 85:4305-4309 (maize); Klein et al. (1988) Biotechnology 6:559-563 (maize); Tomes, U.S. Patent No. 5,240,855; Buising et al , U.S. Patent Nos. 5,322,783 and 5,324,646; Tomes et al. (1995) "Direct DNA Transfer into Intact Plant Cells via Microprojectile
Bombardment," in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg (Springer-Verlag, Berlin) (maize); Klein et al. (1988) Plant Physiol.
91 :440-444 (maize); Fromm et al. (1990) Biotechnology 8:833-839 (maize); Hooykaas- Van Slogteren et al. (1984) Nature (London) 31 1 :763-764; Bowen et al , U.S. Patent No. 5,736,369 (cereals); Bytebier et al. (1987) PNAS 84:5345-5349 (Liliaceae); De Wet et al. (1985) in The Experimental Manipulation of Ovule Tissues, ed. Chapman et al (Longman, New York), pp. 197-209 (pollen); Kaeppler et al. (1990) Plant Cell Reports 9:415-418 and Kaeppler et al. (1992) Theor. Appl. Genet. 84:560-566
(whisker-mediated transformation); DTIalluin et al. (1992) Plant Cell 4: 1495-1505 (electroporation); Li et al. (1993) Plant Cell Reports 12:250-255 and Christou and Ford (1995) Annals of Botany 75:407-413 (rice); Osjoda et al. (1996) Nature Biotechnology 14:745-750 (maize via Agrobacterium tumefaciens); all of which are herein
incorporated by reference.
The nucleic acid molecules, expression constructs, and regulatory constructs of the invention may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating a nucleic acid molecule or an expression construct of the invention within a viral DNA or RNA molecule. It is recognized that the a protein of the invention may be initially synthesized as part of a viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein.
The cells that have been transformed may be grown into plants in accordance with conventional ways. See, for example, McCormick et al. (1986) Plant Cell Reports 5 :81 -84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting hybrid having constitutive expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved. In this manner, the present invention provides transformed seed (also referred to as "transgenic seed") having a polynucleotide construct of the invention, for example, an expression cassette of the invention, stably incorporated into their genome.
In specific embodiments, the nucleic acid molecules, expression constructs, and regulatory constructs of the present invention can be provided to a plant or other organism using a variety of transient transformation methods. Such transient transformation methods include, but are not limited to, the introduction of the sequence or variants and fragments thereof directly into the plant or other organism or the introduction of a transcript into the plant. Such methods include, for example, microinjection, electroporation, or particle bombardment. See, for example, Crossway et al. ( \ 986) Mo! Gen. Genet. 202: 179-185; Nomura et al. (1986) Plant Sci. 44:53-58; Hepler et al. (1994) PNAS 91 : 2176-2180 and Hush et al. (1994) The Journal of Cell Science 107:775-784, Sheen, J. 2002. A transient expression assay using maize mesophyll protoplasts, http://genetics.mgh.harvard.edu/sheenweb/, Anderson et al , U.S. Pat. No. 7,645,919 B2, all of which are herein incorporated by reference.
Alternatively, the polynucleotide can be transiently transformed into the plant or other organism using any other technique known in the art.
The nucleic acid molecules,expression constructs, and regulatory constructs of the present invention can be used for transformation of any plant species, including, but not limited to, monocots and dicots. Examples of plant species of interest include, but are not limited to, Arabidopsis thaliana, peppers {Capsicum spp; e.g., Capsicum annuum, C. baccatum, C. chinense, C. frutescens, C. pubescens, and the like), tomatoes (Lycopersicon esculentum), tobacco (Nicotiana tabacum), eggplant (Solanum melongena), petunia (Petunia spp., e.g., Petunia x hybrida or Petunia hybrida), corn or maize (Zea mays), Brassica ssp. (e.g., B, napus, B. rapa, B. juncea), particularly those Brassica species useful as sources of seed oil, alfalfa (Medicago saliva), rice (Oryza saliva), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), green millet (Setaria viridis), finger millet (Eleusine coracana)), sunflower (Helianlhus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycine max), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (lpomoea batatus), cassava (Manihot esculenta), coffee (Coffea spp.), coconut (Cocos nucifera), pineapple (Ananas comosus), citrus trees (Citrus spp.), cocoa (Theobroma cacao), tea (Camellia sinensis), banana (Musa spp.), avocado (Persea americana), fig (Ficus casica), guava (Psidium guajava), mango (Mangifera indica), olive (Olea europaea), papaya (Carica papaya), cashew (Anacardium occidentale), macadamia (Macadamia integrifolid), almond (Prunus amygdalus), sugar beets (Beta vulgaris), sugarcane (Saccharum spp.), switchgrass (Panicum virgatum), duckweed (e.g., Lemna spp., Spirodela spp., Landoltia spp.,
Wolffiella spp., and Wolffia spp.) algae (e.g., Chlamydomonas reinhardlii, Botryococcus braunii, Chlorella spp. , Dunaliella tertiolecta, Gracilaria spp.), oats, barley, vegetables, ornamentals, and conifers.
As used herein, the term plant includes plant cells, plant protoplasts, plant cell or tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruits, roots, root tips, anthers, and the like. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced expression constructs or polynucleotides.
Various changes in phenotype are of interest including modifying the fatty acid composition in a plant, altering the amino acid content of a plant, altering a plant's pathogen defense mechanism, and the like. These results can be achieved by providing expression of heterologous products or increased expression of endogenous products in plants. The present invention provides methods for expressing heterologous genes in organisms. A heterologous gene of the present invention can be any gene of interest that can be expressed by the methods of the present invention. Genes of interest encode proteins of interest. Thus, a translated region of the present invention can comprise a gene of interest that encodes a protein of interest.
Genes of interest are reflective of the commercial markets and interests of those involved in the development of the crop. Crops and markets of interest change, and as developing nations open up world markets, new crops and technologies will emerge also. In addition, as our understanding of agronomic traits and characteristics such as yield and heterosis increase, the choice of genes for transformation will change accordingly. General categories of genes of interest include, for example, those genes involved in information, such as zinc fingers, those involved in communication, such as kinases, and those involved in housekeeping, such as heat shock proteins. More specific categories of transgenes, for example, include genes encoding important traits for agronomics, insect resistance, disease resistance, herbicide resistance, sterility, grain characteristics, yield, abiotic stress tolerance, and commercial products. Genes of interest include, generally, those involved in oil, starch, carbohydrate, or nutrient metabolism. In addition, genes of interest include genes encoding enzymes and other proteins from plants and other sources including prokaryotes and other eukaryotes.
The following examples are offered by way of illustration and not by way of limitation.
EXAMPLE 1
Promoters and introns from Arabidopsis and rice highly expressed constitutive genes were used to make expression constructs comprising a promoter and intron from the same gene operably linked to a reporter gene and control expression constructs comprising a promoter operably linked to the reporter gene. The Arabidopsis and rice genes were previously identified as being highly expressed constitutive genes as reported in WO 201 1/079197 (see also, U.S. Patent Application No. 13/528,515, filed June 20, 2012) and WO 2012/006426. The accession numbers of the genes are listed in Tables 1 and 2 along with cross-references to the sequence identifiers for the constructs in these publications. Table 1. Gene Accessions from Arabidopsis thaliana
Figure imgf000042_0001
Table 2. Gene Accessions from Rice (Oryza sativa)
Figure imgf000043_0001
In Figures 1 and 2, intron-mediated enhancement (IME) was calculated as average expression with a + intron construct divided by average expression with the corresponding - intron construct. The constructs from Arabidopsis were tested for GFP expression in Arabidopsis lhaliana by calculating the GFP index, and the rice constructs were tested for GUS expression in corn (Zea mays) by determining GUS enzymatic activity, as described in WO 201 1/079197 and WO 2012/006426. IME was calculated for each of the 14 (Arabidopsis) or 10 (corn) tissues/zones/stages, and then these values are averaged for presentation in Figures 1 and 2. The presence of the first introns in the constructs enhanced expression in most cases (12 of 15 cases in
Arabidopsis, 10 of 10 cases in rice), with the expression enhancement ranging from 2- 70 fold in both Arabidopsis and corn (median IME in Arabidopsis was 4.1 -fold, median IME in corn was 1 1.5-fold). It is noted that the starred Arabidopsis IME values in Figure 1 and all of the corn IME values in Figure 2 are minimal estimates for IME because there was no detectable expression in the absence of an intron in one or more of the tissues tested. In these cases, the IME value is calculated using background GFP or GUS values, respectively, for the tissues with no detectable expression in the -intron transgenics. Arabidopsis expression measurements are from the root epidermis, cortex, endodermis, and stele in each of the meristematic, elongation, and maturation zones, as well as the root cap and quiescent center (14 measurements throughout root
development total) of T2 seedlings. Corn expression measurements are from V3-root, V7-root, VT-root, V3-leaf, V7-leaf, VT-leaf, VT-anther, VT-silk, 21 -DAP-embryo, and 21 -DAP-endosperm (10 measurements throughout plant development total) from R0 seedlings.
IME was also determined in shoot tissue from two representative Arabidopsis promoters using quantitative PCR analysis (qRT-PCR) and northern blot analysis. For the Northern blot analysis of GFP expression, independent transgenic lines that exhibited single locus segregation of antibiotic resistance marker expression were selected for GFP transgene copy number and expression analysis. T3 seeds from the transgenic lines, as well as wild-type (WT) Col controls, were surface sterilized, stratified and grown on IX MS media. Shoot tissues were harvested from 2-3 week old seedlings and homogenized in liquid nitrogen by grinding with mortar and pestle. Total RNA was extracted from tissues using the RNeasy kit (Qiagen). Gel resolution, transfer and crosslinking were done with the NorthernMax kit (Ambion). Probes for GFP and the housekeeping gene ATPK1 were labeled with the Prime-A-Gene kit (Promega). Unincorporated labels were removed via Micro Bio-spin P30 Tris chromatography columns (BioRad). Following overnight hybridization, membranes were washed in 2X SSC with 0.1 %SDS, dried, and screened at Ι ΟΟμιη using the Scan Phospholmager. Bands were quantified via ImageQuant software.
For the quantitative PCR analysis of GFP expression, independent transgenic lines that exhibited single locus segregation of antibiotic resistance marker expression were selected for GFP transgene copy number and expression analysis. T3 seeds from the tested transgenic lines as well as a CaMV 35S:GFP construct and wild-type (WT) Col controls, were surface sterilized, stratified and sown on 90 μηι nylon mesh on I X MS media. Shoots were harvested from pools of ~100 1 week old seedlings per line. Shoot tissues was homogenized in liquid nitrogen by bead milling followed by passage through QIAshredder columns (Qiagen). Genomic DNA and total RNA were extracted from tissues using Allprep DNA/RNA kits (Qiagen). cDNA was generated from total RNA using Superscriptlll reverse transcriptase (Invitrogen) per manufacturer's instructions. Quantitative PCR was performed with iQ Multiplex Powermix (Bio-Rad) supplemented with the appropriate primers and probes (see below) on an iCycler iQ real-time detection system (Bio-Rad) using the following thermal-cycler program: (1) 9 min at 95°C; (2) 15 s at 94°C; (3) 30 s at 57°C; (4) 30 s at 72°C; repeat 40 cycles of steps 2-4. Amplification data recorded by the iQ software (Bio-Rad) was exported to Linregpcr program (Ruijter et al. (2009) Nucleic Acids Res. 37(6):e45) to determine PCR efficiency and cycle threshold values, which were used to calculate GFP transgene copy number and expression relative to the 35S:GFP control using REST-MCS beta tool (Pfaffl et al. (2002) Nucleic Acids Res. 30(9):e36). Relative GFP expression in each tissue was calculated by normalizing the amplification of GFP in cDNA to the amplification of ubiquitin-conjugating enzyme 9 (UBC9), a "housekeeping gene", and subsequent normalization to 35S:GFP. Primers used for PCR are as follows:
ER-GFP F 5' - CGTGCAGGAGAGGACCAT;
ER-GFP R 5' - TGTCTCCCTCAAACTTGACTTCAG;
ER-GFP Probe 5' - 56-FAM/TCCCGTCGTCCTTGAAGAAG/3IABkFQ;
UBC9 F 5' - ATGGAAGCATCTGCCTCGACATCT;
UBC9 R 5' - AGGATCATCTGGGTTTGGATCCGT;
UBC9 Probe 5' - 5TEX615/AGCAGTGGAGTCCTGCTCTCACAATT/3IAbRQSp; PDS 1 F 5' - TCACGGCTCTTGTCGTTCCTTCTT;
PDS 1 R 5' - TGGAGAAAGCTGACTCTGCGTCTT;
PDS 1 Probe 5' - 5 TEX 615/TCGGTGTTAGAGCCGTTGCGATTGAA /3IAbRQSp.
56-FAM and 5TEX615 indicate the presence of 5' fluorophore modifications while 3IAbRQSp and 3IABkFQ indicate the presence of 3' quencher modifications
(Integrated DNA Technologies, Coralville, Iowa USA) on the real time PCR probes.
Expression enhancement (IME) was calculated as average expression with + intron construct divided by average expression with the (-) intron construct.
Measurements are the average of shoots of 3-5 independent, homozygous, single-copy T3 lines per intron variant. Table 3. Shoot Intron Expression Enhancement (IME) of Arabidopsis
Promoters by Cognate First Introns
Figure imgf000046_0001
* Minimal estimate of IME since no
expression above background was detected
in the -intron variants.
Tables 4 and 5 demonstrate the absolute expression activity of the + intron variants when compared to well-know, high constitutive expressing control promoters. In Table 4, expression constructs with Arabidopsis promoters and cognate introns were compared to the CaMV 35S promoter for expression in Arabidopsis roots. GFP expression in Arabidopsis was measured as the GFP index as described in WO
201 1/079197 and WO 2012/006426. In Table 5, expression constructs with rice promoters and cognate introns were compared to an enhanced rice actin 1 (eACTl ) promoter for expression in corn. GUS expression in corn was measured from GUS activity assays as described in WO 201 1/079197. These results demonstrate that IME is important for achieving expression approaching and comparable to well-know, high constitutive expressing control promoters.
Table 4. Expression of Arabidopsis + Intron Promoter Constructs in Arabidopsis Roots
Figure imgf000047_0001
* Average GFP index from 14 root tissues/zones
of two independent lines per promoter. Results from a CaMV 35S-promoter control (- intron)
are shown for comparison. Table 5. Expression of Rice + Intron Promoter Constructs in Corn Plants
Figure imgf000048_0001
* Average GUS activity measured in 10 corn
tissue/stages from 5-10 lines per promoter.
Results from an enhanced rice actin 1 (eACTl) promoter control (+ intron) are shown for
comparison.
In addition to enhancing the expression of their cognate promoters, the introns that have been identified can enhance the expression of heterologous promoters. In this example, introns were swapped between two promoters from Figure 1 and tested for expression enhancement by northern analysis of shoot tissue as described above. The result in Table 6 for the AT1 G52300/AT4G37830 construct is based on 1 single copy homozygous line of each the - and + intron variants. The result in Table 6 for the AT4G37830/AT1 G52300 construct is based on 2 (- intron variant) and 5 (+ intron variant) single copy, homozygous lines. Table 6. Intron-Mediated Enhancement (IME) of Heterologous Promoters
Figure imgf000049_0001
* IME was calculated as average expression with a + intron construct divided by average
expression with the corresponding (-) intron
construct. The results provided in Figures 1 -2 and Tables 3-6 demonstrate that the phenomenon of intron enhancement of gene expression by a first intron is widespread and that IME contributes to the expression of most highly expressed constitutive genes in both monocot and dicot plants. All of the tested promoters are from highly expressed constitutive genes. The present inventors have only been able to recapitulate high expression with cloned promoters when the first introns were included. In contrast, the current dogma has been that there were just a few monocot introns with enhancing properties, and that intron enhancement is not important in dicots. Furthermore, the present invention demonstrates how to identify enhancing introns - by taking the first introns from genes selected for particular properties (e.g., high and uniform expression in all cell types, organs, tissues). The first introns are usually in the coding region but as disclosed herein the enhancing property of the first introns is modular because the first introns can be moved to the 3' end of 5 -UTRs of cloned promoters and still provide effective enhancement. This is important because the present invention demonstrates that there it is not necessary to make fusion constructs comprising a first intron inserted within the translated region of a gene of interest. Instead, regulatory constructs can be prepared which comprise a promoter operably linked to a 5'-UTR which comprises a first intron preferably at or new the 3' of the 5'-UTR. Such a construct can be operably linked to any gene of interest with relative ease without making any modification to the translated region of the gene of interest.
All publications and patent applications mentioned in the specification are indicative of the level of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims

THAT WHICH IS CLAIMED:
1. A method for making an expression construct for enhancing gene expression in an organism, said method comprising:
(a) selecting a first intron, wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism, wherein the first intron is the first intron from the 5' end of the first gene, and wherein the first gene is native to the first organism;
(b) selecting a promoter, wherein the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, wherein the second gene is native to at least one of the first organism and the second organism; and
(b) synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the polynucleotide comprises a 5'-untranslated region (5'-UTR), a first intron, and a translated region, and wherein the 5'-UTR or the translated region comprise the first intron.
2. The method of claim 1 , wherein the expression construct provides for enhanced expression of the polynucleotide in a target organism when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
3. The method of claim 2, wherein the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
4. The method of claim 1 , wherein the first organism and the second rganism are from the same species.
5. The method of claim 1 , wherein the first organism and the second organism are from different species.
6. The method of claim 2, wherein the target organism is the same species as at least one of the first organism and the second organism.
7. The method of claim 1 , wherein the first intron, the promoter, and the 5'- UTR are derived from the same gene. 8. The method of claim 1 , wherein the first intron, the promoter, and the 5'-
UTR are not all derived from the same gene.
9. The method of claim 1 , wherein the first intron, the promoter, and the 5'- UTR are derived from the same organism.
10. The method claim 1 , wherein the first intron, the promoter, and the 5'- UTR are not all derived from the same organism.
1 1. The method of claim 1 , wherein the 5'-UTR comprises the first intron.
12. The method of claim 1 1 , wherein the first intron is at or near the 3' end of the 5'-UTR.
13. The method of claim 1 , wherein the translated region comprises the first intron.
14. The method of claim 13, wherein the first intron is between the first and second exons of the translated region. 15. The method of claim 1 , wherein the first organism and the second organism are eukaryotic organisms.
16. The method claim 15, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists.
17. The method of claim 1 , wherein the first organism and the second organism are plants.
8 The method of claim 17, wherein the plant is a dicot or a monocot.
19. The method of claim 18, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and Brassica sp.
20. The method of claim 15, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
An expression construct according to any one of claims 1 -20.
A non-human organism or a non-human host cell comprising the expression construct of claim 21.
23. A method for making an organism for expressing a heterologous gene, said method comprising introducing into at least one cell of a target organism an expression construct comprising a promoter operably linked to a polynucleotide, wherein:
(a) the polynucleotide comprises a 5'-UTR, a first intron, and a
translated region,
(b) the 5'-UTR or translated comprises the first intron, (c) wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism,
(d) the first intron is the first intron from the 5' end of the first gene, (e) the first gene is native to the first organism, (f> the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, and
(g) the second gene is native to at least one of the first organism and the second organism.
24. The method of claim 23, further comprising regenerating from the at least one cell a target organism comprising the expression construct.
The method of claim 24, wherein the target organism is capable of expressing the polynucleotide when the target organism is exposed to conditions favorable for the expression of the polynucleotide for a sufficient period of time.
26. The method of claim 25, wherein the polynucleotide is expressed at an increased level in the target organism or at least one cell thereof when compared to the expression of the polynucleotide in the target organism or at least one cell thereof comprising a control expression construct which lacks the first intron.
27. The method of claim 25, wherein the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
28. The method of claim 23, wherein the first organism and the second organism are from the same species.
29. The method of claim 23, wherein the first organism and the second organism are from different species.
30. The method of claim 23, wherein the target organism is the same species of organism as at least one of the first organism and the second organism.
31. The method of claim 23, wherein the first intron, the promoter, and the 5'-UTR are derived from the same gene.
32. The method of claim 23, wherein the first intron, the promoter, and the 5 -UTR are not all derived from the same gene.
33. The method of claim 23, wherein the first intron, the promoter, and the 5'-UTR are derived from the same organism. 34. The method of claim 23, wherein the first intron, the promoter, and the
5 -UTR are not all derived from the same organism.
35. The method of claim 23, wherein the 5'-UTR comprises the first intron. 36. The method of claim 35, wherein the first intron is at or near the 3' end of the S'-UTR.
37. The method of claim 23, wherein the translated region comprises the first intron.
38. The method of claim 37, wherein the first intron is between the first and second exons of the translated region.
39. The method of claim 23, wherein the first organism, the second organism, and the target are eukaryotic organisms.
40. The method claim 39, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists. 41. The method of claim 23, wherein the first organism, the second organism, and the target organism are plants. The method of claim 31 , wherein the plant is a dicot or a monocot.
43. The method of claim 42, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and Brassica sp.
44. The method of claim 42, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
45. The method of claim 23, wherein the target organism is a plant.
46. A plant of claim 45 or descendant thereof that comprises the expression construct.
47. The plant or descendant of claim 46, wherein the plant is a seed.
48. A non-human organism of any one of claims 23-45 or cell or descendant thereof, wherein the non-human organism or cell or descendant thereof comprises the expression construct.
49. A method for expressing a heterologous gene in an organism, said method comprising:
(a) obtaining a target organism comprising an expression construct or at least one cell thereof, wherein the nucleic acid comprises a promoter operably linked to a polynucleotide, wherein the polynucleotide comprises a 5'-UTR, a first intron, and a translated region, wherein the 5'-UTR or translated region comprises the first intron, wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism, wherein the first intron is the first intron from the 5' end of the first gene, wherein the first gene is native to the first organism, wherein the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, and wherein the second gene is native to at least one of the first organism and the second organism; and
(b) exposing the target organism or cell thereof to conditions
favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed.
50. The method of claim 49, wherein the polynucleotide is expressed at an increased level in the target organism or cell thereof when compared to the expression of the polynucleotide in the target organism or cell thereof comprising a control expression construct which lacks the first intron.
51. The method of claim 50, wherein the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
52. The method claim 49, wherein the target organism or a progenitor thereof is produced by introducing the expression construct into at least one cell of an organism and regenerating the at least one cell into the target organism or a progenitor thereof comprising the expression construct.
53. The method of claim 49, wherein the first organism and the second organism are from the same species.
54. The method of claim 49, wherein the first organism and the second organism are from different species.
55. The method of claim 49, wherein the target organism is the same species of organism as at least one of the first organism and the second organism.
56. The method of claim 49, wherein the first intron, the promoter, and the 5'-UTR are derived from the same gene.
57. The method of claim 49, wherein the first intron, the promoter, and the 5'-UTR are not all derived from the same gene.
58. The method of claim 49, wherein the first intron, the promoter, and the 5'-UTR are derived from the same organism. 59. The method of claim 49, wherein the first intron, the promoter, and the
5'-UTR are not all derived from the same organism.
60. The method of claim 49, wherein the 5'-UTR comprises the first intron. 61. The method of claim 60, wherein the first intron is at or near the 3 ' end of the 5'-UTR.
62. The method of claim 49, wherein the translated region comprises the first intron.
63. The method of claim 62, wherein the first intron is between the first and second exons of the translated region.
64. The method of claim 49, wherein the first organism, the second organism, and the target are eukaryotic organisms.
65. The method claim 64, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists.
66. The method of claim 49, wherein the first organism, the second organism, and the target organism are plants. The method of claim 66, wherein the plant is a dicot or a monocot.
68. The method of claim 66, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and
Brassica sp.
69. The method of claim 66, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
70. The method of claim 49, wherein the target organism is a plant.
71. The method of claim 70, further comprising regenerating the at least one cell into a plant comprising the expression construct.
72. A plant of claim 71 or descendant thereof, wherein the descendant comprises the expression construct.
73. The plant or descendant of claim 72, wherein the plant or descendant is a seed.
74. A non-human organism of any one of claims 49-71 or cell or descendant thereof, wherein the non-human organism or cell or descendant thereof comprises the expression construct.
75. A method for making a regulatory construct, said method comprising:
(a) selecting a first intron, wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism, wherein the first intron is the first intron from the 5' end of the first gene, and wherein the first gene is native to the first organism;
(b) selecting a promoter, wherein the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, wherein the second gene is native to at least one of the first organism and the second organism; and
(b) synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the polynucleotide comprises a 5 '-untranslated region (5'-UTR), a first intron, and a translated region, and wherein the 5'-UTR or the translated region comprise the first intron. 76. The method of claim 75, wherein the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron. 77. The method of claim 75, wherein the first organism and the second organism are from the same species.
78. The method of claim 75, wherein the first organism and the second organism are from different species.
79. The method of claim 76, wherein the target organism is the same species as at least one of the first organism and the second organism.
80. The method of claim 75, wherein the first intron, the promoter, and the 5'-UTR are derived from the same gene.
81. The method of claim 75, wherein the first intron, the promoter, and the 5 -UTR are not all derived from the same gene. 82. The method of claim 75, wherein the first intron, the promoter, and the
5'-UTR are derived from the same organism.
83. The method claim 75, wherein the first intron, the promoter, and the 5'- UTR are not all derived from the same organism.
85. The method of claim 75, wherein the first intron is at or near the 3' end of the 5'-UTR.
86. The method of claim 1 , wherein the first organism and the second organism are eukaryotic organisms. 87. The method claim 86, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists.
88. The method of claim 75, wherein the first organism and the second organism are plants.
89. The method of claim 88, wherein the plant is a dicot or a monocot.
90. The method of claim 89, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and Brassica sp.
91. The method of claim 89, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
92. The method of claim 75, further comprising operably linking a gene of interest to the regulatory construct.
93. A regulatory construct according to any one of any one of claims 75-92.
94. A non-human organism or a non-human host cell comprising the regulatory construct of claim 93.
PCT/US2013/047837 2012-06-29 2013-06-26 Methods and compositions for enhancing gene expression WO2014004638A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261666318P 2012-06-29 2012-06-29
US61/666,318 2012-06-29

Publications (2)

Publication Number Publication Date
WO2014004638A2 true WO2014004638A2 (en) 2014-01-03
WO2014004638A3 WO2014004638A3 (en) 2014-03-13

Family

ID=48746154

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/047837 WO2014004638A2 (en) 2012-06-29 2013-06-26 Methods and compositions for enhancing gene expression

Country Status (1)

Country Link
WO (1) WO2014004638A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016134213A3 (en) * 2015-02-19 2016-11-03 Danisco Us Inc Enhanced protein expression
WO2018136594A1 (en) * 2017-01-19 2018-07-26 Monsanto Technology Llc Plant regulatory elements and uses thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000011185A2 (en) * 1998-08-19 2000-03-02 Monsanto Co. Improved expression of cry3b insecticidal protein in plants
EP2169058A2 (en) * 2005-03-08 2010-03-31 BASF Plant Science GmbH Expression enhancing intron sequences
WO2011156535A1 (en) * 2010-06-09 2011-12-15 E. I. Du Pont De Nemours And Company Regulatory sequences for modulating transgene expression in plants

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000011185A2 (en) * 1998-08-19 2000-03-02 Monsanto Co. Improved expression of cry3b insecticidal protein in plants
EP1698699A2 (en) * 1998-08-19 2006-09-06 Monsanto Technology, LLC Improved expression of CRY3B insecticidal protein in plants
EP2169058A2 (en) * 2005-03-08 2010-03-31 BASF Plant Science GmbH Expression enhancing intron sequences
WO2011156535A1 (en) * 2010-06-09 2011-12-15 E. I. Du Pont De Nemours And Company Regulatory sequences for modulating transgene expression in plants

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016134213A3 (en) * 2015-02-19 2016-11-03 Danisco Us Inc Enhanced protein expression
WO2018136594A1 (en) * 2017-01-19 2018-07-26 Monsanto Technology Llc Plant regulatory elements and uses thereof
US10196648B2 (en) 2017-01-19 2019-02-05 Monsanto Technology Llc Plant regulatory elements and uses thereof
US10870863B2 (en) 2017-01-19 2020-12-22 Monsanto Technology Llc Plant regulatory elements and uses thereof
EA039606B1 (en) * 2017-01-19 2022-02-16 Монсанто Текнолоджи Ллс Plant regulatory elements and uses thereof
US11519002B2 (en) 2017-01-19 2022-12-06 Monsanto Technology Llc Plant regulatory elements and uses thereof

Also Published As

Publication number Publication date
WO2014004638A3 (en) 2014-03-13

Similar Documents

Publication Publication Date Title
RU2694686C2 (en) Methods for identifying variant recognition sites for rare-cutting engineered double-strand-break-inducing agents and compositions and uses thereof
BR102013032129A2 (en) DNA detection methods for site specific nuclease activity
CA2805937A1 (en) Chimeric promoters and methods of use
US9574202B2 (en) Methods for increasing the anthocyanin content of citrus fruit
WO2013112686A1 (en) Methods and compositions for generating complex trait loci
MX2008010992A (en) Compositions related to the quantitative trait locus 6 (qtl6) in maize and methods of use.
US11879129B2 (en) Modulation of transgene expression in plants
AU2017234672B2 (en) Zea mays regulatory elements and uses thereof
CA2933042C (en) Zea mays regulatory elements and uses thereof
US20140137292A1 (en) Citrus trees with resistance to citrus canker
AU2017235944B2 (en) Zea mays regulatory elements and uses thereof
US11732271B2 (en) Stem rust resistance genes and methods of use
WO2014004638A2 (en) Methods and compositions for enhancing gene expression
US9777286B2 (en) Zea mays metallothionein-like regulatory elements and uses thereof
US20130111634A1 (en) Methods and compositions for silencing genes using artificial micrornas
US20240093220A1 (en) Plant regulatory elements and uses thereof
CN116634861A (en) Rust resistance gene

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13733532

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 13733532

Country of ref document: EP

Kind code of ref document: A2