WO2003060092A2 - Modified fatty acid hydroxylase protein and genes - Google Patents

Modified fatty acid hydroxylase protein and genes Download PDF

Info

Publication number
WO2003060092A2
WO2003060092A2 PCT/US2003/000341 US0300341W WO03060092A2 WO 2003060092 A2 WO2003060092 A2 WO 2003060092A2 US 0300341 W US0300341 W US 0300341W WO 03060092 A2 WO03060092 A2 WO 03060092A2
Authority
WO
WIPO (PCT)
Prior art keywords
ofthe
hydroxylase
modified
plant
gene
Prior art date
Application number
PCT/US2003/000341
Other languages
French (fr)
Other versions
WO2003060092A3 (en
Inventor
John N. Shanklin
John A. Broadwater
Original Assignee
Brookhaven Science Associates, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Brookhaven Science Associates, Llc filed Critical Brookhaven Science Associates, Llc
Priority to AU2003235589A priority Critical patent/AU2003235589A1/en
Publication of WO2003060092A2 publication Critical patent/WO2003060092A2/en
Publication of WO2003060092A3 publication Critical patent/WO2003060092A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0071Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8247Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine involving modified lipid metabolism, e.g. seed oil composition

Definitions

  • the present invention concerns the identification of modified enzymes and encoding sequences, and methods related thereto, and the use of these sequences to produce genetically modified plants for the purpose of altering the composition of plant oils, waxes and related compounds.
  • Plants have the ability to produce a diverse range of structures, including more than 20,000 different terpenoids, flavonoids, alkaloids, and fatty acids. Fatty acids have been extensively exploited for industrial uses in products such as lubricants, plasticizers, and surfactants. In fact, approximately one-third of vegetable oils produced in the world are already used for non-food purposes (Ohlrogge, J (1994) Plant Physiol. 104:821-26).
  • seed oils consist primarily of storage oil in the form of triglycerols, with minority contributions primarily from membrane lipids, which are predominantly phospholipids.
  • Seed oils from different species of higher plants contain a total of more than 210 naturally occurring fatty acids, which differ by the number and arrangement of double or triple bonds and various functional groups, such as hydroxyls, ketones, epoxys, cyclopentenyl or cyclopropyl groups, furans or halogens (van de Loo et al. (1993) in Lipid Metabolism in Plants (Moore, TS, Jr., ed.; CRC, Boca Raton, FL) pp 91-126).
  • Y indicates that the fatty acid contains X numbers of carbon atoms and 7 numbers of double bonds; ⁇ z indicates that a double bond is positioned at the z th carbon atom from the carboxyl terminus).
  • LFAH actually retains both hydroxylase and desaturase activity, indicating that these two oxidation reactions can be catalyzed by the same enzyme. It would be useful to modify the activity ofthe enzyme, such that the ratio of desaturation to hydroxylation activity could be controlled. This would allow the production of a spectrum of enzymes with varying ratios of hydroxylation to denaturation; preferably, such enzymes are only slightly modified from the wild-type enzyme. Such enzymes could then be used to produce oils of specified fatty acid composition, either in vivo, as in transgenic plants, or in vitro, as in fermentation reactors.
  • the present invention is directed to the biosynthesis of hydroxylated fatty acids, such as ricinoleic acid in castor (Ricinus communis) and Lequerella fendleri seed, and to the manipulation ofthe catalytic activity and reaction specificity ofthe enzymes which synthesize them.
  • hydroxylated fatty acids such as ricinoleic acid in castor (Ricinus communis) and Lequerella fendleri seed
  • compositions comprising a modified fatty acid hydroxylase or desaturase enzyme, such that the ratio of hydroxylase to desaturase activity ofthe enzyme differs from the wild-type enzyme, and to provide compositions comprising nucleic acids encoding the same; preferably, the enzyme is a modified Lesquerella hydroxylase enzyme. It is a further object ofthe present invention to provide methods of using the modified fatty acid hydroxylase to modify oils produced by transgenic organisms. It is yet a further object ofthe present invention to provide a yeast system in which to evaluate the effects of modified hydroxylases and other modified lipid synthetic enzymes.
  • the present invention provides a composition comprising a modified Lesquerella fatty acid hydroxylase polypeptide, comprising a non-native amino acid at position 149, at position 325, or at both positions, where of amino acids at positions 63, 105, 149, 218, 296, 323 and 325 no more than three are non-native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase.
  • the modified hydroxylase comprises a non- native amino acid at position 149.
  • the non-native amino acid at position 149 is threonine or isoleucine.
  • the modified Lesquerella fatty acid hydroxylase polypeptide is a modified Lesquerella fendleri hydroxylase.
  • the present invention provides a composition comprising a modified Lesquerella fatty acid hydroxylase polypeptide comprising an amino acid sequence shown in SEQ ID NO:l, where the amino acid sequence is modified to comprise a non-native amino acid at position 149, at position 325, or at both positions, where of amino acids at positions 63, 105, 149, 218, 296, 323 and 325 no more than three are non-native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase.
  • the present invention comprises a composition comprising a modified plant fatty acid hydroxylase polypeptide comprising a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where 'of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non- native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
  • the present invention provides a composition comprising any ofthe modified fatty acid hydroxylase polypeptides as described above, where a ratio of hydroxylation to desaturation activity ofthe modified hydroxylase is decreased relative to a ratio of hydroxylation to desaturation activity ofthe unmodified hydroxylase.
  • the present invention comprises a composition comprising any ofthe modified fatty acid hydroxylase polypeptides as described above, where a ratio of hydroxylation to desaturation activity ofthe modified hydroxylase is increased relative to a ratio of hydroxylation to desaturation activity ofthe unmodified hydroxylase.
  • the present invention also provides a composition comprising a modified fatty acid hydroxylase polypeptide, comprising a non-native amino acid at a position corresponding to position 149 of SEQ ID NO: 1, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non- native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase.
  • the present invention also provides a composition comprising a modified fatty acid hydroxylase polypeptide, where the fatty hydroxylase polypeptide is highly homologous to SEQ ID NO: 1 and where the modified hydroxylase fatty acid polypeptide comprises a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, wherein of amino acids at positions corresponding to positions 63, 105, 149, 218,
  • the present invention also provides a composition comprising a nucleic acid sequence encoding any ofthe modified fatty acid hydroxylases as described above.
  • the present invention also provides a composition comprising a nucleic acid sequence encoding a modified fatty acid hydroxylase polypeptide, wherein the nucleic acid sequence hybridizes to a nucleic acid sequence encoding SEQ ID NO:l, and wherein the modified hydroxylase fatty acid polypeptide comprises a non-native amino acid at a position corresponding to position 149 of SEQ ID NO: 1, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, wherein of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non- native amino acids, and wherein a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
  • the present invention provides a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid - hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the present invention provides an expression vector comprising the recombinant DNA.
  • the present invention also provides an organism transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the organism is a microorganism or a plant.
  • the present invention also provides a plant transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the plant is selected from the group consisting of soybean (Glycine max), rapeseed and canola (including Brassica napus and B.
  • campestris sunflower (Helianthus annus), cotton (Gossypium hirsutum), corn (Zea mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorius), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea).
  • the present invention also provides a plant cell transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the present invention also provides a plant seed transformed with the recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the present invention provides an oil obtained from the transformed plant seed.
  • the present invention also provides a yeast cell transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the yeast is S. cerevisiae strain YPH499.
  • the present invention also provides a bacterial cell transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence.
  • the present invention provides an oil obtained from the transformed bacterial cell.
  • the present invention also provides a method of producing a modified hydroxylase in a transgenic organism, comprising providing an organism transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence, and growing the organism under conditions such that a modified hydroxylase encoded by the recombinant DNA molecule is expressed.
  • the organism is a plant.
  • the recombinant DNA molecule is integrated into a genome ofthe plant.
  • the present invention also provides a transgenic plant which produces the modified hydroxylase according to this method.
  • the present invention also provides a method for altering the phenotype of a plant, comprising providing an expression vector comprising a recombinant DNA molecule comprising any ofthe nucleic acid sequences encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence, and a plant or a plant tissue or a plant cell; and transfecting the plant or plant tissue or plant cell with the vector under conditions such that the protein is expressed in a plant obtained from the plant or plant tissue or plant cell.
  • the present invention also provides a method for evaluating fatty acid desaturation or hydroxylation activity of an enzyme, comprising transforming a yeast cell of S. cerevisiae strain YPH499 with a nucleic acid sequence encoding the enzyme under control of an inducible promoter; growing the yeast cell to a culture of cells at high density; and inducing expression of the nucleic acid at about 30 degrees centigrade, such that the desaturation or hydroxylation activity ofthe enzyme can be evaluated.
  • Figure 1 shows the accumulation of hydroxy fatty acids in seeds of A. thaliana FAD2- deficient plants transformed with FAD2 variants under control ofthe seed-specific napin promoter.
  • Each circle represents data obtained from a single plant.
  • Variants are indicated by the amino acid of AtFAD2 to be modified, followed by its position number, followed by the substituted amino acid; if more than one amino acid is modified, the amino acids are separated by a slash.
  • Variants with more two or three amino acid substitution are indicated by a shorthand designation, with the letter ofthe enzyme source ofthe substituted amino acid (for example "C" for CFAH), "M” to indicate mutant, a number to indicate the number of amino acids substituted, followed by a period and a second number to indicate the particular variant.
  • the variants are identified as follows: L4M, A. thaliana FAD2 with four substitutions from LFAH (Al 04G/T148N/S322A/M324I); C4M, A. thaliana FAD2 with four substitutions from CFAH (A104G/T148I/S322A/M324V); single amino acid variants of AtFAD2, with one substitution from CFAH, include A104G, T148I, S322A, and M324I (the single amino acid variant of AtFAD2, with one substitution from LFAH, is designated T148N); double amino acid variants of AtFAD2, with two substitutions from CFAH, include C2M.1 (T148I/M324V), C2M.2 (A104G/S322A), C2M.3 (A104G/M324V), C2M.4 (A104G/T148I), C2M.5
  • T148I/S322A T148I/S322A
  • C2M.6 S322A/M324V
  • triple amino acid variants of AtFAD2 triple amino acid variants of AtFAD2, with three substitutions from CFAH, include C3M.1 (A104G/T148I/M324V), C3M.2 (A104G/T148I/S322A), C3M.3 (A104G/S322A/M324V), and C3M.4 (T148I/S322A/M324V).
  • Figure 2 shows the relationship of linoleic acid and ricinoleic acid generated by the expression ofthe quadruple mutants of FAD2 (L4M and C4M) in seeds of A. thaliana seeds of FAD2-deficient plants.
  • Figure 3 shows the percent linoleic and ricinoleic acid products produced in S. cerevisiae expressing LFAH, FAD2, or FAD2 variant. Variant identifications are the same as in Figure 1.
  • Figure 4 shows the amino acid sequence of Lesquerella fatty acid hydroxylase (SEQ ID NO: 1
  • microorganism is used in its broadest sense. It includes, but is not limited to, microscopic organisms (and taxonomically related macroscopic organisms) within the categories algae, bacteria, fungi (including lichens), protozoa, viruses, and subviral agents.
  • plant is used in its broadest sense. It includes, but is not limited to, any species of woody, ornamental or decorative, crop or cereal, fruit or vegetable plant, and photosynthetic green algae (for example, Chlamydomonas reinhardtii). It also refers to a plurality of plant cells which are largely differentiated into a structure that is present at any stage of a plant's development.
  • plant tissue includes differentiated and undifferentiated tissues of plants including those present in roots, shoots, leaves, pollen, seeds and tumors, as well as cells in culture (for example, single cells, protoplasts, embryos, callus, etc.). Plant tissue may be in planta, in organ culture, tissue culture, or cell culture.
  • plant part refers to a plant structure or a plant tissue. '
  • crop or "crop plant” is used in its broadest sense. The term includes, but is not limited to, any species of plant or algae edible by humans or used as a feed for animals or used, or consumed by humans, or any plant or algae used in industry or commerce.
  • oil-producing species refers to plant species which produce and store triacylglycerol in specific organs, primarily in seeds.
  • Such species include but are not limited to soybean (Glycine max), rapeseed and canola (including Brassica napus and B. campestris), sunflower (Helianthus annus), cotton (Gossypium hirsutum), corn (Zea mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorhis), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea).
  • the group also includes non-agronomic species which are useful in developing appropriate expression vectors such as tobacco, rapid cycling Brassica species, and Arabidopsis thaliana, and wild species which may be a source of unique fatty acids.
  • plant cell or "organelle” is used in its broadest sense.
  • the term includes but is not limited to, the endoplasmic reticulum, Golgi apparatus, trans Golgi network, plastids, sarcoplasmic reticulum, glyoxysomes, mitochondrial, chloroplast, and nuclear membranes, and the like.
  • hydroxylase refers to a monooxygenase (also mixed-function oxidase and mixed-function oxygenase), which is an oxygenase which catalyses the incorporation of one atom of molecular oxygen into a substrate molecule, the other oxygen atom being reduced to water; the reducing power need for monooxygenase activity may be supplied for example by NADH.
  • fatty acid hydroxylase refers to a sequence of amino acids, such as a protein, polypeptide or peptide fragment, which demonstrates the ability to catalyze the production of a hydroxy-fatty acid from a fatty acid substrate under enzyme reactive conditions; the substrate may be free fatty acid, a fatty acid salt, fatty acyl-CoA, fatty acyl-ACP or an fatty acyl-lipid.
  • enzyme reactive conditions any necessary conditions (for example, such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function.
  • the catalytic activity ofthe enzyme is referred to as "hydroxylation.”
  • the enzyme demonstrates the ability to catalyze the production of hydroxy-oleic acid.
  • ricinoleate or “ricinoleic acid” or “hydroxy-oleate” or “hydroxy-oleic acid” refer to 12D-hydroxyoctadec-cis-9-enoic acid, and include the free acids, the ACP and CoA esters, the salts of these acids, the glycerolipid esters (particularly the triacylglycerol esters), the wax esters, and the ether derivatives of these acids.
  • the term “desaturase” refers to a monooxygenase (also mixed-function oxidase and mixed-function oxygenase), which is an oxygenase which catalyses the 0 2 -dependent insertion of a double bond between two carbon atoms; the reducing power need for the monooxygenase activity may be supplied for example by NADH
  • fatty acid desaturase refers to a sequence of amino acids, such as a protein, polypeptide or peptide fragment, which demonstrates the ability to catalyze the production of an unsaturated bond in a fatty acid substrate under enzyme reactive conditions; the substrate may be free fatty acid, a fatty acid salt, fatty acyl-CoA, fatty acyl-ACP or an fatty acyl-lipid, and it may be an unsaturated, monounsaturated or polyunsaturated fatty acid.
  • the catalytic activity ofthe enzyme is referred to as "desaturation.”
  • enzyme reactive conditions is meant any necessary conditions (that is, such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function.
  • the enzyme demonstrates the ability to catalyze the production of linoleic acid from oleic acid.
  • reaction specificity refers to the proportion of hydroxylation or desaturation activity relative to the total of hydroxylation and desaturation activity of a hydroxylase or desaturase enzyme under particular conditions.
  • the specificity can be expressed as the ratio of the two activities, preferably desaturation to hydroxylation; this can be measured, for example, as the amount of desaturated fatty acid product divided by the amount of hydroxylated fatty acid product.
  • the term "Lesquerella hydroxylase” refers to an enzyme for which the protein or nucleic acid coding sequence occurs naturally in a Lesquerella plant. However, the coding sequence may be subsequently cloned, and expressed, in different organisms.
  • amino acid in a particular position in a protein when used in reference to an amino acid in a particular position in a protein means that the amino acid exists in nature in that particular position.
  • amino acid when used in reference to a Lesquerella hydroxylase, it means that the amino acid exists in the hydroxylase as it is found in Lesquerella plant cells, or encoded by a Lesquerella plant gene for the hydroxylase.
  • non-native when used in reference to an amino acid in a particular position in a protein means that the amino acid that exists in nature in that particular position is replaced by another amino acid; thus, a “non-native" amino acid is an amino acid other than one that exists in nature in that particular position.
  • position corresponding to position "X" when used in reference to an amino acid sequence refers to a second position in a second sequence which is the same as the first position X in a first or reference sequence as determined from an alignment of two amino acid sequences based upon sequence homology, where "X" is the position in an identified referent amino acid sequence, even though the exact position ofthe corresponding second position within any particular second amino acid sequence may vary, due to amino acid modifications, such as additions and deletions.
  • modified when used in reference to a hydroxylase or desaturase ofthe present invention refers to a hydroxylase or desaturase polypeptide comprising a non-native amino acid at positions corresponding to amino acid positions in Lesquerella hydroxylase, where the modified polypeptide comprises a non-native amino at a position corresponding to position 149, at a position corresponding to position 325, or at positions corresponding to both positions, where no more than three ofthe amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids; preferably, the reaction specificity ofthe modified enzyme differs from the reaction specificity of an unmodified enzyme.
  • the position corresponding to position 149 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine.
  • protein and polypeptide refer to compounds comprising amino acids joined via peptide bonds and are used interchangeably.
  • a “protein” or “polypeptide” encoded by a gene is not limited to the amino acid sequence encoded by the gene, but includes post- translational modifications ofthe protein.
  • amino acid sequence is recited herein to refer to an amino acid sequence of a protein molecule; an “amino acid sequence” can be deduced from the nucleic acid sequence encoding the protein.
  • portion when used in reference to a protein (as in “a portion of a given protein") refers to fragments of that protein.
  • the fragments may range in size from four amino acid residues to the entire amino sequence minus one amino acid.
  • chimera when used in reference to a polypeptide refers to the expression product of two or more coding sequences obtained from different genes, that have been cloned together and that, after translation, act as a single polypeptide sequence. Chimeric polypeptides are also referred to as "hybrid" polypeptides.
  • the coding sequences include those obtained from the same or from different species of organisms.
  • fusion when used in reference to a polypeptide refers to a chimeric protein containing a protein of interest joined to an exogenous protein fragment (the fusion partner).
  • the fusion partner may serve various functions, including enhancement of solubility ofthe polypeptide of interest, as well as providing an "affinity tag" to allow purification ofthe recombinant fusion polypeptide from a host cell or from a supernatant or from both. If desired, the fusion partner may be removed from the protein of interest after or during purification.
  • homolog or “homologous” when used in reference to a polypeptide refers to a high degree of sequence identity between two polypeptides, or to a high degree of similarity between the three-dimensional structure or to a high degree of similarity between the active site and the mechanism of action.
  • a homolog has a greater than about 60 percent sequence identity, and more preferably greater than about 75 percent sequence identity, and still more preferably greater than about 90 percent sequence identity, and even more preferably greater than about 96 percent sequence identity with a reference sequence.
  • the term "substantial identity” means that two peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap weights, share at least about 80 percent sequence identity, preferably at least about 90 percent sequence identity, more preferably at least about 95 percent sequence identity or more (for example, about 99 percent sequence identity).
  • residue positions which are not identical differ by conservative amino acid substitutions.
  • variant and mutant when used in reference to a polypeptide refer to an amino acid sequence that differs by one or more amino acids from another, usually related polypeptide.
  • the variant may have "conservative" changes, wherein a substituted amino acid has similar structural or chemical properties.
  • conservative amino acid substitutions refers to the interchangeability of residues having similar side chains.
  • a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine.
  • Preferred conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine- arginine, alanine-valine, and asparagine-glutamine. More rarely, a variant may have "non- conservative" changes (for example, replacement of a glycine with a tryptophan). Similar minor variations may also include amino acid deletions or insertions (that is, additions), or both.
  • the term "gene” refers to a nucleic acid (for example, DNA or RNA) sequence that comprises coding sequences necessary for the production of an RNA, or a polypeptide or its precursor (for example, proinsulin).
  • a functional polypeptide can be encoded by a full length coding sequence or by any portion ofthe coding sequence as long as the desired activity or functional properties (for example, enzymatic activity, ligand binding, signal transduction, etc.) ofthe polypeptide are retained.
  • portion when used in reference to a gene refers to fragments of that gene. The fragments may range in size from a few nucleotides to the entire gene sequence minus one nucleotide. Thus, "a nucleotide comprising at least a portion of a gene” may comprise fragments ofthe gene or the entire gene.
  • the term “gene” also encompasses the coding regions of a structural gene and includes sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length ofthe full-length mRNA.
  • the sequences which are located 5' ofthe coding region and which are present on the mRNA are referred to as 5' non-translated sequences.
  • the sequences which are located 3' or downstream ofthe coding region and which are present on the mRNA are referred to as 3' non- translated sequences.
  • the term “gene” encompasses both cDNA and genomic forms of a gene.
  • a genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed "introns” or “intervening regions” or “intervening sequences.”
  • Introns are segments of a gene which are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript.
  • mRNA messenger RNA
  • genomic forms of a gene may also include sequences located on both the 5' and 3' end ofthe sequences which are present on the RNA transcript. These sequences are referred to as "flanking" sequences or regions (these flanking sequences are located 5' or 3' to the non-translated sequences present on the mRNA transcript).
  • the 5' flanking region may contain regulatory sequences such as promoters and enhancers which control or influence the transcription ofthe gene.
  • the 3' flanking region may contain sequences which direct the termination of transcription, posttranscriptional cleavage and polyadenylation.
  • heterologous when used in reference to a gene refers to a gene encoding a factor that is not in its natural environment (in other words, has been altered by the hand of man).
  • a heterologous gene includes a gene from one species introduced into another species.
  • a heterologous gene also includes a gene native to an organism that has been altered in some way (for example, mutated, added in multiple copies, linked to a non-native promoter or enhancer sequence, etc.).
  • Heterologous genes may comprise plant gene sequences that comprise cDNA forms of a plant gene; the cDNA sequences may be expressed in either a sense (to produce mRNA) or anti-sense orientation (to produce an anti-sense RNA transcript that is complementary to the mRNA transcript).
  • Heterologous genes are distinguished from endogenous plant genes in that the heterologous gene sequences are typically joined to nucleotide sequences comprising regulatory elements such as promoters that are not found naturally associated with the gene for the protein encoded by the heterologous gene or with plant gene sequences in the chromosome, or are associated with portions ofthe chromosome not found in nature (for example, genes expressed in loci where the gene is not normally expressed).
  • nucleotide sequence of interest refers to any nucleotide sequence (for example, RNA or DNA), the manipulation of which may be deemed desirable for any reason (for example, treat disease, confer improved qualities, etc.), by one of ordinary skill in the art.
  • nucleotide sequences include, but are not limited to, coding sequences of structural genes (for example, reporter genes, selection marker genes, oncogenes, drug resistance genes, growth factors, etc.), and non-coding regulatory sequences which do not encode an mRNA or protein product (for example, promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, etc.).
  • structural when used in reference to a gene or to a nucleotide or nucleic acid sequence refers to a gene or a nucleotide or nucleic acid sequence whose ultimate expression product is a protein (such as an enzyme or a structural protein), an rRNA, an sRNA, a tRNA, etc.
  • oligonucleotide refers to a molecule comprised of two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and usually more than ten. The exact size will depend on many factors, which in turn depends on the ultimate function or use ofthe oligonucleotide.
  • the oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, or a combination thereof.
  • an oligonucleotide having a nucleotide sequence encoding a gene or "a nucleic acid sequence encoding" a specified polypeptide refers to a nucleic acid sequence comprising the coding region of a gene or in other words the nucleic acid sequence which encodes a gene product.
  • the coding region may be present in either a cDNA, genomic DNA or RNA form.
  • the oligonucleotide may be single-stranded (that is, the sense strand) or double-stranded.
  • Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc.
  • the coding region utilized in the expression vectors ofthe present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
  • nucleic acid molecule when made in reference to a nucleic acid molecule refers to a nucleic acid molecule which is comprised of segments of nucleic acid joined together by means of molecular biological techniques.
  • recombinant when made in reference to a protein or a polypeptide refers to a protein molecule which is expressed using a recombinant nucleic acid molecule.
  • complementarity refers to polynucleotides (in other words, a sequence of nucleotides) related by the base-pairing rules. For example, for the sequence “A-G-T,” is complementary to the sequence “T-C-A.” Complementarity may be “partial,” in which only some ofthe nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods which depend upon binding between nucleic acids.
  • sequence identity refers to a measure of relatedness between two or more nucleic acids or proteins, and is given as a percentage with reference to the total comparison length. The identity calculation takes into account those nucleotide or amino acid residues that are identical and in the same relative positions in their respective larger sequences. Calculations of identity may be performed by algorithms contained within computer programs such as "GAP” (Genetics Computer Group, Madison, Wis.) and “ALIGN” (DNAStar, Madison, Wis.).
  • a partially complementary sequence is one that at least partially inhibits (or competes with) a completely complementary sequence from hybridizing to a target nucleic acid is referred to using the functional term "substantially homologous.”
  • the inhibition of hybridization ofthe completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency.
  • a substantially homologous sequence or probe will compete for and inhibit the binding (in other words, the hybridization) of a sequence which is completely homologous to a target under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (in other words, selective) interaction.
  • the absence of non-specific binding may be tested by the use of a second target which lacks even a partial degree of complementarity (for example, less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.
  • a partial degree of complementarity for example, less than about 30% identity
  • reference sequence is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA sequence given in a sequence listing or may comprise a complete gene sequence. Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length.
  • two polynucleotides may each (1) comprise a sequence (in other words, a portion ofthe complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides
  • sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences ofthe two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity.
  • a “comparison window”, as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous nucleotides and wherein the portion ofthe polynucleotide sequence in the comparison window may comprise additions or deletions (in other words, gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment ofthe two sequences.
  • Optimal alignment of sequences for aligning a comparison window may be conducted by the local homology algorithm of Smith and Waterman (Smith and Waterman (1981) Adv. Appl.
  • sequence identity means that two polynucleotide sequences are identical (in other words, on a nucleotide-by-nucleotide basis) over the window of comparison.
  • percentage of sequence identity is calculated by comparing twp - optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (for example, A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (in other words, the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
  • substantially identical denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 85 percent sequence identity, preferably at least 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison window of at least 20 nucleotide positions, frequently over a window of at least 25-50 nucleotides, wherein the percentage of sequence identity is calculated by comparing the reference sequence to the polynucleotide sequence which may include deletions or additions which total 20 percent or less ofthe reference sequence over the window of comparison.
  • substantially homologous when used in reference to a double-stranded nucleic acid sequence such as a cDNA or genomic clone refers to any probe that can hybridize to either or both strands ofthe double-stranded nucleic acid sequence under conditions of low to high stringency as described above.
  • substantially homologous when used in reference to a single-stranded nucleic acid sequence refers to any probe that can hybridize (that is, it is the complement of) the single-stranded nucleic acid sequence under conditions of low to high stringency as described above.
  • hybridization refers to the pairing of complementary nucleic acids.
  • Hybridization and the strength of hybridization is impacted by such factors as the degree of complementary between the nucleic acids, stringency ofthe conditions involved, the T m ofthe formed hybrid, and the
  • G:C ratio within the nucleic acids A single molecule that contains pairing of complementary nucleic acids within its structure is said to be "self-hybridized.”
  • T m refers to the "melting temperature" of a nucleic acid.
  • the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • stringency refers to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. With “high stringency” conditions, nucleic acid base pairing will occur only between nucleic acid fragments that have a high frequency of complementary base sequences.
  • conditions of "low” stringency are often required with nucleic acids that are derived from organisms that are genetically diverse, as the frequency of complementary sequences is usually less.
  • Low stringency conditions when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42EC in a solution consisting of
  • 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH 2 P0 4 XH 2 0 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS
  • 5X Denhardt's reagent 50X Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)) and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 5X SSPE, 0.1% SDS at 42EC when a probe of about 500 nucleotides in length is employed.
  • “Medium stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42EC in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH 2 P0 4 XH 2 0 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5X Denhardt's reagent and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 1.0X SSPE, 1.0% SDS at 42EC when a probe of about 500 nucleotides in length is employed.
  • High stringency conditions when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42EC in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH 2 P0 4 XH 2 0 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5X Denhardt's reagent and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1X SSPE, 1.0% SDS at 42EC when a probe of about 500 nucleotides in length is employed.
  • low stringency conditions factors such as the length and nature (DNA, RNA, base composition) of the probe and nature ofthe target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration ofthe salts and other components (for example, the presence or absence of formamide, dextran sulfate, polyethylene glycol) are considered and the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions.
  • the art knows conditions that promote hybridization under conditions of high stringency (for example, increasing the temperature ofthe hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).
  • amplification refers to a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (in other words, replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (that is, synthesis ofthe proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of “target” specificity. Target sequences are “targets” in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out. Template specificity is achieved in most amplification techniques by the choice of enzyme.
  • Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid.
  • MDV-1 RNA is the specific template for the replicase (Kacian et al. (1972) Proc. Natl. Acad. Sci. USA, 69:3038).
  • Other nucleic acid will not be replicated by this amplification enzyme.
  • this amplification enzyme has a stringent specificity for its own promoters (Chamberlin et al. (1970) Nature: 228:227).
  • amplifiable nucleic acid refers to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid” will usually comprise "sample template.”
  • sample template refers to nucleic acid originating from a sample that is analyzed for the presence of "target” (defined below).
  • background template is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
  • primer refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (that is, in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence ofthe inducing agent. The exact lengths ofthe primers will depend on many factors, including temperature, source of primer and the use ofthe method.
  • the term "probe” refers to an oligonucleotide (in other words, a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced ' synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to another oligonucleotide of interest.
  • a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences.
  • any probe used in the present invention will be labeled with any "reporter molecule,” so that is detectable in any detection system, including, but not limited to enzyme (for example, ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
  • the term "target,” when used in reference to the polymerase chain reaction, refers to the region of nucleic acid bounded by the primers used for polymerase chain reaction. Thus, the "target” is sought to be sorted out from other nucleic acid sequences.
  • a “segment” is defined as a region of nucleic acid within the target sequence.
  • PCR polymerase chain reaction
  • the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
  • the primers are extended with a polymerase so as to form a new pair of complementary strands.
  • the steps of denaturation, primer annealing, and polymerase extension can be repeated many times (that is, denaturation, annealing and extension constitute one "cycle”; there can be numerous "cycles") to obtain a high concentration of an amplified segment ofthe desired target sequence.
  • the length ofthe amplified segment ofthe desired target sequence is determined by the relative positions ofthe primers with respect to each other, and therefore, this length is a controllable parameter.
  • PCR polymerase chain reaction
  • PCR it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (for example, hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin- enzyme conjugate detection; incorporation of 32 P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment).
  • any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules.
  • the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • PCR product refers to the resultant mixture of compounds after two or more cycles ofthe PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
  • amplification reagents refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for primers, nucleic acid template, and the amplification enzyme.
  • amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
  • RT-PCR reverse-transcriptase
  • RNA for example, mRNA, rRNA, tRNA, or snRNA
  • transcription of the gene
  • RNA for example, mRNA, rRNA, tRNA, or snRNA
  • Gene expression can be regulated at many stages in the process.
  • Up-regulation or “activation” refers to regulation that increases the production of gene expression products (in other words, RNA or protein), while “down- regulation” or “repression” refers to regulation that decrease production.
  • Molecules for example, transcription factors
  • activators for example, transcription factors
  • repressors repressors
  • the terms “in operable combination”, “in operable order” and “operably linked” refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced.
  • the term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.
  • regulatory element refers to a genetic element which controls some aspect of the expression of nucleic acid sequences.
  • a promoter is a regulatory element which facilitates the initiation of transcription of an operably linked coding region.
  • Other regulatory elements are splicing signals, polyadenylation signals, termination signals, etc.
  • Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription (Maniatis, et al. (1987) Science 236: 1237). Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in yeast, insect, mammalian and plant cells. Promoter and enhancer elements have also been isolated from viruses and analogous control elements, such as promoters, are also found in prokaryotes. The selection of a particular promoter and enhancer depends on the cell type used to express the protein of interest.
  • promoter element refers to a DNA sequence that is located at the 5' end (in other words precedes) ofthe coding region of a DNA polymer. The location of most promoters known in nature precedes the transcribed region.
  • the promoter functions as a switch, activating the expression of a gene. If the gene is activated, it is said to be transcribed, or participating in transcription. Transcription involves the synthesis of mRNA from the gene.
  • the promoter therefore, serves as a transcriptional regulatory element and also provides a site for initiation of transcription ofthe gene into mRNA.
  • regulatory region refers to a gene's 5' transcribed but untranslated regions, located immediately downstream from the promoter and ending just prior to the translational start ofthe gene.
  • promoter region refers to the region immediately upstream ofthe coding region of a DNA polymer, and is typically between about 500 bp and 4 kb in length, and is preferably about 1 to 1.5 kb in length.
  • Promoters may be tissue specific or cell specific.
  • tissue specific refers to a promoter that is capable of directing selective expression of a nucleotide sequence of interest to a specific type of tissue (for example, seeds) in the relative absence of expression ofthe same nucleotide sequence of interest in a different type of tissue (for example, leaves).
  • Tissue specificity of a promoter may be evaluated by, for example, operably linking a reporter gene to the promoter sequence to generate a reporter construct, introducing the reporter construct into the genome of a plant such that the reporter construct is integrated into every tissue ofthe resulting transgenic plant, and detecting the expression ofthe reporter gene (for example, detecting mRNA, protein, or the activity of a protein encoded by the reporter gene) in different tissues ofthe transgenic plant.
  • the detection of a greater level of expression ofthe reporter gene in one or more tissues relative to the level of expression ofthe reporter gene in other tissues shows that the promoter is specific for the tissues in which greater levels of expression are detected.
  • cell type specific refers to a promoter which is capable of directing selective expression of a nucleotide sequence of interest in a specific type of cell in the relative absence of expression ofthe same nucleotide sequence of interest in a different type of cell within the same tissue.
  • the term "cell type specific” when applied to a promoter also means a promoter capable of promoting selective expression of a nucleotide sequence of interest in a region within a single tissue. Cell type specificity of a promoter may be assessed using methods well known in the art, for example, immunohistochemical staining.
  • tissue sections are embedded in paraffin, and paraffin sections are reacted with a primary antibody which is specific for the polypeptide product encoded by the nucleotide sequence of interest whose expression is controlled by the promoter.
  • a labeled (for example, peroxidase conjugated) secondary antibody which is specific for the primary antibody is allowed to bind to the sectioned tissue and specific binding detected (for example, with avidin biotin) by microscopy.
  • Promoters may be constitutive or inducible.
  • the term "constitutive" when made in reference to a promoter means that the promoter is capable of directing transcription of an operably linked nucleic acid sequence in the absence of a stimulus (for example, heat shock, chemicals, light, etc.).
  • constitutive promoters are capable of directing expression of a transgene in substantially any cell and any tissue.
  • Exemplary constitutive plant promoters include, but are not limited to SD Cauliflower Mosaic Virus (CaMV SD; see for example, U.S. Pat. No.
  • an "inducible" promoter is one which is capable of directing a level of transcription of an operably linked nucleic acid sequence in the presence of a stimulus (for example, heat shock, chemicals, light, etc.) which is different from the level of transcription of the operably linked nucleic acid sequence in the absence ofthe stimulus.
  • a stimulus for example, heat shock, chemicals, light, etc.
  • the enhancer and/or promoter may be "endogenous” or “exogenous” or “heterologous.”
  • An “endogenous” enhancer or promoter is one that is naturally linked with a given gene in the genome.
  • An “exogenous” or “heterologous” enhancer or promoter is one that is placed in juxtaposition to a gene by means of genetic manipulation (in other words, molecular biological techniques) such that transcription ofthe gene is directed by the linked enhancer or promoter.
  • genetic manipulation in other words, molecular biological techniques
  • first and second genes can be from the same species, or from different species.
  • naturally linked or “naturally located” when used in reference to the relative positions of nucleic acid sequences means that the nucleic acid sequences exist in nature in the relative positions.
  • Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site (Sambrook, et al. Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989) pp. 16.7-16.8).
  • a commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40. Efficient expression of recombinant DNA sequences in eukaryotic cells requires expression of signals directing the efficient termination and polyadenylation ofthe resulting transcript.
  • Transcription termination signals are generally found downstream ofthe polyadenylation signal and are a few hundred nucleotides in length.
  • the term "poly(A) site” or "poly(A) sequence” as used herein denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of the recombinant transcript is desirable, as transcripts lacking a poly(A) tail are unstable and are rapidly degraded.
  • the poly(A) signal utilized in an expression vector may be "heterologous" or "endogenous.” An endogenous poly(A) signal is one that is found naturally at the 3' end ofthe coding region of a given gene in the genome.
  • a heterologous poly(A) signal is one which has been isolated from one gene and positioned 3' to another gene.
  • a commonly used heterologous poly(A) signal is the SV40 poly(A) signal.
  • the SV40 poly(A) signal is contained on a 237 bp Bar ⁇ Hl/Bcll restriction fragment and directs both termination and polyadenylation (Sambrook, supra, at 16.6-16.7).
  • vector refers to nucleic acid molecules that transfer DNA segment(s) from one cell to another.
  • vehicle is sometimes used interchangeably with “vector.”
  • expression vector or "expression cassette” refer to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression ofthe operably linked coding sequence in a particular host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences. Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals.
  • transfection refers to the introduction of foreign DNA into cells. Transfection may be accomplished by a variety of means known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated transfection, polybrene-mediated transfection, glass beads, electroporation, microinjection, liposome fusion, lipofection, protoplast fusion, viral infection, biolistics (that is, particle bombardment) and the like.
  • stable transfection or "stably transfected” refers to the introduction and integration of foreign DNA into the genome ofthe transfected cell.
  • stable transfectant refers to a cell that has stably integrated foreign DNA into the genomic DNA.
  • transient transfection or “transiently transfected” refers to the introduction of foreign DNA into a cell where the foreign DNA fails to integrate into the genome ofthe transfected cell.
  • the foreign DNA persists in the nucleus ofthe transfected cell for several days. During this time the foreign DNA is subject to the regulatory controls that govern the expression of endogenous genes in the chromosomes.
  • transient transfectant refers to cells that have taken up foreign DNA but have failed to integrate this DNA.
  • calcium phosphate co-precipitation refers to a technique for the introduction of nucleic acids into a cell.
  • the uptake of nucleic acids by cells is enhanced when the nucleic acid is presented as a calcium phosphate-nucleic acid co-precipitate.
  • Graham and van der Eb Graham and van der Eb, 1973, Virol., 52:456
  • the original technique of Graham and van der Eb has been modified by several groups to optimize conditions for particular types of cells. The art is well aware of these numerous modifications.
  • infectious and “infection” when used with a bacterium refer to co-incubation of a target biological sample, (for example, cell, tissue, etc.) with the bacterium under conditions such that nucleic acid sequences contained within the bacterium are introduced into one or more cells ofthe target biological sample.
  • a target biological sample for example, cell, tissue, etc.
  • Agrobacterium refers to a soil-borne, Gram-negative, rod-shaped phytopathogenic bacterium which causes crown gall.
  • Agrobacterium includes, but is not limited to, the strains Agrobacterium tumefaciens, (which typically causes crown gall in infected plants), and Agrobacterium rhizogens (which causes hairy root disease in infected host plants). Infection of a plant cell with Agrobacterium generally results in the production of opines (for example, nopaline, agropine, octopine etc.) by the infected cell.
  • opines for example, nopaline, agropine, octopine etc.
  • Agrobacterium strains which cause production of nopaline are referred to as "nopaline-type” Agrobacteria
  • Agrobacterium strains which cause production of octopine for example, strain LBA4404, Ach5, B6
  • octopine-type for example, strain LBA4404, Ach5, B6
  • agropine-type for example, strain EHA105, EHA101, A281
  • biolistic bombardment refers to the process of accelerating particles towards a target biological sample (for example, cell, tissue, etc.) to effect wounding ofthe cell membrane of a cell in the target biological sample and/or entry ofthe particles into the target biological sample.
  • a target biological sample for example, cell, tissue, etc.
  • Methods for biolistic bombardment are known in the art (for example, U.S. Patent No. 5,584,807, the contents of which are incorporated herein by reference), and are commercially available (for example, the helium gas-driven microprojectile accelerator (PDS-1000/He, BioRad).
  • microwounding when made in reference to plant tissue refers to the introduction of microscopic wounds in that tissue. Microwounding may be achieved by, for example, particle bombardment as described herein.
  • transgene refers to a foreign gene that is placed into an organism by the process of transfection.
  • foreign gene refers to any nucleic acid (for example, gene sequence) that is introduced into the genome of an organism by experimental manipulations and may include gene sequences found in that organism so long as the introduced gene does not reside in the same location as does the naturally-occurring gene.
  • transgenic when used in reference to a plant or fruit or seed (that is, a “transgenic plant” or “transgenic fruit” or a “transgenic seed” ) refers to a plant or fruit or seed that contains at least one heterologous or foreign gene in one or more of its cells.
  • transgenic plant material refers broadly to a plant, a plant structure, a plant tissue, a plant seed or a plant cell that contains at least one heterologous gene in one or more of its cells.
  • host cell refers to any cell capable of replicating and/or transcribing and/or translating a heterologous gene.
  • a “host cell” refers to any eukaryotic or prokaryotic cell (for example, bacterial cells such as E. coli, yeast cells, mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells), whether located in vitro or in vivo.
  • host cells may be located in a transgenic animal.
  • transformants or “transformed cells” include the primary transformed cell or tissue, cultures derived from that cell without regard to the number of transfers, and progeny derived from the transformed cell or tissue, such as a transgenic plant or bacteria. All progeny may not be precisely identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny that have the same functionality as screened for in the originally transformed cell are included in the definition of transformants.
  • selectable marker refers to a gene which encodes an enzyme having an activity that confers resistance to an antibiotic or drug upon the cell in which the selectable marker is expressed, or which confers expression of a trait which can be detected (for example., luminescence or fluorescence).
  • Selectable markers may be "positive” or “negative.” Examples of positive selectable markers include the neomycin phosphotrasferase (NPTII) gene which confers resistance to G418 and to kanamycin, and the bacterial hygromycin phosphotransferase gene (hyg), which confers resistance to the antibiotic hygromycin.
  • Negative selectable markers encode an enzymatic activity whose expression is cytotoxic to the cell when grown in an appropriate selective medium.
  • the HSV-tA: gene is commonly used as a negative selectable marker. Expression ofthe ⁇ SV-tk gene in cells grown in the presence of gancyclovir or acyclovir is cytotoxic; thus, growth of cells in selective medium containing gancyclovir or acyclovir selects against cells capable of expressing a functional HSV TK enzyme.
  • reporter gene refers to a gene encoding a protein that may be assayed.
  • reporter genes include, but are not limited to, luciferase (See, for example, deWet et al. (1987) Mol. Cell. Biol. 7:725 and U.S. PatNos.,6,074,859; 5,976,796; 5,674,713; and 5,618,682; all of which are incorporated herein by reference), green fluorescent protein (for example, GenBank Accession Number U43284; a number of GFP variants are commercially available from CLONTECH Laboratories, Palo Alto, CA), chloramphenicol acetyltransferase, -galactosidase, alkaline phosphatase, and horse radish peroxidase.
  • luciferase See, for example, deWet et al. (1987) Mol. Cell. Biol. 7:725 and U.S. PatNos.,6,074,859; 5,976,796; 5,674,713; and 5,618,682; all of which are
  • wild-type when made in reference to a gene refers to a gene which has the characteristics of a gene isolated from a naturally occurring source.
  • wild-type when made in reference to a gene product refers to a gene product which has the characteristics of a gene product isolated from a naturally occurring source.
  • naturally-occurring as applied to an object refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally-occurring.
  • a wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the "normal” or “wild-type” form of the gene.
  • the term “modified” or “mutant” when made in reference to a gene or to a gene product refers, respectively, to a gene or to a gene product which displays modifications in sequence and/or functional properties (in other words, altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • antisense refers to a deoxyribonucleotide sequence whose sequence of deoxyribonucleotide residues is in reverse 5' to 3' orientation in relation to the sequence of deoxyribonucleotide residues in a sense strand of a DNA duplex.
  • a "sense strand" of a DNA duplex refers to a strand in a DNA duplex which is transcribed by a cell in its natural state into a “sense mRNA.”
  • an "antisense” sequence is a sequence having the same sequence as the non-coding strand in a DNA duplex.
  • antisense RNA refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene by interfering with the processing, transport and/or translation of its primary transcript or mRNA.
  • the complementarity of an antisense RNA may be with any part ofthe specific gene transcript, in other words, at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence.
  • antisense RNA may contain regions of ribozyme sequences that increase the efficacy of antisense RNA to block gene expression.
  • Ribozyme refers to a catalytic RNA and includes sequence-specific endoribonucleases.
  • Antisense inhibition refers to the production of antisense RNA transcripts capable of preventing the expression ofthe target protein.
  • overexpression generally refers to the production of a gene product in transgenic organisms that exceeds levels of production in normal or non-transformed organisms.
  • cosuppression refers to the expression of a foreign gene which has substantial homology to an endogenous gene resulting in the suppression of expression of both the foreign and the endogenous gene.
  • altered levels refers to the production of gene product(s) in transgenic organisms in amounts or proportions that differ from that of normal or non-transformed organisms.
  • overexpression and “overexpressing” and grammatical equivalents are used specifically in reference to levels of mRNA to indicate a level of expression approximately 3 -fold higher than that typically observed in a given tissue in a control or non- transgenic animal.
  • Levels of mRNA are measured using any of a number of techniques known to those skilled in the art including, but not limited to Northern blot analysis (See, Example 10, for a protocol for performing Northern blot analysis).
  • RNA loaded from each tissue analyzed for example, the amount of 28 S rRNA, an abundant RNA transcript present at essentially the same amount in all tissues, present in each sample can be used as a means of normalizing or standardizing the RAD50 mRNA-specific signal observed on Northern blots).
  • Southern blot analysis and “Southern blot” and “Southern” refer to the analysis of DNA on agarose or acrylamide gels in which DNA is separated or fragmented according to size followed by transfer ofthe DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane.
  • the immobilized DNA is then exposed to a labeled probe to detect DNA species complementary to the probe used.
  • the DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support.
  • Southern blots are a standard tool of molecular biologists (J. Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY (1989), pp 9.31-9.58).
  • Northern blot analysis and “Northern blot” and “Northern” refer to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer ofthe RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled probe to detect RNA species complementary to the probe used.
  • Northern blots are a standard tool of molecular biologists (J. Sambrook, et al. supra, pp 7.39-7.52).
  • Western blot analysis and “Western blot” and “Western” refers to the analysis of protein(s) (or polypeptides) immobilized onto a support such as nitrocellulose or a membrane.
  • a mixture comprising at least one protein is first separated on an acrylamide gel, and the separated proteins are then transferred from the gel to a solid support, such as nitrocellulose or a nylon membrane.
  • the immobilized proteins are exposed to at least one antibody with reactivity against at least one antigen of interest.
  • the bound antibodies may be detected by various methods, including the use of radiolabeled antibodies.
  • antigenic determinant refers to that portion of an antigen that makes contact with a particular antibody (in other words, an epitope).
  • an antigenic determinant may compete with the intact antigen (in other words, the "immunogen” used to elicit the immune response) for binding to an antibody.
  • isolated when used in relation to a nucleic acid, as in “an isolated oligonucleotide” refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids, such as DNA and RNA, are found in the state they exist in nature.
  • a given DNA sequence for example, a gene
  • RNA sequences such as a specific mRNA sequence encoding a specific protein
  • isolated nucleic acid encoding a particular protein includes, by way of example, such nucleic acid in cells ordinarily expressing the protein, where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature.
  • the isolated nucleic acid or oligonucleotide may be present in single-stranded or double-stranded form.
  • the oligonucleotide When an isolated nucleic acid or oligonucleotide is to be utilized to express a protein, the oligonucleotide will contain at a minimum the sense or coding strand (that is, the oligonucleotide may single-stranded), but may contain both the sense and antisense strands (that is, the oligonucleotide may be double-stranded).
  • purified refers to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated.
  • An "isolated nucleic acid sequence” is therefore a purified nucleic acid sequence.
  • substantially purified molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free from other components with which they are naturally associated.
  • purified or “to purify” also refer to the removal of contaminants from a sample. The removal of contaminating proteins results in an increase in the percent of polypeptide of interest in the sample.
  • recombinant polypeptides are expressed in plant, bacterial, yeast, or mammalian host cells and the polypeptides are purified by the removal of host cell proteins; the percent of recombinant polypeptides is thereby increased in the sample.
  • sample is used in its broadest sense. In one sense it can refer to a plant cell or tissue. In another sense, it is meant to include a specimen or culture obtained from any source, as well as biological and environmental samples. Biological samples may be obtained from plants or animals (including humans) and encompass fluids, solids, tissues, and gases. Environmental samples include environmental material such as surface matter, soil, water, and industrial samples. These examples are not to be construed as limiting the sample types applicable to the present invention.
  • hydroxylase enzymes from castor, CFAH van de Loo, F. J. et al. (1995) Proc. Natl. Acad. Sci. USA 92(15), 6743-6747), and Lesquerella, LFAH ( Broun, P. et al. (1997) Plant J. 13, 201-210), are closely related to the common plant oleate desaturase enzyme, FAD2, which converts oleate (18:1 ⁇ 9 ) into linoleate (18:2 ⁇ 9 ' 12 ). Indeed, LFAH actually retains both hydroxylase and desaturase activity, indicating that these two oxidation reactions can be catalyzed by the same enzyme.
  • AtFAD2 containing either all seven or the same four residues previously identified as essential to catalysis from the castor hydroxylase (at positions 104, 148, 322, and 324 in AtFAD2; the corresponding residues from CFAH were incorporated into these AtFAD2 positions) generated an enzyme which produced levels of hydroxy fatty acids similar to those previously observed upon expression ofthe wild type CFAH in Arabidopsis plants deficient in desaturase activity.
  • these modified AtFAD2 enzymes produced roughly equal amounts of linoleic acid and ricinoleic acid in the transgenic plants, which indicates that the modified enzymes displayed both desaturase and hydroxylase activities in about equal amounts.
  • the CFAH residues at positions 148 and 324 in AtFAD2 resulted in the production ofthe highest amounts of hydroxy fatty acids; however, the dramatic effect ofthe CFAH residue at position 324 was not additive among the two and three introduced residues, whereas the presence ofthe CFAH residue at position 148 resulted in even higher amounts of hydroxy fatty acids.
  • the castor hydroxylase is a more specific hydroxylase than is the Lesequerella hydroxylase, and the introduction ofthe four CFAH residues into AtFAD2 resulted in a more specific hydroxylase than did the introduction ofthe same four residues from LFAH, where specificity is determined by the relative ratio of desaturation/hydroxylation activity, as determined by the relative amounts of fatty acid products in transgenic plants. This implied that at least some ofthe specificity determinants ofthe castor hydroxylase were contained within the four CFAH residues. Only two of these residues (148 and 324) are different between LFAH (Asn-148 and Ile-324) and CFAH (Ile-148 and Val-324).
  • present invention provides compositions comprising a modified Lesquerella hydroxylase polypeptide or a coding sequence for the modified Lesquerella hydroxylase polypeptide.
  • the present invention provides a modified Lesquerella hydroxylase polypeptide comprising a non-native amino acid at position 149, at position 325, or at both positions, where no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids.
  • the reaction specificity ofthe modified hydroxylase differs from the reaction specificity of an unmodified hydroxylase.
  • position 149 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine.
  • the unmodified Lesquerella hydroxylase is SEQ ID NO: 1 as shown in Figure 4, and is modified as described above.
  • hydroxylase and desaturase enzymes can also be modified, based upon the rational design principles elucidated and described above and applied to the Lesquerella hydroxylase.
  • the present invention also provides a system and a method for screening fatty acid hydroxylase enzyme activities.
  • the system and method utilize as a host strain S. cerevisiae YPH499 and as expression conditions induction at 30°C at high cell density.
  • the present invention also provides methods for using modified Lesquerella hydroxylase peptides and coding sequences; such methods include but are not limited to using the modified Lesquerella hydroxylase peptides and coding sequences in the production of hydroxylated fatty acids.
  • modified Lesquerella hydroxylase peptides and coding sequences include but are not limited to using the modified Lesquerella hydroxylase peptides and coding sequences in the production of hydroxylated fatty acids.
  • the description below provides specific, but not limiting, illustrative examples of embodiments ofthe present invention.
  • compositions comprising an isolated nucleic acid sequence encoding a modified Lesquerella hydroxylase polypeptide, where the polypeptide comprises a non-native amino acid at position 149, at position 325, or at both positions, where no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non- native amino acids; preferably, the reaction specificity ofthe modified hydroxylase differs from the reaction specificity of an unmodified hydroxylase.
  • position 149 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine.
  • An unmodified Lesquerella hydroxylase polypeptide is preferably one which occurs naturally in Lesquerella fendleri, and more preferably an unmodified hydroxylase polypeptide comprising the amino acid sequence SEQ ID NO:l as shown in Figure 4, where such polypeptides are modified according to the present invention.
  • the present invention also provides compositions comprising a nucleic acid sequence encoding a modified fatty acid hydroxylase polypeptide, where the nucleic acid sequence hybridizes to a nucleic acid sequence encoding SEQ ID NO:l, and where the modified hydroxylase fatty acid polypeptide comprises a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO.T, or at both positions, and no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids, and where the reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
  • the nucleic acid sequence encoding a modified fatty acid hydroxylase polypeptide hybridizes to a nucleic acid sequence encoding SEQ ID NO:l under conditions from low to high stringency; preferably, hybridization is under conditions of high stringency.
  • compositions comprising an isolated nucleic acid sequence encoding a modified Lesquerella hydroxylase polypeptide
  • other hydroxylase and desaturase enzymes can also be modified, based upon the rational design principles elucidated and described above and exemplified as applied to the Lesquerella hydroxylase. Therefore, the present invention provides compositions comprising nucleic acid sequences encoding a modified hydroxylase or desaturase polypeptide, where the hydroxylase or desaturase polypeptide are highly homologous to the Lesquerella hydroxylase (at least about 80% homology to SEQ ID NO:l).
  • modified polypeptides comprise a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where no more than three ofthe amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO: 1 are non-native amino acids; preferably, the reaction specificity ofthe modified hydroxylase or desaturase differs from the reaction specificity of an unmodified hydroxylase or desaturase.
  • the position corresponding to position 149 of SEQ ID NO: 1 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine.
  • the nucleic acid sequence can be oriented to produce sense or antisense transcripts, depending on the desired use.
  • a nucleic acid sequence according to the present invention includes sequences engineered in order to alter a sequence encoding a modified hydroxylase or desaturase, including a modified Lesquerella hydroxylase, for a variety of reasons, including but not limited to alterations that modify the cloning, processing and/or expression ofthe gene product (such alterations include inserting new restriction sites, altering glycosylation patterns, and changing codon preference).
  • nucleic acid sequences ofthe present invention are obtained by methods well known in the art.
  • a nucleic acid sequence ofthe present invention is obtained by modification of a sequence encoding amino sequence SEQ ID NO:l as shown in Figure 4, such that a modified sequence encodes a modified hydroxylase ofthe present invention.
  • Methods of preparing modified sequences are well known, and include those described in the Examples.
  • the coding sequence for a modified Lesquerella hydroxylase is synthesized, in whole or in part, using chemical methods well known in the art (See for example, Caruthers et al. (1980 ) Nucl. Acids Res. Symp. Ser., 7:215-233; Crea and Horn (1980) Nucl. Acids Res., 9:2331; Matteucci and Caruthersm (1980) Tetrahedron Lett., 21:719; and Chow and Kempe (1981) Nucl. Acids Res., 9:2807-2817).
  • the present invention provides vectors comprising a nucleic acid sequence ofthe present invention as described.
  • the vectors include cloning vectors and expression vectors; both types of vectors are well known in the art, and are described further below.
  • the present invention also provides compositions comprising a modified Lesquerella fatty acid hydroxylase polypeptide comprising a non-native amino acid at position 149, at position 325, or at both positions, where no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids; preferably, the reaction specificity ofthe modified hydroxylase differs from the reaction specificity of an unmodified hydroxylase.
  • position 149 is occupied by a non-native amino acid; more preferably, the non- native amino acid is either threonine or isoleucine.
  • An unmodified Lesquerella hydroxylase polypeptide is preferably obtained from Lesquerella fendleri, and more preferably an unmodified hydroxylase polypeptide comprises the amino acid sequence of SEQ ID NO:l as shown in Figure 4, where such polypeptides are modified according to the present invention.
  • Modified hydroxylases and desaturases may be identified by first modifying in accordance with the present invention either a nucleic acid encoding an enzyme such that it encodes a modified enzyme or the amino acid sequence of an enzyme, and then screening the modified enzyme for activity as described below.
  • compositions comprising a modified Lesquerella hydroxylase polypeptide
  • other hydroxylase and desaturase enzymes can also be modified, based upon the rational design principles elucidated and described above and applied to the Lesquerella hydroxylase. Therefore, the present invention provides compositions comprising a modified hydroxylase or desaturase polypeptide, where the hydroxylase or desaturase polypeptide is highly homologous to the Lesquerella hydroxylase (at least about 80% homology to SEQ ID NO:l).
  • modified polypeptides comprise a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where no more than three ofthe amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l are non-native amino acids; preferably, the reaction specificity ofthe modified hydroxylase or desaturase differs from the reaction specificity of an unmodified hydroxylase or desaturase.
  • the position corresponding to position 149 of SEQ ID NO:l is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine.
  • the present invention also provides fusion proteins ofthe modified Lesquerella hydroxylase polypeptide.
  • the polypeptide is a purified product, while in other embodiments it is a product of chemical synthetic procedures. In still other embodiments it is produced by recombinant techniques using a prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and mammalian cells in culture).
  • the polypeptide ofthe present invention is glycosylated or is non-glycosylated. In other embodiments, the polypeptides ofthe invention may also include an initial methionine amino acid residue.
  • a modified Lesquerella hydroxylase ofthe present invention is a sequence of amino acids, such as a protein, polypeptide or peptide fragment, as described above and which has the ability to catalyze either the production of hydroxy-fatty acid or the production of a polyunsaturated fatty acid or both from appropriate fatty acyl substrates, under enzyme reactive conditions.
  • enzyme reactive conditions any necessary conditions (for example, factors such as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. Appropriate enzyme reactive conditions may occur in vivo or in vitro.
  • An appropriate fatty acid substrate may be free fatty acid, a fatty acid salt, fatty acyl-CoA, fatty acyl-ACP or a fatty acyl-lipid, where the fatty acid is at least a monounsaturated fatty acid.
  • references to fatty acid substrates and products are intended to include the free acids, the ACP and CoA esters, the salts of these acids, the glycerolipid esters (particularly the phospholipid and triacylglycerol esters), the wax esters, and the ether derivatives of these acids.
  • Fatty acids may be indicated by the number of carbon atoms, with the number of double bonds following an asterisk; the location ofthe double bond is indicated by a superscript numeral following a delta. The common name, when included, follows in parentheses.
  • a fatty acid with 18 carbon atoms and one double bond between carbons 9 and 10 (numbering from the carboxyl end) is indicated as 18:l d9 (oleate).
  • the presence and location of a hydroxyl group is indicated by an OH following a number which indicates the carbon atom to which the hydroxyl is attached.
  • a fatty acid with 18 carbon atoms and one double bond between carbons 9 and 10 (numbering from the carboxyl end) and with a hydroxyl group at carbon 12 is indicated as 12-OH, 18:l d9 (ricinoleate).
  • a modified Lesquerella hydroxylase ofthe present invention is used for production of hydroxylated or polyunsaturated fatty acids, by expression ofthe enzyme either in vitro or in vivo in transgenic organisms, such as plants, which produce the appropriate precursors.
  • products ofthe modified Lesquerella hydroxylase include but are not limited to: 12-OH, 16:l d9 ; 9-OH, 18:l d6 ; 12-OH, 18:l d9 (ricinoleate); 14-OH, 20:l dn (lesqueroleate); and 16-OH, 22:l d13 .
  • the modified Lesquerella hydroxylase ofthe present invention is used for the production of additionally modified fatty acids that result from desaturation or elongation of hydroxylated fatty; such products include but are not limited to: 12-OH, 18:2 d9 ' 15 (densipoleate); 14-OH, 20:2 dll ' 17 (aruicoleate); 14-OH, 20:l dU (lesqueroleate); and 16-OH, 22:l d13 .
  • Substrates ofthe modified Lesquerella hydroxylase include but are not limited to: 16:l d9 (palmitoleate); 18:l d6 (petroselenate); 18:l d9 (oleate); 20:1 d ⁇ (gladoleate or eicosenoate); and 22:l d13 (erucate or docosenoate).
  • a modified Lesquerella hydroxylase of this invention displays activity toward fatty acyl substrates.
  • fatty acids are typically covalently bound to acyl carrier protein (ACP), coenzyme A (Co A) or various cellular lipids.
  • ACP acyl carrier protein
  • Co A coenzyme A
  • a modified Lesquerella hydroxylase ofthe present invention utilizes a fatty acyl substrate which is esterified to ACP, to CoA, or to a glycerolipid backbone.
  • the fatty acid substrate is a free fatty acid.
  • the substrate is an esterified fatty acyl substrate; more preferably, the substrate is a fatty acyl glycerolipid; even more preferably, the substrate is fatty acyl phosphatidylcholine.
  • a modified Lesquerella hydroxylase catalyzes fatty acid hydroxylation, desaturation, or both reactions.
  • An example of these reactions, where oleate is the substrate, follows:
  • the enzyme in situ is believed to act on a fatty acid esterified to a lipid, and requires cytochrome b 5 reductase and cytochrome b 5 for activity. Moreover, the enzyme may utilize different substrates under different conditions to differing degrees of activity.
  • modified Lesquerella hydroxylase may be assayed in a number of ways.
  • the activity is determined by expressing a nucleic acid sequence encoding the modified hydroxylase in a transgenic organism, as described below and in the Examples, and then analyzing the composition ofthe total fatty acids.
  • the activity is measured as the presence of or increase in the amount of endogenous hydroxy-oleate and other hydroxy fatty acids, or as the presence of or increase in the amount of endogenous linolineate or other di- or polyunsaturated fatty acids (collectively referred to as the fatty acid products) in a transgenic organism which comprises a heterologous or exogenous nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention; such transgenic organisms are obtained as described below.
  • the amount of fatty acid product in a transgenic organism is then determined. In some experiments, the amount of fatty acid product in a transgenic organism is compared to that present in a non-transgenic organism.
  • the fatty acids are analyzed from lipids extracted from samples of a transgenic organism, or from fatty acids extracted directly from such samples; for example, the samples are homogenized in methanol/chloroform (2:1, v/v) and the lipids extracted as described by Bligh and Dyer (1959) (Can. J. Biochem. Physiol. 37: 911).
  • tissue samples obtained from a transgenic organism as described below and in the Examples.
  • tissue samples include but are not limited to leaf samples (such as discs), stem and root samples, and developing and mature seed embryonic or endosperm tissue.
  • tissue samples are incubated with either precursors of fatty acid synthesis, such as 14 C-acetate, or with fatty acids, such as ammonium salts of 14 C-fatty acids, which can be taken up and incorporated into tissue lipids.
  • co-factors for lipid synthesis include but are not limited to ATP, CoA, MgCl , and lyso-phospholipids, such as lysoPC.
  • Incubations generally proceed at room temperature in a buffered solution, such as 0.1M potassium phosphate at pH 7.2, for a suitable period of time. The samples are then washed in buffer, and the amount of labeled fatty acid product is determined. In some experiments, the amount of labeled fatty acid produced in a transgenic organism is compared to that produced in a non-transgenic organism.
  • the fatty acids are analyzed from lipids extracted from the tissue samples, or from fatty acids extracted directly from such samples; for example, the tissue samples homogenized in methanol/chloroform (2:1, v/v) and the lipids extracted as described by Bligh and Dyer (1959) (Can. J. Biochem. Physiol. 37: 911).
  • the enzyme activity is determined in a sub-cellular fraction obtained from a transgenic organism as described below.
  • subcellular fractions may be obtained from any ofthe types of tissues described above, and include whole cell and microsomal membranes, plastids, and plastidial membrane fractions. Preparation of such fractions is well-known in the art.
  • the subcellular fraction is then incubated with fatty acids, such as ammonium salts of 14C-fatty acids, which can be taken up and incorporated into tissue lipids.
  • co-factors for lipid synthesis include but are not limited to ATP, CoA, MgCl 2 , and lyso-phospholipids, such as lysoPC.
  • ATP adenosine triphosphate
  • CoA adenosine triphosphate
  • MgCl 2 adenosine-phosphate
  • lyso-phospholipids such as lysoPC
  • the enzyme activity is determined from an in-vitro nucleic acid expression system, in which a nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention is added and the encoded enzyme expressed.
  • expression systems are well-known in the art and include, for example reticulocyte lysate or wheat germ homogenates. Because the enzyme is likely to be an integral membrane protein, it may be necessary to include micellar or membrane structures into which the enzyme may be incorporated during or after protein synthesis.
  • micellar structures are obtained from sources which contain related lipid synthetic capabilities such as lysophospholipid acyl transferase as well as cytochrome b 5 reductase and cytochrome b 5 , but not the lipid synthetic capability under investigation;
  • an example of a micellar source is a plant tissue where the plant does not contain an endogenous fatty acid hydroxylase.
  • fatty acid methyl esters are prepared from an aliquot of an extracted lipid fraction by evaporating the solvent from the aliquot under N 2 and resuspending and heating the lipids in 4% methanolic HCL (w/w). The fatty acid methyl esters are then separated, and for radioactive samples the radioactivity in each separated fraction determined, by radio gas-liquid chromatography (GLC) and radio-HPLC (Engeseth et al. (1996) Planta 198: 238-245).
  • LPC radio gas-liquid chromatography
  • radio-HPLC Engeseth et al. (1996) Planta 198: 238-245).
  • fatty acid methyl esters are prepared, derivatized with bis(trimethylsily)trifluoroacetamide:trimethyl-chlorosilane to obtain TMS fatty acid methyl esters of hydroxylated fatty acids and analyzed by GC (Broun et al (1997) Plant Physiol. 113: 933-942). 2. Chemical synthesis of modified Lesquerella hydroxylase
  • the protein itself is produced using chemical methods to synthesize either an entire modified Lesquerella hydroxylase amino acid sequence or a portion thereof.
  • peptides are synthesized by solid phase techniques, cleaved from the resin, and purified by preparative high performance liquid chromatography (See for example, Creighton (1983) Proteins Structures And Molecular Principles, W H Freeman and Co, New York N.Y.).
  • the composition ofthe synthetic peptides is confirmed by amino acid analysis or sequencing (See for example, Creighton, supra). Direct peptide synthesis can be performed using various solid-phase techniques
  • modified Lesquerella hydroxylase or any part thereof, may be altered during direct synthesis and/or combined using chemical methods with other sequences to produce a variant polypeptide. Because the enzyme is thought to be an integral membrane protein, it may be necessary to include micellar or membrane structures into which the enzyme may be incorporated during or after protein synthesis.
  • modified Lesquerella hydroxylase polypeptides purified from recombinant organisms as described below are provided.
  • modified Lesquerella hydroxylase polypeptides purified from in vitro transcription translation expression systems as described above are provided.
  • the present invention provides purified modified Lesquerella hydroxylase polypeptides ofthe present invention.
  • the present invention also provides methods for recovering and purifying modified Lesquerella hydroxylase.
  • Purification typically begins by disruption ofthe cells, and preparation of cell fractions with the highest specific activity ofthe modified hydroxylase. Because a modified Lesquerella hydroxylase is likely to be a membrane-bound enzyme, it is contemplated that microsomal preparations contain the highest specific activity ofthe enzyme. Further purification ofthe modified Lesquerella hydroxylase is then accomplished by detergent solubilization ofthe enzyme, followed by column chromatography. Purification schemes have been developed for related enzymes, such as plastidial oleate desaturase (Schmidt et al. (1994) Plant Molecular Biology 26: 631-642).
  • a modified Lesquerella hydroxylase is contemplated to exhibit a high degree of similarity to oleate desaturase, both in amino acid sequence and in the reaction catalyzed, a modified Lesquerella hydroxylase is purified by a similar scheme to that reported for the desaturase.
  • Alternative chromatographic steps include but are not limited to ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography, lectin chromatography, and size exclusion chromatography.
  • the present invention further provides nucleic acid sequences encoding a modified Lesquerella hydroxylase fused in frame to a marker sequence that allows for expression alone or both expression and purification ofthe polypeptide ofthe present invention.
  • a non-limiting example of a marker sequence is a hexahistidine tag that may be supplied by a vector, for example, a pQE-30 vector which adds a hexahistidine tag to the N terminal of a modified hydroxylase ofthe present invention and which results in expression ofthe polypeptide in the case of a bacterial host, and more preferably by vector PT-23B, which adds a hexahistidine tag to the C terminal of a modified hydroxylase ofthe present invention and which results in improved ease of purification ofthe polypeptide fused to the marker in the case of a bacterial host.
  • Non-limiting example is the fusion of glutathione S-transferase to the enzyme, resulting in the expression of (GST)-modified hydroxylase, such as was used by Maeng, CY et al. (2001, Biochem Biophys Res Commun: 282(3):787-92).
  • GST hemagglutinin
  • Other non-limiting examples include a hemagglutinin (HA) tag as a marker sequence when a mammalian host is used.
  • the HA tag corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al. (1984) Cell, 37:767).
  • antibodies are. generated to allow for the detection and characterization of a modified Lesquerella hydroxylase protein.
  • the antibodies may be prepared using various immunogens.
  • the immunogen is a modified Lesquerella hydroxylase peptide with either threonine or isoleucine at amino position 149 to generate antibodies that recognize the modified hydroxylase.
  • Such antibodies include, but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and Fab expression libraries.
  • Various procedures known in the art may be used for the production of polyclonal antibodies directed against the modified hydroxylase ofthe present invention.
  • various host animals can be immunized by injection with the peptide corresponding to the modified hydroxylase epitope including but not limited to rabbits, mice, rats, sheep, goats, etc.
  • the peptide is conjugated to an immunogenic carrier (for example, diphtheria toxoid, bovine serum albumin (BSA), or keyhole limpet hemocyanin (KLH)).
  • an immunogenic carrier for example, diphtheria toxoid, bovine serum albumin (BSA), or keyhole limpet hemocyanin (KLH).
  • adjuvants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels (for example, aluminum hydroxide), surface active substances (for example, lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (Bacille Calmette-Guerin) and Corynebacterium parvum).
  • BCG Bacille Calmette-Guerin
  • any technique that provides for the production of antibody molecules by continuous cell lines in culture finds use with the present invention (See for example, Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). These include but are not limited to the hybridoma technique originally developed by K ⁇ hler and Milstein (Kohler and Milstein (1975) Nature, 256:495-497), as well as the trioma technique, the human B-cell hybridoma technique (See for example, Kozbor et al. (1983) Immunol.
  • screening for the desired antibody is accomplished by techniques known in the art (for example, radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), "sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (for example, using colloidal gold, enzyme or radioisotope labels, for example), Western blots, precipitation reactions, agglutination assays (for example, gel agglutination assays, hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc.
  • radioimmunoassay for example, ELISA (enzyme-linked immunosorbant assay), "sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays
  • antibody binding is detected by detecting a label on the primary antibody.
  • the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody.
  • the secondary antibody is labeled.
  • the immunogenic peptide should be provided free ofthe carrier molecule used in any immunization protocol. For example, if the peptide was conjugated to KLH, it may be conjugated to BSA, or used directly, in a screening assay.
  • the foregoing antibodies are used in methods known in the art relating to the expression ofthe modified hydroxylase (for example, for Western blotting), measuring levels thereof in appropriate biological samples, etc.
  • the antibodies can be used to detect the modified hydroxylase in a biological sample from a plant.
  • the biological sample can be an extract of a tissue, or a sample fixed for microscopic examination.
  • modified hydroxylase The biological samples are then tested directly for the presence of modified hydroxylase using an appropriate strategy (for example, ELISA or radioimmunoassay) and format (for example, microwells, dipstick (for example, as described in International Patent Publication WO 93/03367), etc.
  • proteins in the sample can be size separated (for example, by polyacrylamide gel electrophoresis (PAGE), in the presence or not of sodium dodecyl sulfate (SDS), and the presence ofthe modified hydroxylase detected by immunoblotting (Western blotting). Immunoblotting techniques are generally more effective with antibodies generated against a peptide corresponding to an epitope of a protein, and hence, are particularly suited to the present invention.
  • compositions comprising variants ofthe modified Lesquerella hydroxylase, where the amino acid sequence ofthe modified
  • Lesquerella hydroxylase other than residues 63, 105, 149, 218, 296, 323, and 325, which are modified according to the invention as described above, may be varied; these variants include mutants, fragments, fusion proteins, or functional equivalents ofthe modified Lesquerella hydroxylases, provided that the activity ofthe variants ofthe modified Lesquerella hydroxylase is essentially unchanged by such sequence variations.
  • any variant generated according to the guidelines outlined below can be evaluated in order to determine whether it is a member ofthe genus of variant modified Lesquerella hydroxylase ofthe present invention as defined functionally, rather than structurally.
  • the activity of variant modified Lesquerella is evaluated by the methods described above and in the Examples.
  • the present invention provides compositions comprising nucleic acid sequences encoding variant modified Lesquerella hydroxylases.
  • Nucleic acid sequences ofthe present invention are engineered in order to alter a modified Lesquerella hydroxylase coding sequence for a variety of reasons, including but not limited to alterations that modify the cloning, processing and/or expression ofthe gene product (such alterations include inserting new restriction sites, altering glycosylation patterns, and changing codon preferences.
  • Mutant modified Lesquerella hydroxylases ofthe present invention can be generated according to the following guidelines. For example, it is contemplated that isolated replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar replacement of an amino acid with a structurally related amino acid (in other words, conservative mutations) will not have a major effect on the biological activity ofthe resulting molecule. Accordingly, some embodiments ofthe present invention provide variants of modified Lesquerella hydroxylases disclosed herein containing conservative replacements. Conservative replacements are those that take place within a family of amino acids that are related in their side chains.
  • Genetically encoded amino acids can be divided into four families: (1) acidic (aspartate, glutamate); (2) basic (lysine, arginine, histidine); (3) nonpolar (alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan); and (4) uncharged polar (glycine, asparagine, glutamine, cysteine, serine, threonine, tyrosine). Phenylalanine, tryptophan, and tyrosine are sometimes classified jointly as aromatic amino acids.
  • amino acid repertoire can be grouped as (1) acidic (aspartate, glutamate); (2) basic (lysine, arginine, histidine), (3) aliphatic (glycine, alanine, valine, leucine, isoleucine, serine, threonine), with serine and threonine optionally be grouped separately as aliphatic-hydroxyl; (4) aromatic (phenylalanine, tyrosine, tryptophan); (5) amide (asparagine, glutamine); and (6) sulfur -containing (cysteine and methionine) (for example, Stryer ed. (1981) Biochemistry, pg.
  • Whether a change in the amino acid sequence of a peptide results in a functional homolog can be readily determined by assessing the ability ofthe variant peptide to function in a fashion similar to the modified protein. Peptides having more than one replacement can readily be tested in the same manner. More rarely, a mutant includes "nonconservative" changes (for example, replacement of a glycine with a tryptophan). Analogous minor variations can also include amino acid deletions or insertions, or both.
  • Mutants of modified Lesquerella hydroxylase can be generated by any suitable method well known in the art, including but not limited to site-directed mutagenesis, randomized "point" mutagenesis, and domain-swap mutagenesis.
  • Mutant modified Lesquerella hydroxylases may also be produced by methods such as directed evolution or other techniques for producing combinatorial libraries of variants.
  • the present invention further contemplates a method of generating sets of combinatorial mutants ofthe present modified Lesquerella hydroxylase proteins, as well as truncation mutants, and is especially useful for identifying potential variant sequences (in other words, homologs) that possess the biological activity ofthe modified Lesquerella hydroxylase.
  • nucleic acids encoding a modified Lesquerella hydroxylase can be utilized as starting nucleic acids for directed evolution. These techniques can be utilized to develop modified Lesquerella hydroxylase variants having desirable properties
  • artificial evolution is performed by random mutagenesis (for example,' by utilizing error-prone PCR to introduce random mutations into a given coding sequence).
  • This method requires that the frequency of mutation be finely tuned.
  • beneficial mutations are rare, while deleterious mutations are common. This is because the combination of a deleterious mutation and a beneficial mutation often results in an inactive enzyme.
  • the ideal number of base substitutions for targeted gene is usually between 1.5 and 5 (Moore and Arnold (1996) Nat. Biotech., 14, 458-67; Leung et al.
  • the polynucleotides ofthe present invention are used in gene shuffling or sexual PCR procedures (for example, Smith (1994) Nature, 370:324-25; U.S. Pat. Nos. 5,837,458; 5,830,721; 5,811,238; 5,733,731).
  • Gene shuffling involves random fragmentation of several mutant DNAs followed by their reassembly by PCR into full length molecules. Examples of various gene shuffling procedures include, but are not limited to, assembly following / DNase treatment, the staggered extension process (STEP), and random priming in vitro recombination.
  • DNA segments isolated from a pool of positive mutants are cleaved into random fragments with DNasel and subjected to multiple rounds of PCR with no added primer.
  • the lengths of random fragments approach that ofthe uncleaved segment as the PCR cycles proceed, resulting in mutations in present in different clones becoming mixed and accumulating in some ofthe resulting sequences.
  • Multiple cycles of selection and shuffling have led to the functional enhancement of several enzymes (Stemmer (1994) Nature, 370:398-91; Stemmer (1994) Proc. Natl. Acad. Sci. USA, 91, 10747-51; Crameri et al. (1996) Nat. Biotech., 14:315-19; Zhang et al.
  • Variants produced by directed evolution can be screened for modified Lesquerella activity by the methods described (see for example above and the Examples).
  • compositions comprising modified Lesequerella hydroxylase homologs, and compositions comprising nucleic acid sequences encoding the same.
  • modified Lesquerella have intracellular half- lives dramatically different than the corresponding wild-type protein.
  • the altered protein is rendered either more stable or less stable to proteolytic degradation or other cellular process that result in destruction of, or otherwise inactivate modified Lesquerella hydroxylase.
  • Such homologs, and the genes that encode them can be utilized to alter the activity of modified Lesquerella hydroxylase by modulating the half-life ofthe protein. For instance, a short half- life can give rise to more transient hydroxylase biological effects.
  • Other homologs have characteristics which are either similar to modified Lesquerella hydroxylase, or which differ in one or more respects from modified Lesquerella hydroxylase.
  • the amino acid sequences for a population of modified Lesquerella hydroxylase homologs are aligned, preferably to promote the highest homology possible.
  • a population of variants can include, for example, Lesquerella hydroxylase homologs from one or more species, or from the same species but which differ due to mutation, outside of amino acid residues 63, 105, 149, 218, 296, 323, and 325 modified in accordance with the present invention.
  • Amino acids that appear at each position ofthe aligned sequences are selected to create a degenerate set of combinatorial sequences.
  • the combinatorial library is produced by way of a degenerate library of genes encoding a library of polypeptides that each include at least a portion of candidate protein sequences, including 63, 105, 149, 218, 296, 323, and 325 modified in accordance with the present invention.
  • a mixture of synthetic oligonucleotides is enzymatically ligated into gene sequences such that the degenerate set of candidate sequences are expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (for example, for phage display) containing the set of hydroxylase sequences therein.
  • the library of potential hydroxylase homologs can be generated from a degenerate oligonucleotide sequence.
  • chemical synthesis of a degenerate gene sequence is carried out in an automatic DNA synthesizer, and the synthetic genes are ligated into an appropriate gene for expression.
  • the purpose of a degenerate set of genes is to provide, in one mixture, all ofthe sequences encoding the desired set of potential hydroxylase sequences.
  • the synthesis of degenerate oligonucleotides is well known in the art (See for example, Narang (1983) Tetrahedron Lett., 39:3 9; Itakura et al.
  • the present invention provides compositions comprising fragments (or in other words truncation mutants) of a modified Lesquerella hydroxylase which comprises amino acids at original positions 105, 149, 218, 296, 323, and 325, modified in accordance with the present invention, and compositions comprising nucleic acid sequences encoding them.
  • a modified Lesquerella hydroxylase fragment is biologically active.
  • ATG start codon
  • MAP methionine aminopeptidase
  • removal of an N-terminal methionine can be achieved either in vivo by expressing such recombinant polypeptides in a host that produces MAP (for example, E. coli or CM89 or S. cerevisiae), or in vitro by use of purified MAP.
  • a host that produces MAP for example, E. coli or CM89 or S. cerevisiae
  • purified MAP for example, E. coli or CM89 or S. cerevisiae
  • the present invention also provides compositions comprising fusion proteins incorporating all or part of a modified Lesquerella hydroxylase comprising amino acids at original positions 105, 149, 218, 296, 323, and 325, modified in accordance with the present invention, and compositions comprising the nucleic acid sequences encoding such fusion proteins.
  • the fusion proteins have a modified Lesquerella hydroxylase functional domain with a fusion partner.
  • the coding sequences for the polypeptide (for example, a modified Lesquerella hydroxylase functional domain) are incorporated as a part of a fusion gene including a nucleotide sequence encoding a different polypeptide.
  • chimeric constructs code for a fusion protein comprising all or part ofthe modified hydroxylase ofthe present invention, and all or a part of a cytochrome b 5 , such that each protein is active.
  • Similar fusion proteins have been reported, as for example a cDNA isolated from ripening sunflower embryos which encoded a fusion protein, of which one portion was highly homologous to a membrane bound desaturase and the other N terminal portion was highly homologous to cytochrome b 5 (Sperling et al. (1995) Eur. J. Biochem. 232: 798-805).
  • chimeric constructs code for fusion proteins containing a modified hydroxylase ofthe present invention and at least a portion of another gene.
  • the fusion proteins have biological activity similar to a modified Lesquerella hydroxylase ofthe present invention (for example, they have at least one desired biological activity ofthe modified hydroxylase.
  • chimeric constructs code for fusion proteins containing a modified Lesquerella hydroxylase ofthe present invention and a leader sequence.
  • leader sequences are well-known in the art, and function to direct the protein to targeted cellular locations.
  • Exemplary leader sequences include but are not limited to those disclosed in Gavel Y, and von Heijne G (1990, FEBS Lett 261(2):45), and Emanuelsson O et al. (2000, J Mol Biol 300:1005-16).
  • fusion proteins can also facilitate the expression and/or purification of proteins, such as a modified Lesquerella hydroxylase protein ofthe present invention.
  • a modified hydroxylase is generated as a glutathione-S-transferase (in other words., a GST fusion protein). It is contemplated that such GST fusion proteins facilitates purification of modified hydroxylase, such as by the use of glutathione-derivatized matrices (See for example, Ausabel et al, eds.(1991) Current Protocols in Molecular Biology, John Wiley & Sons, NY).
  • a fusion gene coding for a purification leader sequence such as a poly-(His)/enterokinase cleavage site sequence at the N-terminus of the desired portion of modified hydroxylase allows purification ofthe expressed modified hydroxylase fusion protein by affinity chromatography using a Ni2+ metal resin.
  • the purification leader sequence is then subsequently removed by treatment with enterokinase (See for example, Hochuli et al. (1987) J. Chromatogr. 411 : 177; and Janknecht et al, Proc. Natl. Acad. Sci. USA 88:8972).
  • a fusion protein comprising a purification sequence appended to either the N or the C terminus allows for affinity purification; one example is addition of a hexahistidine tag to the carboxy terminus of modified Lesquerella hydroxylase which may result in improved affinity purification.
  • fusion genes are well known. Essentially, the joining of various nucleic acid fragments coding for different polypeptide sequences is performed in accordance with conventional techniques, employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation.
  • the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers.
  • PCR amplification of gene fragments is carried out using anchor primers that give rise to complementary overhangs between two consecutive gene fragments that can subsequently be annealed to generate a chimeric gene sequence (See for example, Current Protocols in Molecular Biology, supra).
  • a wide range of techniques are known in the art for screening gene products of combinatorial libraries made by point mutations, and for screening cDNA libraries for gene products having a certain property. Such techniques are generally adaptable for rapid screening ofthe gene libraries generated by the combinatorial mutagenesis of hydroxylase homologs.
  • the most widely used techniques for screening large gene libraries typically comprise cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates relatively easy isolation ofthe vector encoding the gene whose product was detected.
  • the candidate hydroxylase gene products are displayed on the surface of a cell or viral particle, and the ability of particular cells or viral particles to catalyze the reaction resulting from the modification ofthe hydroxylase is assayed using the techniques described above and in the Examples.
  • the gene library is cloned into the gene for a surface membrane protein of a bacterial cell, and the resulting fusion protein detected by panning (WO 88/06630; Fuchs et al( ⁇ 991) BioTechnol.
  • fluorescently labeled molecules that bind the modified Lesquerella hydroxylase can be used to score for potentially functional hydroxylase homologs.
  • Cells are visually inspected and separated under a fluorescence microscope, or, where the morphology ofthe cell permits, separated by a fluorescence- activated cell sorter.
  • the gene library is expressed as a fusion protein on the surface of a viral particle.
  • foreign peptide sequences are expressed on the surface of infectious phage in the filamentous phage system, thereby conferring two significant benefits.
  • coli filamentous phages Ml 3, fd, and fl are most often used in phage display libraries, as either ofthe phage gill or gVIII coat proteins can be used to generate fusion proteins without disrupting the ultimate packaging of the viral particle (See for example, WO 90/02909; WO 92/09690; Marks et al. (1992) J. Biol. Chem. 267:16007-16010; Griffths et al. (1993) EMBO J. 12:725-734; Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) Proc. Natl. Acad. Sci. 89:4457-4461).
  • the recombinant phage antibody system (for example, RPAS, Pharmacia Catalog number 27-9400-01) is modified for use in expressing and screening of hydroxylase combinatorial libraries.
  • the pCANTAB 5 phagemid ofthe RPAS kit contains the gene that encodes the phage gill coat protein.
  • the modified hydroxylase combinatorial gene library is cloned into the phagemid adjacent to the gill signal sequence such that it is expressed as a gill fusion protein.
  • the phagemid is used to transform competent E. coli TGI cells after ligation.
  • transformed cells are subsequently infected with M13K07 helper phage to rescue the phagemid and its candidate hydroxylase gene insert.
  • the resulting recombinant phage contain phagemid DNA encoding a specific candidate modified hydroxylase-protein and display one or more copies ofthe corresponding fusion coat protein.
  • the phage-displayed candidate proteins that are capable of, for example, metabolizing a hydroperoxide, are selected or enriched by panning.
  • the bound phage is then isolated, and if the recombinant phage express at least one copy ofthe wild type gill coat protein, they will retain their ability to infect E. coli.
  • successive rounds of reinfection of E. coli and panning will greatly enrich for modified hydroxylase homologs, which can then be screened for further biological activities in order to differentiate agonists and antagonists.
  • modified Lesquerella hydroxylase homologs can be generated and screened using, for example, alanine scanning mutagenesis and the like (Ruf et al. (1994) Biochem. 33:1565-1572; Wang et al. (1994) J. Biol. Chem. 269:3095-3099; Balint (1993) Gene 137:109-118 ; Grodberg et al. (1993) Eur. J. Biochem. 218:597-601; Nagashima et al. (1993) J.
  • a nucleic acid sequence encoding a modified Lesquerella hydroxylase according to the present invention is used to generate a recombinant DNA molecule that directs the expression ofthe encoded protein product in appropriate host cells.
  • a nucleic acid sequence corresponding to the antisense sequence of a modified Lesquerella hydroxylase is used.
  • codons preferred by a particular prokaryotic or eukaryotic host can be selected, for example, to increase the rate of hydroxylase expression or to produce recombinant RNA transcripts having desirable properties, such as a longer half-life, than transcripts produced from naturally occurring sequence.
  • nucleic acid sequences ofthe present invention may be employed for producing polypeptides by recombinant techniques.
  • the nucleic acid sequence may be included in any one of a variety of expression vectors for expressing a polypeptide.
  • vectors include, but are not limited to, chromosomal, nonchromosomal and synthetic DNA sequences (for example, derivatives of SV40, bacterial plasmids, phage DNA; baculovirus, yeast plasmids, vectors derived from combinations of plasmids and phage DNA, and viral DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies). It is contemplated that any vector may be used as long as it is replicable and viable in the host.
  • some embodiments ofthe present invention provide recombinant constructs comprising one or more ofthe nucleic sequences ofthe present invention as described above.
  • the constructs comprise a vector, such as a plasmid or viral vector, into which a nucleic acid sequence ofthe invention has been inserted, in a forward or reverse orientation.
  • the appropriate nucleic acid sequence is inserted into the vector using any of a variety of procedures.
  • the nucleic acid sequence is inserted into an appropriate restriction endonuclease site(s) by procedures known in the art. Large numbers of suitable vectors are known to those of skill in the art, and are commercially available.
  • Such vectors include, but are not limited to, the following vectors: 1) Bacterial ⁇ pQE70, pQE60, pQE-9 (Qiagen), pBS, pDIO, phagescript, psiX174, pbluescript SK, pBSKS, pNH8A, pNH16a, pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); and 2) Eukaryotic ⁇ pWLNEO, pSV2CAT, pOG44, PXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, and pSVL (Pharmacia).
  • vectors include, but are not limited to, the following vectors: 1) Bacterial ⁇ pQE70, pQE60, pQE-9 (Qi
  • plant expression vectors comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation sites, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences.
  • DNA sequences derived from the SV40 splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.
  • the nucleic acid sequence in the expression vector is operatively linked to at least one suitable regulatory sequence.
  • suitable regulatory sequences include, but are not limited to, an appropriate expression control sequence(s)
  • Promoters useful in the present invention include, but are not limited to, the LTR or SV40 promoter, the E. coli lac or trp, the phage lambda PL and
  • CMV cytomegalovirus
  • HSV herpes simplex virus
  • thymidine kinase thymidine kinase
  • mouse metallothionein-I promoters and other promoters known to control expression of gene in prokaryotic or eukaryotic cells or their viruses.
  • Exemplary plant promoters include but are not limited to SD Cauliflower Mosaic Virus (CaMV).
  • recombinant expression vectors include origins of replication and selectable markers permitting transformation ofthe host cell (for example, dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or tetracycline or ampicillin resistance in E. coli).
  • transcription ofthe DNA encoding the polypeptides ofthe present invention by higher eukaryotes is increased by inserting an enhancer sequence into the vector.
  • Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to increase its transcription.
  • Enhancers useful in the present invention include, but are not limited to, the SV40 enhancer on the late side ofthe replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side ofthe replication origin, and adenovirus enhancers.
  • the expression vector also contains a ribosome binding site for translation initiation and a transcription terminator.
  • the vector may also include appropriate sequences for amplifying expression.
  • the present invention provides host cells containing the above-described constructs.
  • the host cell is a higher eukaryotic cell (for example, a plant cell).
  • the host cell is a lower eukaryotic cell (for example, a yeast cell).
  • the host cell can be a prokaryotic cell (for example, a bacterial cell).
  • host cells include, but are not limited to, Escherichia coli, Salmonella typhimurium, Bacillus subtilis, and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus, as well as Saccharomycees cerivisiae, Schizosaccharomycees pombe, Drosophila S2 cells, Spodoptera Sf9 cells, Chinese hamster ovary (CHO) cells, COS-7 lines of monkey kidney fibroblasts, (Gluzman (1981) Cell 23:175), 293T, C127, 3T3, HeLa and BHK cell lines, NT-1 (tobacco cell culture line), root cell and cultured roots in rhizosecretion (Gleba et al. (1999) Proc Natl Acad Sci USA 96: 5973-5977) and other plant cells, which can be cultivated in fermenters or which can be regenerated into an entire plants.
  • the constructs in host cells can be used in a conventional manner to produce the gene product encoded by the recombinant sequence.
  • introduction of the construct into the host cell can be accomplished by calcium phosphate transfection, DEAE-Dextran mediated transfection, or electroporation (See for example, Davis et al. (1986) Basic Methods in Molecular Biology).
  • Proteins can be expressed in eukaryotic cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs ofthe present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, NY.
  • the selected promoter is induced by appropriate means (for example, temperature shift or chemical induction) and cells are cultured for an additional period.
  • cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.
  • microbial cells and other cultured cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents.
  • fatty acid products are produced in vivo, in organisms transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention and capable of expressing hydroxylase activity, and grown under conditions sufficient to effect production ofthe fatty acid products.
  • the fatty acid products are produced in vitro, from either nucleic acid sequences encoding a modified Lesquerella hydroxylase ofthe present invention or from polypeptides comprising a modified hydroxylase ofthe present invention and exhibiting fatty acid hydroxylase or desaturase activity.
  • the invention also provides compositions comprising organisms transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention, and in vitro systems comprising modified Lesquerella hydroxylase polypeptides or coding sequences or both for producing fatty acid products.
  • the fatty acid products are produced in vivo, by providing an organism transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention and growing the transgenic organism under conditions sufficient to effect production ofthe fatty acid products.
  • the fatty acid products are produced in vivo by transforming an organism with a heterologous gene encoding a modified hydroxylase ofthe present invention and growing the transgenic organism under conditions sufficient to effect production ofthe fatty acid products.
  • Organisms which are transformed with a heterologous gene encoding a modified hydroxylase ofthe present invention include preferably those which naturally synthesize and store in some manner fatty acids, and those which are commercially feasible to grow and suitable for harvesting the fatty acid products.
  • Such organisms include but are not limited to bacteria, yeast, oleaginous algae, and plants.
  • bacteria include E. coli and related bacteria which can be grown in commercial-scale fermenters.
  • yeast include S. cerevisiae, strains IVNSCI and YPH499, which can also be grown in commercial-scale fermenters.
  • plants include preferably oil-producing plants; examples of such plants include but are not limited to soybean, rapeseed and canola, sunflower, cotton, corn, cocoa, safflower, oil palm, coconut palm, flax, castor, and peanut.
  • Non-commercial cultivars of plants can be transformed, and the trait for expression of a modified Lesquerella hydroxylase ofthe present invention moved to commercial cultivars by breeding techniques well-known in the art.
  • a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention, including fusion proteins, includes any suitable sequence as described above.
  • the heterologous gene is provided within an expression vector such that transformation with the vector results in expression ofthe polypeptide; suitable vectors are described above and below.
  • a transgenic organism is grown under conditions sufficient to effect production ofthe fatty acid products.
  • a transgenic organism is supplied with exogenous substrates ofthe modified hydroxylases.
  • substrates comprise mono-, di-, and poly-unsaturated fatty acids; the chain length of such unsaturated fatty acids is variable, but is preferably 18 carbons in length.
  • the unsaturated fatty acids may also comprise additional functional groups, including but not limited to acetylenic bonds, conjugated acetylenic and ethylenic bonds, allenic groups, cyclopropane, cyclopropene, cyclopentene and furan rings, epoxy-, and keto-groups and double bonds of both the cis and trans configuration and separated by more than one methylene group; two or more of these functional groups may be found in a single fatty acid.
  • the substrates are added or present as the free acids, the ACP and CoA esters, the salts of these acids, the glycerolipid esters (particularly the phospholipid and triacylglycerol esters), the wax esters, and the ether derivatives of these acids.
  • such substrates are selected from the group consisting of: 16: l d9 (palmitoleate); 18:l d6 (petroselenate); 18:l d9 (oleate); 20:l dl l (gladoleate or eicosenoate); and 22:l d13 (erucate or docosenoate).
  • Substrates are supplied in various forms as are well known in the art; such forms include aqueous suspensions prepared by sonication, aqueous suspensions prepared with detergents and other surfactants, micellar preparations which include the substrate, dissolution ofthe substrate into a solvent, and dried powders of substrates. Such forms may be added to organisms or cultured cells or tissues grown in fermenters.
  • a transgenic organism comprises a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention operably linked to an inducible promoter, and is grown either in the presence ofthe an inducing agent, or is grown and then exposed to an inducing agent.
  • a transgenic organism comprises a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention operably linked to a promoter which is either tissue specific or developmentally specific, and is grown to the point at which the tissue is developed or the developmental stage at which the developmentally-specific promoter is activated.
  • a transgenic organism as described above is engineered to produce greater amounts ofthe fatty acid substrate.
  • Organisms include bacteria, yeast, algae and plants; preferably, the organism is a plant; most preferably, the organism is an oil- producing plant.
  • the methods for producing large quantities ofthe fatty acid products further comprise collecting the fatty acids produced.
  • Such methods are known generally in the art, and include harvesting the transgenic organisms and extracting the fatty acid products. Extraction procedures preferably include solvent extraction, and typically include disrupting cells, as by chopping, mincing, grinding, and/or sonicating, prior to solvent extraction. Solvent extraction procedures are well known, and have been described.
  • the fatty acid products are further purified, as for example by thin layer liquid chromatography, gas-liquid chromatography, or high pressure liquid chromatography
  • the fatty acid products are produced in transgenic plants; preferably, the fatty acid products are produced in plant seed oils.
  • Plants are transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention; in some embodiments, plants are transformed with a fusion gene encoding a fusion polypeptide comprising a modified hydroxylase ofthe present invention. Transformation techniques are well known in the art. It is contemplated that the heterologous genes are utilized to increase the level ofthe encoded enzyme activities
  • the methods ofthe present invention are not limited to any particular plant. Indeed, a variety of plants are contemplated, including but not limited to soybean (Glycine max), rapeseed and canola (including Brassica napus and B. c ⁇ mpestris), sunflower (Heli ⁇ nthus ⁇ nnus), cotton (Gossypium hirsutum), corn (Ze ⁇ mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorius), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea).
  • the group also includes non-agronomic species which are useful in developing appropriate expression vectors such as tobacco, rapid cycling Brassica species, and Arabidopsis thaliana, and wild species which may be a source of unique fatty acids.
  • heterologous genes intended for expression in plants are first assembled in expression cassettes comprising a promoter.
  • Methods which are well known to those skilled in the art are used to construct expression vectors containing a heterologous gene and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are widely described in the art (See for example, Sambrook. et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, NY.; and Ausubel, F. M.
  • these vectors comprise a nucleic acid sequence encoding a modified hydroxylase ofthe present invention (as described above) operably linked to a promoter and other regulatory sequences (for example, enhancers, polyadenylation signals, etc.) required for expression in a plant.
  • Promoters include but are not limited to constitutive promoters, tissue-, organ-, and developmentally-specific promoters, and inducible promoters.
  • Examples of promoters include but are not limited to: constitutive promoter 35S of cauliflower mosaic virus; a wound-inducible promoter from tomato, leucine amino peptidase ("LAP,” Chao et al. (1999) Plant Physiol 120: 979-992); a chemically-inducible promoter from tobacco, Pathogenesis-Related 1 (PR1) (induced by salicylic acid and BTH
  • the expression cassettes may further comprise any sequences required for expression of mRNA.
  • sequences include, but are not limited to transcription terminators, enhancers such as introns, viral sequences, and sequences intended for the targeting ofthe gene product to specific organelles and cell compartments.
  • transcriptional terminators are available for use in expression of sequences ofthe present invention.
  • Transcriptional terminators are responsible for the termination of transcription beyond the transcript and its correct polyadenylation.
  • Appropriate transcriptional terminators and those which are known to function in plants include, but are not limited to, the CaMV 35S terminator, the tml terminator, the pea rbcS E9 terminator, and the nopaline and octopine synthase terminator (See for example, Odell et al. (1985) Nature 313:810; Rosenberg et al. (1987) Gene 56:125; Guerineau et al. (1991) Mol. Gen. Genet. 262:141; Proudfoot (1991) Cell 64:671; Sanfacon et al, Genes Dev. 5:141 ; Mogen et al. (1990) Plant Cell 2:1261;
  • constructs for expression ofthe heterologous gene of interest include one or more of sequences found to enhance gene expression from within the transcriptional unit. These sequences can be used in conjunction with the nucleic acid sequence of interest to increase expression in plants.
  • Various intron sequences have been shown to enhance expression, particularly in monocotyledonous cells.
  • the introns ofthe maize Adhl gene have been found to significantly enhance the expression ofthe wild-type gene under its cognate promoter when introduced into maize cells (Callis et al. (1987) Genes Develop. 1: 1183). Intron sequences have been routinely incorporated into plant transformation vectors, typically within the non-translated leader.
  • the construct for expression ofthe nucleic acid sequence of interest also includes a regulator such as a nuclear localization signal (Kalderon et al. (1984) Cell 39:499; Lassner etal. (1991) Plant Molecular Biology 17:229), a plant translational consensus sequence (Joshi (1987) Nucleic Acids Research 15:6643), an intron (Luehrsen and Walbot (1991) Mol. Gen. Genet. 225:81), and the like, operably linked to the nucleic acid sequence encoding the modified hydroxylase ofthe present invention.
  • a regulator such as a nuclear localization signal (Kalderon et al. (1984) Cell 39:499; Lassner etal. (1991) Plant Molecular Biology 17:229), a plant translational consensus sequence (Joshi (1987) Nucleic Acids Research 15:6643), an intron (Luehrsen and Walbot (1991) Mol. Gen. Genet. 225:81), and the like, operably linked to
  • various DNA fragments can be manipulated, so as to provide for the DNA sequences in the desired orientation (for example, sense or antisense) orientation and, as appropriate, in the desired reading frame.
  • adapters or linkers can be employed to join the DNA fragments or other manipulations can be used to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like.
  • in vitro mutagenesis, primer repair, restriction, annealing, resection, ligation, or the like is preferably employed, where insertions, deletions or substitutions (for example, transitions and transversions) are involved.
  • transformation vectors are available for plant transformation. The selection of a vector for use will depend upon the preferred transformation technique and the target species for transformation. For certain target species, different antibiotic or herbicide selection markers are preferred. Selection markers used routinely in transformation include the nptll gene which confers resistance to kanamycin and related antibiotics (Messing and Vierra (1982) Gene 19: 259; Bevan et al. (1983) Nature 304:184), the bar gene which confers resistance to the herbicide phosphinothricin (White et al (1990) Nucl Acids Res. 18:1062; Spencer et al (1990) Theor. Appl. Genet.
  • the vector is adapted for use in an Agrobacterium mediated transfection process (See for example, U.S. Pat. Nos. 5,981,839; 6,051,757; 5,981,840; 5,824,877; and 4,940,838; all of which are incorporated herein by reference).
  • Construction of recombinant Ti and Ri plasmids in general follows methods typically used with the more common bacterial vectors, such as pBR322. Additional use can be made of accessory genetic elements sometimes found with the native plasmids and sometimes constructed from foreign sequences. These may include but are not limited to structural genes for antibiotic resistance as selection genes.
  • the first system is called the "cointegrate" system.
  • the shuttle vector containing the gene of interest is inserted by genetic recombination into a non-oncogenic Ti plasmid that contains both the cis-acting and trans-acting elements required for plant transformation as, for example, in the pMLJl shuttle vector and the non-oncogenic Ti plasmid pGV3850.
  • the second system is called the "binary" system in which two plasmids are used; the gene of interest is inserted into a shuttle vector containing the cis-acting elements required for plant transformation.
  • the other necessary functions are provided in trans by the non-oncogenic Ti plasmid as exemplified by the pBIN19 shuttle vector and the non-oncogenic Ti plasmid PAL4404. Some of these vectors are commercially available.
  • the nucleic acid sequence of interest is targeted to a particular locus on the plant genome.
  • Site-directed integration ofthe nucleic acid sequence of interest into the plant cell genome may be achieved by, for example, homologous recombination using Agrobacterium-deri ed sequences.
  • plant cells are incubated with a strain of Agrobacterium which contains a targeting vector in which sequences that are homologous to a DNA sequence inside the target locus are flanked by Agrobacterium transfer- DNA (T-DNA) sequences, as previously described (U.S. Pat. No. 5,501,967).
  • T-DNA Agrobacterium transfer- DNA
  • homologous recombination may be achieved using targeting vectors which contain sequences that are homologous to any part ofthe targeted plant gene, whether belonging to the regulatory elements ofthe gene, or the coding regions ofthe gene. Homologous recombination may be achieved at any region of a plant gene so long as the nucleic acid sequence of regions flanking the site to be targeted is known.
  • a nucleic acid ofthe present invention is utilized to construct vectors derived from plant (+) RNA viruses (for example, brome mosaic virus, tobacco mosaic virus, alfalfa mosaic virus, cucumber mosaic virus, tomato mosaic virus, and combinations and hybrids thereof).
  • the inserted modified hydroxylase encoding polynucleotide can be expressed from these vectors as a fusion protein (for example, coat protein fusion protein) or from its own subgenomic promoter or other promoter.
  • a fusion protein for example, coat protein fusion protein
  • Methods for the construction and use of such viruses are described in U.S. Pat. Nos. 5,846,795; 5,500,360; 5,173,410; 5,965,794; 5,977,438; and 5,866,785, all of which are incorporated herein by reference.
  • nucleic acid sequence of interest is introduced directly into a plant.
  • One vector useful for direct gene transfer techniques in combination with selection by the herbicide Basta (or phosphinothricin) is a modified version ofthe plasmid pCIB246, with a CaMV 35S promoter in operational fusion to the E. coli GUS gene and the CaMV 35S transcriptional terminator (WO 93/07278).
  • a nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention is operatively linked to an appropriate promoter and inserted into a suitable vector for the particular transformation technique utilized (for example, one ofthe vectors described above), the recombinant DNA described above can be introduced into the plant cell in a number of art-recognized ways. Those skilled in the art will appreciate that the choice of method depends upon the vector and the type of plant targeted for transformation. In some embodiments, the vector is maintained episomally. In other embodiments, the vector is integrated into the genome.
  • direct transformation in the plastid genome is used to introduce the vector into the plant cell (See for example, U.S. Patent Nos 5,451,513; 5,545,817;
  • the basic technique for chloroplast transformation involves introducing regions of cloned plastid DNA flanking a selectable marker together with the nucleic acid encoding the RNA sequences of interest into a suitable target tissue (for example, using biolistics or protoplast transformation with calcium chloride or PEG).
  • a suitable target tissue for example, using biolistics or protoplast transformation with calcium chloride or PEG.
  • the 1 to 1.5 kb flanking regions, termed targeting sequences facilitate homologous recombination with the plastid genome and thus allow the replacement or modification of specific regions ofthe plastome.
  • Substantial increases in transformation frequency are obtained by replacement ofthe recessive rRNA or r-protein antibiotic resistance genes with a dominant selectable marker, the bacterial aadA gene encoding the spectinomycin-detoxifying enzyme aminoglycoside-3'-adenyltransferase (Svab and Maliga (1993) PNAS 90:913).
  • selectable markers useful for plastid transformation known in the art are encompassed within the scope of the present invention. Plants homoplasmic for plastid genomes containing the two nucleic acid sequences separated by a promoter ofthe present invention are obtained, and are preferentially capable of high expression ofthe RNAs encoded by the DNA molecule.
  • vectors useful in the practice ofthe present invention are microinjected directly into plant cells by use of micropipettes to mechanically transfer the recombinant DNA (Crossway (1985) Mol. Gen. Genet 202:179).
  • the vector is transferred into the plant cell by using polyethylene glycol (Krens et al (1982) Nature 296:72; Crossway et al. (1986) BioTechniques 4:320); fusion of protoplasts with other entities, either minicells, cells, lysosomes or other fusible lipid-surfaced bodies (Fraley et al. (1982) Proc. Natl. Acad. Sci.
  • the vector may also be introduced into the plant cells by electroporation.
  • electroporation fromm, et al. (1985) Pro. Natl Acad. Sci. USA 82:5824; Riggs et al. (1986) Proc. Natl. Acad. Sci. USA 83:5602).
  • plant protoplasts are electroporated in the presence of plasmids containing the gene construct. Electrical impulses of high field strength reversibly permeabilize biomembranes allowing the introduction ofthe plasmids. Electroporated plant protoplasts reform the cell wall, divide, and form plant callus.
  • the vector is introduced through ballistic particle acceleration using devices (for example, available from Agracetus, Inc., Madison, Wis. and Dupont, Inc., Wilmington, Del).
  • devices for example, available from Agracetus, Inc., Madison, Wis. and Dupont, Inc., Wilmington, Del.
  • McCabe et al. (1988) Biotechnology 6:923 See also, Weissinger et al. (1988) Annual Rev. Genet. 22:421; Sanford et al. (1987) Particulate Science and Technology 5:27 (onion); Svab et al. (1990) Proc. Natl. Acad. Sci. USA 87:8526 (tobacco chloroplast); Christou et al. (1988) Plant Physiol.
  • the vectors comprising a nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention are transferred using Agrobacterium-mediated transformation (Hinchee et al. (1988) Biotechnology 6:915; Ishida et al. (1996) Nature Biotechnology 14:745).
  • Agrobacterium is a representative genus ofthe gram-negative family Rhizobiaceae. Its species are responsible for plant tumors such as crown gall and hairy root disease.
  • amino acid derivatives known as opines are produced and catabolized.
  • the bacterial genes responsible for expression of opines are a convenient source of control elements for chimeric expression cassettes.
  • Heterologous genetic sequences can be introduced into appropriate plant cells, by means ofthe Ti plasmid of Agrobacterium tumefaciens.
  • the Ti plasmid is transmitted to plant cells on infection by Agrobacterium tumefaciens, and is stably integrated into the plant genome (Schell, 1987, Science, 237: 1176).
  • Species which are susceptible infection by Agrobacterium may be transformed in vitro.
  • plants may be transformed in vivo, such as by transformation of a whole plant by Agrobacterial infiltration of adult plants, as in a "floral dip" method (Bechtold, N et al. (1993) Cr. Acad. Sci. Ill-Vie 316: 1194-1199).
  • embryo formation can be induced from the protoplast suspension. These embryos germinate and form mature plants.
  • the culture media will generally contain various amino acids and hormones, such as auxin and cytokinins. Shoots and roots normally develop simultaneously. Efficient regeneration will depend on the medium, on the genotype, and on the history ofthe culture. The reproducibility of regeneration depends on the control of these variables.
  • Transgenic lines are established from transgenic plants by tissue culture propagation. The presence of nucleic acid sequences encoding a modified hydroxylase ofthe present invention or a fusion protein comprising the modified hydroxylase are transferred to related varieties by traditional plant breeding techniques.
  • the fatty acid products are produced in transgenic mircoorganisms.
  • Microorganisms are transformed with a heterologous gene encoding a modified hydroxylase ofthe present invention or a gene encoding a fusion polypeptide comprising a modified hydroxylase ofthe present invention according to procedures well known in the art. It is contemplated that the heterologous genes are utilized to increase the level ofthe enzyme activities encoded by the heterologous genes.
  • the fatty acid products are produced in transgenic yeast. Wild-type yeast do not accumulate detectable levels of hydroxylated fatty acids.
  • a nucleic acid sequence encoding a modified hydroxylase ofthe present invention is placed into an expression vector under transcriptional control of a promoter; for example, such a promoter is an inducible promoter GAL Expression ofthe enzyme is induced, and the fatty acids produced as a result ofthe enzyme activity.
  • nucleic acid sequences encoding modified Lesquerella hydroxylases are cloned into an expression vector under control of an inducible promoter (a non-limiting example is cloning the coding sequence into the pYes-II expression vector behind a GAL-1 promoter as previously described (Broun, P. et al. (1998) Science 282(5392), 1315- 1317)), and expressed in yeast YPH499 strain.
  • an inducible promoter cloning the coding sequence into the pYes-II expression vector behind a GAL-1 promoter as previously described (Broun, P. et al. (1998) Science 282(5392), 1315- 1317)
  • modified hydroxylase is induced at higher cell densities, of at least an OD 60 ogreater than about 1, and preferably greater than about 2, and more preferably about 2.5, and at higher temperatures, where the temperature is above room temperature, preferably greater than about 20 °C, more preferably greater than about 22 °C , yet more preferably greater than about 25 °C, and even more preferably at or above about 30°C. This results in accumulation of relatively high amounts of hydroxy-fatty acids.
  • fatty acid products are produced in viti-o, from either nucleic acid sequences encoding a modified hydroxylase ofthe present invention or from a polypeptide comprising a modified hydroxylase ofthe present invention.
  • nucleic acid sequences encoding a modified hydroxylase Using nucleic acid sequences encoding a modified hydroxylase
  • methods for producing large quantities ofthe fatty acid products comprise adding an isolated nucleic acid sequence encoding a modified hydroxylase ofthe present invention to in vitro expression systems under conditions sufficient to cause production ofthe modified hydroxylase.
  • the isolated nucleic acid sequences encoding a modified hydroxylase ofthe present invention is any suitable sequence as described above, and preferably is provided within an expression vector such that addition of the vector to an in vitro transcription/translation system results in expression ofthe polypeptide.
  • the system further comprises the substrates for the modified hydroxylase, as described above. Alternatively, the system further comprises the means for generating the substrates for the modified hydroxylase.
  • the methods for producing large quantities ofthe fatty acid products further comprise collecting the fatty acids produced. Such methods are known generally in the art, and have been described above.
  • the fatty acid products are further purified, as for example by thin layer liquid chromatography, gas-liquid chromatography, or high performance liquid chromatography.
  • methods for producing large quantities of fatty acid products comprise incubating a modified hydroxylase ofthe present invention under conditions sufficient to result in the synthesis ofthe fatty acid products; generally, such incubation is carried out in a mixture which comprises the modified hydroxylase.
  • a modified hydroxylase ofthe present invention is obtained by purification of recombinant modified hydroxylase from an organism transformed with heterologous gene encoding a modified hydroxylase ofthe present invention, as described above.
  • a source of recombinant modified hydroxylase is either plant, bacterial or other transgenic organisms, transformed with heterologous gene encoding a modified hydroxylase of the present invention as described above.
  • the modified hydroxylase may further include means for improving purification, as for example a 6x-His tag added to the C-terminus ofthe protein as described above.
  • a modified hydroxylase ofthe present invention is chemically synthesized.
  • the incubation mixture further comprises the substrates for the modified hydroxylase, as described above.
  • the mixture further comprises the means for generating the substrates for the modified hydroxylase.
  • the methods for producing large quantities of the fatty acid products further comprise collecting the fatty acids produced; such methods are described above.
  • subtilis, and C-15 for the ⁇ -3 linoleate desaturase from flax suggests that these positions are the sites of initial oxidation ofthe desaturation reactions, corroborating the kinetic isotope effect studies of Buist and coworkers (Buist, P. H., and Behrouzian, B. (1998) J. Am. Chem. Soc. 120: 871-876; Buist, P. H., and Behrouzian, B. (1996) J. Am. Chem. Soc. 118: 6295-6296; Fauconnot, L., and Buist, P. H. (2001) Bioorg. Med. Lett. 11: 2879-2881; and Savile, C. K. et al. (2001) J. Chem. Soc, Perkin Trans. 1 9: 1116-1121).
  • the Lesquerella enzyme must also share this same stereospecificity as the seed oil-derived lesquerolic acid retains the same optical rotation properties as ricinoleic acid derived from castor seed oil (Smith, C. R., Jr. et al. (1961) J. Org. Chem. 26: 2903-2905).
  • the knowledge that these enzymes are highly homologous van de Loo, F. J. et al. (1995) Proc. Natl. Acad. Sci. USA 92(15): 6743-6747;and Broun, P. et al. (1997) Plant J. 13: 201-210) and that the enzymes catalyze both desaturation and hydroxylation, just with differing product ratios, implies that these enzymes employ closely-related catalytic mechanisms.
  • the stereospecificity ofthe variant FAD2 enzymes was determined to gain insight into understanding the cause of bifunctional behavior.
  • Analysis of enzymatically-derived products obtained from yeast cultures expressing active AtFAD2, LFAH, AtFAD2 with 4 amino acid residues incorporated from CFAH (C4M), and AtFAD2 with isoleucine substituted at amino acid position 324 (FAD2-M324I) revealed comparable retention ofthe 12(S)-hydrogen atom.
  • the variant FAD2 oleate hydroxylases C4M and M324I retain the stereospecificity ofthe wild-type desaturase and hydroxylase enzymes.
  • the catalytic function of a binuclear iron center might be changed to desaturation through alteration ofthe chemical nature ofthe substrate (effected by intermediate stabilization, as for example, a chemical substituent, like a fluorine instead of a hydrogen) or by substrate presentation to the oxidant.
  • Substrate presentation could be affected by substitution of amino acids adjacent to the active site; for example, such substitutions might then allow the substrate to approach the oxidant closer, or keep the substrate farther away from the oxidant, due to the physical size difference between amino acid side chains.
  • hydrocarbon hydroxylases are capable of controlling the substrate orientation to some degree, small substrate size may also favor hydroxylation (Pikus, J. D. et al. (1997)
  • the present invention further provides a yeast system in which to evaluate the effects of modified hydroxylases and other modified lipid synthetic enzymes.
  • This system avoids the problems associated with utilizing transgenic plants as the means for assessing the effects of modified enzymes.
  • the use of transgenic plants is time-consuming, and results in variable results; such variability is thought to be in part to insertional positional effects.
  • many unusual lipid products are toxic to host plants; plants may also be degrading such products.
  • the yeast system ofthe present invention is an alternative to the use of transgenic plants for the evaluation of modified lipid-synthesizing enzymes; the production of transgenic yeast is quick, and yeast are relatively easy to grow, able to tolerate unusual fatty acid products, and possess a background such that the generation of even very low amounts ofthe enzyme product can be detected.
  • the present invention provides a system comprising yeast S. cerevisiae strain YPH499 transformed with a nucleic acid sequence encoding a modified Lesquerella hydroxylase enzyme under control of an inducible promoter, where the yeast strain is grown to high density of about an OD 6 o 0 of 2.5 and induced at a high temperature of about 30°C.
  • the present invention provides methods with which to evaluate the effects of modified hydroxylases and other modified lipid synthetic enzymes.
  • a nucleic acid sequence encoding a modified Lesquerella hydroxylase is cloned into an expression vector under control of an inducible promoter and expressed in yeast YPH499 strain. Expression ofthe modified hydroxylase is induced at higher cell densities, of an OD 6 oo of about 2.5, and at higher temperatures of about 30°C. This allows accumulation of relatively high amounts of hydroxy-fatty acids, thus enhancing evaluation ofthe effects ofthe modified hydroxylase.
  • the modified Lesquerella hydroxylase coding sequence is cloned into the pYes-II expression vector behind a GAL-1 promoter as previously described (Broun, P. et al. (1998) Science 282(5392): 1315-1317).
  • N normal
  • M molar
  • mM millimolar
  • ⁇ M micromolar
  • mol molecular weight
  • mmol millimoles
  • ⁇ mol micromol
  • nmol nanomoles
  • pmol picomoles
  • g grams); mg (milligrams); ⁇ g (micrograms); ng (nanograms); 1 or L (liters); ml (milliliters); ⁇ l (microliters); cm (centimeters); mm (millimeters); ⁇ m (micrometers); nm (nanometers); °C (degrees Centigrade); PCR (polymerase chain reaction); RT-PCR (reverse-transcriptase-PCR); TAIL-PCR (thermal asymmetric interlaced-PCR); FAD2, oleate ⁇ 12 desaturase; LFAH, Lesquerella fendleri oleate
  • thaliana FAD2 with four substitutions from LFAH (A104G/T148N/S322A/M324I); C4M, A. thaliana FAD2 with four substitutions from CFAH (A104G/T148I/S322A/M324V); BSTFA-TMCS, N, O- Bis(trimethylsilyl)trifluoroacetamide.
  • the fad2 mutants were cloned into the Sacl/Xmal sites ofthe Agrobacterium binary vector DATNAP, a derivative of pRLMlO (Datla, R. S. S. et al. (1992) Gene 122: 383-384), to direct seed-specific expression from the napin promoter (Kridl, J. C. et al. (1991) Seed Science Res. 1: 209-219).
  • the vectors were introduced into Agrobacterium tumefaciens strain GV3101 pMP90 by electroporation and used to transform Arabidopsis thaliana FAD2-deficient plants (Okuley, J. et al.
  • Seeds were methylated (l ' ml of 1 N HCl-methanol, Supelco, 80°C for lh), extracted with hexane, and trimethylsilylated (100 ⁇ l of BSTFA-TMCS, Supelco, 90°C for 45 min).
  • the BSTFA-TMCS was removed by evaporation and the sample resuspended in hexane.
  • Yeast pellets were dried by a nitrogen stream prior to methylation and when fatty acids were added during growth, cell pellets were washed with 1% Tergitol and then water before drying.
  • residues 63, 104, 148, 217, 295, 322, and 324 ofthe Arabidopsis thaliana oleate desaturase differ from the corresponding residues found in the closely related Lesquerella fendleri oleate hydroxylase (LFAH) and Ricinus communis (castor) oleate hydroxylase (CFAH).
  • LFAH Lesquerella fendleri oleate hydroxylase
  • CFAH Ricinus communis
  • Amino acid residue numbering is based on the A. thaliana FAD2 sequence; the corresponding residue and residue position for each enzyme is included in the table.
  • LFAH and CFAH represent the Lesquerella fendleri oleate hydroxylase and the castor oleate hydroxylase, respectively.
  • the FAD2 consensus sequence is conserved among the plant FAD2 sequences available in Genbank.
  • Bold-faced and underlined residue numbers are located within five residues of one ofthe three His clusters that have been proposed to coordinate the nonheme iron active site.
  • variant fad2 genes containing one, two, or three amino acid substitutions were also constructed.
  • the variant fad2 genes were introduced into Arabidopsis thaliana FAD2-deficient plants by Agrobacterium-msdiated transformation. Trimethylsilylated fatty acid methyl esters from seeds harvested from T2 plants (heterozygous) were analyzed by gas chromatography/mass spectrometry in order to determine the overall fatty acid composition. The results are shown in Figure 1. All AtFAD2 variants expressed in Arabidopsis gave detectable amounts of hydroxy fatty acids, ranging from 0.03-22%).
  • AtFAD2 variants Most ofthe AtFAD2 variants, with the exception of some ofthe plants expressing C4M, retained sufficient desaturase activity to complement the FAD2-deficient background (resulting in about 25-30 % linoleic acid).
  • T148I and M324I caused the most dramatic change in phenotype, producing up to 4.2% and 5.4% hydroxy fatty acids, respectively ( Figure 1).
  • the AtFAD2 variants A104G, S322A, and T148N (a Lesquerella substitution) produced less than 1% hydroxy fatty acids.
  • C2M.1 T148I/M324V
  • C2M.5 T148I/S322A
  • C2M.3 produced higher levels (4.5% and 3.1%) of hydroxy fatty acids than did C2M.2 (A104G/S322A), C2M.3
  • C4M Based on transgenic expression in Arabidopsis, C4M exhibited the lowest desaturation hydroxylation ratio among all AtFAD2 variants at 1.7 (the average was 1.1 in planta, data from Figure 2). C4M is a more specific hydroxylase than is L4M, and the castor hydroxylase is a more specific hydroxylase than is the Lesquerella hydroxylase. This implies that at least some ofthe specificity determinants ofthe castor hydroxylase are contained within the C4M residues. Only two of these residues (148 and 324) are different between L4M (Asn- 148 and Ile-324) and C4M (He- 148 and Val-324).
  • T148I, M324I and M324V dramatically altered the product distribution (20- 90 fold).
  • the T148I, M324I, and M324V mutations affect the accumulation of both desaturated and hydroxylated product. For example, by simply substituting Val for Met at residue 324, the amount of desaturation product has decreased ⁇ 5-fold while the amount of hydroxylation product has increased ⁇ 20-fold. Analysis ofthe double mutants clearly supports the significance of T148I in determining product distribution.
  • Figure 3 illustrates the results ofthe expression of parental enzymes, quadruple mutants L4M and C4M, all possible triple (C3M) and double (C2M) mutant combinations using the C4M residues, and all single mutants that contain Lesquerella or castor substitutions.
  • yeast expression The data from yeast expression are consistent with the transgenic plant data, suggesting that the information obtained from yeast expression has predictive value regarding relative activity upon expression in A. thaliana. Because both ricinoleic acid and linoleic acid are end products in yeast, whereas they are further metabolized in A. thaliana, a precise product ratio is obtained more readily from expression in yeast than expression in A. thaliana.
  • Chain length affects chemoselectivity.
  • oleate desaturase and hydroxylase enzymes oxidize palmitoleic acid in addition to oleic acid.
  • FAD2 produces roughly 7.5 times as much 18:2 as 16:2
  • LFAH roughly 24 times as much 12-OH 18:1 ⁇ 9 as 12-OH 16:1 ⁇ 9 .
  • Values represent the ratio of desaturation product to hydroxylation product for the given substrate. Standard deviations derived from three experiments are shown in parentheses.
  • the catalytic mechanism ofthe variant AtFAD2 enzymes was investigated through analysis ofthe oxidation products of stereospecifically labeled stearate and oleate.
  • Yeast cells expressing AtFAD2, LFAH, C4M, or FAD2-M324I were grown and induced in the presence of deuterated stearoyl methyl ester ([12- 2 H ⁇ ](S)-18:0 or [12- 2 Hj](R)-18:0); the yeast acyl-CoA ⁇ 9 desaturase gene desaturated sufficient quantities ofthe labeled stearate to provide the necessary labeled oleate for enzymatic desaturation/hydroxylation.
  • Cerulenin was added to the cultures to minimize endogenous fatty acid synthesis, thus preventing dilution ofthe labeled stearate (Awaya, J. et al. (1975) Biochem. Biophys. Acta 409: 267-273).
  • the cellular fatty acids were analyzed by GC/MS to determine the [ 2 Hj7[H] ratio (measured as the ratio ofthe M + +l peak to the M + peak) ofthe enzymic products (linoleate and ricinoleate). These values were then corrected to account for the contribution of endogenous unlabeled substrate to the peak intensities.
  • Table III The results are shown in Table III.
  • AtFAD2 variants While these growth conditions permitted incorporation of high levels ofthe labeled stearoyl methyl ester, product accumulation was decreased markedly, preventing analysis ofthe less active AtFAD2 variants.
  • yeast cells were grown and induced in the presence of labeled oleate ([12- 2 H ⁇ ](S)-18:l ⁇ 9 ). The addition of this unsaturated fatty acid partially attenuates endogenous unsaturated fatty acid synthesis and thus cerulenin was not required (Bossie, M. A., and Martin, C. E. (1989) J. Bacteriol. 171 : 6409-6413).
  • the relative amount of ricinoleic acid was directly related to the amount of linoleic acid present in the seed oil, as would be expected if the source ofthe hydroxy fatty acid was produced by a bifunctional enzyme.
  • substantial levels of a novel hydroxylated fatty acid were detected in flax seed oil.
  • this analyte displayed a major ion at 145 m/z which is consistent with a (CH 3 ) 3 SiOCH(CH 2 ) 2 CH 3 ion.
  • Smaller fragments at 73 m/z (OSi(CH 3 ) 3 ) and 310 m/z (M-(OSi(CH 3 ) 3 )) are also present. These fragments are consistent with a description ofthe analyte as 15-hydroxy linoleate; the major fragment would arise from cleavage adjacent to the carbon bearing the oxygen, between C14 and C15 (Broun, P., and

Abstract

The present invention provides modified hydroxylase proteins, where the modification results in altered specific activity, as well as genes encoding the modified proteins, and methods of their production. The present invention also provides methods of using the modified hydroxylase genes and proteins, including in their expression in transgenic organisms and in the production of hydroxylated fatty acids.

Description

MODIFIED FATTY ACID HYDROXYLASE PROTEIN AND GENES
This application claims benefit under 35 U.S.C. §119(e) to Provisional Patent Application Serial No. 60/348,557, filed on January 14, 2002, pending, which is herein incorporated by reference in its entirety for all purposes.
This invention was made at least in part with Government support under Contract No. DE-AC02-98CH 10886 awarded by the U.S. Department of Energy. The Government may have certain rights to this invention.
FIELD OF THE INVENTION
The present invention concerns the identification of modified enzymes and encoding sequences, and methods related thereto, and the use of these sequences to produce genetically modified plants for the purpose of altering the composition of plant oils, waxes and related compounds.
BACKGROUND
Plants have the ability to produce a diverse range of structures, including more than 20,000 different terpenoids, flavonoids, alkaloids, and fatty acids. Fatty acids have been extensively exploited for industrial uses in products such as lubricants, plasticizers, and surfactants. In fact, approximately one-third of vegetable oils produced in the world are already used for non-food purposes (Ohlrogge, J (1994) Plant Physiol. 104:821-26).
Most plant fatty acids are obtained from seed oils, which consist primarily of storage oil in the form of triglycerols, with minority contributions primarily from membrane lipids, which are predominantly phospholipids. Seed oils from different species of higher plants contain a total of more than 210 naturally occurring fatty acids, which differ by the number and arrangement of double or triple bonds and various functional groups, such as hydroxyls, ketones, epoxys, cyclopentenyl or cyclopropyl groups, furans or halogens (van de Loo et al. (1993) in Lipid Metabolism in Plants (Moore, TS, Jr., ed.; CRC, Boca Raton, FL) pp 91-126). These include at least 33 structurally distinct monohydroxylated plant fatty acids, and 12 different polyhydroxylated fatty acids have been described (reviewed by van de Loo et al. (1993) in Lipid Metabolism in Plants (Moore, TS, Jr., ed.; CRC, Boca Raton, FL) pp 91-126; Smith (1970) in Progress in the Chemistry of Fats and Other Lipids (Holman, RT, e;) Vol. 11 : pp 139-177). The most commonly occurring fatty acids in both membrane and storage lipids are 16- andl8-carbon fatty acids, which may have from zero to three methylene-interrupted unsaturations. These are synthesized from the fully saturated species as the result of a series of sequential desaturations which usually begin at the DELTA 9 carbon and progress in the direction ofthe methyl carbon (Browse and Somerville (1991) Science 252: 80-87). Fatty acids which cannot be described by this simple algorithm are generally considered "unusual" even though several, such as lauric (12:0), erucic (22:1) and ricinoleic acid (12D- hydroxyoctadec-cis-9-enoic acid) are of significant commercial importance. Ricinoleic acid is synthesized by oleate-12-hydroxylase. The hydroxylase enzymes from castor, CFAH (van de Loo, F. J. et al. (1995) Proc. Natl. Acad. Sci. USA 92(15), 6743-6747) and Lesquerella, LFAH (Broun, P. et al. (1997) Plant J. 13, 201-210), are closely related to each other, as well as to the common plant oleate desaturase enzyme, typified by Arabidopsis thaliana fatty acid desaturase 2 (AtFAD2), which converts oleate (18:1 Δ9) into linoleate (18:2 Δ9,12) (In the following fatty acid nomenclature, X. Y indicates that the fatty acid contains X numbers of carbon atoms and 7 numbers of double bonds; Δz indicates that a double bond is positioned at the zth carbon atom from the carboxyl terminus). Indeed, LFAH actually retains both hydroxylase and desaturase activity, indicating that these two oxidation reactions can be catalyzed by the same enzyme. It would be useful to modify the activity ofthe enzyme, such that the ratio of desaturation to hydroxylation activity could be controlled. This would allow the production of a spectrum of enzymes with varying ratios of hydroxylation to denaturation; preferably, such enzymes are only slightly modified from the wild-type enzyme. Such enzymes could then be used to produce oils of specified fatty acid composition, either in vivo, as in transgenic plants, or in vitro, as in fermentation reactors.
During the development of modified enzymes, it is necessary to evaluate the effect of different modifications. The effects can be observed by in vitro assays ofthe enzymes. However, such assays are often quite difficult, especially when the enzymes are lipid synthesizing enzymes, due to problems associated with providing lipid substrates. Such problems are compounded when the enzymes are associated with or bound to membranes. For these enzymes, it is often most convenient to assess the effects in transgenic hosts; typically, for plant enzymes, plants are the host of choice. However, the use of transgenic plants is time consuming, especially when it is desired to screen many modified enzymes. An additional problem is the variability observed between independent transgenic plant lines; such variability is thought be due to numerous factors, including insertional position effects. Finally, many unusual fatty acid products are not well tolerated by host cells, as the unusual fatty acid products may be toxic as they accumulate; one defense is for the host to degrade them, thereby effectively preventing their accumulation. Therefore, it would also be useful to develop an alternative, to the use of transgenic plants for the evaluation of modified lipid-synthesizing enzymes, especially when the products of such enzymes are unusual fatty acids which may not be well tolerated by a transgenic host. Preferably, such an alternative would be quick, relatively easy to grow, able to tolerate unusual fatty acid products, and possess a background such that the presence of even very low amounts ofthe enzyme product could be detected.
SUMMARY OF THE INVENTION The present invention is directed to the biosynthesis of hydroxylated fatty acids, such as ricinoleic acid in castor (Ricinus communis) and Lequerella fendleri seed, and to the manipulation ofthe catalytic activity and reaction specificity ofthe enzymes which synthesize them.
It is an object ofthe present invention to provide compositions comprising a modified fatty acid hydroxylase or desaturase enzyme, such that the ratio of hydroxylase to desaturase activity ofthe enzyme differs from the wild-type enzyme, and to provide compositions comprising nucleic acids encoding the same; preferably, the enzyme is a modified Lesquerella hydroxylase enzyme. It is a further object ofthe present invention to provide methods of using the modified fatty acid hydroxylase to modify oils produced by transgenic organisms. It is yet a further object ofthe present invention to provide a yeast system in which to evaluate the effects of modified hydroxylases and other modified lipid synthetic enzymes.
Thus, in some embodiments, the present invention provides a composition comprising a modified Lesquerella fatty acid hydroxylase polypeptide, comprising a non-native amino acid at position 149, at position 325, or at both positions, where of amino acids at positions 63, 105, 149, 218, 296, 323 and 325 no more than three are non-native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase. In some embodiments, the modified hydroxylase comprises a non- native amino acid at position 149. In further embodiments, the non-native amino acid at position 149 is threonine or isoleucine. In other embodiments, the modified Lesquerella fatty acid hydroxylase polypeptide is a modified Lesquerella fendleri hydroxylase.
In yet other embodiments, the present invention provides a composition comprising a modified Lesquerella fatty acid hydroxylase polypeptide comprising an amino acid sequence shown in SEQ ID NO:l, where the amino acid sequence is modified to comprise a non-native amino acid at position 149, at position 325, or at both positions, where of amino acids at positions 63, 105, 149, 218, 296, 323 and 325 no more than three are non-native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase.
In yet other embodiments, the present invention comprises a composition comprising a modified plant fatty acid hydroxylase polypeptide comprising a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where 'of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non- native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
In yet other embodiments, the present invention provides a composition comprising any ofthe modified fatty acid hydroxylase polypeptides as described above, where a ratio of hydroxylation to desaturation activity ofthe modified hydroxylase is decreased relative to a ratio of hydroxylation to desaturation activity ofthe unmodified hydroxylase. In still other embodiments, the present invention comprises a composition comprising any ofthe modified fatty acid hydroxylase polypeptides as described above, where a ratio of hydroxylation to desaturation activity ofthe modified hydroxylase is increased relative to a ratio of hydroxylation to desaturation activity ofthe unmodified hydroxylase.
In other embodiments, the present invention also provides a composition comprising a modified fatty acid hydroxylase polypeptide, comprising a non-native amino acid at a position corresponding to position 149 of SEQ ID NO: 1, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non- native amino acids, and where a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase. In still other embodiments, the present invention also provides a composition comprising a modified fatty acid hydroxylase polypeptide, where the fatty hydroxylase polypeptide is highly homologous to SEQ ID NO: 1 and where the modified hydroxylase fatty acid polypeptide comprises a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, wherein of amino acids at positions corresponding to positions 63, 105, 149, 218,
296, 323 and 325 of SEQ ID NO:l no more than three are non-native amino acids, and wherein a reaction specificity ofthe modified hydroxylase differs from a reaction specificity ofthe unmodified hydroxylase.
The present invention also provides a composition comprising a nucleic acid sequence encoding any ofthe modified fatty acid hydroxylases as described above. In some other embodiments, the present invention also provides a composition comprising a nucleic acid sequence encoding a modified fatty acid hydroxylase polypeptide, wherein the nucleic acid sequence hybridizes to a nucleic acid sequence encoding SEQ ID NO:l, and wherein the modified hydroxylase fatty acid polypeptide comprises a non-native amino acid at a position corresponding to position 149 of SEQ ID NO: 1, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, wherein of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non- native amino acids, and wherein a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase. The present invention provides a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid - hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In yet further embodiments, the present invention provides an expression vector comprising the recombinant DNA. The present invention also provides an organism transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In different embodiments, the organism is a microorganism or a plant.
The present invention also provides a plant transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In some embodiments, the plant is selected from the group consisting of soybean (Glycine max), rapeseed and canola (including Brassica napus and B. campestris), sunflower (Helianthus annus), cotton (Gossypium hirsutum), corn (Zea mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorius), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea).
The present invention also provides a plant cell transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In other embodiments, the present invention also provides a plant seed transformed with the recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In yet other embodiments, the present invention provides an oil obtained from the transformed plant seed.
The present invention also provides a yeast cell transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In some embodiments, the yeast is S. cerevisiae strain YPH499.
The present invention also provides a bacterial cell transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence. In other embodiments, the present invention provides an oil obtained from the transformed bacterial cell. The present invention also provides a method of producing a modified hydroxylase in a transgenic organism, comprising providing an organism transformed with a recombinant DNA molecule comprising any ofthe nucleic acid sequences described above, or encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence, and growing the organism under conditions such that a modified hydroxylase encoded by the recombinant DNA molecule is expressed. In some embodiments, the organism is a plant. In further embodiments, the recombinant DNA molecule is integrated into a genome ofthe plant. The present invention also provides a transgenic plant which produces the modified hydroxylase according to this method.
The present invention also provides a method for altering the phenotype of a plant, comprising providing an expression vector comprising a recombinant DNA molecule comprising any ofthe nucleic acid sequences encoding any ofthe modified fatty acid hydroxylases as described above, where the nucleic acid sequence is operably linked to at least one suitable regulatory sequence, and a plant or a plant tissue or a plant cell; and transfecting the plant or plant tissue or plant cell with the vector under conditions such that the protein is expressed in a plant obtained from the plant or plant tissue or plant cell.
The present invention also provides a method for evaluating fatty acid desaturation or hydroxylation activity of an enzyme, comprising transforming a yeast cell of S. cerevisiae strain YPH499 with a nucleic acid sequence encoding the enzyme under control of an inducible promoter; growing the yeast cell to a culture of cells at high density; and inducing expression of the nucleic acid at about 30 degrees centigrade, such that the desaturation or hydroxylation activity ofthe enzyme can be evaluated.
DESCRIPTION OF THE FIGURES
Figure 1 shows the accumulation of hydroxy fatty acids in seeds of A. thaliana FAD2- deficient plants transformed with FAD2 variants under control ofthe seed-specific napin promoter. Each circle represents data obtained from a single plant. Variants are indicated by the amino acid of AtFAD2 to be modified, followed by its position number, followed by the substituted amino acid; if more than one amino acid is modified, the amino acids are separated by a slash. Variants with more two or three amino acid substitution are indicated by a shorthand designation, with the letter ofthe enzyme source ofthe substituted amino acid (for example "C" for CFAH), "M" to indicate mutant, a number to indicate the number of amino acids substituted, followed by a period and a second number to indicate the particular variant. The variants are identified as follows: L4M, A. thaliana FAD2 with four substitutions from LFAH (Al 04G/T148N/S322A/M324I); C4M, A. thaliana FAD2 with four substitutions from CFAH (A104G/T148I/S322A/M324V); single amino acid variants of AtFAD2, with one substitution from CFAH, include A104G, T148I, S322A, and M324I (the single amino acid variant of AtFAD2, with one substitution from LFAH, is designated T148N); double amino acid variants of AtFAD2, with two substitutions from CFAH, include C2M.1 (T148I/M324V), C2M.2 (A104G/S322A), C2M.3 (A104G/M324V), C2M.4 (A104G/T148I), C2M.5
(T148I/S322A), and C2M.6 (S322A/M324V); triple amino acid variants of AtFAD2, with three substitutions from CFAH, include C3M.1 (A104G/T148I/M324V), C3M.2 (A104G/T148I/S322A), C3M.3 (A104G/S322A/M324V), and C3M.4 (T148I/S322A/M324V). Figure 2 shows the relationship of linoleic acid and ricinoleic acid generated by the expression ofthe quadruple mutants of FAD2 (L4M and C4M) in seeds of A. thaliana seeds of FAD2-deficient plants.
Figure 3 shows the percent linoleic and ricinoleic acid products produced in S. cerevisiae expressing LFAH, FAD2, or FAD2 variant. Variant identifications are the same as in Figure 1. Figure 4 shows the amino acid sequence of Lesquerella fatty acid hydroxylase (SEQ ID
NO:l); the seven amino acids at positions 63, 105, 149, 218, 296, 323, and 325 are underlined, and the amino acid at position 149 is also in bold type.
DEFINITIONS
To facilitate an understanding ofthe present invention, a number of terms and phrases as used herein are defined below:
The term "microorganism" is used in its broadest sense. It includes, but is not limited to, microscopic organisms (and taxonomically related macroscopic organisms) within the categories algae, bacteria, fungi (including lichens), protozoa, viruses, and subviral agents. The term "plant" is used in its broadest sense. It includes, but is not limited to, any species of woody, ornamental or decorative, crop or cereal, fruit or vegetable plant, and photosynthetic green algae (for example, Chlamydomonas reinhardtii). It also refers to a plurality of plant cells which are largely differentiated into a structure that is present at any stage of a plant's development. Such structures include, but are not limited to, a fruit, shoot, stem, leaf, flower petal, etc. The term "plant tissue" includes differentiated and undifferentiated tissues of plants including those present in roots, shoots, leaves, pollen, seeds and tumors, as well as cells in culture (for example, single cells, protoplasts, embryos, callus, etc.). Plant tissue may be in planta, in organ culture, tissue culture, or cell culture. The term "plant part" as used herein refers to a plant structure or a plant tissue. '
The term "crop" or "crop plant" is used in its broadest sense. The term includes, but is not limited to, any species of plant or algae edible by humans or used as a feed for animals or used, or consumed by humans, or any plant or algae used in industry or commerce.
The term "oil-producing species" refers to plant species which produce and store triacylglycerol in specific organs, primarily in seeds. Such species include but are not limited to soybean (Glycine max), rapeseed and canola (including Brassica napus and B. campestris), sunflower (Helianthus annus), cotton (Gossypium hirsutum), corn (Zea mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorhis), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea). The group also includes non-agronomic species which are useful in developing appropriate expression vectors such as tobacco, rapid cycling Brassica species, and Arabidopsis thaliana, and wild species which may be a source of unique fatty acids.
The term plant cell "compartment" or "organelle" is used in its broadest sense. The term includes but is not limited to, the endoplasmic reticulum, Golgi apparatus, trans Golgi network, plastids, sarcoplasmic reticulum, glyoxysomes, mitochondrial, chloroplast, and nuclear membranes, and the like.
The term "hydroxylase" refers to a monooxygenase (also mixed-function oxidase and mixed-function oxygenase), which is an oxygenase which catalyses the incorporation of one atom of molecular oxygen into a substrate molecule, the other oxygen atom being reduced to water; the reducing power need for monooxygenase activity may be supplied for example by NADH. The term "fatty acid hydroxylase" refers to a sequence of amino acids, such as a protein, polypeptide or peptide fragment, which demonstrates the ability to catalyze the production of a hydroxy-fatty acid from a fatty acid substrate under enzyme reactive conditions; the substrate may be free fatty acid, a fatty acid salt, fatty acyl-CoA, fatty acyl-ACP or an fatty acyl-lipid. By "enzyme reactive conditions" is meant any necessary conditions (for example, such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. The catalytic activity ofthe enzyme is referred to as "hydroxylation." In one aspect, the enzyme demonstrates the ability to catalyze the production of hydroxy-oleic acid.
The terms "ricinoleate" or "ricinoleic acid" or "hydroxy-oleate" or "hydroxy-oleic acid" refer to 12D-hydroxyoctadec-cis-9-enoic acid, and include the free acids, the ACP and CoA esters, the salts of these acids, the glycerolipid esters (particularly the triacylglycerol esters), the wax esters, and the ether derivatives of these acids.
The term "desaturase" refers to a monooxygenase (also mixed-function oxidase and mixed-function oxygenase), which is an oxygenase which catalyses the 02-dependent insertion of a double bond between two carbon atoms; the reducing power need for the monooxygenase activity may be supplied for example by NADH
The term "fatty acid desaturase" refers to a sequence of amino acids, such as a protein, polypeptide or peptide fragment, which demonstrates the ability to catalyze the production of an unsaturated bond in a fatty acid substrate under enzyme reactive conditions; the substrate may be free fatty acid, a fatty acid salt, fatty acyl-CoA, fatty acyl-ACP or an fatty acyl-lipid, and it may be an unsaturated, monounsaturated or polyunsaturated fatty acid. The catalytic activity ofthe enzyme is referred to as "desaturation." By "enzyme reactive conditions" is meant any necessary conditions (that is, such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. In one aspect, the enzyme demonstrates the ability to catalyze the production of linoleic acid from oleic acid.
The term "reaction specificity" refers to the proportion of hydroxylation or desaturation activity relative to the total of hydroxylation and desaturation activity of a hydroxylase or desaturase enzyme under particular conditions. The specificity can be expressed as the ratio of the two activities, preferably desaturation to hydroxylation; this can be measured, for example, as the amount of desaturated fatty acid product divided by the amount of hydroxylated fatty acid product. ** The term "increase" when used in reference to reaction specificity, or ratio of hydroxylation to desaturation activity, of a modified hydroxylase relative to the specificity or ratio of an unmodified hydroxylase, refers to an increase of at least about 5 percent, or preferably of at least about 10 percent, or more preferably of at least about 20 percent, or even more preferably of at least about 50 percent, or even more preferably of at least about 75 percent, or even more preferably of at least about 100 percent. ** The term "decrease" when used in reference to reaction specificity, or ratio of hydroxylation to desaturation activity, of a modified hydroxylase relative to the specificity or ratio of an unmodified hydroxylase, refers to a decrease of at least about 5 percent, or preferably of at least about 10 percent, or more preferably of at least about 20 percent, or even more preferably of at least about 50 percent, or even more preferably of at least about 75 percent, or even more preferably of at least about 100 percent.
The term "Lesquerella hydroxylase" refers to an enzyme for which the protein or nucleic acid coding sequence occurs naturally in a Lesquerella plant. However, the coding sequence may be subsequently cloned, and expressed, in different organisms.
The term "native" when used in reference to an amino acid in a particular position in a protein means that the amino acid exists in nature in that particular position. When used in reference to a Lesquerella hydroxylase, it means that the amino acid exists in the hydroxylase as it is found in Lesquerella plant cells, or encoded by a Lesquerella plant gene for the hydroxylase. The term "non-native" when used in reference to an amino acid in a particular position in a protein means that the amino acid that exists in nature in that particular position is replaced by another amino acid; thus, a "non-native" amino acid is an amino acid other than one that exists in nature in that particular position.
The term "position corresponding to position "X" when used in reference to an amino acid sequence refers to a second position in a second sequence which is the same as the first position X in a first or reference sequence as determined from an alignment of two amino acid sequences based upon sequence homology, where "X" is the position in an identified referent amino acid sequence, even though the exact position ofthe corresponding second position within any particular second amino acid sequence may vary, due to amino acid modifications, such as additions and deletions. The term "modified" when used in reference to a hydroxylase or desaturase ofthe present invention refers to a hydroxylase or desaturase polypeptide comprising a non-native amino acid at positions corresponding to amino acid positions in Lesquerella hydroxylase, where the modified polypeptide comprises a non-native amino at a position corresponding to position 149, at a position corresponding to position 325, or at positions corresponding to both positions, where no more than three ofthe amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids; preferably, the reaction specificity ofthe modified enzyme differs from the reaction specificity of an unmodified enzyme. Preferably, the position corresponding to position 149 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine. The terms "protein" and "polypeptide" refer to compounds comprising amino acids joined via peptide bonds and are used interchangeably. A "protein" or "polypeptide" encoded by a gene is not limited to the amino acid sequence encoded by the gene, but includes post- translational modifications ofthe protein. Where the term "amino acid sequence" is recited herein to refer to an amino acid sequence of a protein molecule; an "amino acid sequence" can be deduced from the nucleic acid sequence encoding the protein.
The term "portion" when used in reference to a protein (as in "a portion of a given protein") refers to fragments of that protein. The fragments may range in size from four amino acid residues to the entire amino sequence minus one amino acid.
The term "chimera" or "chimeric" when used in reference to a polypeptide refers to the expression product of two or more coding sequences obtained from different genes, that have been cloned together and that, after translation, act as a single polypeptide sequence. Chimeric polypeptides are also referred to as "hybrid" polypeptides. The coding sequences include those obtained from the same or from different species of organisms.
The term "fusion" when used in reference to a polypeptide refers to a chimeric protein containing a protein of interest joined to an exogenous protein fragment (the fusion partner). The fusion partner may serve various functions, including enhancement of solubility ofthe polypeptide of interest, as well as providing an "affinity tag" to allow purification ofthe recombinant fusion polypeptide from a host cell or from a supernatant or from both. If desired, the fusion partner may be removed from the protein of interest after or during purification. The term "homolog" or "homologous" when used in reference to a polypeptide refers to a high degree of sequence identity between two polypeptides, or to a high degree of similarity between the three-dimensional structure or to a high degree of similarity between the active site and the mechanism of action. In a preferred embodiment, a homolog has a greater than about 60 percent sequence identity, and more preferably greater than about 75 percent sequence identity, and still more preferably greater than about 90 percent sequence identity, and even more preferably greater than about 96 percent sequence identity with a reference sequence.
As applied to polypeptides, the term "substantial identity" means that two peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap weights, share at least about 80 percent sequence identity, preferably at least about 90 percent sequence identity, more preferably at least about 95 percent sequence identity or more (for example, about 99 percent sequence identity). Preferably, residue positions which are not identical differ by conservative amino acid substitutions.
The terms "variant" and "mutant" when used in reference to a polypeptide refer to an amino acid sequence that differs by one or more amino acids from another, usually related polypeptide. The variant may have "conservative" changes, wherein a substituted amino acid has similar structural or chemical properties. One type of conservative amino acid substitutions refers to the interchangeability of residues having similar side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. Preferred conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine- arginine, alanine-valine, and asparagine-glutamine. More rarely, a variant may have "non- conservative" changes (for example, replacement of a glycine with a tryptophan). Similar minor variations may also include amino acid deletions or insertions (that is, additions), or both. The term "gene" refers to a nucleic acid (for example, DNA or RNA) sequence that comprises coding sequences necessary for the production of an RNA, or a polypeptide or its precursor (for example, proinsulin). A functional polypeptide can be encoded by a full length coding sequence or by any portion ofthe coding sequence as long as the desired activity or functional properties (for example, enzymatic activity, ligand binding, signal transduction, etc.) ofthe polypeptide are retained. The term "portion" when used in reference to a gene refers to fragments of that gene. The fragments may range in size from a few nucleotides to the entire gene sequence minus one nucleotide. Thus, "a nucleotide comprising at least a portion of a gene" may comprise fragments ofthe gene or the entire gene.
The term "gene" also encompasses the coding regions of a structural gene and includes sequences located adjacent to the coding region on both the 5' and 3' ends for a distance of about 1 kb on either end such that the gene corresponds to the length ofthe full-length mRNA. The sequences which are located 5' ofthe coding region and which are present on the mRNA are referred to as 5' non-translated sequences. The sequences which are located 3' or downstream ofthe coding region and which are present on the mRNA are referred to as 3' non- translated sequences. The term "gene" encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed "introns" or "intervening regions" or "intervening sequences." Introns are segments of a gene which are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or "spliced out" from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.
In addition to containing introns, genomic forms of a gene may also include sequences located on both the 5' and 3' end ofthe sequences which are present on the RNA transcript. These sequences are referred to as "flanking" sequences or regions (these flanking sequences are located 5' or 3' to the non-translated sequences present on the mRNA transcript). The 5' flanking region may contain regulatory sequences such as promoters and enhancers which control or influence the transcription ofthe gene. The 3' flanking region may contain sequences which direct the termination of transcription, posttranscriptional cleavage and polyadenylation. The term "heterologous" when used in reference to a gene refers to a gene encoding a factor that is not in its natural environment (in other words, has been altered by the hand of man). For example, a heterologous gene includes a gene from one species introduced into another species. A heterologous gene also includes a gene native to an organism that has been altered in some way (for example, mutated, added in multiple copies, linked to a non-native promoter or enhancer sequence, etc.). Heterologous genes may comprise plant gene sequences that comprise cDNA forms of a plant gene; the cDNA sequences may be expressed in either a sense (to produce mRNA) or anti-sense orientation (to produce an anti-sense RNA transcript that is complementary to the mRNA transcript). Heterologous genes are distinguished from endogenous plant genes in that the heterologous gene sequences are typically joined to nucleotide sequences comprising regulatory elements such as promoters that are not found naturally associated with the gene for the protein encoded by the heterologous gene or with plant gene sequences in the chromosome, or are associated with portions ofthe chromosome not found in nature (for example, genes expressed in loci where the gene is not normally expressed).
The term "nucleotide sequence of interest" or "nucleic acid sequence of interest" refers to any nucleotide sequence (for example, RNA or DNA), the manipulation of which may be deemed desirable for any reason (for example, treat disease, confer improved qualities, etc.), by one of ordinary skill in the art. Such nucleotide sequences include, but are not limited to, coding sequences of structural genes (for example, reporter genes, selection marker genes, oncogenes, drug resistance genes, growth factors, etc.), and non-coding regulatory sequences which do not encode an mRNA or protein product (for example, promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, etc.).
The term "structural" when used in reference to a gene or to a nucleotide or nucleic acid sequence refers to a gene or a nucleotide or nucleic acid sequence whose ultimate expression product is a protein (such as an enzyme or a structural protein), an rRNA, an sRNA, a tRNA, etc.
The term "oligonucleotide" refers to a molecule comprised of two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and usually more than ten. The exact size will depend on many factors, which in turn depends on the ultimate function or use ofthe oligonucleotide. The oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, or a combination thereof. The term "an oligonucleotide having a nucleotide sequence encoding a gene" or "a nucleic acid sequence encoding" a specified polypeptide refers to a nucleic acid sequence comprising the coding region of a gene or in other words the nucleic acid sequence which encodes a gene product. The coding region may be present in either a cDNA, genomic DNA or RNA form. When present in a DNA form, the oligonucleotide may be single-stranded (that is, the sense strand) or double-stranded. Suitable control elements such as enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in close proximity to the coding region ofthe gene if needed to permit proper initiation of transcription and/or correct processing ofthe primary RNA transcript. Alternatively, the coding region utilized in the expression vectors ofthe present invention may contain endogenous enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, etc. or a combination of both endogenous and exogenous control elements.
The term "recombinant" when made in reference to a nucleic acid molecule refers to a nucleic acid molecule which is comprised of segments of nucleic acid joined together by means of molecular biological techniques. The term "recombinant" when made in reference to a protein or a polypeptide refers to a protein molecule which is expressed using a recombinant nucleic acid molecule.
The terms "complementary" and "complementarity" refer to polynucleotides (in other words, a sequence of nucleotides) related by the base-pairing rules. For example, for the sequence "A-G-T," is complementary to the sequence "T-C-A." Complementarity may be "partial," in which only some ofthe nucleic acids' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods which depend upon binding between nucleic acids.
The term "homology" when used in relation to nucleic acids refers to a degree of complementarity. There may be partial homology or complete homology (in other words, identity). "Sequence identity" refers to a measure of relatedness between two or more nucleic acids or proteins, and is given as a percentage with reference to the total comparison length. The identity calculation takes into account those nucleotide or amino acid residues that are identical and in the same relative positions in their respective larger sequences. Calculations of identity may be performed by algorithms contained within computer programs such as "GAP" (Genetics Computer Group, Madison, Wis.) and "ALIGN" (DNAStar, Madison, Wis.). A partially complementary sequence is one that at least partially inhibits (or competes with) a completely complementary sequence from hybridizing to a target nucleic acid is referred to using the functional term "substantially homologous." The inhibition of hybridization ofthe completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe will compete for and inhibit the binding (in other words, the hybridization) of a sequence which is completely homologous to a target under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (in other words, selective) interaction. The absence of non-specific binding may be tested by the use of a second target which lacks even a partial degree of complementarity (for example, less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.
The following terms are used to describe the sequence relationships between two or more polynucleotides: "reference sequence", "sequence identity", "percentage of sequence identity", and "substantial identity". A "reference sequence" is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA sequence given in a sequence listing or may comprise a complete gene sequence. Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length. Since two polynucleotides may each (1) comprise a sequence (in other words, a portion ofthe complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences ofthe two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity. A "comparison window", as used herein, refers to a conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference sequence of at least 20 contiguous nucleotides and wherein the portion ofthe polynucleotide sequence in the comparison window may comprise additions or deletions (in other words, gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment ofthe two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by the local homology algorithm of Smith and Waterman (Smith and Waterman (1981) Adv. Appl. Math 2: 482) by the homology alignment algorithm of Needleman and Wunsch (Needleman and Wunsch (1970) J. Mol. Biol. 48:443), by the search for similarity method of Pearson and Lipman (Pearson and Lipman (1988) Proc. Natl. Acad. Sci. (U.S.A.J 85:2444), by computerized unplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by inspection, and the best alignment (in other words, resulting in the highest percentage of homology over the comparison window) generated by the various methods is selected. The term "sequence identity" means that two polynucleotide sequences are identical (in other words, on a nucleotide-by-nucleotide basis) over the window of comparison. The term "percentage of sequence identity" is calculated by comparing twp - optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (for example, A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (in other words, the window size), and multiplying the result by 100 to yield the percentage of sequence identity. The terms "substantial identity" as used herein denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 85 percent sequence identity, preferably at least 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison window of at least 20 nucleotide positions, frequently over a window of at least 25-50 nucleotides, wherein the percentage of sequence identity is calculated by comparing the reference sequence to the polynucleotide sequence which may include deletions or additions which total 20 percent or less ofthe reference sequence over the window of comparison. The term "substantially homologous" when used in reference to a double-stranded nucleic acid sequence such as a cDNA or genomic clone refers to any probe that can hybridize to either or both strands ofthe double-stranded nucleic acid sequence under conditions of low to high stringency as described above.
The term "substantially homologous" when used in reference to a single-stranded nucleic acid sequence refers to any probe that can hybridize (that is, it is the complement of) the single-stranded nucleic acid sequence under conditions of low to high stringency as described above.
The term "hybridization" refers to the pairing of complementary nucleic acids.
Hybridization and the strength of hybridization (in other words, the strength ofthe association between the nucleic acids) is impacted by such factors as the degree of complementary between the nucleic acids, stringency ofthe conditions involved, the Tm ofthe formed hybrid, and the
G:C ratio within the nucleic acids. A single molecule that contains pairing of complementary nucleic acids within its structure is said to be "self-hybridized."
The term "Tm" refers to the "melting temperature" of a nucleic acid. The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. The equation for calculating the Tm of nucleic acids is well known in the art. As indicated by standard references, a simple estimate ofthe Tm value may be calculated by the equation: Tm = 81.5 + 0.41 (% G + C), when a nucleic acid is in aqueous solution at 1 MNaCl (See for example, Anderson and Young (1985) Quantitative
Filter Hybridization, in Nucleic Acid Hybridization). Other references include more sophisticated computations that take structural as well as sequence characteristics into account for the calculation of Tm.
The term "stringency" refers to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. With "high stringency" conditions, nucleic acid base pairing will occur only between nucleic acid fragments that have a high frequency of complementary base sequences.
Thus, conditions of "low" stringency are often required with nucleic acids that are derived from organisms that are genetically diverse, as the frequency of complementary sequences is usually less.
"Low stringency conditions" when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42EC in a solution consisting of
5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH2P04XH20 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5X Denhardt's reagent (50X Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharmacia), 5 g BSA (Fraction V; Sigma)) and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 5X SSPE, 0.1% SDS at 42EC when a probe of about 500 nucleotides in length is employed. "Medium stringency conditions" when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42EC in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH2P04XH20 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5X Denhardt's reagent and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 1.0X SSPE, 1.0% SDS at 42EC when a probe of about 500 nucleotides in length is employed.
"High stringency conditions" when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42EC in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH2P04XH20 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5X Denhardt's reagent and 100 g/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1X SSPE, 1.0% SDS at 42EC when a probe of about 500 nucleotides in length is employed.
It is well known that numerous equivalent conditions may be employed to comprise low stringency conditions; factors such as the length and nature (DNA, RNA, base composition) of the probe and nature ofthe target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration ofthe salts and other components (for example, the presence or absence of formamide, dextran sulfate, polyethylene glycol) are considered and the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions. In addition, the art knows conditions that promote hybridization under conditions of high stringency (for example, increasing the temperature ofthe hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).
The term "amplification" refers to a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (in other words, replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (that is, synthesis ofthe proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of "target" specificity. Target sequences are "targets" in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out. Template specificity is achieved in most amplification techniques by the choice of enzyme. Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid. For example, in the case of Q Ξ replicase, MDV-1 RNA is the specific template for the replicase (Kacian et al. (1972) Proc. Natl. Acad. Sci. USA, 69:3038). Other nucleic acid will not be replicated by this amplification enzyme. Similarly, in the case of T7 RNA polymerase, this amplification enzyme has a stringent specificity for its own promoters (Chamberlin et al. (1970) Nature: 228:227). In the case of T4 DNA ligase, the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (Wu and Wallace (1989) Genomics: 4:560). Finally, Taq and Pfu polymerases, by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (HA. Erlich (ed.), PCR Technology, Stockton Press (1989)). The term "amplifiable nucleic acid" refers to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" will usually comprise "sample template."
The term "sample template" refers to nucleic acid originating from a sample that is analyzed for the presence of "target" (defined below). In contrast, "background template" is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample. The term "primer" refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (that is, in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence ofthe inducing agent. The exact lengths ofthe primers will depend on many factors, including temperature, source of primer and the use ofthe method. The term "probe" refers to an oligonucleotide (in other words, a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced ' synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention will be labeled with any "reporter molecule," so that is detectable in any detection system, including, but not limited to enzyme (for example, ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label. The term "target," when used in reference to the polymerase chain reaction, refers to the region of nucleic acid bounded by the primers used for polymerase chain reaction. Thus, the "target" is sought to be sorted out from other nucleic acid sequences. A "segment" is defined as a region of nucleic acid within the target sequence.
The term "polymerase chain reaction" ("PCR") refers to the method of K.B. Mullis U.S. Patent Nos. 4,683,195, 4,683,202, and 4,965,188, that describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary to their respective strands ofthe double stranded target sequence. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule. Following annealing, the primers are extended with a polymerase so as to form a new pair of complementary strands. The steps of denaturation, primer annealing, and polymerase extension can be repeated many times (that is, denaturation, annealing and extension constitute one "cycle"; there can be numerous "cycles") to obtain a high concentration of an amplified segment ofthe desired target sequence. The length ofthe amplified segment ofthe desired target sequence is determined by the relative positions ofthe primers with respect to each other, and therefore, this length is a controllable parameter. By virtue ofthe repeating aspect ofthe process, the method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments ofthe target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be "PCR amplified."
With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (for example, hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin- enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications. The terms "PCR product," "PCR fragment," and "amplification product" refer to the resultant mixture of compounds after two or more cycles ofthe PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
The term "amplification reagents" refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for primers, nucleic acid template, and the amplification enzyme. Typically, amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
The term "reverse-transcriptase" or "RT-PCR" refers to a type of PCR where the starting material is mRNA. The starting mRNA is enzymatically converted to complementary DNA or "cDNA" using a reverse transcriptase enzyme. The cDNA is then used as a "template" for a "PCR" reaction. The term "gene expression" refers to the process of converting genetic information encoded in a gene into RNA (for example, mRNA, rRNA, tRNA, or snRNA) through "transcription" ofthe gene (in other words, via the enzymatic action of an RNA polymerase), and into protein, through "translation" of mRNA. Gene expression can be regulated at many stages in the process. "Up-regulation" or "activation" refers to regulation that increases the production of gene expression products (in other words, RNA or protein), while "down- regulation" or "repression" refers to regulation that decrease production. Molecules (for example, transcription factors) that are involved in up-regulation or down-regulation are often called "activators" and "repressors," respectively. The terms "in operable combination", "in operable order" and "operably linked" refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced. The term "regulatory element" refers to a genetic element which controls some aspect of the expression of nucleic acid sequences. For example, a promoter is a regulatory element which facilitates the initiation of transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, etc.
Transcriptional control signals in eukaryotes comprise "promoter" and "enhancer" elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription (Maniatis, et al. (1987) Science 236: 1237). Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in yeast, insect, mammalian and plant cells. Promoter and enhancer elements have also been isolated from viruses and analogous control elements, such as promoters, are also found in prokaryotes. The selection of a particular promoter and enhancer depends on the cell type used to express the protein of interest. Some eukaryotic promoters and enhancers have a broad host range while others are functional in a limited subset of cell types (for review, see Voss, et al, (1986) Trends Biochem. Sci., 11 :287and Maniatis, et al, (1987) supra). The terms "promoter element," "promoter^" or "promoter sequence" refer to a DNA sequence that is located at the 5' end (in other words precedes) ofthe coding region of a DNA polymer. The location of most promoters known in nature precedes the transcribed region. The promoter functions as a switch, activating the expression of a gene. If the gene is activated, it is said to be transcribed, or participating in transcription. Transcription involves the synthesis of mRNA from the gene. The promoter, therefore, serves as a transcriptional regulatory element and also provides a site for initiation of transcription ofthe gene into mRNA.
The term "regulatory region" refers to a gene's 5' transcribed but untranslated regions, located immediately downstream from the promoter and ending just prior to the translational start ofthe gene. The term "promoter region" refers to the region immediately upstream ofthe coding region of a DNA polymer, and is typically between about 500 bp and 4 kb in length, and is preferably about 1 to 1.5 kb in length.
Promoters may be tissue specific or cell specific. The term "tissue specific" as it applies to a promoter refers to a promoter that is capable of directing selective expression of a nucleotide sequence of interest to a specific type of tissue (for example, seeds) in the relative absence of expression ofthe same nucleotide sequence of interest in a different type of tissue (for example, leaves). Tissue specificity of a promoter may be evaluated by, for example, operably linking a reporter gene to the promoter sequence to generate a reporter construct, introducing the reporter construct into the genome of a plant such that the reporter construct is integrated into every tissue ofthe resulting transgenic plant, and detecting the expression ofthe reporter gene (for example, detecting mRNA, protein, or the activity of a protein encoded by the reporter gene) in different tissues ofthe transgenic plant. The detection of a greater level of expression ofthe reporter gene in one or more tissues relative to the level of expression ofthe reporter gene in other tissues shows that the promoter is specific for the tissues in which greater levels of expression are detected. The term "cell type specific" as applied to a promoter refers to a promoter which is capable of directing selective expression of a nucleotide sequence of interest in a specific type of cell in the relative absence of expression ofthe same nucleotide sequence of interest in a different type of cell within the same tissue. The term "cell type specific" when applied to a promoter also means a promoter capable of promoting selective expression of a nucleotide sequence of interest in a region within a single tissue. Cell type specificity of a promoter may be assessed using methods well known in the art, for example, immunohistochemical staining. Briefly, tissue sections are embedded in paraffin, and paraffin sections are reacted with a primary antibody which is specific for the polypeptide product encoded by the nucleotide sequence of interest whose expression is controlled by the promoter. A labeled (for example, peroxidase conjugated) secondary antibody which is specific for the primary antibody is allowed to bind to the sectioned tissue and specific binding detected (for example, with avidin biotin) by microscopy.
Promoters may be constitutive or inducible. The term "constitutive" when made in reference to a promoter means that the promoter is capable of directing transcription of an operably linked nucleic acid sequence in the absence of a stimulus (for example, heat shock, chemicals, light, etc.). Typically, constitutive promoters are capable of directing expression of a transgene in substantially any cell and any tissue. Exemplary constitutive plant promoters include, but are not limited to SD Cauliflower Mosaic Virus (CaMV SD; see for example, U.S. Pat. No. 5,352,605, incorporated herein by reference), mannopine synthase, octopine synthase (ocs), superpromoter (see for example, WO 95/14098), and ubi3 (see for example, Garbarino and Belknap (1994) Plant Mol. Biol. 24: 119-127) promoters. Such promoters have been used successfully to direct the expression of heterologous nucleic acid sequences in transformed plant tissue.
In contrast, an "inducible" promoter is one which is capable of directing a level of transcription of an operably linked nucleic acid sequence in the presence of a stimulus (for example, heat shock, chemicals, light, etc.) which is different from the level of transcription of the operably linked nucleic acid sequence in the absence ofthe stimulus.
The enhancer and/or promoter may be "endogenous" or "exogenous" or "heterologous." An "endogenous" enhancer or promoter is one that is naturally linked with a given gene in the genome. An "exogenous" or "heterologous" enhancer or promoter is one that is placed in juxtaposition to a gene by means of genetic manipulation (in other words, molecular biological techniques) such that transcription ofthe gene is directed by the linked enhancer or promoter. For example, an endogenous promoter in operable combination with a first gene can be isolated, removed, and placed in operable combination with a second gene, thereby making it a "heterologous promoter" in operable combination with the second gene. A variety of such combinations are contemplated (for example, the first and second genes can be from the same species, or from different species). The term "naturally linked" or "naturally located" when used in reference to the relative positions of nucleic acid sequences means that the nucleic acid sequences exist in nature in the relative positions.
The presence of "splicing signals" on an expression vector often results in higher levels of expression of the recombinant transcript in eukaryotic host cells. Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site (Sambrook, et al. Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989) pp. 16.7-16.8). A commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40. Efficient expression of recombinant DNA sequences in eukaryotic cells requires expression of signals directing the efficient termination and polyadenylation ofthe resulting transcript. Transcription termination signals are generally found downstream ofthe polyadenylation signal and are a few hundred nucleotides in length. The term "poly(A) site" or "poly(A) sequence" as used herein denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of the recombinant transcript is desirable, as transcripts lacking a poly(A) tail are unstable and are rapidly degraded. The poly(A) signal utilized in an expression vector may be "heterologous" or "endogenous." An endogenous poly(A) signal is one that is found naturally at the 3' end ofthe coding region of a given gene in the genome. A heterologous poly(A) signal is one which has been isolated from one gene and positioned 3' to another gene. A commonly used heterologous poly(A) signal is the SV40 poly(A) signal. The SV40 poly(A) signal is contained on a 237 bp BarήHl/Bcll restriction fragment and directs both termination and polyadenylation (Sambrook, supra, at 16.6-16.7).
The term "vector" refers to nucleic acid molecules that transfer DNA segment(s) from one cell to another. The term "vehicle" is sometimes used interchangeably with "vector." The terms "expression vector" or "expression cassette" refer to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression ofthe operably linked coding sequence in a particular host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include a promoter, an operator (optional), and a ribosome binding site, often along with other sequences. Eukaryotic cells are known to utilize promoters, enhancers, and termination and polyadenylation signals.
The term "transfection" refers to the introduction of foreign DNA into cells. Transfection may be accomplished by a variety of means known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated transfection, polybrene-mediated transfection, glass beads, electroporation, microinjection, liposome fusion, lipofection, protoplast fusion, viral infection, biolistics (that is, particle bombardment) and the like.
The term "stable transfection" or "stably transfected" refers to the introduction and integration of foreign DNA into the genome ofthe transfected cell. The term "stable transfectant" refers to a cell that has stably integrated foreign DNA into the genomic DNA.
The term "transient transfection" or "transiently transfected" refers to the introduction of foreign DNA into a cell where the foreign DNA fails to integrate into the genome ofthe transfected cell. The foreign DNA persists in the nucleus ofthe transfected cell for several days. During this time the foreign DNA is subject to the regulatory controls that govern the expression of endogenous genes in the chromosomes. The term "transient transfectant" refers to cells that have taken up foreign DNA but have failed to integrate this DNA.
The term "calcium phosphate co-precipitation" refers to a technique for the introduction of nucleic acids into a cell. The uptake of nucleic acids by cells is enhanced when the nucleic acid is presented as a calcium phosphate-nucleic acid co-precipitate. The original technique of Graham and van der Eb (Graham and van der Eb, 1973, Virol., 52:456), has been modified by several groups to optimize conditions for particular types of cells. The art is well aware of these numerous modifications.
The terms "infecting" and "infection" when used with a bacterium refer to co-incubation of a target biological sample, (for example, cell, tissue, etc.) with the bacterium under conditions such that nucleic acid sequences contained within the bacterium are introduced into one or more cells ofthe target biological sample.
The term "Agrobacterium" refers to a soil-borne, Gram-negative, rod-shaped phytopathogenic bacterium which causes crown gall. The term "Agrobacterium" includes, but is not limited to, the strains Agrobacterium tumefaciens, (which typically causes crown gall in infected plants), and Agrobacterium rhizogens (which causes hairy root disease in infected host plants). Infection of a plant cell with Agrobacterium generally results in the production of opines (for example, nopaline, agropine, octopine etc.) by the infected cell. Thus, Agrobacterium strains which cause production of nopaline (for example, strain LBA4301, C58, A208, GV3101) are referred to as "nopaline-type" Agrobacteria; Agrobacterium strains which cause production of octopine (for example, strain LBA4404, Ach5, B6) are referred to as "octopine-type" Agrobacteria; and Agrobacterium strains which cause production of agropine (for example, strain EHA105, EHA101, A281) are referred to as "agropine-type" Agrobacteria.
The terms "bombarding, "bombardment," and "biolistic bombardment" refer to the process of accelerating particles towards a target biological sample (for example, cell, tissue, etc.) to effect wounding ofthe cell membrane of a cell in the target biological sample and/or entry ofthe particles into the target biological sample. Methods for biolistic bombardment are known in the art (for example, U.S. Patent No. 5,584,807, the contents of which are incorporated herein by reference), and are commercially available (for example, the helium gas-driven microprojectile accelerator (PDS-1000/He, BioRad).
The term "microwounding" when made in reference to plant tissue refers to the introduction of microscopic wounds in that tissue. Microwounding may be achieved by, for example, particle bombardment as described herein.
The term "transgene" refers to a foreign gene that is placed into an organism by the process of transfection. The term "foreign gene" refers to any nucleic acid (for example, gene sequence) that is introduced into the genome of an organism by experimental manipulations and may include gene sequences found in that organism so long as the introduced gene does not reside in the same location as does the naturally-occurring gene.
The term "transgenic" when used in reference to a plant or fruit or seed (that is, a "transgenic plant" or "transgenic fruit" or a "transgenic seed" ) refers to a plant or fruit or seed that contains at least one heterologous or foreign gene in one or more of its cells. The term "transgenic plant material" refers broadly to a plant, a plant structure, a plant tissue, a plant seed or a plant cell that contains at least one heterologous gene in one or more of its cells.
The term "host cell" refers to any cell capable of replicating and/or transcribing and/or translating a heterologous gene. Thus, a "host cell" refers to any eukaryotic or prokaryotic cell (for example, bacterial cells such as E. coli, yeast cells, mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells), whether located in vitro or in vivo. For example, host cells may be located in a transgenic animal. The terms "transformants" or "transformed cells" include the primary transformed cell or tissue, cultures derived from that cell without regard to the number of transfers, and progeny derived from the transformed cell or tissue, such as a transgenic plant or bacteria. All progeny may not be precisely identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny that have the same functionality as screened for in the originally transformed cell are included in the definition of transformants.
The term "selectable marker" refers to a gene which encodes an enzyme having an activity that confers resistance to an antibiotic or drug upon the cell in which the selectable marker is expressed, or which confers expression of a trait which can be detected (for example., luminescence or fluorescence). Selectable markers may be "positive" or "negative." Examples of positive selectable markers include the neomycin phosphotrasferase (NPTII) gene which confers resistance to G418 and to kanamycin, and the bacterial hygromycin phosphotransferase gene (hyg), which confers resistance to the antibiotic hygromycin. Negative selectable markers encode an enzymatic activity whose expression is cytotoxic to the cell when grown in an appropriate selective medium. For example, the HSV-tA: gene is commonly used as a negative selectable marker. Expression ofthe ΗSV-tk gene in cells grown in the presence of gancyclovir or acyclovir is cytotoxic; thus, growth of cells in selective medium containing gancyclovir or acyclovir selects against cells capable of expressing a functional HSV TK enzyme. The term "reporter gene" refers to a gene encoding a protein that may be assayed.
Examples of reporter genes include, but are not limited to, luciferase (See, for example, deWet et al. (1987) Mol. Cell. Biol. 7:725 and U.S. PatNos.,6,074,859; 5,976,796; 5,674,713; and 5,618,682; all of which are incorporated herein by reference), green fluorescent protein (for example, GenBank Accession Number U43284; a number of GFP variants are commercially available from CLONTECH Laboratories, Palo Alto, CA), chloramphenicol acetyltransferase, -galactosidase, alkaline phosphatase, and horse radish peroxidase.
The term "wild-type" when made in reference to a gene refers to a gene which has the characteristics of a gene isolated from a naturally occurring source. The term "wild-type" when made in reference to a gene product refers to a gene product which has the characteristics of a gene product isolated from a naturally occurring source. The term "naturally-occurring" as applied to an object refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally-occurring. A wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the "normal" or "wild-type" form of the gene. In contrast, the term "modified" or "mutant" when made in reference to a gene or to a gene product refers, respectively, to a gene or to a gene product which displays modifications in sequence and/or functional properties (in other words, altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
The term "antisense" refers to a deoxyribonucleotide sequence whose sequence of deoxyribonucleotide residues is in reverse 5' to 3' orientation in relation to the sequence of deoxyribonucleotide residues in a sense strand of a DNA duplex. A "sense strand" of a DNA duplex refers to a strand in a DNA duplex which is transcribed by a cell in its natural state into a "sense mRNA." Thus an "antisense" sequence is a sequence having the same sequence as the non-coding strand in a DNA duplex. The term "antisense RNA" refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene by interfering with the processing, transport and/or translation of its primary transcript or mRNA. The complementarity of an antisense RNA may be with any part ofthe specific gene transcript, in other words, at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. In addition, as used herein, antisense RNA may contain regions of ribozyme sequences that increase the efficacy of antisense RNA to block gene expression. "Ribozyme" refers to a catalytic RNA and includes sequence-specific endoribonucleases. "Antisense inhibition" refers to the production of antisense RNA transcripts capable of preventing the expression ofthe target protein.
The term "overexpression" generally refers to the production of a gene product in transgenic organisms that exceeds levels of production in normal or non-transformed organisms. The term "cosuppression" refers to the expression of a foreign gene which has substantial homology to an endogenous gene resulting in the suppression of expression of both the foreign and the endogenous gene. As used herein, the term "altered levels" refers to the production of gene product(s) in transgenic organisms in amounts or proportions that differ from that of normal or non-transformed organisms.
The terms "overexpression" and "overexpressing" and grammatical equivalents, are used specifically in reference to levels of mRNA to indicate a level of expression approximately 3 -fold higher than that typically observed in a given tissue in a control or non- transgenic animal. Levels of mRNA are measured using any of a number of techniques known to those skilled in the art including, but not limited to Northern blot analysis (See, Example 10, for a protocol for performing Northern blot analysis). Appropriate controls are included on the Northern blot to control for differences in the amount of RNA loaded from each tissue analyzed (for example, the amount of 28 S rRNA, an abundant RNA transcript present at essentially the same amount in all tissues, present in each sample can be used as a means of normalizing or standardizing the RAD50 mRNA-specific signal observed on Northern blots).
The terms "Southern blot analysis" and "Southern blot" and "Southern" refer to the analysis of DNA on agarose or acrylamide gels in which DNA is separated or fragmented according to size followed by transfer ofthe DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized DNA is then exposed to a labeled probe to detect DNA species complementary to the probe used. The DNA may be cleaved with restriction enzymes prior to electrophoresis. Following electrophoresis, the DNA may be partially depurinated and denatured prior to or during transfer to the solid support. Southern blots are a standard tool of molecular biologists (J. Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, NY (1989), pp 9.31-9.58).
The terms "Northern blot analysis" and "Northern blot" and "Northern" refer to the analysis of RNA by electrophoresis of RNA on agarose gels to fractionate the RNA according to size followed by transfer ofthe RNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized RNA is then probed with a labeled probe to detect RNA species complementary to the probe used. Northern blots are a standard tool of molecular biologists (J. Sambrook, et al. supra, pp 7.39-7.52).
The terms "Western blot analysis" and "Western blot" and "Western" refers to the analysis of protein(s) (or polypeptides) immobilized onto a support such as nitrocellulose or a membrane. A mixture comprising at least one protein is first separated on an acrylamide gel, and the separated proteins are then transferred from the gel to a solid support, such as nitrocellulose or a nylon membrane. The immobilized proteins are exposed to at least one antibody with reactivity against at least one antigen of interest. The bound antibodies may be detected by various methods, including the use of radiolabeled antibodies.
The term "antigenic determinant" refers to that portion of an antigen that makes contact with a particular antibody (in other words, an epitope). When a protein or fragment of a protein is used to immunize a host animal, numerous regions ofthe protein may induce the production of antibodies that bind specifically to a given region or three-dimensional structure on the protein; these regions or structures are referred to as antigenic determinants. An antigenic determinant may compete with the intact antigen (in other words, the "immunogen" used to elicit the immune response) for binding to an antibody.
The term "isolated" when used in relation to a nucleic acid, as in "an isolated oligonucleotide" refers to a nucleic acid sequence that is identified and separated from at least one contaminant nucleic acid with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. In contrast, non-isolated nucleic acids, such as DNA and RNA, are found in the state they exist in nature. For example, a given DNA sequence (for example, a gene) is found on the host cell chromosome in proximity to neighboring genes; RNA sequences, such as a specific mRNA sequence encoding a specific protein, are found in the cell as a mixture with numerous other mRNAs which encode a multitude of proteins. However, isolated nucleic acid encoding a particular protein includes, by way of example, such nucleic acid in cells ordinarily expressing the protein, where the nucleic acid is in a chromosomal location different from that of natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature. The isolated nucleic acid or oligonucleotide may be present in single-stranded or double-stranded form. When an isolated nucleic acid or oligonucleotide is to be utilized to express a protein, the oligonucleotide will contain at a minimum the sense or coding strand (that is, the oligonucleotide may single-stranded), but may contain both the sense and antisense strands (that is, the oligonucleotide may be double-stranded).
The term "purified" refers to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated. An "isolated nucleic acid sequence" is therefore a purified nucleic acid sequence. "Substantially purified" molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free from other components with which they are naturally associated. As used herein, the term "purified" or "to purify" also refer to the removal of contaminants from a sample. The removal of contaminating proteins results in an increase in the percent of polypeptide of interest in the sample. In another example, recombinant polypeptides are expressed in plant, bacterial, yeast, or mammalian host cells and the polypeptides are purified by the removal of host cell proteins; the percent of recombinant polypeptides is thereby increased in the sample.
The term "sample" is used in its broadest sense. In one sense it can refer to a plant cell or tissue. In another sense, it is meant to include a specimen or culture obtained from any source, as well as biological and environmental samples. Biological samples may be obtained from plants or animals (including humans) and encompass fluids, solids, tissues, and gases. Environmental samples include environmental material such as surface matter, soil, water, and industrial samples. These examples are not to be construed as limiting the sample types applicable to the present invention.
DESCRIPTION OF THE INVENTION
The hydroxylase enzymes from castor, CFAH (van de Loo, F. J. et al. (1995) Proc. Natl. Acad. Sci. USA 92(15), 6743-6747), and Lesquerella, LFAH ( Broun, P. et al. (1997) Plant J. 13, 201-210), are closely related to the common plant oleate desaturase enzyme, FAD2, which converts oleate (18:1 Δ9) into linoleate (18:2 Δ9'12). Indeed, LFAH actually retains both hydroxylase and desaturase activity, indicating that these two oxidation reactions can be catalyzed by the same enzyme. Comparison ofthe amino acid sequences of these two oleate hydroxylases with several oleate desaturases indicated that there are only seven residues that are conserved in all ofthe desaturases but different in both ofthe hydroxylases (Broun, P. et al. (1998) Science 282(5392), 1315-1317); these divergent residues are listed in Table I under Example 2. The role of these seven residues in determining desaturase and hydroxylase activity has been previously investigated by amino acid substitution at these positions (Broun, P. et al (1998) Science 282(5392), 1315-1317).
Thus, it was previously reported (Broun, P. et al. (1998) Science 282(5392), 1315- 1317) that substitution of all seven residues in the desaturase (A. thaliana FAD2 residues 63, 104, 148, 217, 295, 322, and 324) with the corresponding residues from a hydroxylase (LFAH) was sufficient to convert the desaturase into a bifunctional desaturase/hydroxylase. A reciprocal experiment, in which the desaturase residues were substituted into the hydroxylase (LFAH), generated an enzyme with increased desaturase activity. These experiments thus confirmed the importance of these seven residues in specifying the catalytic outcome. Additional experiments included introducing each ofthe seven desaturase residues separately into the hydroxylase (no difference from wild-type hydroxylase activity when expressed in yeast cells) and introducing all combinations of six desaturase residues into the hydroxylase (same activity observed as when all seven residues were introduced when expressed in yeast cells), and led to the conclusions that the observed changes in activity result from the interactions of more than two ofthe seven residues, and that no single amino acid position plays an essential role in catalytic outcome. Further experiments introduced four residues from the hydroxylase, those which are adjacent to histidine residues previously identified as essential to catalysis, into the desaturase (residues 104, 148, 322, and 324, numbered according to At FAD2) (similar activity observed as when all seven residues were introduced when expressed in plants), leading to the conclusion that only four amino acid changes are required to convert a strict desaturase into an enzyme with some desaturase activity but which is also an efficient hydroxylase enzyme.
During the development ofthe present invention, the question ofthe role of these seven residues was approached from a different perspective: small subsets of these residues from a castor hydroxylase (CFAH) were incorporated into AtFAD2. This approach was based upon the observation that ofthe seven residues previously identified as important to catalytic outcome, hydroxylase enzymes from Lesquerella and from castor differ from each other only at three positions (see Table 1 in Example 2); these three positions are 63, 148, and 324 (using the AtFAD2 numbering). The effects of these modified enzymes were initially evaluated in a plant host; the host was Arabidopsis plants deficient in desaturase activity. These plants had high levels of oleate (18:1), but because they were deficient in F AD2 activity, did not synthesize linoleate (18:2). Therefore, the effects of heterologous genes encoding modified FAD2 enzymes would be more easily observed than in plants possessing FAD2 activity. The results from this approach are summarized below, and described in more detail in the Examples. It was first discovered that AtFAD2 containing either all seven or the same four residues previously identified as essential to catalysis from the castor hydroxylase (at positions 104, 148, 322, and 324 in AtFAD2; the corresponding residues from CFAH were incorporated into these AtFAD2 positions) generated an enzyme which produced levels of hydroxy fatty acids similar to those previously observed upon expression ofthe wild type CFAH in Arabidopsis plants deficient in desaturase activity. Moreover, these modified AtFAD2 enzymes produced roughly equal amounts of linoleic acid and ricinoleic acid in the transgenic plants, which indicates that the modified enzymes displayed both desaturase and hydroxylase activities in about equal amounts. When compared to the results previously observed, when either all seven or the same four residues from LFAH were introduced into AtFAD2, where' desaturation dominated hydroxylation, it became clear that the high levels of hydroxy fatty acids observed from the presence ofthe four amino acid residues from CFAH in AtFAD2 resulted from improved enzyme specificity and not simply optimal transgene context.
The contributions of these four individual CFAH residues in determining product outcome was next assessed by analyzing the results observed from incorporating one and all combinations of two and three residues from CFAH into AtFAD2. The results, in combination with the results described above, indicated that the presence ofthe amino acid residue at position 148 has a dominant role in determining the desaturation/hydroxylation ratio ofthe enzymes. This was a surprising discovery, and in contrast to previous reports that no single amino acid position plays an essential role in the catalytic outcome, and that observed changes in enzyme activity result from interactions of more than two ofthe seven residues (Broun, P. et al. (1998) Science 282(5392), 1315-1317). Thus, from the single amino acid substitutions, the CFAH residues at positions 148 and 324 in AtFAD2 resulted in the production ofthe highest amounts of hydroxy fatty acids; however, the dramatic effect ofthe CFAH residue at position 324 was not additive among the two and three introduced residues, whereas the presence ofthe CFAH residue at position 148 resulted in even higher amounts of hydroxy fatty acids. Furthermore, the castor hydroxylase is a more specific hydroxylase than is the Lesequerella hydroxylase, and the introduction ofthe four CFAH residues into AtFAD2 resulted in a more specific hydroxylase than did the introduction ofthe same four residues from LFAH, where specificity is determined by the relative ratio of desaturation/hydroxylation activity, as determined by the relative amounts of fatty acid products in transgenic plants. This implied that at least some ofthe specificity determinants ofthe castor hydroxylase were contained within the four CFAH residues. Only two of these residues (148 and 324) are different between LFAH (Asn-148 and Ile-324) and CFAH (Ile-148 and Val-324). Inspection ofthe product ratios ofthe four single mutants, where either CFAH or LFAH single residues were introduced into AtFAD2, demonstrated that there was no difference upon the introduction of CFAH or LFAH residues at position 324, whereas there was an approximate four-fold difference between the product ratios upon introduction of CFAH or LFAH residues at position 148, suggesting that the CFAH residue at position 148 is of primary importance to determining the product distribution of CFAH and of an AtFAD2 containing the four CFAH residues.
This surprising discovery, of a major role for the amino acid residue at position 148 in determining catalytic outcome, was applied to designing a modified Lesquerella hydroxylase enzyme, with the goal of manipulating the relative amounts ofthe both the desaturase and hydroxylase activities. Thus, the specificity of LFAH (as determined by the ratio of desaturation to hydroxylation, that is, amount of desaturated fatty acid product divided by the amount of hydroxylated fatty acid product) was rationally designed to vary, by substituting the equivalent LFAH residue (Asn-149) with the corresponding residue from either AtFAD2 (Thr) or CFAH (He). In accordance with the discoveries described above, the reaction specificity of LFAH increased from a value of 0.37 to a value of 0.95 when Asn-149 was substituted with the LFAH equivalent (Thr); thus, the desaturase activity of LFAH was increased. Substitution of Asn-149 with the CFAH equivalent (He) created an enzyme with a decreased reaction specificity of 0.21 ; in this case, the hydroxylase activity of LFAH was increased. Thus, a single amino acid substitution from an enzyme with desaturase activity (AtFAD2) at position 149 decreased the hydroxylation activity ~3-fold, while a single amino acid substitution from an enzyme with hydroxylase activity (CFAH) at the same position increased the hydroxylation activity ~2-fold.
Therefore, present invention provides compositions comprising a modified Lesquerella hydroxylase polypeptide or a coding sequence for the modified Lesquerella hydroxylase polypeptide. In particular, the present invention provides a modified Lesquerella hydroxylase polypeptide comprising a non-native amino acid at position 149, at position 325, or at both positions, where no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids. Preferably, the reaction specificity ofthe modified hydroxylase differs from the reaction specificity of an unmodified hydroxylase. Preferably, position 149 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine. Preferably, the unmodified Lesquerella hydroxylase is SEQ ID NO: 1 as shown in Figure 4, and is modified as described above.
It is contemplated that other hydroxylase and desaturase enzymes can also be modified, based upon the rational design principles elucidated and described above and applied to the Lesquerella hydroxylase.
During the development ofthe rational design of fatty acid hydroxylases and desaturases, the effects of modified enzymes were evaluated by transforming nucleic acid sequences encoding the modified enzymes into a plant host, and observing the resulting fatty acid products; typically, the host was an Arabidopsis fad2 mutant, which is a line deficient in FAD2 activity (which is an oleate desaturase activity). However, it was noted that variability among independent transgenic plants expressing the same construct made it difficult to evaluate the effects of different constructs. This variability is believed to be due to insertional position effects. Therefore, the possibility of overcoming this problem by analyzing the constructs in yeast was explored. The utility of S. cerevisiae as a host for the functional heterologous expression of plant fatty acid desaturases was well established (Broun, P. et al. (1998) Science 282(5392), 1315- 1317; Brown, A. P et al. (1998) J. Am. Oil. Chem. Soc. 75, 77-82; and Covello, P. S., and Reed, D. W. (1996) Plant Physiol. 111(1), 223-236). For example, expression of FAD2 desaturases resulted in linoleic acid accumulation of up to -40%. However, expression of CFAH and LFAH has been less successful, with accumulation of no more than 2% hydroxy fatty acids ( Broun, P. et al. (1998) Science 282(5392), 1315-1317; and Smith, M. et al. (2000) Biochem. Soc. Trans. 28: 947-950). Surprisingly, it was discovered that by using an appropriate host strain (S. cerevisiae YPH499) and expression conditions (30°C induction at high cell density), it was possible to optimize hydroxy fatty acid accumulation (up to 27% ricinoleic acid) upon expression of LFAH. The ability to accumulate high levels of hydroxy fatty acids in yeast permitted a comparative analysis ofthe AtFAD2 variants, and it is contemplated that use ofthe strain under the conditions described provide a convenient system in which to screen for additional fatty acid hydroxylases, as well as modified fatty acid desaturase and hydroxylase enzymes. Thus, in another aspect, the present invention also provides a system and a method for screening fatty acid hydroxylase enzyme activities. The system and method utilize as a host strain S. cerevisiae YPH499 and as expression conditions induction at 30°C at high cell density.
The present invention also provides methods for using modified Lesquerella hydroxylase peptides and coding sequences; such methods include but are not limited to using the modified Lesquerella hydroxylase peptides and coding sequences in the production of hydroxylated fatty acids. The description below provides specific, but not limiting, illustrative examples of embodiments ofthe present invention.
I. Coding Sequences and Polypeptides
A. Coding Sequences
The present invention provides compositions comprising an isolated nucleic acid sequence encoding a modified Lesquerella hydroxylase polypeptide, where the polypeptide comprises a non-native amino acid at position 149, at position 325, or at both positions, where no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non- native amino acids; preferably, the reaction specificity ofthe modified hydroxylase differs from the reaction specificity of an unmodified hydroxylase. Preferably, position 149 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine. An unmodified Lesquerella hydroxylase polypeptide is preferably one which occurs naturally in Lesquerella fendleri, and more preferably an unmodified hydroxylase polypeptide comprising the amino acid sequence SEQ ID NO:l as shown in Figure 4, where such polypeptides are modified according to the present invention.
The present invention also provides compositions comprising a nucleic acid sequence encoding a modified fatty acid hydroxylase polypeptide, where the nucleic acid sequence hybridizes to a nucleic acid sequence encoding SEQ ID NO:l, and where the modified hydroxylase fatty acid polypeptide comprises a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO.T, or at both positions, and no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids, and where the reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase. The nucleic acid sequence encoding a modified fatty acid hydroxylase polypeptide hybridizes to a nucleic acid sequence encoding SEQ ID NO:l under conditions from low to high stringency; preferably, hybridization is under conditions of high stringency.
Although particular embodiments ofthe present invention are compositions comprising an isolated nucleic acid sequence encoding a modified Lesquerella hydroxylase polypeptide, it • is contemplated that other hydroxylase and desaturase enzymes can also be modified, based upon the rational design principles elucidated and described above and exemplified as applied to the Lesquerella hydroxylase. Therefore, the present invention provides compositions comprising nucleic acid sequences encoding a modified hydroxylase or desaturase polypeptide, where the hydroxylase or desaturase polypeptide are highly homologous to the Lesquerella hydroxylase (at least about 80% homology to SEQ ID NO:l). These modified polypeptides comprise a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where no more than three ofthe amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO: 1 are non-native amino acids; preferably, the reaction specificity ofthe modified hydroxylase or desaturase differs from the reaction specificity of an unmodified hydroxylase or desaturase. Preferably, the position corresponding to position 149 of SEQ ID NO: 1 is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine. The nucleic acid sequence can be oriented to produce sense or antisense transcripts, depending on the desired use. A nucleic acid sequence according to the present invention includes sequences engineered in order to alter a sequence encoding a modified hydroxylase or desaturase, including a modified Lesquerella hydroxylase, for a variety of reasons, including but not limited to alterations that modify the cloning, processing and/or expression ofthe gene product (such alterations include inserting new restriction sites, altering glycosylation patterns, and changing codon preference).
Although various embodiments ofthe present invention directed to nucleic acid sequences encoding a modified Lesquerella hydroxylase are described below, it is understood that similar embodiments are directed to nucleic acid sequences encoding other modified hydroxylases and desaturases, where such modifications are described above. Nucleic acid sequences ofthe present invention are obtained by methods well known in the art. In one embodiment, a nucleic acid sequence ofthe present invention is obtained by modification of a sequence encoding amino sequence SEQ ID NO:l as shown in Figure 4, such that a modified sequence encodes a modified hydroxylase ofthe present invention. Methods of preparing modified sequences are well known, and include those described in the Examples. In another embodiment ofthe invention, the coding sequence for a modified Lesquerella hydroxylase is synthesized, in whole or in part, using chemical methods well known in the art (See for example, Caruthers et al. (1980 ) Nucl. Acids Res. Symp. Ser., 7:215-233; Crea and Horn (1980) Nucl. Acids Res., 9:2331; Matteucci and Caruthersm (1980) Tetrahedron Lett., 21:719; and Chow and Kempe (1981) Nucl. Acids Res., 9:2807-2817).
In other aspects, the present invention provides vectors comprising a nucleic acid sequence ofthe present invention as described. The vectors include cloning vectors and expression vectors; both types of vectors are well known in the art, and are described further below.
B. Polypeptides
The present invention also provides compositions comprising a modified Lesquerella fatty acid hydroxylase polypeptide comprising a non-native amino acid at position 149, at position 325, or at both positions, where no more than three ofthe amino acids at positions 63, 105, 149, 218, 296, 323 and 325 are non-native amino acids; preferably, the reaction specificity ofthe modified hydroxylase differs from the reaction specificity of an unmodified hydroxylase. Preferably, position 149 is occupied by a non-native amino acid; more preferably, the non- native amino acid is either threonine or isoleucine. An unmodified Lesquerella hydroxylase polypeptide is preferably obtained from Lesquerella fendleri, and more preferably an unmodified hydroxylase polypeptide comprises the amino acid sequence of SEQ ID NO:l as shown in Figure 4, where such polypeptides are modified according to the present invention. Modified hydroxylases and desaturases may be identified by first modifying in accordance with the present invention either a nucleic acid encoding an enzyme such that it encodes a modified enzyme or the amino acid sequence of an enzyme, and then screening the modified enzyme for activity as described below. Although particular embodiments ofthe present invention are compositions comprising a modified Lesquerella hydroxylase polypeptide, it is contemplated that other hydroxylase and desaturase enzymes can also be modified, based upon the rational design principles elucidated and described above and applied to the Lesquerella hydroxylase. Therefore, the present invention provides compositions comprising a modified hydroxylase or desaturase polypeptide, where the hydroxylase or desaturase polypeptide is highly homologous to the Lesquerella hydroxylase (at least about 80% homology to SEQ ID NO:l). These modified polypeptides comprise a non-native amino acid at a position corresponding to position 149 of SEQ ID NO:l, at a position corresponding to position 325 of SEQ ID NO:l, or at both positions, where no more than three ofthe amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l are non-native amino acids; preferably, the reaction specificity ofthe modified hydroxylase or desaturase differs from the reaction specificity of an unmodified hydroxylase or desaturase. Preferably, the position corresponding to position 149 of SEQ ID NO:l is occupied by a non-native amino acid; more preferably, the non-native amino acid is either threonine or isoleucine.
Although various embodiments ofthe present invention directed to modifications ofthe Lesquerella hydroxylase are described below, it is understood that similar embodiments are directed to modifications of other hydroxylases and desaturases, where such modifications of these hydroxylases and desaturases are described above. The present invention also provides fusion proteins ofthe modified Lesquerella hydroxylase polypeptide. In some embodiments ofthe present invention, the polypeptide is a purified product, while in other embodiments it is a product of chemical synthetic procedures. In still other embodiments it is produced by recombinant techniques using a prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and mammalian cells in culture). In some embodiments, depending upon the host employed in a recombinant production procedure, the polypeptide ofthe present invention is glycosylated or is non-glycosylated. In other embodiments, the polypeptides ofthe invention may also include an initial methionine amino acid residue.
A modified Lesquerella hydroxylase ofthe present invention is a sequence of amino acids, such as a protein, polypeptide or peptide fragment, as described above and which has the ability to catalyze either the production of hydroxy-fatty acid or the production of a polyunsaturated fatty acid or both from appropriate fatty acyl substrates, under enzyme reactive conditions. By "enzyme reactive conditions" is meant any necessary conditions (for example, factors such as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. Appropriate enzyme reactive conditions may occur in vivo or in vitro. An appropriate fatty acid substrate may be free fatty acid, a fatty acid salt, fatty acyl-CoA, fatty acyl-ACP or a fatty acyl-lipid, where the fatty acid is at least a monounsaturated fatty acid.
References to fatty acid substrates and products, such as oleate or oleic acid, or ricinoleate or ricinoleic acid, are intended to include the free acids, the ACP and CoA esters, the salts of these acids, the glycerolipid esters (particularly the phospholipid and triacylglycerol esters), the wax esters, and the ether derivatives of these acids. Fatty acids may be indicated by the number of carbon atoms, with the number of double bonds following an asterisk; the location ofthe double bond is indicated by a superscript numeral following a delta. The common name, when included, follows in parentheses. For example, a fatty acid with 18 carbon atoms and one double bond between carbons 9 and 10 (numbering from the carboxyl end) is indicated as 18:ld9 (oleate). The presence and location of a hydroxyl group is indicated by an OH following a number which indicates the carbon atom to which the hydroxyl is attached. For example, a fatty acid with 18 carbon atoms and one double bond between carbons 9 and 10 (numbering from the carboxyl end) and with a hydroxyl group at carbon 12 is indicated as 12-OH, 18:ld9 (ricinoleate). In one aspect, it is contemplated that a modified Lesquerella hydroxylase ofthe present invention is used for production of hydroxylated or polyunsaturated fatty acids, by expression ofthe enzyme either in vitro or in vivo in transgenic organisms, such as plants, which produce the appropriate precursors. In some embodiments, products ofthe modified Lesquerella hydroxylase include but are not limited to: 12-OH, 16:ld9; 9-OH, 18:ld6; 12-OH, 18:ld9 (ricinoleate); 14-OH, 20:ldn (lesqueroleate); and 16-OH, 22:ld13. In another aspect, it is contemplated that the modified Lesquerella hydroxylase ofthe present invention is used for the production of additionally modified fatty acids that result from desaturation or elongation of hydroxylated fatty; such products include but are not limited to: 12-OH, 18:2d9'15 (densipoleate); 14-OH, 20:2dll'17 (aruicoleate); 14-OH, 20:ldU (lesqueroleate); and 16-OH, 22:ld13. Substrates ofthe modified Lesquerella hydroxylase include but are not limited to: 16:ld9 (palmitoleate); 18:ld6 (petroselenate); 18:ld9 (oleate); 20:1 (gladoleate or eicosenoate); and 22:ld13 (erucate or docosenoate).
A modified Lesquerella hydroxylase of this invention displays activity toward fatty acyl substrates. During biosynthesis of lipids in a plant cell, fatty acids are typically covalently bound to acyl carrier protein (ACP), coenzyme A (Co A) or various cellular lipids. Thus, in various aspects, it is contemplated that a modified Lesquerella hydroxylase ofthe present invention utilizes a fatty acyl substrate which is esterified to ACP, to CoA, or to a glycerolipid backbone. Alternatively, the fatty acid substrate is a free fatty acid. Preferably, the substrate is an esterified fatty acyl substrate; more preferably, the substrate is a fatty acyl glycerolipid; even more preferably, the substrate is fatty acyl phosphatidylcholine.
1. Assay of modified Lesquerella hydroxylase
A modified Lesquerella hydroxylase catalyzes fatty acid hydroxylation, desaturation, or both reactions. An example of these reactions, where oleate is the substrate, follows:
18:lΔy + 2e' _ +ι_ r 0v2 _-2- 1108.:1lΔ Λ 9y,12-OH+ H20
18:lΔ9 + 2e- + 02 -^ 18:2Δ9'12 + 2 H20
The enzyme in situ is believed to act on a fatty acid esterified to a lipid, and requires cytochrome b5 reductase and cytochrome b5 for activity. Moreover, the enzyme may utilize different substrates under different conditions to differing degrees of activity.
The activity of modified Lesquerella hydroxylase may be assayed in a number of ways. In one aspect, the activity is determined by expressing a nucleic acid sequence encoding the modified hydroxylase in a transgenic organism, as described below and in the Examples, and then analyzing the composition ofthe total fatty acids. Thus, the activity is measured as the presence of or increase in the amount of endogenous hydroxy-oleate and other hydroxy fatty acids, or as the presence of or increase in the amount of endogenous linolineate or other di- or polyunsaturated fatty acids (collectively referred to as the fatty acid products) in a transgenic organism which comprises a heterologous or exogenous nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention; such transgenic organisms are obtained as described below. The amount of fatty acid product in a transgenic organism is then determined. In some experiments, the amount of fatty acid product in a transgenic organism is compared to that present in a non-transgenic organism. The fatty acids are analyzed from lipids extracted from samples of a transgenic organism, or from fatty acids extracted directly from such samples; for example, the samples are homogenized in methanol/chloroform (2:1, v/v) and the lipids extracted as described by Bligh and Dyer (1959) (Can. J. Biochem. Physiol. 37: 911).
In another aspect, the enzyme activity is determined in tissue samples obtained from a transgenic organism as described below and in the Examples. For example, in plants, tissue samples include but are not limited to leaf samples (such as discs), stem and root samples, and developing and mature seed embryonic or endosperm tissue. Typically, tissue samples are incubated with either precursors of fatty acid synthesis, such as 14C-acetate, or with fatty acids, such as ammonium salts of 14C-fatty acids, which can be taken up and incorporated into tissue lipids. Additional co-factors for lipid synthesis, as required, are present during the incubation; such co-factors include but are not limited to ATP, CoA, MgCl , and lyso-phospholipids, such as lysoPC. Incubations generally proceed at room temperature in a buffered solution, such as 0.1M potassium phosphate at pH 7.2, for a suitable period of time. The samples are then washed in buffer, and the amount of labeled fatty acid product is determined. In some experiments, the amount of labeled fatty acid produced in a transgenic organism is compared to that produced in a non-transgenic organism. The fatty acids are analyzed from lipids extracted from the tissue samples, or from fatty acids extracted directly from such samples; for example, the tissue samples homogenized in methanol/chloroform (2:1, v/v) and the lipids extracted as described by Bligh and Dyer (1959) (Can. J. Biochem. Physiol. 37: 911).
In another aspect, the enzyme activity is determined in a sub-cellular fraction obtained from a transgenic organism as described below. For example, in plants, subcellular fractions may be obtained from any ofthe types of tissues described above, and include whole cell and microsomal membranes, plastids, and plastidial membrane fractions. Preparation of such fractions is well-known in the art. The subcellular fraction is then incubated with fatty acids, such as ammonium salts of 14C-fatty acids, which can be taken up and incorporated into tissue lipids. Additional co-factors for lipid synthesis, as required, are present during the incubation; such co-factors include but are not limited to ATP, CoA, MgCl2, and lyso-phospholipids, such as lysoPC. The samples are incubated and the lipids extracted as described above.
In another aspect, the enzyme activity is determined from an in-vitro nucleic acid expression system, in which a nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention is added and the encoded enzyme expressed. Such expression systems are well-known in the art and include, for example reticulocyte lysate or wheat germ homogenates. Because the enzyme is likely to be an integral membrane protein, it may be necessary to include micellar or membrane structures into which the enzyme may be incorporated during or after protein synthesis. Moreover, because an unmodified enzyme in situ is believed to act on a fatty acid esterified to a lipid, and require cytochrome b5 reductase and cytochrome b5 for activity, it is preferable that such micellar structures are obtained from sources which contain related lipid synthetic capabilities such as lysophospholipid acyl transferase as well as cytochrome b5 reductase and cytochrome b5, but not the lipid synthetic capability under investigation; an example of a micellar source is a plant tissue where the plant does not contain an endogenous fatty acid hydroxylase. Direct and quantitative measurements ofthe activity ofthe modified enzyme require the incorporation of labeled lipids into the micellar or membrane structures and the assurance that cytochrome b5 and b5 reductase are not limiting. The newly-expressed enzyme is then analyzed as described above for subcellular fractions. The extracted lipid products ofthe modified hydroxylase are analyzed by methods well- known in the art (as described in the Examples; see also Engeseth et al. (1996) Planta 198: 238-245; Broun et al (1997) Plant Physiol. 113: 933-942). For example, fatty acid methyl esters are prepared from an aliquot of an extracted lipid fraction by evaporating the solvent from the aliquot under N2 and resuspending and heating the lipids in 4% methanolic HCL (w/w). The fatty acid methyl esters are then separated, and for radioactive samples the radioactivity in each separated fraction determined, by radio gas-liquid chromatography (GLC) and radio-HPLC (Engeseth et al. (1996) Planta 198: 238-245). Alternatively, fatty acid methyl esters are prepared, derivatized with bis(trimethylsily)trifluoroacetamide:trimethyl-chlorosilane to obtain TMS fatty acid methyl esters of hydroxylated fatty acids and analyzed by GC (Broun et al (1997) Plant Physiol. 113: 933-942). 2. Chemical synthesis of modified Lesquerella hydroxylase
In other embodiments ofthe present invention, the protein itself is produced using chemical methods to synthesize either an entire modified Lesquerella hydroxylase amino acid sequence or a portion thereof. For example, peptides are synthesized by solid phase techniques, cleaved from the resin, and purified by preparative high performance liquid chromatography (See for example, Creighton (1983) Proteins Structures And Molecular Principles, W H Freeman and Co, New York N.Y.). In other embodiments ofthe present invention, the composition ofthe synthetic peptides is confirmed by amino acid analysis or sequencing (See for example, Creighton, supra). Direct peptide synthesis can be performed using various solid-phase techniques
(Roberge et al. (1995) Science, 269:202-204) and automated synthesis may be achieved, for example, using ABI 431 A Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer. Additionally, the amino acid sequence of modified Lesquerella hydroxylase, or any part thereof, may be altered during direct synthesis and/or combined using chemical methods with other sequences to produce a variant polypeptide. Because the enzyme is thought to be an integral membrane protein, it may be necessary to include micellar or membrane structures into which the enzyme may be incorporated during or after protein synthesis.
3. Purification of modified Lesquerella hydroxylase
In some embodiments ofthe present invention, modified Lesquerella hydroxylase polypeptides purified from recombinant organisms as described below are provided. In other embodiments, modified Lesquerella hydroxylase polypeptides purified from in vitro transcription translation expression systems as described above are provided. The present invention provides purified modified Lesquerella hydroxylase polypeptides ofthe present invention.
The present invention also provides methods for recovering and purifying modified Lesquerella hydroxylase. Purification typically begins by disruption ofthe cells, and preparation of cell fractions with the highest specific activity ofthe modified hydroxylase. Because a modified Lesquerella hydroxylase is likely to be a membrane-bound enzyme, it is contemplated that microsomal preparations contain the highest specific activity ofthe enzyme. Further purification ofthe modified Lesquerella hydroxylase is then accomplished by detergent solubilization ofthe enzyme, followed by column chromatography. Purification schemes have been developed for related enzymes, such as plastidial oleate desaturase (Schmidt et al. (1994) Plant Molecular Biology 26: 631-642). Because a modified Lesquerella hydroxylase is contemplated to exhibit a high degree of similarity to oleate desaturase, both in amino acid sequence and in the reaction catalyzed, a modified Lesquerella hydroxylase is purified by a similar scheme to that reported for the desaturase. Alternative chromatographic steps include but are not limited to ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography, lectin chromatography, and size exclusion chromatography.
The present invention further provides nucleic acid sequences encoding a modified Lesquerella hydroxylase fused in frame to a marker sequence that allows for expression alone or both expression and purification ofthe polypeptide ofthe present invention. A non-limiting example of a marker sequence is a hexahistidine tag that may be supplied by a vector, for example, a pQE-30 vector which adds a hexahistidine tag to the N terminal of a modified hydroxylase ofthe present invention and which results in expression ofthe polypeptide in the case of a bacterial host, and more preferably by vector PT-23B, which adds a hexahistidine tag to the C terminal of a modified hydroxylase ofthe present invention and which results in improved ease of purification ofthe polypeptide fused to the marker in the case of a bacterial host. Another non-limiting example is the fusion of glutathione S-transferase to the enzyme, resulting in the expression of (GST)-modified hydroxylase, such as was used by Maeng, CY et al. (2001, Biochem Biophys Res Commun: 282(3):787-92). Yet other non-limiting examples include a hemagglutinin (HA) tag as a marker sequence when a mammalian host is used. The HA tag corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al. (1984) Cell, 37:767).
C. Generation of modified Lesquerella hydroxylase antibodies
In some embodiments ofthe present invention, antibodies are. generated to allow for the detection and characterization of a modified Lesquerella hydroxylase protein. The antibodies may be prepared using various immunogens. In one embodiment, the immunogen is a modified Lesquerella hydroxylase peptide with either threonine or isoleucine at amino position 149 to generate antibodies that recognize the modified hydroxylase. Such antibodies include, but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and Fab expression libraries. Various procedures known in the art may be used for the production of polyclonal antibodies directed against the modified hydroxylase ofthe present invention. For the production of antibody, various host animals can be immunized by injection with the peptide corresponding to the modified hydroxylase epitope including but not limited to rabbits, mice, rats, sheep, goats, etc. In a preferred embodiment, the peptide is conjugated to an immunogenic carrier (for example, diphtheria toxoid, bovine serum albumin (BSA), or keyhole limpet hemocyanin (KLH)). Various adjuvants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels (for example, aluminum hydroxide), surface active substances (for example, lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (Bacille Calmette-Guerin) and Corynebacterium parvum).
For preparation of monoclonal antibodies directed toward modified hydroxylase ofthe present invention, it is contemplated that any technique that provides for the production of antibody molecules by continuous cell lines in culture finds use with the present invention (See for example, Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). These include but are not limited to the hybridoma technique originally developed by Kδhler and Milstein (Kohler and Milstein (1975) Nature, 256:495-497), as well as the trioma technique, the human B-cell hybridoma technique (See for example, Kozbor et al. (1983) Immunol. Tod., 4:72), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al, (1985) in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. pp. 77-96).
In the production of antibodies, it is contemplated that screening for the desired antibody is accomplished by techniques known in the art (for example, radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), "sandwich" immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (for example, using colloidal gold, enzyme or radioisotope labels, for example), Western blots, precipitation reactions, agglutination assays (for example, gel agglutination assays, hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc.
In one embodiment, antibody binding is detected by detecting a label on the primary antibody. In another embodiment, the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody. In a further embodiment, the secondary antibody is labeled. Many methods are known in the art for detecting binding in an immunoassay and are within the scope ofthe present invention. As is well known in the art, the immunogenic peptide should be provided free ofthe carrier molecule used in any immunization protocol. For example, if the peptide was conjugated to KLH, it may be conjugated to BSA, or used directly, in a screening assay.
In some embodiments ofthe present invention, the foregoing antibodies are used in methods known in the art relating to the expression ofthe modified hydroxylase (for example, for Western blotting), measuring levels thereof in appropriate biological samples, etc. The antibodies can be used to detect the modified hydroxylase in a biological sample from a plant. The biological sample can be an extract of a tissue, or a sample fixed for microscopic examination.
The biological samples are then tested directly for the presence of modified hydroxylase using an appropriate strategy (for example, ELISA or radioimmunoassay) and format (for example, microwells, dipstick (for example, as described in International Patent Publication WO 93/03367), etc. Alternatively, proteins in the sample can be size separated (for example, by polyacrylamide gel electrophoresis (PAGE), in the presence or not of sodium dodecyl sulfate (SDS), and the presence ofthe modified hydroxylase detected by immunoblotting (Western blotting). Immunoblotting techniques are generally more effective with antibodies generated against a peptide corresponding to an epitope of a protein, and hence, are particularly suited to the present invention.
II. Variants of modified Lesquerella hydroxylase
In other embodiments, the present invention provides compositions comprising variants ofthe modified Lesquerella hydroxylase, where the amino acid sequence ofthe modified
Lesquerella hydroxylase, other than residues 63, 105, 149, 218, 296, 323, and 325, which are modified according to the invention as described above, may be varied; these variants include mutants, fragments, fusion proteins, or functional equivalents ofthe modified Lesquerella hydroxylases, provided that the activity ofthe variants ofthe modified Lesquerella hydroxylase is essentially unchanged by such sequence variations. In other words, any variant generated according to the guidelines outlined below can be evaluated in order to determine whether it is a member ofthe genus of variant modified Lesquerella hydroxylase ofthe present invention as defined functionally, rather than structurally. In preferred embodiments, the activity of variant modified Lesquerella is evaluated by the methods described above and in the Examples. In yet other embodiments, the present invention provides compositions comprising nucleic acid sequences encoding variant modified Lesquerella hydroxylases. Nucleic acid sequences ofthe present invention are engineered in order to alter a modified Lesquerella hydroxylase coding sequence for a variety of reasons, including but not limited to alterations that modify the cloning, processing and/or expression ofthe gene product (such alterations include inserting new restriction sites, altering glycosylation patterns, and changing codon preferences.
A. Mutant modified Lesquerella hydroxylases
Mutant modified Lesquerella hydroxylases ofthe present invention can be generated according to the following guidelines. For example, it is contemplated that isolated replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar replacement of an amino acid with a structurally related amino acid (in other words, conservative mutations) will not have a major effect on the biological activity ofthe resulting molecule. Accordingly, some embodiments ofthe present invention provide variants of modified Lesquerella hydroxylases disclosed herein containing conservative replacements. Conservative replacements are those that take place within a family of amino acids that are related in their side chains. Genetically encoded amino acids can be divided into four families: (1) acidic (aspartate, glutamate); (2) basic (lysine, arginine, histidine); (3) nonpolar (alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan); and (4) uncharged polar (glycine, asparagine, glutamine, cysteine, serine, threonine, tyrosine). Phenylalanine, tryptophan, and tyrosine are sometimes classified jointly as aromatic amino acids. In similar fashion, the amino acid repertoire can be grouped as (1) acidic (aspartate, glutamate); (2) basic (lysine, arginine, histidine), (3) aliphatic (glycine, alanine, valine, leucine, isoleucine, serine, threonine), with serine and threonine optionally be grouped separately as aliphatic-hydroxyl; (4) aromatic (phenylalanine, tyrosine, tryptophan); (5) amide (asparagine, glutamine); and (6) sulfur -containing (cysteine and methionine) (for example, Stryer ed. (1981) Biochemistry, pg. 17-21, 2nd ed, WH Freeman and Co., 1981). Whether a change in the amino acid sequence of a peptide results in a functional homolog can be readily determined by assessing the ability ofthe variant peptide to function in a fashion similar to the modified protein. Peptides having more than one replacement can readily be tested in the same manner. More rarely, a mutant includes "nonconservative" changes (for example, replacement of a glycine with a tryptophan). Analogous minor variations can also include amino acid deletions or insertions, or both. Guidance in determining which amino acid residues can be substituted, inserted, or deleted without abolishing biological activity can be found using computer programs (for example, LASERGENE software, DNASTAR Inc., Madison, Wis.). Mutants of modified Lesquerella hydroxylase can be generated by any suitable method well known in the art, including but not limited to site-directed mutagenesis, randomized "point" mutagenesis, and domain-swap mutagenesis.
Mutant modified Lesquerella hydroxylases may also be produced by methods such as directed evolution or other techniques for producing combinatorial libraries of variants. Thus, the present invention further contemplates a method of generating sets of combinatorial mutants ofthe present modified Lesquerella hydroxylase proteins, as well as truncation mutants, and is especially useful for identifying potential variant sequences (in other words, homologs) that possess the biological activity ofthe modified Lesquerella hydroxylase.
It is contemplated that the nucleic acids encoding a modified Lesquerella hydroxylase can be utilized as starting nucleic acids for directed evolution. These techniques can be utilized to develop modified Lesquerella hydroxylase variants having desirable properties
In some embodiments, artificial evolution is performed by random mutagenesis (for example,' by utilizing error-prone PCR to introduce random mutations into a given coding sequence). This method requires that the frequency of mutation be finely tuned. As a general rule, beneficial mutations are rare, while deleterious mutations are common. This is because the combination of a deleterious mutation and a beneficial mutation often results in an inactive enzyme. The ideal number of base substitutions for targeted gene is usually between 1.5 and 5 (Moore and Arnold (1996) Nat. Biotech., 14, 458-67; Leung et al. (1989) Technique, 1 :11-15; Eckert and Kunkel (1991) PCR Methods Appl., 1:17-24; Caldwell and Joyce (1992) PCR Methods Appl., 2:28-33; and Zhao and Arnold (1997) Nuc. Acids. Res., 25:1307-08). After mutagenesis, the resulting clones are selected for desirable activity (for example, screened for modified Lesquerella hydroxylase activity as described above and in the Examples).
Successive rounds of mutagenesis and selection are often necessary to develop enzymes with desirable properties. It should be noted that only the useful mutations are carried over to the next round of mutagenesis.
In other embodiments ofthe present invention, the polynucleotides ofthe present invention are used in gene shuffling or sexual PCR procedures (for example, Smith (1994) Nature, 370:324-25; U.S. Pat. Nos. 5,837,458; 5,830,721; 5,811,238; 5,733,731). Gene shuffling involves random fragmentation of several mutant DNAs followed by their reassembly by PCR into full length molecules. Examples of various gene shuffling procedures include, but are not limited to, assembly following/DNase treatment, the staggered extension process (STEP), and random priming in vitro recombination. In the DNase mediated method, DNA segments isolated from a pool of positive mutants are cleaved into random fragments with DNasel and subjected to multiple rounds of PCR with no added primer. The lengths of random fragments approach that ofthe uncleaved segment as the PCR cycles proceed, resulting in mutations in present in different clones becoming mixed and accumulating in some ofthe resulting sequences. Multiple cycles of selection and shuffling have led to the functional enhancement of several enzymes (Stemmer (1994) Nature, 370:398-91; Stemmer (1994) Proc. Natl. Acad. Sci. USA, 91, 10747-51; Crameri et al. (1996) Nat. Biotech., 14:315-19; Zhang et al. (1997) Proc. Natl. Acad. Sci. USA, 94:4504-09; and Crameri etal. (1997) Nat. Biotech., 15:436-38). Variants produced by directed evolution can be screened for modified Lesquerella activity by the methods described (see for example above and the Examples).
B. Homologs
Still other embodiments ofthe present invention provide compositions comprising modified Lesequerella hydroxylase homologs, and compositions comprising nucleic acid sequences encoding the same. Some homologs of modified Lesquerella have intracellular half- lives dramatically different than the corresponding wild-type protein. For example, the altered protein is rendered either more stable or less stable to proteolytic degradation or other cellular process that result in destruction of, or otherwise inactivate modified Lesquerella hydroxylase. Such homologs, and the genes that encode them, can be utilized to alter the activity of modified Lesquerella hydroxylase by modulating the half-life ofthe protein. For instance, a short half- life can give rise to more transient hydroxylase biological effects. Other homologs have characteristics which are either similar to modified Lesquerella hydroxylase, or which differ in one or more respects from modified Lesquerella hydroxylase.
In some embodiments ofthe combinatorial mutagenesis approach ofthe present invention, the amino acid sequences for a population of modified Lesquerella hydroxylase homologs are aligned, preferably to promote the highest homology possible. Such a population of variants can include, for example, Lesquerella hydroxylase homologs from one or more species, or from the same species but which differ due to mutation, outside of amino acid residues 63, 105, 149, 218, 296, 323, and 325 modified in accordance with the present invention. Amino acids that appear at each position ofthe aligned sequences are selected to create a degenerate set of combinatorial sequences.
In a preferred embodiment ofthe present invention, the combinatorial library is produced by way of a degenerate library of genes encoding a library of polypeptides that each include at least a portion of candidate protein sequences, including 63, 105, 149, 218, 296, 323, and 325 modified in accordance with the present invention. For example, a mixture of synthetic oligonucleotides is enzymatically ligated into gene sequences such that the degenerate set of candidate sequences are expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (for example, for phage display) containing the set of hydroxylase sequences therein.
There are many ways by which the library of potential hydroxylase homologs can be generated from a degenerate oligonucleotide sequence. In some embodiments, chemical synthesis of a degenerate gene sequence is carried out in an automatic DNA synthesizer, and the synthetic genes are ligated into an appropriate gene for expression. The purpose of a degenerate set of genes is to provide, in one mixture, all ofthe sequences encoding the desired set of potential hydroxylase sequences. The synthesis of degenerate oligonucleotides is well known in the art (See for example, Narang (1983) Tetrahedron Lett., 39:3 9; Itakura et al.
(1981) Recombinant DNA, in Walton (ed.), Proceedings ofthe 3rd Cleveland Symposium on Macromolecules, Elsevier, Amsterdam, pp 273-289; Itakura et al. (1984) Annu. Rev. Biochem., 53:323; Itakura et al. (1984) Science 198:1056; Ike et al (1983) Nucl. Acid Res., 11 :477). Such techniques have been employed in the directed evolution of other proteins (See for example, Scott et al. (1980) Science, 249:386-390; Roberts et al. (1992) Proc. Natl. Acad. Sci. USA, 89:2429-2433; Devlin et al. (1990) Science, 249: 404-406 ; Cwirla et al. (1990)
Proc. Natl. Acad. Sci. USA, 87: 6378-6382; as well as U.S. Pat. Nos. 5,223,409, 5,198,346, and 5,096,815).
C. Truncation Mutants of modified Lesquerella hydroxylase In addition, the present invention provides compositions comprising fragments (or in other words truncation mutants) of a modified Lesquerella hydroxylase which comprises amino acids at original positions 105, 149, 218, 296, 323, and 325, modified in accordance with the present invention, and compositions comprising nucleic acid sequences encoding them. In preferred embodiments, a modified Lesquerella hydroxylase fragment is biologically active. In some embodiments ofthe present invention, when expression of a portion ofthe modified hydroxylase protein is desired, it may be necessary to add a start codon (ATG) to the oligonucleotide fragment containing the desired sequence to be expressed. It is well known in the art that a methionine at the N-terminal position can be enzymatically cleaved by the use of the enzyme methionine aminopeptidase (MAP). MAP has been cloned from E. coli (Ben- Bassat et al. (1987) J. Bacteriol. 169:751-757 ) and Salmonella typhimurium and its in vitro activity has been demonstrated on recombinant proteins (Miller et al. (1990) Proc. Natl. Acad. Sci. USA, 84:2718-1722). Therefore, removal of an N-terminal methionine, if desired, can be achieved either in vivo by expressing such recombinant polypeptides in a host that produces MAP (for example, E. coli or CM89 or S. cerevisiae), or in vitro by use of purified MAP.
D. Fusion Proteins Containing Modified Lesquerella Hydroxylase
The present invention also provides compositions comprising fusion proteins incorporating all or part of a modified Lesquerella hydroxylase comprising amino acids at original positions 105, 149, 218, 296, 323, and 325, modified in accordance with the present invention, and compositions comprising the nucleic acid sequences encoding such fusion proteins. In some embodiments, the fusion proteins have a modified Lesquerella hydroxylase functional domain with a fusion partner. Accordingly, in some embodiments ofthe present invention, the coding sequences for the polypeptide (for example, a modified Lesquerella hydroxylase functional domain) are incorporated as a part of a fusion gene including a nucleotide sequence encoding a different polypeptide. In some embodiments ofthe present invention, chimeric constructs code for a fusion protein comprising all or part ofthe modified hydroxylase ofthe present invention, and all or a part of a cytochrome b5, such that each protein is active. Similar fusion proteins have been reported, as for example a cDNA isolated from ripening sunflower embryos which encoded a fusion protein, of which one portion was highly homologous to a membrane bound desaturase and the other N terminal portion was highly homologous to cytochrome b5 (Sperling et al. (1995) Eur. J. Biochem. 232: 798-805). Such a fusion, between the electron donor and its acceptor was speculated to increase efficiency ofthe electron transport required for desaturation (Sperling et al. (1995) Eur. J. Biochem. 232: 798-805). In a similar fashion, a fusion between a modified Lesquerella hydroxylase ofthe present invention and a cytochrome b5 protein is contemplated to increase the efficiency ofthe electron transport required for the hydroxylation and desaturation reactions, as well as to decrease the reliance ofthe enzyme upon added or exogenous or endogenous cytochrome b5 for its activity.
In other embodiments ofthe present invention, chimeric constructs code for fusion proteins containing a modified hydroxylase ofthe present invention and at least a portion of another gene. Preferably, the fusion proteins have biological activity similar to a modified Lesquerella hydroxylase ofthe present invention (for example, they have at least one desired biological activity ofthe modified hydroxylase.
In yet other embodiments, chimeric constructs code for fusion proteins containing a modified Lesquerella hydroxylase ofthe present invention and a leader sequence. Such leader sequences are well-known in the art, and function to direct the protein to targeted cellular locations. Exemplary leader sequences include but are not limited to those disclosed in Gavel Y, and von Heijne G (1990, FEBS Lett 261(2):45), and Emanuelsson O et al. (2000, J Mol Biol 300:1005-16).
In addition to utilizing fusion proteins to alter biological activity, it is widely appreciated that fusion proteins can also facilitate the expression and/or purification of proteins, such as a modified Lesquerella hydroxylase protein ofthe present invention. Accordingly, in some embodiments ofthe present invention, a modified hydroxylase is generated as a glutathione-S-transferase (in other words., a GST fusion protein). It is contemplated that such GST fusion proteins facilitates purification of modified hydroxylase, such as by the use of glutathione-derivatized matrices (See for example, Ausabel et al, eds.(1991) Current Protocols in Molecular Biology, John Wiley & Sons, NY).
In another embodiment ofthe present invention, a fusion gene coding for a purification leader sequence, such as a poly-(His)/enterokinase cleavage site sequence at the N-terminus of the desired portion of modified hydroxylase allows purification ofthe expressed modified hydroxylase fusion protein by affinity chromatography using a Ni2+ metal resin. In still another embodiment ofthe present invention, the purification leader sequence is then subsequently removed by treatment with enterokinase (See for example, Hochuli et al. (1987) J. Chromatogr. 411 : 177; and Janknecht et al, Proc. Natl. Acad. Sci. USA 88:8972). In yet other embodiments ofthe present invention, a fusion protein comprising a purification sequence appended to either the N or the C terminus allows for affinity purification; one example is addition of a hexahistidine tag to the carboxy terminus of modified Lesquerella hydroxylase which may result in improved affinity purification.
Techniques for making fusion genes are well known. Essentially, the joining of various nucleic acid fragments coding for different polypeptide sequences is performed in accordance with conventional techniques, employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment ofthe present invention, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, in other embodiments ofthe present invention, PCR amplification of gene fragments is carried out using anchor primers that give rise to complementary overhangs between two consecutive gene fragments that can subsequently be annealed to generate a chimeric gene sequence (See for example, Current Protocols in Molecular Biology, supra).
E. Screening Gene Products A wide range of techniques are known in the art for screening gene products of combinatorial libraries made by point mutations, and for screening cDNA libraries for gene products having a certain property. Such techniques are generally adaptable for rapid screening ofthe gene libraries generated by the combinatorial mutagenesis of hydroxylase homologs. The most widely used techniques for screening large gene libraries typically comprise cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates relatively easy isolation ofthe vector encoding the gene whose product was detected. Each ofthe illustrative assays described below are amenable to high through-put analysis as necessary to screen large numbers of degenerate sequences created by combinatorial mutagenesis techniques. Accordingly, in one embodiment ofthe present invention, the candidate hydroxylase gene products are displayed on the surface of a cell or viral particle, and the ability of particular cells or viral particles to catalyze the reaction resulting from the modification ofthe hydroxylase is assayed using the techniques described above and in the Examples. In other embodiments ofthe present invention, the gene library is cloned into the gene for a surface membrane protein of a bacterial cell, and the resulting fusion protein detected by panning (WO 88/06630; Fuchs et al(\991) BioTechnol. 9:1370-1371; and Goward et al. (1992) TIBS 18:136-140). In other embodiments ofthe present invention, fluorescently labeled molecules that bind the modified Lesquerella hydroxylase can be used to score for potentially functional hydroxylase homologs. Cells are visually inspected and separated under a fluorescence microscope, or, where the morphology ofthe cell permits, separated by a fluorescence- activated cell sorter. (
In an alternate embodiment ofthe present invention, the gene library is expressed as a fusion protein on the surface of a viral particle. For example, foreign peptide sequences are expressed on the surface of infectious phage in the filamentous phage system, thereby conferring two significant benefits. First, since these phage can be applied to affinity matrices at very high concentrations, a large number of phage can be screened at one time. Second, since each infectious phage displays the combinatorial gene product on its surface, if a particular phage is recovered from an affinity matrix in low yield, the phage can be amplified by another round of infection. The group of almost identical E. coli filamentous phages Ml 3, fd, and fl are most often used in phage display libraries, as either ofthe phage gill or gVIII coat proteins can be used to generate fusion proteins without disrupting the ultimate packaging of the viral particle (See for example, WO 90/02909; WO 92/09690; Marks et al. (1992) J. Biol. Chem. 267:16007-16010; Griffths et al. (1993) EMBO J. 12:725-734; Clackson et al. (1991) Nature 352:624-628; and Barbas et al. (1992) Proc. Natl. Acad. Sci. 89:4457-4461).
In another embodiment ofthe present invention, the recombinant phage antibody system (for example, RPAS, Pharmacia Catalog number 27-9400-01) is modified for use in expressing and screening of hydroxylase combinatorial libraries. The pCANTAB 5 phagemid ofthe RPAS kit contains the gene that encodes the phage gill coat protein. In some embodiments ofthe present invention, the modified hydroxylase combinatorial gene library is cloned into the phagemid adjacent to the gill signal sequence such that it is expressed as a gill fusion protein. In other embodiments ofthe present invention, the phagemid is used to transform competent E. coli TGI cells after ligation. In still other embodiments ofthe present invention, transformed cells are subsequently infected with M13K07 helper phage to rescue the phagemid and its candidate hydroxylase gene insert. The resulting recombinant phage contain phagemid DNA encoding a specific candidate modified hydroxylase-protein and display one or more copies ofthe corresponding fusion coat protein. In some embodiments of the present invention, the phage-displayed candidate proteins that are capable of, for example, metabolizing a hydroperoxide, are selected or enriched by panning. The bound phage is then isolated, and if the recombinant phage express at least one copy ofthe wild type gill coat protein, they will retain their ability to infect E. coli. Thus, successive rounds of reinfection of E. coli and panning will greatly enrich for modified hydroxylase homologs, which can then be screened for further biological activities in order to differentiate agonists and antagonists.
In light ofthe present disclosure, other forms of mutagenesis generally applicable will be apparent to those skilled in the art in addition to the aforementioned rational mutagenesis based on conserved versus non-conserved residues. For example, modified Lesquerella hydroxylase homologs can be generated and screened using, for example, alanine scanning mutagenesis and the like (Ruf et al. (1994) Biochem. 33:1565-1572; Wang et al. (1994) J. Biol. Chem. 269:3095-3099; Balint (1993) Gene 137:109-118 ; Grodberg et al. (1993) Eur. J. Biochem. 218:597-601; Nagashima et al. (1993) J. Biol. Chem. 268:2888-2892; Lowman et al. (1991) Biochem. 30:10832-10838; and Cunningham et al. (1989) Science 244:1081-1085), by linker scanning mutagenesis (Gustin et al (1993) Virol. 193:653-660; Brown et al. (1992) Mol. Cell. Biol. 12:2644-2652; McKnight et al (1982)Science 217:316); or by saturation mutagenesis (Meyers et al. (1986) Science 232:613).
III. Expression of cloned modified Lesquerella hydroxylase In other embodiment ofthe present invention, a nucleic acid sequence encoding a modified Lesquerella hydroxylase according to the present invention is used to generate a recombinant DNA molecule that directs the expression ofthe encoded protein product in appropriate host cells. In yet other embodiments, a nucleic acid sequence corresponding to the antisense sequence of a modified Lesquerella hydroxylase is used. As will be understood by those of skill in the art, it may be advantageous to produce a modified hydroxylase-encoding nucleotide sequences possessing non-naturally occurring codons Therefore, in some preferred embodiments, codons preferred by a particular prokaryotic or eukaryotic host (Murray et al. (1989) Nucl. Acids Res. 17) can be selected, for example, to increase the rate of hydroxylase expression or to produce recombinant RNA transcripts having desirable properties, such as a longer half-life, than transcripts produced from naturally occurring sequence.
A. Vectors for production of a modified Lesquerella hydroxylase
The nucleic acid sequences ofthe present invention may be employed for producing polypeptides by recombinant techniques. Thus, for example, the nucleic acid sequence may be included in any one of a variety of expression vectors for expressing a polypeptide. In some embodiments ofthe present invention, vectors include, but are not limited to, chromosomal, nonchromosomal and synthetic DNA sequences (for example, derivatives of SV40, bacterial plasmids, phage DNA; baculovirus, yeast plasmids, vectors derived from combinations of plasmids and phage DNA, and viral DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies). It is contemplated that any vector may be used as long as it is replicable and viable in the host.
In particular, some embodiments ofthe present invention provide recombinant constructs comprising one or more ofthe nucleic sequences ofthe present invention as described above. In some embodiments ofthe present invention, the constructs comprise a vector, such as a plasmid or viral vector, into which a nucleic acid sequence ofthe invention has been inserted, in a forward or reverse orientation. In preferred embodiments ofthe present invention, the appropriate nucleic acid sequence is inserted into the vector using any of a variety of procedures. In general, the nucleic acid sequence is inserted into an appropriate restriction endonuclease site(s) by procedures known in the art. Large numbers of suitable vectors are known to those of skill in the art, and are commercially available. Such vectors include, but are not limited to, the following vectors: 1) Bacterial ~ pQE70, pQE60, pQE-9 (Qiagen), pBS, pDIO, phagescript, psiX174, pbluescript SK, pBSKS, pNH8A, pNH16a, pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); and 2) Eukaryotic ~ pWLNEO, pSV2CAT, pOG44, PXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, and pSVL (Pharmacia). Any other plasmid or vector may be used as long as they are replicable and viable in the host. In some preferred embodiments ofthe present invention, plant expression vectors comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation sites, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences. In other embodiments, DNA sequences derived from the SV40 splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.
In certain embodiments ofthe present invention, the nucleic acid sequence in the expression vector is operatively linked to at least one suitable regulatory sequence. Such sequences include, but are not limited to, an appropriate expression control sequence(s)
(promoter) to direct mRNA synthesis. Promoters useful in the present invention include, but are not limited to, the LTR or SV40 promoter, the E. coli lac or trp, the phage lambda PL and
PR, T3 and T7 promoters, and the cytomegalovirus (CMV) immediate early, herpes simplex virus (HSV) thymidine kinase, and mouse metallothionein-I promoters and other promoters known to control expression of gene in prokaryotic or eukaryotic cells or their viruses.
Exemplary plant promoters include but are not limited to SD Cauliflower Mosaic Virus (CaMV
SD; see for example, U.S. Pat. No. 5,352,605, incorporated herein by reference), mannopine synthase, octopine synthase (ocs), superpromoter (see for example, WO 95/14098), and ubi3
(see for example, Garbarino and Belknap (1994) Plant Mol. Biol. 24:119-127) promoters. In other embodiments ofthe present invention, recombinant expression vectors include origins of replication and selectable markers permitting transformation ofthe host cell (for example, dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or tetracycline or ampicillin resistance in E. coli).
In some embodiments ofthe present invention, transcription ofthe DNA encoding the polypeptides ofthe present invention by higher eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to increase its transcription. Enhancers useful in the present invention include, but are not limited to, the SV40 enhancer on the late side ofthe replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side ofthe replication origin, and adenovirus enhancers. In other embodiments, the expression vector also contains a ribosome binding site for translation initiation and a transcription terminator. In still other embodiments ofthe present invention, the vector may also include appropriate sequences for amplifying expression.
B. Host cells for production modified hydroxylase In a further embodiment, the present invention provides host cells containing the above-described constructs. In some embodiments ofthe present invention, the host cell is a higher eukaryotic cell (for example, a plant cell). In other embodiments ofthe present invention, the host cell is a lower eukaryotic cell (for example, a yeast cell). In still other embodiments ofthe present invention, the host cell can be a prokaryotic cell (for example, a bacterial cell). Specific examples of host cells include, but are not limited to, Escherichia coli, Salmonella typhimurium, Bacillus subtilis, and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus, as well as Saccharomycees cerivisiae, Schizosaccharomycees pombe, Drosophila S2 cells, Spodoptera Sf9 cells, Chinese hamster ovary (CHO) cells, COS-7 lines of monkey kidney fibroblasts, (Gluzman (1981) Cell 23:175), 293T, C127, 3T3, HeLa and BHK cell lines, NT-1 (tobacco cell culture line), root cell and cultured roots in rhizosecretion (Gleba et al. (1999) Proc Natl Acad Sci USA 96: 5973-5977) and other plant cells, which can be cultivated in fermenters or which can be regenerated into an entire plants.
The constructs in host cells can be used in a conventional manner to produce the gene product encoded by the recombinant sequence. In some embodiments, introduction of the construct into the host cell can be accomplished by calcium phosphate transfection, DEAE-Dextran mediated transfection, or electroporation (See for example, Davis et al. (1986) Basic Methods in Molecular Biology).
Proteins can be expressed in eukaryotic cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs ofthe present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, NY.
In some embodiments ofthe present invention, following transformation of a suitable host cell and growth ofthe host cell to an appropriate cell density, the selected promoter is induced by appropriate means (for example, temperature shift or chemical induction) and cells are cultured for an additional period. In other embodiments ofthe present invention, cells are typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification. In still other embodiments ofthe present invention, microbial cells and other cultured cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents.
IV. Production of Large Quantities of Hydroxy or Desaturated Fatty Acids In one aspect ofthe present invention, methods are provided for producing large quantities of hydroxy fatty acids, or di- or polyunsaturated fatty acids, or both (collectively, the fatty acid products), depending upon the catalytic specificity ofthe modified Lesquerella hydroxylase. In some embodiments, the fatty acid products are produced in vivo, in organisms transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention and capable of expressing hydroxylase activity, and grown under conditions sufficient to effect production ofthe fatty acid products. In other embodiments, the fatty acid products are produced in vitro, from either nucleic acid sequences encoding a modified Lesquerella hydroxylase ofthe present invention or from polypeptides comprising a modified hydroxylase ofthe present invention and exhibiting fatty acid hydroxylase or desaturase activity. The invention also provides compositions comprising organisms transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention, and in vitro systems comprising modified Lesquerella hydroxylase polypeptides or coding sequences or both for producing fatty acid products.
A. In vivo in Transgenic Organisms In some embodiments ofthe present invention, the fatty acid products are produced in vivo, by providing an organism transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention and growing the transgenic organism under conditions sufficient to effect production ofthe fatty acid products. In other embodiments of the present invention, the fatty acid products are produced in vivo by transforming an organism with a heterologous gene encoding a modified hydroxylase ofthe present invention and growing the transgenic organism under conditions sufficient to effect production ofthe fatty acid products.
Organisms which are transformed with a heterologous gene encoding a modified hydroxylase ofthe present invention include preferably those which naturally synthesize and store in some manner fatty acids, and those which are commercially feasible to grow and suitable for harvesting the fatty acid products. Such organisms include but are not limited to bacteria, yeast, oleaginous algae, and plants. Examples of bacteria include E. coli and related bacteria which can be grown in commercial-scale fermenters. Examples of yeast include S. cerevisiae, strains IVNSCI and YPH499, which can also be grown in commercial-scale fermenters. Examples of plants include preferably oil-producing plants; examples of such plants include but are not limited to soybean, rapeseed and canola, sunflower, cotton, corn, cocoa, safflower, oil palm, coconut palm, flax, castor, and peanut. Non-commercial cultivars of plants can be transformed, and the trait for expression of a modified Lesquerella hydroxylase ofthe present invention moved to commercial cultivars by breeding techniques well-known in the art.
A heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention, including fusion proteins, includes any suitable sequence as described above. Preferably, the heterologous gene is provided within an expression vector such that transformation with the vector results in expression ofthe polypeptide; suitable vectors are described above and below. A transgenic organism is grown under conditions sufficient to effect production ofthe fatty acid products. In some embodiments ofthe present invention, a transgenic organism is supplied with exogenous substrates ofthe modified hydroxylases. Such substrates comprise mono-, di-, and poly-unsaturated fatty acids; the chain length of such unsaturated fatty acids is variable, but is preferably 18 carbons in length. The unsaturated fatty acids may also comprise additional functional groups, including but not limited to acetylenic bonds, conjugated acetylenic and ethylenic bonds, allenic groups, cyclopropane, cyclopropene, cyclopentene and furan rings, epoxy-, and keto-groups and double bonds of both the cis and trans configuration and separated by more than one methylene group; two or more of these functional groups may be found in a single fatty acid. The substrates are added or present as the free acids, the ACP and CoA esters, the salts of these acids, the glycerolipid esters (particularly the phospholipid and triacylglycerol esters), the wax esters, and the ether derivatives of these acids. Most preferably, such substrates are selected from the group consisting of: 16: ld9 (palmitoleate); 18:ld6 (petroselenate); 18:ld9 (oleate); 20:ldl l (gladoleate or eicosenoate); and 22:ld13 (erucate or docosenoate). Substrates are supplied in various forms as are well known in the art; such forms include aqueous suspensions prepared by sonication, aqueous suspensions prepared with detergents and other surfactants, micellar preparations which include the substrate, dissolution ofthe substrate into a solvent, and dried powders of substrates. Such forms may be added to organisms or cultured cells or tissues grown in fermenters. In yet other embodiments ofthe present invention, a transgenic organism comprises a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention operably linked to an inducible promoter, and is grown either in the presence ofthe an inducing agent, or is grown and then exposed to an inducing agent. In still other embodiments ofthe present invention, a transgenic organism comprises a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention operably linked to a promoter which is either tissue specific or developmentally specific, and is grown to the point at which the tissue is developed or the developmental stage at which the developmentally-specific promoter is activated.
In alternative embodiments, a transgenic organism as described above is engineered to produce greater amounts ofthe fatty acid substrate. Organisms include bacteria, yeast, algae and plants; preferably, the organism is a plant; most preferably, the organism is an oil- producing plant.
In other embodiments ofthe present invention, the methods for producing large quantities ofthe fatty acid products further comprise collecting the fatty acids produced. Such methods are known generally in the art, and include harvesting the transgenic organisms and extracting the fatty acid products. Extraction procedures preferably include solvent extraction, and typically include disrupting cells, as by chopping, mincing, grinding, and/or sonicating, prior to solvent extraction. Solvent extraction procedures are well known, and have been described. In yet other embodiments ofthe present invention, the fatty acid products are further purified, as for example by thin layer liquid chromatography, gas-liquid chromatography, or high pressure liquid chromatography
1. Transgenic Plants, Seeds, and Plant Parts
In one preferred aspect ofthe invention, the fatty acid products are produced in transgenic plants; preferably, the fatty acid products are produced in plant seed oils. Plants are transformed with a heterologous gene encoding a modified Lesquerella hydroxylase ofthe present invention; in some embodiments, plants are transformed with a fusion gene encoding a fusion polypeptide comprising a modified hydroxylase ofthe present invention. Transformation techniques are well known in the art. It is contemplated that the heterologous genes are utilized to increase the level ofthe encoded enzyme activities
Plants
The methods ofthe present invention are not limited to any particular plant. Indeed, a variety of plants are contemplated, including but not limited to soybean (Glycine max), rapeseed and canola (including Brassica napus and B. cαmpestris), sunflower (Heliαnthus αnnus), cotton (Gossypium hirsutum), corn (Zeα mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorius), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea). The group also includes non-agronomic species which are useful in developing appropriate expression vectors such as tobacco, rapid cycling Brassica species, and Arabidopsis thaliana, and wild species which may be a source of unique fatty acids. Vectors
The methods ofthe present invention contemplate the use of a heterologous gene encoding a modified hydroxylase ofthe present invention, as described above. Heterologous genes intended for expression in plants are first assembled in expression cassettes comprising a promoter. Methods which are well known to those skilled in the art are used to construct expression vectors containing a heterologous gene and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are widely described in the art (See for example, Sambrook. et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, NY.; and Ausubel, F. M. et al (1989) Current Protocols in Molecular Biology, John Wiley & Sons, New York, NY). In general, these vectors comprise a nucleic acid sequence encoding a modified hydroxylase ofthe present invention (as described above) operably linked to a promoter and other regulatory sequences (for example, enhancers, polyadenylation signals, etc.) required for expression in a plant.
Promoters include but are not limited to constitutive promoters, tissue-, organ-, and developmentally-specific promoters, and inducible promoters. Examples of promoters include but are not limited to: constitutive promoter 35S of cauliflower mosaic virus; a wound-inducible promoter from tomato, leucine amino peptidase ("LAP," Chao et al. (1999) Plant Physiol 120: 979-992); a chemically-inducible promoter from tobacco, Pathogenesis-Related 1 (PR1) (induced by salicylic acid and BTH
(benzothiadiazole-7-carbothioic acid S-methyl ester)); a tomato proteinase inhibitor II promoter (PIN2) or LAP promoter (both inducible with methyl jasmonate); a heat shock promoter (US Pat 5,187,267); a tetracycline-inducible promoter (US Pat 5,057,422); and seed-specific promoters, such as those for seed storage proteins (for example, phaseolin, napin, oleosin, and a promoter for soybean beta conglycin (Beachy et al. (1985) EMBO J. 4: 3047-3053)). All references cited herein are incorporated in their entirety.
The expression cassettes may further comprise any sequences required for expression of mRNA. Such sequences include, but are not limited to transcription terminators, enhancers such as introns, viral sequences, and sequences intended for the targeting ofthe gene product to specific organelles and cell compartments.
A variety of transcriptional terminators are available for use in expression of sequences ofthe present invention. Transcriptional terminators are responsible for the termination of transcription beyond the transcript and its correct polyadenylation. Appropriate transcriptional terminators and those which are known to function in plants include, but are not limited to, the CaMV 35S terminator, the tml terminator, the pea rbcS E9 terminator, and the nopaline and octopine synthase terminator (See for example, Odell et al. (1985) Nature 313:810; Rosenberg et al. (1987) Gene 56:125; Guerineau et al. (1991) Mol. Gen. Genet. 262:141; Proudfoot (1991) Cell 64:671; Sanfacon et al, Genes Dev. 5:141 ; Mogen et al. (1990) Plant Cell 2:1261;
Munroe et al. (1990) Gene 91:151; Ballas et al. (1989) Nucleic Acids Res. 17:7891; Joshi et al. (1987) Nucleic Acid Res. 15:9627).
In addition, in some embodiments, constructs for expression ofthe heterologous gene of interest include one or more of sequences found to enhance gene expression from within the transcriptional unit. These sequences can be used in conjunction with the nucleic acid sequence of interest to increase expression in plants. Various intron sequences have been shown to enhance expression, particularly in monocotyledonous cells. For example, the introns ofthe maize Adhl gene have been found to significantly enhance the expression ofthe wild-type gene under its cognate promoter when introduced into maize cells (Callis et al. (1987) Genes Develop. 1: 1183). Intron sequences have been routinely incorporated into plant transformation vectors, typically within the non-translated leader.
In some embodiments ofthe present invention, the construct for expression ofthe nucleic acid sequence of interest also includes a regulator such as a nuclear localization signal (Kalderon et al. (1984) Cell 39:499; Lassner etal. (1991) Plant Molecular Biology 17:229), a plant translational consensus sequence (Joshi (1987) Nucleic Acids Research 15:6643), an intron (Luehrsen and Walbot (1991) Mol. Gen. Genet. 225:81), and the like, operably linked to the nucleic acid sequence encoding the modified hydroxylase ofthe present invention.
In preparing the construct comprising the nucleic acid sequence encoding a modified hydroxylase ofthe present invention, various DNA fragments can be manipulated, so as to provide for the DNA sequences in the desired orientation (for example, sense or antisense) orientation and, as appropriate, in the desired reading frame. For example, adapters or linkers can be employed to join the DNA fragments or other manipulations can be used to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resection, ligation, or the like is preferably employed, where insertions, deletions or substitutions (for example, transitions and transversions) are involved.
Numerous transformation vectors are available for plant transformation. The selection of a vector for use will depend upon the preferred transformation technique and the target species for transformation. For certain target species, different antibiotic or herbicide selection markers are preferred. Selection markers used routinely in transformation include the nptll gene which confers resistance to kanamycin and related antibiotics (Messing and Vierra (1982) Gene 19: 259; Bevan et al. (1983) Nature 304:184), the bar gene which confers resistance to the herbicide phosphinothricin (White et al (1990) Nucl Acids Res. 18:1062; Spencer et al (1990) Theor. Appl. Genet. 79: 625), the hph gene which confers resistance to the antibiotic hygromycin (Blochlinger and Diggelmann (1984) Mol. Cell. Biol. 4:2929), and the dhfr gene, which confers resistance to methotrexate (Bourouis et al. (1983) EMBO J., 2:1099).
In some preferred embodiments, the vector is adapted for use in an Agrobacterium mediated transfection process (See for example, U.S. Pat. Nos. 5,981,839; 6,051,757; 5,981,840; 5,824,877; and 4,940,838; all of which are incorporated herein by reference). Construction of recombinant Ti and Ri plasmids in general follows methods typically used with the more common bacterial vectors, such as pBR322. Additional use can be made of accessory genetic elements sometimes found with the native plasmids and sometimes constructed from foreign sequences. These may include but are not limited to structural genes for antibiotic resistance as selection genes.
Two systems of recombinant Ti and Ri plasmid vector systems are in general use. The first system is called the "cointegrate" system. In this system, the shuttle vector containing the gene of interest is inserted by genetic recombination into a non-oncogenic Ti plasmid that contains both the cis-acting and trans-acting elements required for plant transformation as, for example, in the pMLJl shuttle vector and the non-oncogenic Ti plasmid pGV3850. The second system is called the "binary" system in which two plasmids are used; the gene of interest is inserted into a shuttle vector containing the cis-acting elements required for plant transformation. The other necessary functions are provided in trans by the non-oncogenic Ti plasmid as exemplified by the pBIN19 shuttle vector and the non-oncogenic Ti plasmid PAL4404. Some of these vectors are commercially available.
In other embodiments ofthe invention, the nucleic acid sequence of interest is targeted to a particular locus on the plant genome. Site-directed integration ofthe nucleic acid sequence of interest into the plant cell genome may be achieved by, for example, homologous recombination using Agrobacterium-deri ed sequences. Generally, plant cells are incubated with a strain of Agrobacterium which contains a targeting vector in which sequences that are homologous to a DNA sequence inside the target locus are flanked by Agrobacterium transfer- DNA (T-DNA) sequences, as previously described (U.S. Pat. No. 5,501,967). One of skill in the art knows that homologous recombination may be achieved using targeting vectors which contain sequences that are homologous to any part ofthe targeted plant gene, whether belonging to the regulatory elements ofthe gene, or the coding regions ofthe gene. Homologous recombination may be achieved at any region of a plant gene so long as the nucleic acid sequence of regions flanking the site to be targeted is known. In yet other embodiments, a nucleic acid ofthe present invention is utilized to construct vectors derived from plant (+) RNA viruses (for example, brome mosaic virus, tobacco mosaic virus, alfalfa mosaic virus, cucumber mosaic virus, tomato mosaic virus, and combinations and hybrids thereof). Generally, the inserted modified hydroxylase encoding polynucleotide can be expressed from these vectors as a fusion protein (for example, coat protein fusion protein) or from its own subgenomic promoter or other promoter. Methods for the construction and use of such viruses are described in U.S. Pat. Nos. 5,846,795; 5,500,360; 5,173,410; 5,965,794; 5,977,438; and 5,866,785, all of which are incorporated herein by reference.
In some embodiments ofthe present invention, where the nucleic acid sequence of interest is introduced directly into a plant. One vector useful for direct gene transfer techniques in combination with selection by the herbicide Basta (or phosphinothricin) is a modified version ofthe plasmid pCIB246, with a CaMV 35S promoter in operational fusion to the E. coli GUS gene and the CaMV 35S transcriptional terminator (WO 93/07278).
Transformation Techniques Once a nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention is operatively linked to an appropriate promoter and inserted into a suitable vector for the particular transformation technique utilized (for example, one ofthe vectors described above), the recombinant DNA described above can be introduced into the plant cell in a number of art-recognized ways. Those skilled in the art will appreciate that the choice of method depends upon the vector and the type of plant targeted for transformation. In some embodiments, the vector is maintained episomally. In other embodiments, the vector is integrated into the genome.
In some embodiments, direct transformation in the plastid genome is used to introduce the vector into the plant cell (See for example, U.S. Patent Nos 5,451,513; 5,545,817;
5,545,818; PCT application WO 95/16783). The basic technique for chloroplast transformation involves introducing regions of cloned plastid DNA flanking a selectable marker together with the nucleic acid encoding the RNA sequences of interest into a suitable target tissue (for example, using biolistics or protoplast transformation with calcium chloride or PEG). The 1 to 1.5 kb flanking regions, termed targeting sequences, facilitate homologous recombination with the plastid genome and thus allow the replacement or modification of specific regions ofthe plastome. Initially, point mutations in the chloroplast 16S rRNA and rpsl2 genes conferring resistance to spectinomycin and/or streptomycin are utilized as selectable markers for transformation (Svab et al. (1990) PNAS 87:8526; Staub and Maliga (1992) Plant Cell 4:39). The presence of cloning sites between these markers allowed creation of a plastid targeting vector introduction of foreign DNA molecules (Staub and Maliga, (1993) EMBO J. 12:601). Substantial increases in transformation frequency are obtained by replacement ofthe recessive rRNA or r-protein antibiotic resistance genes with a dominant selectable marker, the bacterial aadA gene encoding the spectinomycin-detoxifying enzyme aminoglycoside-3'-adenyltransferase (Svab and Maliga (1993) PNAS 90:913). Other selectable markers useful for plastid transformation known in the art are encompassed within the scope of the present invention. Plants homoplasmic for plastid genomes containing the two nucleic acid sequences separated by a promoter ofthe present invention are obtained, and are preferentially capable of high expression ofthe RNAs encoded by the DNA molecule. In other embodiments, vectors useful in the practice ofthe present invention are microinjected directly into plant cells by use of micropipettes to mechanically transfer the recombinant DNA (Crossway (1985) Mol. Gen. Genet 202:179). In still other embodiments, the vector is transferred into the plant cell by using polyethylene glycol (Krens et al (1982) Nature 296:72; Crossway et al. (1986) BioTechniques 4:320); fusion of protoplasts with other entities, either minicells, cells, lysosomes or other fusible lipid-surfaced bodies (Fraley et al. (1982) Proc. Natl. Acad. Sci. USA, 79:1859 ); protoplast transformation (EP 0 292 435); direct gene transfer (Paszkowski et al. (1984) EMBO J. 3:2717; Hayashimoto et al. (1990) Plant Physiol. 93:857).
In still further embodiments, the vector may also be introduced into the plant cells by electroporation. (Fromm, et al. (1985) Pro. Natl Acad. Sci. USA 82:5824; Riggs et al. (1986) Proc. Natl. Acad. Sci. USA 83:5602). In this technique, plant protoplasts are electroporated in the presence of plasmids containing the gene construct. Electrical impulses of high field strength reversibly permeabilize biomembranes allowing the introduction ofthe plasmids. Electroporated plant protoplasts reform the cell wall, divide, and form plant callus.
In yet other embodiments, the vector is introduced through ballistic particle acceleration using devices (for example, available from Agracetus, Inc., Madison, Wis. and Dupont, Inc., Wilmington, Del). (See for example, U.S. Pat. No. 4,945,050; and McCabe et al. (1988) Biotechnology 6:923). See also, Weissinger et al. (1988) Annual Rev. Genet. 22:421; Sanford et al. (1987) Particulate Science and Technology 5:27 (onion); Svab et al. (1990) Proc. Natl. Acad. Sci. USA 87:8526 (tobacco chloroplast); Christou et al. (1988) Plant Physiol. 87:671 (soybean); McCabe et al. (1988) Bio/Technology 6:923 (soybean); Klein et al. (1988) Proc. Natl. Acad. Sci. USA 85:4305 (maize); Klein et al. (1988) Bio/Technology 6:559 (maize); Klein et al. (1988) Plant Physiol. 91:4404 (maize); Fromm et al (1990) Bio/Technology 8:833; and Gordon-Kamm et al. (1990) Plant Cell 2:603 (maize); Koziel et al (1993) Biotechnology 11:194 (maize); Hill et al (1995) Euphytica 85:119 and Koziel et al (1996) Annals ofthe New York Academy of Sciences 792:164; Shimamoto et al. (1989) Nature 338: 274 (rice); Christou et al. (1991) Biotechnology 9:957 (rice); Datta et al (1990) Bio/Technology 8:736 (rice); European Patent Application EP 0 332 581 (orchardgrass and other Pooideae); Vasil et al (1993) Biotechnology 11: 1553 (wheat); Weeks et al. (1993) Plant Physiol. 102: 1077 (wheat); Wan et al. (1994) Plant Physiol. 104: 37 (barley); Jahne et al. (1994) Theor. Appl. Genet. 89:525 (barley); Knudsen and Muller (1991) Planta 185:330 (barley); Umbeck et al. (1987) Bio/Technology 5: 263 (cotton); Casas et al. (1993) Proc. Natl. Acad. Sci. USA 90:11212 (sorghum); Somers et al. (1992) Bio/Technology 10:1589 (oat); Torbert et al. (1995) Plant Cell Reports 14:635 (oat); Weeks et al. (1993) Plant Physiol. 102:1077 (wheat); Chang et al, WO 94/13822 (wheat) and ehra et al (1994) The Plant Journal 5:285 (wheat).
In addition to direct transformation, in some embodiments, the vectors comprising a nucleic acid sequence encoding a modified Lesquerella hydroxylase ofthe present invention are transferred using Agrobacterium-mediated transformation (Hinchee et al. (1988) Biotechnology 6:915; Ishida et al. (1996) Nature Biotechnology 14:745). Agrobacterium is a representative genus ofthe gram-negative family Rhizobiaceae. Its species are responsible for plant tumors such as crown gall and hairy root disease. In the dedifferentiated tissue characteristic ofthe tumors, amino acid derivatives known as opines are produced and catabolized. The bacterial genes responsible for expression of opines are a convenient source of control elements for chimeric expression cassettes. Heterologous genetic sequences (for example, nucleic acid sequences operatively linked to a promoter ofthe present invention), can be introduced into appropriate plant cells, by means ofthe Ti plasmid of Agrobacterium tumefaciens. The Ti plasmid is transmitted to plant cells on infection by Agrobacterium tumefaciens, and is stably integrated into the plant genome (Schell, 1987, Science, 237: 1176). Species which are susceptible infection by Agrobacterium may be transformed in vitro. Alternatively, plants may be transformed in vivo, such as by transformation of a whole plant by Agrobacterial infiltration of adult plants, as in a "floral dip" method (Bechtold, N et al. (1993) Cr. Acad. Sci. Ill-Vie 316: 1194-1199).
Regeneration
After selecting for transformed plant material which can express the heterologous gene encoding a modified hydroxylase ofthe present invention, whole plants are regenerated. Plant regeneration from cultured protoplasts is described in Evans et al. (1983) Handbook of Plant Cell Cultures, Vol. 1 : (MacMillan Publishing Co. New York); and Vasil I. R. (ed.) (1984) Cell Culture and Somatic Cell Genetics of Plants, Acad. Press, Orlando, Vol. 1, 1984, and Vol. III. It is known that many plants can be regenerated from cultured cells or tissues, including but not limited to all major species of sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables, and monocots (for example, the plants described above). Means for regeneration vary from species to species of plants, but generally a suspension of transformed protoplasts containing copies ofthe heterologous gene is first provided. Callus tissue is formed and shoots may be induced from callus and subsequently rooted.
Alternatively, embryo formation can be induced from the protoplast suspension. These embryos germinate and form mature plants. The culture media will generally contain various amino acids and hormones, such as auxin and cytokinins. Shoots and roots normally develop simultaneously. Efficient regeneration will depend on the medium, on the genotype, and on the history ofthe culture. The reproducibility of regeneration depends on the control of these variables.
Generation of Transgenic lines
Transgenic lines are established from transgenic plants by tissue culture propagation. The presence of nucleic acid sequences encoding a modified hydroxylase ofthe present invention or a fusion protein comprising the modified hydroxylase are transferred to related varieties by traditional plant breeding techniques.
2. Micro-organisms
In another preferred aspect ofthe invention, the fatty acid products are produced in transgenic mircoorganisms. Microorganisms are transformed with a heterologous gene encoding a modified hydroxylase ofthe present invention or a gene encoding a fusion polypeptide comprising a modified hydroxylase ofthe present invention according to procedures well known in the art. It is contemplated that the heterologous genes are utilized to increase the level ofthe enzyme activities encoded by the heterologous genes.
Yeast In one embodiment, the fatty acid products are produced in transgenic yeast. Wild-type yeast do not accumulate detectable levels of hydroxylated fatty acids. A nucleic acid sequence encoding a modified hydroxylase ofthe present invention is placed into an expression vector under transcriptional control of a promoter; for example, such a promoter is an inducible promoter GAL Expression ofthe enzyme is induced, and the fatty acids produced as a result ofthe enzyme activity. In preferred embodiments, nucleic acid sequences encoding modified Lesquerella hydroxylases are cloned into an expression vector under control of an inducible promoter (a non-limiting example is cloning the coding sequence into the pYes-II expression vector behind a GAL-1 promoter as previously described (Broun, P. et al. (1998) Science 282(5392), 1315- 1317)), and expressed in yeast YPH499 strain. Expression of the modified hydroxylase is induced at higher cell densities, of at least an OD60ogreater than about 1, and preferably greater than about 2, and more preferably about 2.5, and at higher temperatures, where the temperature is above room temperature, preferably greater than about 20 °C, more preferably greater than about 22 °C , yet more preferably greater than about 25 °C, and even more preferably at or above about 30°C. This results in accumulation of relatively high amounts of hydroxy-fatty acids. B. In vitro Systems
In other embodiments ofthe present invention, fatty acid products are produced in viti-o, from either nucleic acid sequences encoding a modified hydroxylase ofthe present invention or from a polypeptide comprising a modified hydroxylase ofthe present invention.
Using nucleic acid sequences encoding a modified hydroxylase
In some embodiments ofthe present invention, methods for producing large quantities ofthe fatty acid products comprise adding an isolated nucleic acid sequence encoding a modified hydroxylase ofthe present invention to in vitro expression systems under conditions sufficient to cause production ofthe modified hydroxylase. The isolated nucleic acid sequences encoding a modified hydroxylase ofthe present invention is any suitable sequence as described above, and preferably is provided within an expression vector such that addition of the vector to an in vitro transcription/translation system results in expression ofthe polypeptide. The system further comprises the substrates for the modified hydroxylase, as described above. Alternatively, the system further comprises the means for generating the substrates for the modified hydroxylase.
In other embodiments ofthe present invention, the methods for producing large quantities ofthe fatty acid products further comprise collecting the fatty acids produced. Such methods are known generally in the art, and have been described above. In yet other embodiments ofthe present invention, the fatty acid products are further purified, as for example by thin layer liquid chromatography, gas-liquid chromatography, or high performance liquid chromatography.
Using modified hydroxylase ofthe present invention In some embodiments ofthe present invention, methods for producing large quantities of fatty acid products comprise incubating a modified hydroxylase ofthe present invention under conditions sufficient to result in the synthesis ofthe fatty acid products; generally, such incubation is carried out in a mixture which comprises the modified hydroxylase.
A modified hydroxylase ofthe present invention, as described above, is obtained by purification of recombinant modified hydroxylase from an organism transformed with heterologous gene encoding a modified hydroxylase ofthe present invention, as described above. A source of recombinant modified hydroxylase is either plant, bacterial or other transgenic organisms, transformed with heterologous gene encoding a modified hydroxylase of the present invention as described above. The modified hydroxylase may further include means for improving purification, as for example a 6x-His tag added to the C-terminus ofthe protein as described above. Alternatively, a modified hydroxylase ofthe present invention is chemically synthesized.
The incubation mixture further comprises the substrates for the modified hydroxylase, as described above. Alternatively, the mixture further comprises the means for generating the substrates for the modified hydroxylase.
In other embodiments of the present invention, the methods for producing large quantities of the fatty acid products further comprise collecting the fatty acids produced; such methods are described above.
IV. Manipulating reaction specificities of other desaturases and hydroxylases
Both LFAH (Broun, P. et al. (1997) Plant J. 13: 201-210) and CFAH (Smith, M. et al. (2000) Biochem. Soc. Trans. 28: 947-950) enzymes catalyze desaturation in addition to hydroxylation, suggesting that these enzymes utilize specialized variations ofthe common desaturase mechanism. As shown in the Examples below, FAD2 also catalyzes oleate hydroxylation, further supporting the relationship between the oleate desaturases and hydroxylases. The presence of trace amounts of ricinoleic acid in the seed oils of soybean (family Fabaceae), olive (family Oleaceae), and flax (family Linaceae), as shown in the Examples, points to a general bifunctional nature ofthe plant FAD2 enzymes.
The ability to convert the oleate desaturase into a bifunctional desaturase/hydroxylase with as few as one base substitution (Arabidopsis thaliana Fad2 M324V ( bases 970-972 ATG to GTG)) validates the notion that evolution ofthe modified hydroxylase could have progressed incrementally via gene duplication and mutagenesis. The ease of this conversion is reflected by the independent evolution of 12-hydroxylase activity at least several times (van de Loo, F. J. et al. (1995) Proc. Natl. Acad. Sci. USA 92(15): 6743-674; Broun, P. et al. (1997) Plant J. 13: 201-210). As described in the Examples, highly divergent desaturases with different regiospecificities retain a similar ability to form small quantities of hydroxylated fatty acids. Despite their limited sequence homology, these classes of membrane-bound diiron desaturases share the canonical histidine boxes that likely act as the iron ligands as well as predicted transmembrane segments (Shanklin, J. et al. (1994) Biochemistry 33(43): 12787-12794). The identification of residues adjacent to these conserved histidine clusters that are critical determinants of hydroxylation function supports a hypothesis that desaturases with different regiospecificities (Girke, T. etal. (1998) in Adv. Plant Lipid Res. (Sanchez, J., Cerda-Olmedo, E., and Martinez-Force, E., eds), pp. 103-109, Universidad de Sevilla, Spain) could become hydroxylases through a similar evolutionary process. The intrinsic hydroxylation activity of desaturases with different regiospecificities make these enzymes suitable targets for directed evolution. Furthermore, the observation of low-level hydroxylation activity at C-12 for the Δ12 desaturase from A. thaliana, C-9 for the Δ9 desaturase of S. cerevisiae, C-5 for the Δ5 desaturase of B. subtilis, and C-15 for the ω-3 linoleate desaturase from flax suggests that these positions are the sites of initial oxidation ofthe desaturation reactions, corroborating the kinetic isotope effect studies of Buist and coworkers (Buist, P. H., and Behrouzian, B. (1998) J. Am. Chem. Soc. 120: 871-876; Buist, P. H., and Behrouzian, B. (1996) J. Am. Chem. Soc. 118: 6295-6296; Fauconnot, L., and Buist, P. H. (2001) Bioorg. Med. Lett. 11: 2879-2881; and Savile, C. K. et al. (2001) J. Chem. Soc, Perkin Trans. 1 9: 1116-1121).
There are many examples of seed oils that contain hydroxy fatty acids other than ricinoleic acid or its derivatives ( Smith, C. R. J. (1970) in Progress in the Chemistry of Fats and other Lipids (Holman, R. T., ed) Vol. 11, pp. 139-177). Given the possible inherent hydroxylation function of a diverse group of fatty acid desaturases, from mammals, insects, fungi, bacteria, and plants, the evolution of fatty acid hydroxylases of varying regiospecificities from these ancestral desaturases certainly seems plausible.
Studies by Morris (Morris, L. J. (1967) Biochem. Biophys. Res. Comm. 29(3): 311- 315) and Buist (Buist, P. H., and Behrouzian, B. (1998) J. Am. Chem. Soc. 120: 871-876; and Behrouzian, B. et al. (2001) Eur. J. Biochem. 268: 3545-3549) have shown that the Δ12 desaturases and hydroxylases are mechanistically similar; both enzymes specifically remove the pro-R hydrogen from C-12 of oleate en route to product formation. The Lesquerella enzyme must also share this same stereospecificity as the seed oil-derived lesquerolic acid retains the same optical rotation properties as ricinoleic acid derived from castor seed oil (Smith, C. R., Jr. et al. (1961) J. Org. Chem. 26: 2903-2905). The knowledge that these enzymes are highly homologous (van de Loo, F. J. et al. (1995) Proc. Natl. Acad. Sci. USA 92(15): 6743-6747;and Broun, P. et al. (1997) Plant J. 13: 201-210) and that the enzymes catalyze both desaturation and hydroxylation, just with differing product ratios, implies that these enzymes employ closely-related catalytic mechanisms.
The stereospecificity ofthe variant FAD2 enzymes was determined to gain insight into understanding the cause of bifunctional behavior. Analysis of enzymatically-derived products obtained from yeast cultures expressing active AtFAD2, LFAH, AtFAD2 with 4 amino acid residues incorporated from CFAH (C4M), and AtFAD2 with isoleucine substituted at amino acid position 324 (FAD2-M324I) revealed comparable retention ofthe 12(S)-hydrogen atom. Thus, the variant FAD2 oleate hydroxylases C4M and M324I retain the stereospecificity ofthe wild-type desaturase and hydroxylase enzymes. The retention of stereospecificity throughout the oleate desaturases and hydroxylases, including both wild-type and variant enzymes, is consistent with tight control of substrate binding conformation. However, the ability to form distinct products from one enzyme indicates that some flexibility of substrate binding modes, whether static or dynamic, may still exist. The subtlety ofthe changes necessary to alter the reaction outcome presented here (for example Met to He at position 324), as well as the large number of different changes that can affect reaction outcome, is consistent with a hypothesis that minor alterations in the geometry ofthe active site can explain the change in function ( Broun, P. et al. (1998) Science 282(5392): 1315-1317). Recent studies have shown that the chemical nature ofthe substrate can also influence reaction partitioning of binuclear iron hydroxylases (Jin, Y., and Lipscomb, J. D. (2001) J. Biol. Inorg. Chem. 6: 717-725; Kim, C. et al. (1997) J. Am. Chem. Soc. 119: 3635-3636; and Gherman, B. F. et al. (2001) J. Am. Chem. Soc. 123: 3836-3837). Taken together, these studies imply that the default activity of an activated binuclear iron center towards an unactivated hydrocarbon substrate is hydroxylation. The catalytic function of a binuclear iron center might be changed to desaturation through alteration ofthe chemical nature ofthe substrate (effected by intermediate stabilization, as for example, a chemical substituent, like a fluorine instead of a hydrogen) or by substrate presentation to the oxidant. Substrate presentation could be affected by substitution of amino acids adjacent to the active site; for example, such substitutions might then allow the substrate to approach the oxidant closer, or keep the substrate farther away from the oxidant, due to the physical size difference between amino acid side chains.
While hydrocarbon hydroxylases are capable of controlling the substrate orientation to some degree, small substrate size may also favor hydroxylation (Pikus, J. D. et al. (1997)
Biochemistry 36(31): 9283-9289). The unactivated nature ofthe desaturase substrates suggests that these enzymes do not use intramolecular intermediate stabilization as a means of achieving desaturation. Perhaps the large size ofthe fatty acid substrate ofthe soluble and membrane- bound desaturases would permit these enzymes the control, as mediated through extensive protein-substrate interactions, necessary to avoid hydroxylation and instead catalyze desaturation. In fact, the ability ofthe LFAH enzyme to link substrate identity (16:1Δ9 vs. 18:1Δ9) to functional outcome, as described above where the desaturation to hydroxylation ratio for LFAH is 0.37 for 18:1 Δ9 and 3.5 for 16:1 Δ9 (as illustrated in Table II), which is an approximately a 10-fold difference, further supports the notion that presentation ofthe substrate to the oxidant is a critical factor in specifying hydroxylation or desaturation.
V. Method and System for Examining Enzyme Variants
The present invention further provides a yeast system in which to evaluate the effects of modified hydroxylases and other modified lipid synthetic enzymes. This system avoids the problems associated with utilizing transgenic plants as the means for assessing the effects of modified enzymes. The use of transgenic plants is time-consuming, and results in variable results; such variability is thought to be in part to insertional positional effects. Moreover, many unusual lipid products are toxic to host plants; plants may also be degrading such products. The yeast system ofthe present invention is an alternative to the use of transgenic plants for the evaluation of modified lipid-synthesizing enzymes; the production of transgenic yeast is quick, and yeast are relatively easy to grow, able to tolerate unusual fatty acid products, and possess a background such that the generation of even very low amounts ofthe enzyme product can be detected.
In one embodiment, the present invention provides a system comprising yeast S. cerevisiae strain YPH499 transformed with a nucleic acid sequence encoding a modified Lesquerella hydroxylase enzyme under control of an inducible promoter, where the yeast strain is grown to high density of about an OD6o0of 2.5 and induced at a high temperature of about 30°C.
In other embodiments, the present invention provides methods with which to evaluate the effects of modified hydroxylases and other modified lipid synthetic enzymes. In some embodiments, a nucleic acid sequence encoding a modified Lesquerella hydroxylase is cloned into an expression vector under control of an inducible promoter and expressed in yeast YPH499 strain. Expression ofthe modified hydroxylase is induced at higher cell densities, of an OD6oo of about 2.5, and at higher temperatures of about 30°C. This allows accumulation of relatively high amounts of hydroxy-fatty acids, thus enhancing evaluation ofthe effects ofthe modified hydroxylase. In one embodiment, the modified Lesquerella hydroxylase coding sequence is cloned into the pYes-II expression vector behind a GAL-1 promoter as previously described (Broun, P. et al. (1998) Science 282(5392): 1315-1317).
EXAMPLES
The following examples are provided in order to demonstrate and further illustrate certain preferred embodiments and aspects ofthe present invention and are not to be construed as limiting the scope thereof.
In the experimental disclosure which follows, the following abbreviations apply: N (normal); M (molar); mM (millimolar); μM (micromolar); mol (moles); mmol (millimoles); μmol (micromoles); nmol (nanomoles); pmol (picomoles); g (grams); mg (milligrams); μg (micrograms); ng (nanograms); 1 or L (liters); ml (milliliters); μl (microliters); cm (centimeters); mm (millimeters); μm (micrometers); nm (nanometers); °C (degrees Centigrade); PCR (polymerase chain reaction); RT-PCR (reverse-transcriptase-PCR); TAIL-PCR (thermal asymmetric interlaced-PCR); FAD2, oleate Δ12 desaturase; LFAH, Lesquerella fendleri oleate Δ12 hydroxylase/desaturase; CFAH, Ricinus communis oleate Δ12 hydroxylase; GC/MS, gas chromatography/mass spectrometry; L4M, A. thaliana FAD2 with four substitutions from LFAH (A104G/T148N/S322A/M324I); C4M, A. thaliana FAD2 with four substitutions from CFAH (A104G/T148I/S322A/M324V); BSTFA-TMCS, N, O- Bis(trimethylsilyl)trifluoroacetamide.
EXAMPLE 1 Materials and Methods
Site-directed mutagenesis. Mutations were introduced into the A. thaliana oleate- 12 desaturase gene (fad!) through the use of overlap extension polymerase chain reaction as previously described (Broun, P. et al. (1998) Science 282(5392): 1315-1317). The oligonucleotides (Life Technologies, Gaithersburg, Maryland) used were D5', D3', mD2f, mD2r, mD4f, mD4r, mD5f, mD5r, mD6f, and mD6r (Broun, P. et al. (1998) Science 282(5392): 1315-1317) as well as mDClf gacatcattatatcctcatgcttctact, mDClr agtagaagcatgaggatataatgatgtc, mDC3f caccattccaacattggatccctcgaa, mDC3r ttcgagggatccaatgttggaatggtg, mDC7f cacctgttctcgacagtgccgcattataacgc, mDC7r gcgttataatgcggcactgtcgagaacaggtg, mDC67f cacctgttcgcgacagtgccgcattataacgc, and mDC67r gcgttataatgcggcactgtcgcgaacaggtg. Mutations were introduced into the coding sequence fox Lesquerella fendleri oleate- 12 hydroxylase (LFAH) using the same overlap extension strategy with the following oligonucleotide pairs: for N149I, LesHF, gatcaagcttatgggtgctggtggaagaataatg, and LeslR, ctcgagagatcctatgttggaatggtg, and LeslF, caccattccaacataggatctctcgag, and LesER, gatcgaattctcataacttattgttgtaatagta; for N149T, LesHF and Les2R, ctcgagagatcctgtgttggaatggtg, and Les2F, caccattccaacacaggatctctcgag, and LesER. The bold and underlined letters indicate altered nucleotides and codons, respectively.
Expression of variants in Arabidopsis. The fad2 mutants were cloned into the Sacl/Xmal sites ofthe Agrobacterium binary vector DATNAP, a derivative of pRLMlO (Datla, R. S. S. et al. (1992) Gene 122: 383-384), to direct seed-specific expression from the napin promoter (Kridl, J. C. et al. (1991) Seed Science Res. 1: 209-219). The vectors were introduced into Agrobacterium tumefaciens strain GV3101 pMP90 by electroporation and used to transform Arabidopsis thaliana FAD2-deficient plants (Okuley, J. et al. (1994) Plant Cell 6(1): 147-158) by the floral dip method (Clough, S. J., and Bent, A. F. (1998) Plant J. 16: 735-743 13). Yeast expression conditions. Genes were cloned into pYes-II for expression in S. cerevisiae YPH499 (ATCC, Manassas, Virginia) or INVSCI (Invitrogen, Carlsbad, California) using HindlU/EcoRI sites for the Lesquerella mutants or Acc65VEcoKL sites for the fad2 mutants. Upon transformation of the yeast with a lithium acetate method Agatep, R. et al. (1998) Technical Tips Online), cultures were initially grown on SC-URA media (yeast synthetic complete media devoid of uracil, Sigma) supplemented with 1% casamino acids and 2% glucose. Once cells reached an optical density (600 nm) of ~2, the cells were washed with glucose-free media, resuspended in SC-URA containing 1% casamino acids and 2% galactose and grown at 30°C for 48 h. For stereochemistry studies, [12-2Hι](R)-stearoyl methyl ester, [12-2Hι ](S)-stearoyl methyl ester, or [12-2Hι](S)-oleoyl methyl ester (the generous gift of Chris Savile and Dr. Peter Buist, Carleton University, Ottawa) was added as an ethanolic solution to a sterile glass tube. When using the labeled methyl stearate, cerulenin (lOμg, Sigma, St. Louis, MO) and myristic acid (20 μg) were added to the tubes and the ethanolic solvent was removed by evaporation; we found that the presence of ethanol (0.5-1%) substantially decreased hydroxy fatty acid accumulation. SC- URA media (1 ml) containing 1% casamino acids, 2% glucose (or galactose for induction) and 0.5%) Tergitol NP-40 was added to the dry tubes prior to inoculation. Fatty acid analysis.
Seeds were methylated (l'ml of 1 N HCl-methanol, Supelco, 80°C for lh), extracted with hexane, and trimethylsilylated (100 μl of BSTFA-TMCS, Supelco, 90°C for 45 min). The BSTFA-TMCS was removed by evaporation and the sample resuspended in hexane. Yeast pellets were dried by a nitrogen stream prior to methylation and when fatty acids were added during growth, cell pellets were washed with 1% Tergitol and then water before drying.
Samples were analyzed on a Hewlett-Packard 6890 gas chromatograph equipped with a 5973 mass selective detector (GC/MS) and a J&W DB-23 capillary column (60m x 250 μm x 0.25 μm). The injector was held at 225°C, the oven temperature was varied (100 to 160 °C at 25°C /min, then 10°C /min to 240°C) and a helium flow of 1.1 ml/min was maintained.
EXAMPLE 2
Analysis of Variant fad2 Genes.
As illustrated in Table I, residues 63, 104, 148, 217, 295, 322, and 324 ofthe Arabidopsis thaliana oleate desaturase (AtFAD2) differ from the corresponding residues found in the closely related Lesquerella fendleri oleate hydroxylase (LFAH) and Ricinus communis (castor) oleate hydroxylase (CFAH).
Table I. Amino acid comparison of residues that differ between the oleate desaturases and hydroxylases.
Residue 63 104 148 217 295 322 324
FAD2 Ala 63 Ala104 Thr148 Tyr217 Ala295 Ser322 Met324
L H VaI63 Gly105 Asn149 Phe218 Va!296 Ala323 Ile325
CFAH Ser67 Gly108 Ile152 Phe221 Val299 Ala326 Val328
Amino acid residue numbering is based on the A. thaliana FAD2 sequence; the corresponding residue and residue position for each enzyme is included in the table. LFAH and CFAH represent the Lesquerella fendleri oleate hydroxylase and the castor oleate hydroxylase, respectively. The FAD2 consensus sequence is conserved among the plant FAD2 sequences available in Genbank. Bold-faced and underlined residue numbers are located within five residues of one ofthe three His clusters that have been proposed to coordinate the nonheme iron active site.
The replacement of all seven (L7M) or a subset of four (104, 148, 322 and 324, L4M) residues of AtFAD2 with the corresponding residues from LFAH gives rise to variant AtFAD2 enzymes that catalyzes both desaturation and hydroxylation ( Broun, P. et al. (1998) Science 282(5392): 1315-1317). Because the residues corresponding to 63, 148, and 324 ofthe Lesquerella and castor oleate hydroxylases (A. thaliana FAD2 numbering) differ from each other, variant fad2 genes were also constructed incorporating either seven or four of these substitutions using the corresponding residues ofthe castor oleate hydroxylase. In an effort to dissect the contribution of individual amino acids in determining product distribution, a set of variant fad2 genes containing one, two, or three amino acid substitutions were also constructed. The variant fad2 genes were introduced into Arabidopsis thaliana FAD2-deficient plants by Agrobacterium-msdiated transformation. Trimethylsilylated fatty acid methyl esters from seeds harvested from T2 plants (heterozygous) were analyzed by gas chromatography/mass spectrometry in order to determine the overall fatty acid composition. The results are shown in Figure 1. All AtFAD2 variants expressed in Arabidopsis gave detectable amounts of hydroxy fatty acids, ranging from 0.03-22%). Furthermore, similar distribution patterns ofthe hydroxy fatty acids (18:1-OH>18:2-OH>20:1-OH»20:2-OH) were observed in all plants. Interestingly, the wild-type AtFAD2 enzyme also produced detectable levels of ricinoleic acid (average of 0.03% of total fatty acids). Because ricinoleic acid was detected in seeds of both wild-type and FAD2-deficient plants transformed with fαd2 but not in untransformed A. thαliαnα FAD2-deficient plants, the hydroxylation activity must be associated with FAD2.
Analysis ofthe transgenic plants expressing AtFAD2 variants that incorporated substitutions corresponding to their equivalent residues from the castor hydroxylase enzyme revealed different phenotypes than were observed with the corresponding Lesquerella enzyme substitutions. The introduction of four amino acid substitutions
(A104G/T148I/S322A/M324V, or C4M) generated an enzyme that produced levels of hydroxy fatty acids (up to 22%) similar to those previously obtained upon expression ofthe wild-type CFAH in Arabidopsis FAD2-deficient plants (also under control ofthe napin promoter) (Broun, P., and Somerville, C. (1997) Plant Physiol. 113: 933-942). A large range of hydroxy fatty acid product was observed among independent lines transformed with the same AtFAD2 variant, presumably due to differences in the context ofthe insertion site or the gene copy number. In addition to demonstrating substantial hydroxylase activity, C4M (and C7M) produced roughly equivalent amounts of linoleic acid and ricinoleic acid in transgenic plants (Figure 2). When compared with the results obtained upon expression of L4M (or L7M), where desaturation dominates hydroxylation, it is clear that the high levels of hydroxy fatty acids observed in several lines of A. thaliana FAD2-deficient/C4M plants result from an improved enzyme specificity, and not simply from optimal transgene context.
Most ofthe AtFAD2 variants, with the exception of some ofthe plants expressing C4M, retained sufficient desaturase activity to complement the FAD2-deficient background (resulting in about 25-30 % linoleic acid). Among the AtFAD2 variants with single amino acid changes, T148I and M324I caused the most dramatic change in phenotype, producing up to 4.2% and 5.4% hydroxy fatty acids, respectively (Figure 1). The AtFAD2 variants A104G, S322A, and T148N (a Lesquerella substitution) produced less than 1% hydroxy fatty acids. Among the double mutants analyzed, C2M.1 (T148I/M324V) and C2M.5 (T148I/S322A) produced higher levels (4.5% and 3.1%) of hydroxy fatty acids than did C2M.2 (A104G/S322A), C2M.3
(A104G/M324V), and C2M.6 (S322A/M324V), which produced less than 0.2%. The triple mutants C3M.1 (A104G/T148I/M324V), C3M.2 (A104G/T148I/S322A), and C3M.4 (T148I/S322A/M324V) generated far more hydroxylated fatty acids (9-16%) than did C3M.3 (A104G/S322A/M324V)(0.7%). Taken together, these data reveal a dominant role of He at position 148 in specifying hydroxylation, whereas the dramatic effect ofthe M324I variant is not additive among the double and triple mutants.
Based on transgenic expression in Arabidopsis, C4M exhibited the lowest desaturation hydroxylation ratio among all AtFAD2 variants at 1.7 (the average was 1.1 in planta, data from Figure 2). C4M is a more specific hydroxylase than is L4M, and the castor hydroxylase is a more specific hydroxylase than is the Lesquerella hydroxylase. This implies that at least some ofthe specificity determinants ofthe castor hydroxylase are contained within the C4M residues. Only two of these residues (148 and 324) are different between L4M (Asn- 148 and Ile-324) and C4M (He- 148 and Val-324). Inspection ofthe product ratios of these four single mutants in yeast (as is described in Example 3 below) (Figure 3) demonstrates that there is no difference in the product ratios between substituting the castor residue (M324V) and the Lesquerella residue (M324I) at position 324, whereas there is an approximate four-fold difference between the product ratios between substituting the castor residue (T148I) or the Lesquerella residue (T148N) at position 148, suggesting that the T148I mutation is key to determining the product distribution of C4M and ofthe castor hydroxylase. The results from the single mutants clearly demonstrate that each ofthe single mutations alone is sufficient to increase the hydroxylation activity ofthe AtFAD2 desaturase. Whereas T148N, A104G, and S322A resulted in modest changes in the product distribution (2- 5 fold increase in hydroxylation), T148I, M324I and M324V dramatically altered the product distribution (20- 90 fold). The T148I, M324I, and M324V mutations affect the accumulation of both desaturated and hydroxylated product. For example, by simply substituting Val for Met at residue 324, the amount of desaturation product has decreased ~5-fold while the amount of hydroxylation product has increased ~20-fold. Analysis ofthe double mutants clearly supports the significance of T148I in determining product distribution. All three double mutants that contain T148I exhibit decreased desaturation and increased hydroxylation (desaturation to hydroxylation ratio of 5.2-6.3), whereas those that do not contain T148I display reduced desaturation but very little change in hydroxylation (desaturation to hydroxylation ratio of 55- 74). Interestingly, the effects of M324V appear to be largely masked when combined with all other mutations; only the T148I/M324V double mutant retains high hydroxylation activity, and this may be attributed to the contribution of T148I. The triple mutants further confirm the importance of T148I in the product distribution, as those triple mutants that include T148I have low desaturation to hydroxylation ratios (3.4-5.7) whereas the one that does not contain T148I has a high ratio.
EXAMPLE 3 Yeast as a Complementary Host System for Examining Enzyme Variants Given the variability observed between independent transgenic plant lines (Figure 1), the amount of time required to generate transgenic plants, and the objective of comparing the specificity of numerous FAD2 variant enzymes, the use of Saccharomyces cerevisiae as a complementary host system was explored. The various enzymes were cloned into the pYes-II expression vector behind a GAL-1 promoter as previously described (Broun, P. et al. (1998) Science 282(5392): 1315-1317). Expression of AfFAD2 and LFAH enzymes was therefore tested in two different yeast strains INVSCI and YPH499, (a strain that Hills et. al., had reported accumulation of high levels of 18:2 upon expression of a FAD2 enzyme (Brown, A. P. et al. (1998) J. Am. Oil. Chem. Soc. 75: 77-82) and at multiple temperatures (15, 22, and 30°C). The highest product accumulation was obtained with the YPH499 strain induced at 30°C (Brown, A. P. et al (1998) J. Am. Oil. Chem. Soc. 75: 77-82). It was also observed that induction at high starting cell densities led to higher product accumulation; inducing at an OD6OQ of 2.5 resulting in the accumulation of 30 % diene (from FAD2 expression) or 27% hydroxylated fatty acids (from LFAH expression), as compared to 17 % and 18%, respectively, when the cultures were induced at an OD6oo of 0.2.
Expression ofthe castor oleate hydroxylase under all conditions resulted in the cessation of yeast growth, and the failure of cultures to accumulate detectable product.
Because ricinoleic acid accumulation of up to ~25 % was observed upon expression of LFAH of transgenic yeast, the toxicity upon expression ofthe castor oleate hydroxylase cannot be attributed to the accumulation of ricinoleic acid. Figure 3 illustrates the results ofthe expression of parental enzymes, quadruple mutants L4M and C4M, all possible triple (C3M) and double (C2M) mutant combinations using the C4M residues, and all single mutants that contain Lesquerella or castor substitutions. Although all the enzymes acted upon palmitoleate in addition to oleate, only the data from oleate oxidation is shown because the higher specificity towards oleate permitted more precise measurements ofthe products obtained from oleate oxidation at lower concentrations. As observed by their transgenic expression in plants, LFAH and AtFAD2 produce both desaturation and hydroxylation products. A recent publication by Smith reported that CFAH also produces linoleic acid when expressed in yeast, albeit to a lesser extent than LFAH (Smith, M. et al. (2000) Biochem. Soc. Trans. 28: 947-950). The collection of variant AtFAD2 enzymes included in Figure 3 contains enzymes exhibiting product ratios intermediate between those ofthe parental enzymes. The data from yeast expression are consistent with the transgenic plant data, suggesting that the information obtained from yeast expression has predictive value regarding relative activity upon expression in A. thaliana. Because both ricinoleic acid and linoleic acid are end products in yeast, whereas they are further metabolized in A. thaliana, a precise product ratio is obtained more readily from expression in yeast than expression in A. thaliana.
EXAMPLE 4 Rational Improvement of LFAH The product distribution variability observed when position 148 of AtFAD2 was changed to the corresponding residue of CFAH or LFAH suggested the importance ofthe identity ofthe amino acid at position 148 in determining enzyme specificity. Therefore, attempts were made to alter the specificity of LFAH by substituting the equivalent residue (Asn-149) with either the AtFAD2 (Thr) or CFAH (He) residue. The engineered enzymes were then expressed in yeast cells. The wild-type reaction specificity (desaturation to hydroxylation) of LFAH was about 0.37, where hydroxylation activity is about one-third ofthe total activity. Substitution of Asn-149 with the AtFAD2 residue Thr created an enzyme with a reaction specificity of about 0.95; thus, desaturation activity was increased to about the same level as hydroxylation. Substitution of Asn-149 with the CFAH equivalent (He) created an enzyme with an specificity of 0.21; thus, hydroxylation activity increased to about 80% ofthe total activity.
EXAMPLE 5 Catalytic Mechanism of AtFAD2
Chain length affects chemoselectivity.
As previously reported, the oleate desaturase and hydroxylase enzymes oxidize palmitoleic acid in addition to oleic acid. In yeast cells that normally contain approximately equal quantities ofthe two monounsaturated fatty acids, FAD2 produces roughly 7.5 times as much 18:2 as 16:2 and LFAH roughly 24 times as much 12-OH 18:1Δ9 as 12-OH 16:1 Δ9.
What is interesting is that the desaturation to hydroxylation ratio for LFAH is quite different for the two substrates, as illustrated in Table II; it is 0.37 for 18:1 Δ9 and 3.5 for 16:1 Δ9, approximately a 10-fold difference. While this bias could be attributed to differential metabolism ofthe hydroxy fatty acids, the fact that both of these ratios follow the same trend within the series of Lesquerella variants (more hydroxylation for Lesquerella -N149I, less hydroxylation for Lesquerella -N149T, data in Table II) suggests that the bias is associated with the mechanism ofthe partitioning between desaturation and hydroxylation.
Table II. Chemoselectivity of LFAH, LFAH-N149T, and LFAH-N149I as a function ofthe substrate.
Enzyme substrate LFAH LFAH-N149T LFAH-N149I
18:1 Δy 0.37 (0.05) 0.95(0.19) 0.21(0.06) 16:1 Δ9 3.5 (0.4) 5.6 (2.9) 1.7 (0.37)
Values represent the ratio of desaturation product to hydroxylation product for the given substrate. Standard deviations derived from three experiments are shown in parentheses.
Stereospecificity of Variant Enzymes.
The catalytic mechanism ofthe variant AtFAD2 enzymes was investigated through analysis ofthe oxidation products of stereospecifically labeled stearate and oleate. Yeast cells expressing AtFAD2, LFAH, C4M, or FAD2-M324I were grown and induced in the presence of deuterated stearoyl methyl ester ([12-2Hι](S)-18:0 or [12-2Hj](R)-18:0); the yeast acyl-CoA Δ9 desaturase gene desaturated sufficient quantities ofthe labeled stearate to provide the necessary labeled oleate for enzymatic desaturation/hydroxylation. Cerulenin was added to the cultures to minimize endogenous fatty acid synthesis, thus preventing dilution ofthe labeled stearate (Awaya, J. et al. (1975) Biochem. Biophys. Acta 409: 267-273). After 48 h of induction, the cellular fatty acids were analyzed by GC/MS to determine the [2Hj7[H] ratio (measured as the ratio ofthe M++l peak to the M+ peak) ofthe enzymic products (linoleate and ricinoleate). These values were then corrected to account for the contribution of endogenous unlabeled substrate to the peak intensities. The results are shown in Table III.
Table III. Stereospecificity of FAD2, LFAH, FAD2-C4M, and FAD2-M324I as determined
9 0 b byy GGCC//MMSS aannaallyyssiiss oofftthhee pprroodduucts formed when using labeled oleate ([12- H](R)-18:1Δ or [12-2H](S)-18:1Δ9) as substrate. enzyme FAD2 FAD2 LFAH LFAH LFAH C4M M324I substrate [12-2H] [12-2H] [12-2H] [12-1H] [12-2H] [12-2H] [12-1H]
(R)18:0 (S)18:0* (R)18:0 (S)18:0* (S)18:l (S)18:l (S)18:1Δ
* * Δ9 Δ9 9
18:2Δy>1 0.08 0.77 0.06 0.84 0.77 0.66 0.67
12-OH nd nd 0.03 1.08 0.63 0.73 0.54
18:1Δ9 '
The values in the table represent the ratio ofthe M+1 peaks (presence of 2H) to the M4 peaks (loss of 2H) for the enzymatic products and are corrected for the endogenous unlabeled oleate substrate. *Exogenous 18:0 was converted to 18:1Δ9 through the action of endogenous acyl-CoA desaturase.
From the data shown in Table III, it is clear that FAD2 and LFAH specifically remove the [12-2Hι](R) hydrogen while they retain the [^-^^(S) hydrogen; this result is consistent with the known stereochemistry of FAD2 (Buist, P. H., and Behrouzian, B. (1998) J. Am. Chem. Soc. 120: 871-876), LFAH (Smith, C. R. et al. (1961) J. Org. Chem. 26: 2903-2905) and CFAH (Morris, L. J. (1967) Biochem. Biophys. Res. Comm. 29(3): 311-315). While these growth conditions permitted incorporation of high levels ofthe labeled stearoyl methyl ester, product accumulation was decreased markedly, preventing analysis ofthe less active AtFAD2 variants. In order to characterize the specificity ofthe AtFAD2 variants C4M and M324I, yeast cells were grown and induced in the presence of labeled oleate ([12-2Hι](S)-18:lΔ9). The addition of this unsaturated fatty acid partially attenuates endogenous unsaturated fatty acid synthesis and thus cerulenin was not required (Bossie, M. A., and Martin, C. E. (1989) J. Bacteriol. 171 : 6409-6413). Again, GC/MS was employed to determine the [2H]/[H] ratio of the enzymic products (linoleate and ricinoleate). Furthermore, the [2H]/[H] data for the ricinoleic acid products have been confirmed using matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (Kinumi, T. et al. (2000) J. Mass Spect. 35: 417-422). The values presented in Table III demonstrate that LFAH, FAD2-C4M, and FAD2- M324I retain the 12(S)-hydrogen upon formation of both 18:2 Δ9,12 and 12-OH 18:1Δ9. At this time we cannot formally rule out the possibility that an intramolecular isotope effect could have biased the reaction in favor ofthe observed results. However, this seems unlikely because the oxidation of [12-2H!](R)-18:1Δ9 and [12-2Hι](S)-18:lΔ9 by LFAH gave the expected products. Nonetheless, the available data indicate that the stereospecificity ofthe AtFAD2 variants is consistent with that ofthe characterized AtFAD2 (Buist, P. H., and Behrouzian, B. (1998) J. Am. Chem. Soc. 120: 871-876), LFAH (Smith, C. R. et al. (1961) J. Org. Chem. 26: 2903- 2905), and CFAH (Morris, L. J. (1967) Biochem. Biophys. Res. Comm. 29(3): 311-315)
EXAMPLE 6 Generality of Bifunctional Activity
The surprising finding that the wild-type A. thaliana oleate desaturase (AtFAD2) had detectable hydroxylase activity prompted an investigation ofthe generality of bifunctional activity among other membrane-bound fatty acid desaturases. As a first step, the seed oil of a number of plants was examined to determine if ricinoleic acid accumulated as a result of a bifunctional FAD2 enzyme. Arabidopsis oil (0.03%), olive oil (0.01%), linseed oil (0.015%) and soybean oil (0.15%) all contained measurable amounts of ricinoleic acid. The relative amount of ricinoleic acid was directly related to the amount of linoleic acid present in the seed oil, as would be expected if the source ofthe hydroxy fatty acid was produced by a bifunctional enzyme. In addition to ricinoleic acid, substantial levels of a novel hydroxylated fatty acid were detected in flax seed oil. Using GC/MS detection, this analyte displayed a major ion at 145 m/z which is consistent with a (CH3)3SiOCH(CH2)2CH3 ion. Smaller fragments at 73 m/z (OSi(CH3)3) and 310 m/z (M-(OSi(CH3)3)) are also present. These fragments are consistent with a description ofthe analyte as 15-hydroxy linoleate; the major fragment would arise from cleavage adjacent to the carbon bearing the oxygen, between C14 and C15 (Broun, P., and
Somerville, C. (1997) Plant Physiol. 113: 933-942). This analyte eluted from the GC column at a time consistent with such an assignment. The retention time of ricinoleate 12-hydroxyoleate) is approximately 0.3 min greater than that of linoleate whereas the novel analyte eluted ~0.3 min after linolenate. It is contemplated that this fatty acid arises through the action of a bifunctional linoleate desaturase (for example FAD3). The observation ofthe presence of 15- hydroxylinoleate in flax seed is probably due to the fact that flax accumulates high levels of linolenate. Thus, the activity of this enzyme, or at least the flux through the linoleate desaturase, is sufficient to allow detectable quantities of this unusual fatty acid to accumulate.
In the next step, the product of S. cerevisiae Δ9 acyl-CoA desaturase was examined; this enzyme shares a histidine sequence motif and predicted membrane topology, but only about 25% sequence identity, with the A. thaliana FAD2 (Shanklin, J. et al. (1994)
Biochemistry 33(43): 12787-12794). GC/MS analysis ofthe trimethylsilyl derivatives of fatty acid methyl esters from wild-type yeast (strains INVSCI and YPH499) revealed the presence of small quantities (0.2-1%) of total fatty acids) of 9-hydroxypalmitate (MS fragment ions at 201 and 259 m/z) and 9-hydroxystearate (MS fragments at 229 and 259 m/z). Analysis ofthe fatty acids from an L814C (a desaturase null strain that is an unsaturated fatty acid auxotroph
(Resnick, M. A., and Mortimer, R. K. (1966) J. Bacteriol. 92: 597-600) culture grown in the presence of palmitoleate and oleate revealed no detectable hydroxy fatty acids. Transformation of this strain with a vector containing the gene for the stearoyl-CoA Δ9desaturase from rat (Strittmatter, P. et al. (1974) Proc. Natl. Acad. Sci. USA 71(11): 4565-4569) or fruit fly (Drosophila melanogaster CS strain, des 1 gene) (Dallerac, R. et al. (2000) Proc. Natl. Acad. Sci. USA 97: 9449-9454) complemented the unsaturated fatty acid auxotrophy ofthe strain. In addition to producing the expected palmitoleate and oleate, these enzymes produced detectable levels of 9-hydroxypalmitate and 9-hydroxystearate similar to those found in wild-type yeast. Finally, the Bacillus subtilis Δ5-desaturase (Aguilar, P. S. et al. (1998) J. Bacteriol. 180(8): 2194-2200) was expressed in Escherichia coli BL21 (DE3) and found to produce trace quantities of 5-hydroxy palmitate (MS fragment ions of 203 and 257 m/z) in addition to its primary desaturation product,16:l Δ5. These fragment ions were not detected from E. coli BL21(DE3) cells containing a pET-3a lacking a desaturase. No hydroxy fatty acids were identified in the linoleic acid-producing yeasts (Kaneko, H. et al. (1976) Lipids 11 : 837-844) Rhodotorula glutinis or Crytopococcus laurentii, indicating either that the oleate desaturase from these organisms does not have any hydroxylase activity or that these particular yeast have an efficient mechanism for metabolizing hydroxylated fatty acids.
All publications and patents mentioned in the above specification are herein incorporated by reference. Various modifications and variations ofthe described method and system ofthe invention will be apparent to those skilled in the art without departing from the scope and spirit ofthe invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications ofthe described modes for carrying out the invention which are obvious to those skilled in material science, chemistry, and molecular biology or related fields are intended to be within the scope ofthe following claims.

Claims

CLAIMSWhat is claimed is:
1. A composition comprising a modified Lesquerella fatty acid hydroxylase polypeptide, comprising a non-native amino acid at position 149, at position 325, or at both positions, wherein of amino acids at positions 63, 105, 149, 218, 296, 323 and 325 no more than three are non-native amino acids, and wherein a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
2. A composition of Claim 1, wherein the modified hydroxylase comprises a non- native amino acid at position 149.
3. A composition of Claim 2, wherein the non-native amino acid at position 149 is threonine or isoleucine.
4. A composition of Claim 1, wherein the Lesquerella is Lesquerella fendleri.
5. A composition comprising a modified Lesquerella fatty acid hydroxylase polypeptide comprising an amino acid sequence shown in SEQ ID NO:l, wherein the amino acid sequence is modified to comprise a non-native amino acid at position 149, at position 325, or at both positions, wherein of amino acids at positions 63, 105, 149, 218, 296, 323 and 325 no more than three are non-native amino acids, and wherein a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
6. A composition comprising a modified plant fatty acid hydroxylase polypeptide, comprising a non-native amino acid at a position corresponding to position 149 of SEQ ID NO : 1 , at a position corresponding to position 325 of SEQ ID NO : 1 , or at both positions, wherein of amino acids at positions corresponding to positions 63, 105, 149, 218, 296, 323 and 325 of SEQ ID NO:l no more than three are non-native amino acids, and wherein a reaction specificity ofthe modified hydroxylase differs from a reaction specificity of an unmodified hydroxylase.
The composition of Claim 1 , wherein a ratio of hydroxylation to desaturation activity ofthe modified hydroxylase is decreased relative to a ratio of hydroxylation to desaturation activity of an unmodified hydroxylase.
The composition of Claim 1, wherein a ratio of hydroxylation to desaturation activity ofthe modified hydroxylase is increased relative to a ratio of hydroxylation to desaturation activity of an unmodified hydroxylase.
A composition comprising a nucleic acid sequence encoding the modified Lesquerella fatty acid hydroxylase of Claim 1.
A recombinant DNA molecule comprising the nucleic acid sequence of Claim 9 operably linked to at least one suitable regulatory sequence.
An expression vector comprising the recombinant DNA molecule of Claim 10.
An organism transformed with the recombinant DNA molecule of Claim 11.
The organism of Claim 12, wherein the organism is selected from the group consisting of microorganisms and plants.
A plant transformed with the recombinant DNA molecule of Claim 10.
The transgenic plant of Claim 14, wherein the plant is selected from the group consisting of soybean (Glycine max), rapeseed and canola (including Brassica napus and B. cαmpestris), sunflower (Heliαnthus αnnus), cotton (Gossypium hirsutum), corn (Zeα mays), cocoa (Theobroma cacao), safflower (Carthamus tinctorius), oil palm (Elaeis guineensis), coconut palm (Cocos nucifera), flax (Linum usitatissimum), castor (Ricinus communis) and peanut (Arachis hypogaea).
16. A plant cell transformed with the recombinant DNA molecule of Claim 10.
17. A plant seed transformed with the recombinant DNA molecule of Claim 10.
18. An oil obtained from the plant seed of Claim 17.
19. A yeast cell transformed with the recombinant DNA molecule of Claim 10.
20. The yeast cell of claim 19, wherein the yeast is S. cerevisiae strain YPH499.
21. A bacterial cell transformed with the recombinant DNA molecule of Claim 10.
22. An oil obtained from the bacterial cell of Claim 21.
23. A method of producing a modified hydroxylase in a transgenic organism, comprising: a) providing an organism transformed with the recombinant DNA molecule of
Claim 10; b) growing the organism under conditions such that a modified hydroxylase encoded by the recombinant DNA molecule is expressed.
24. The method of Claim 23, wherein the organism is a plant.
25. A method for altering the phenotype of a plant comprising: a) providing: i) the expression vector of claim 11 ; and ii) a plant or a plant tissue or a plant cell; b) transfecting the plant or plant tissue or plant cell with the vector under conditions such that the protein is expressed in a plant obtained from the plant or plant tissue or plant cell.
26. A method for evaluating fatty acid desaturation or hydroxylation activity of an enzyme, comprising: a) transforming a yeast cell of S. cerevisiae strain YPH499 with a nucleic acid sequence encoding the enzyme under control of an inducible promoter; b) growing the yeast cell to a culture of cells at high density; and c) inducing expression ofthe nucleic acid at about 30 degrees centigrade, such that the desaturation or hydroxylation activity ofthe enzyme can be evaluated.
PCT/US2003/000341 2002-01-14 2003-01-07 Modified fatty acid hydroxylase protein and genes WO2003060092A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003235589A AU2003235589A1 (en) 2002-01-14 2003-01-07 Modified fatty acid hydroxylase protein and genes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US34855702P 2002-01-14 2002-01-14
US60/348,557 2002-01-14

Publications (2)

Publication Number Publication Date
WO2003060092A2 true WO2003060092A2 (en) 2003-07-24
WO2003060092A3 WO2003060092A3 (en) 2004-02-19

Family

ID=23368536

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/000341 WO2003060092A2 (en) 2002-01-14 2003-01-07 Modified fatty acid hydroxylase protein and genes

Country Status (3)

Country Link
AR (1) AR038137A1 (en)
AU (1) AU2003235589A1 (en)
WO (1) WO2003060092A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2182062A2 (en) * 2004-08-04 2010-05-05 Divergence, Inc. Codon-optimized epoxygenases / hydroxylases
US7939713B2 (en) 2003-02-05 2011-05-10 Divergence, Inc. Method for screening transgenic plants for anthelmintic activity
CN108753803A (en) * 2018-07-16 2018-11-06 河北省农林科学院粮油作物研究所 A kind of high oleic acid peanut mutator AhFAD2B-814 and application
CN110684806A (en) * 2012-09-07 2020-01-14 美国陶氏益农公司 FAD2 performance loci and corresponding target site specific binding proteins capable of inducing targeted breaks

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6310194B1 (en) * 1994-09-26 2001-10-30 Carnegie Institution Of Washington Plant fatty acid hydroxylases

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6310194B1 (en) * 1994-09-26 2001-10-30 Carnegie Institution Of Washington Plant fatty acid hydroxylases

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BROUN ET AL.: 'Catalytic plasticity on fatty acid modificatin enzymes underlying chemical diversity of plant lipids' SCIENCE vol. 282, 13 November 1998, pages 131 - 133, XP002124106 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7939713B2 (en) 2003-02-05 2011-05-10 Divergence, Inc. Method for screening transgenic plants for anthelmintic activity
US7919680B2 (en) 2004-02-04 2011-04-05 Divergence, Inc. Nucleic acids encoding anthelmintic agents and plants made therefrom
US7932435B2 (en) 2004-02-04 2011-04-26 Divergence, Inc. Method of screening transgenic plants for anthelmintic activity
EP2182062A2 (en) * 2004-08-04 2010-05-05 Divergence, Inc. Codon-optimized epoxygenases / hydroxylases
EP2182062A3 (en) * 2004-08-04 2010-10-06 Divergence, Inc. Codon-optimized epoxygenases / hydroxylases
CN110684806A (en) * 2012-09-07 2020-01-14 美国陶氏益农公司 FAD2 performance loci and corresponding target site specific binding proteins capable of inducing targeted breaks
CN108753803A (en) * 2018-07-16 2018-11-06 河北省农林科学院粮油作物研究所 A kind of high oleic acid peanut mutator AhFAD2B-814 and application
CN108753803B (en) * 2018-07-16 2019-04-23 河北省农林科学院粮油作物研究所 A kind of high oleic acid peanut mutated gene AhFAD2B-814 and application

Also Published As

Publication number Publication date
AU2003235589A1 (en) 2003-07-30
AR038137A1 (en) 2004-12-29
WO2003060092A3 (en) 2004-02-19
AU2003235589A8 (en) 2003-07-30

Similar Documents

Publication Publication Date Title
US7429473B2 (en) Diacylglycerol acyltransferase genes, proteins, and uses thereof
US8017386B2 (en) Divinyl ether synthase gene and protein, and uses thereof
JP4481819B2 (en) Diacylglycerol acyltransferase nucleic acid sequences and related products
US7446188B2 (en) Plant cyclopropane fatty acid synthase genes, proteins, and uses thereof
US6974893B2 (en) Isoform of castor oleate hydroxylase
US8674176B2 (en) ADS genes for reducing saturated fatty acid levels in seed oils
US20040010817A1 (en) Plant acyl-CoA synthetases
US7932433B2 (en) Plant cyclopropane fatty acid synthase genes, proteins, and uses thereof
WO2003060092A2 (en) Modified fatty acid hydroxylase protein and genes
EP1583417A2 (en) Elevation of oil levels in brassica plants
AU2001277918A1 (en) Plant Acyl-CoA synthetases
US7105722B2 (en) Plant acyl-CoA synthetases
WO2005067512A2 (en) Novel carotenoid hydroxylases for use in engineering carotenoid metabolism in plants
WO2004101755A2 (en) Ref1 modified plants and plant seeds
WO2003020017A1 (en) Identification and expression of heterologous nucleic acid sequences encoding heterologous fatty acid modifying enzymes in plants

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PH PL PT RO RU SD SE SG SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP