TITLE GLYCEROL-3-PHOSPHATE O-ACYLTRANSFERASE PROMOTER FOR
GENE EXPRESSION IN OLEAGINOUS YEAST This application claims the benefit of U.S. Provisional Application No. 60/610060, filed September 15, 2004.
FIELD OF THE INVENTION
This invention is in the field of biotechnology. More specifically, this invention pertains to a promoter region isolated from Yarrowia lipolytics that is useful for gene expression in oleaginous yeast. BACKGROUND OF THE INVENTION
Oleaginous yeast are defined as those organisms that are naturally capable of oil synthesis and accumulation, wherein oil accumulation ranges from at least about 25% up to about 80% of the cellular dry weight. Genera typically identified as oleaginous yeast include, but are not limited to: Yarrowia, Candida, Rhodotorula,
Rhodosporidium, Cryptococcus, Trichosporon and Lipomyces. More specifically, illustrative oil-synthesizing yeast include: Rhodosporidium toruloides, Lipomyces starkeyii, L. lipoferus, Candida revkaufi, C. pulcherrima, C. tropicalis, C. utilis, Trichosporon pullans, T. cutaneum, Rhodotorula glutinus, R. graminis and Yarrowia lipolytica (formerly classified as Candida lipolytica).
The technology for growing oleaginous yeast with high oil content is well developed (for example, see EP 0 005 277B1; Ratledge, C, Prog. Ind. Microbiol. 16:119-206 (1982)). And, these organisms have been commercially used for a variety of purposes in the past. For example, various strains of Yarrowa lipolytica have historically been used for the manufacture and production of: isocitrate lyase; lipases; polyhydroxy- alkanoates; citric acid; erythritol; 2-oxoglutaric acid; γ-decalactone; γ- dodecalactone; and pyruvic acid. More recently, however, the natural abilities of oleaginous yeast have been enhanced by advances in genetic engineering, resulting in organisms capable of producing polyunsaturated fatty acids ("PUFAs"). Specifically, Picataggio et al. have demonstrated that Y. lipolytica can be engineered for production of ω-3 and ω-6 fatty acids, by introducing and expressing genes encoding the ω-3/ω-6 biosynthetic pathway (see WO 2004/101757).
Recombinant production of any heterologous protein is generally accomplished by constructing an expression cassette in which the DNA coding for the protein of interest is placed under the control of appropriate
regulatory sequences (i.e., promoters) suitable for the host cell. The expression cassette is then introduced into the host cell (usually by plasmid-mediated transformation or targeted integration into the host genome) and production of the heterologous protein is achieved by culturing the transformed host cell under conditions necessary for the proper function of the promoter contained within the expression cassette. Thus, the development of new host cells (e.g., oleaginous yeast) for recombinant production of proteins generally requires the availability of promoters that are suitable for controlling the expression of a protein of interest in the host cell.
• A variety of strong promoters have been isolated from Yarrowia lipolytica that are useful for heterologous gene expression in yeast. For example, U.S. 4,937,189 and EP220864 (Davidow et al.) disclose the sequence of the XPR2 gene (which encodes an inducible alkaline extracellular protease) and upstream promoter region for use in expression of heterologous proteins. U.S. 6,265,185 (Muller et al.) describe promoters for the translation elongation factor EF1-α (TEF) protein and ribosomal protein S7 that are suitable for expression cloning in yeast and heterologous expression of proteins. These promoters were improved relative to the XPR2 promoter, when tested for yeast promoter activity on growth plates (Example 9, U.S. 6,265,185) and based on their activity in the pH range of 4-11. WO 2005/003310 and commonly owned co-pending U.S. Patent Application No. 11/183664 describe regulatory sequences (e.g., promoters, introns) of the glyceraldehyde-3-phosphate dehydrogenase (gpd) and phosphoglycerate mutase (gpm) genes; and, WO 2005/049805 describes regulatory sequences (e.g., promoters, introns) of the fructose-bisphosphate aldolase (fba) gene. Similarly, Juretzek et al. (Biotech. Bioprocess Eng.,
5:320-326 (2000)) compares the glycerol-3-phosphate dehydrogenase (G3P), isocitrate lyase (ICL1), 3-oxo-acyl-CoA thiolase (POT1) and acyl-CoA oxidase (POX1 , POX2 and POX5) promoters with respect to their regulation and activities during growth on different carbon sources.
Despite the utility of these known promoters, however, there is a need for new improved yeast promoters for metabolic engineering of yeast
(oleaginous and non-oleaginous) and for controlling the expression of heterologous genes in yeast. Furthermore, possession of a suite of promoters that are regulatable under a variety of natural growth and induction conditions in yeast will play an important role in industrial settings, wherein it is desirable to express heterologous polypeptides in commercial quantities in said hosts for economical production of those polypeptides. Thus, it is an object of the present invention to provide such promoters that will be useful for gene expression in a variety of yeast cultures, and preferably in Yarrowia sp. cultures and other oleaginous yeast.
Applicants have solved the stated problem by identifying the gene {gpaf) encoding glycerol-3-phosphate O-acyltransferase (GPAT) from Yarrowia lipolytica and the promoter responsible for driving expression of this native gene. The promoter is useful for expression of heterologous genes in Yarrowia and has improved activity with respect to the TEF promoter.
SUMMARY OF THE INVENTION
The present invention relates to the isolation of a gene encoding a glycerol-3-phosphate O-acyltransferase (GPAT) enzyme from Yarrowia and methods for the expression of a coding region of interest in a transformed yeast, using a promoter of the glycerol-3-phosphate O- acyltransferase (gpat) gene.
Accordingly the invention provides a method for the expression of a coding region of interest in a transformed yeast comprising: a) providing a transformed yeast having a chimeric gene comprising:
(i) a promoter region of a Yarrowia gpat gene; and, (ii) a coding region of interest expressible in the yeast; wherein the promoter region is operably linked to the coding region of interest; and, b) growing the transformed yeast of step (a) under conditions whereby the chimeric gene of step (a) is expressed. In similar fashion the invention provides mutant gpat promoter regions having enhanced promoter activity relative to the wild type promoter.
In a preferred embodiment the invention provides a method for the production of an ω-3 or an ω-6 fatty acid comprising:
a) providing a transformed oleaginous yeast comprising a chimeric gene, comprising:
(i) a promoter region of a Yarrowia gpat gene; and, (ii) a coding region encoding at least one enzyme of the ω-3/ . ω-6 fatty acid biosynthetic pathway; wherein the promoter region and coding region are operably linked; b) culturing the transformed oleaginous yeast of step (a) under conditions whereby the at least one enzyme of the ω-3/ ω-6 fatty acid biosynthetic pathway is expressed and a ω-3 or ω-6 fatty acid is produced; and, c) optionally recovering the ω-3 or ω-6 fatty acid; wherein the preferred ω-3/ ω-6 fatty acid biosynthetic pathway enzymes include, but are not limited to: Δ9 desaturase, Δ12 desaturase, Δ6 desaturase, Δ5 desaturase, Δ17 desaturase, Δ15 desaturase, Δ8 desaturase and Δ4 desaturase; and wherein the preferred ω-3 or ω-6 fatty acid includes, but is not limited to: linoleic acid, α-linolenic acid, γ-linolenic acid, stearidonic acid, dihomo-γ- linoleic acid, eicosatetraenoic acid, arachidonic acid, eicosapentaenoic acid, docosapentaenoic acid, docosahexaenoic acid, eicosadienoic acid and eicosatrienoic acid. Additionally the invention provides an isolated nucleic acid molecule comprising a gpat promoter selected from the group consisting of SEQ ID NOs: 13 and 17.
BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE
DESCRIPTIONS Figure 1 graphically represents the relationship between SEQ ID
NOs:3, 4, 13, 14 and 17, each of which relates to glycerol-3-phosphate O- acyltransferase (GPAT) in Y. lipolytica.
Figure 2 provides plasmid maps for the following: (A) pY5-30; (B) PDMW214; and (C) pYGPAT-GUS. Figure 3A diagrams the development of Y. lipolytica ATCC #20362 derivative pDMW236-#18. Figure 3B provides a plasmid map for pKUNF12T6E; and Figure 3C provides a plasmid map for pDMW236.
Figure 4 illustrates the relative promoter activities of TEF, GPAT and FBAIN in Y. lipolytica as determined by histochemical staining. Figure 5 provides plasmid maps for the following: (A) pY5-13; (B) pY25-d12d; and (C) pZGP6B, respectively.
Figure 6 illustrates the ω-3/ω-6 fatty acid biosynthetic pathway.
The invention can be more fully understood from the following detailed description and the accompanying sequence descriptions, which form a part of this application.
The following sequences comply with 37 C.F.R. §1.821-1.825 ("Requirements for Patent Applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosures - the Sequence Rules") and are consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (1998) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822.
SEQ ID NOs:1-4, 14-17, 20-30, 36, 45, 48-49 and 56 correspond to ORFs (i.e., encoding genes or proteins), promoters, terminators and plasmids, as identified in Table 1.
Table 1 Summar Of Nucleotide And Protein SEQ ID Numbers
SEQ ID NOs:5 and 6 correspond to the degenerate primers YGPAT-F and YGPAT-R, respectively, used for amplifying the Yarrowia lipolytica ORF YALI-CDS 1055.1.
SEQ ID NOs:7 and 8 correspond to the Genome Walker adaptor used to isolate the GPAT promoter region by genome-walking.
SEQ ID NOs:9-12 correspond to the PCR primers used in genome- walking: Adaptor-1 , YGPAT-5R-1 , Nested Adaptor Primer 2 and YGPAT- 5R-2, respectively.
SEQ ID NO:13 corresponds to a 1781 bp fragment contained within plasmid pEcoRV-G-5, the fragment containing a 1678 bp region upstream
of the translation initiation codon 'ATG' of the gpat gene (wherein the TV position of the 'ATG' translation initiation codon is designated as +1). SEQ ID NOs:18 and 19 correspond to primers GPAT-5-1 and GPAT-5-2, respectively, used to amplify "GPATPro". SEQ ID NOs:31-33 correspond to BD-Clontech Creator Smart® cDNA library kit primers SMART IV oligo nucleotide, CDSI 11/3' PCR primer and 5'-PCR primer, respectively.
SEQ ID NOs:34 and 35 correspond to primers YL421 and YL422, respectively, used to amplify the Mortierella alpina Δ6B desaturase. SEQ ID NOs:37-42 correspond to primers YL475, YL476, YL477,
YL478, YL479 and YL480, respectively, used for in vitro mutagenesis within the M. alpina Δ6B desaturase.
SEQ ID NOs:43 and 44 correspond to primers YL497 and YL498, respectively, used to amplify GPATPro. SEQ ID NOs:46 and 47 correspond to primers YL259 and YL260, respectively, used to amplify the Yarrowia Pex20 terminator.
SEQ ID NOs:50 and 51 correspond to primers P147 and P148, used to amplify the Y. lipolytica Δ12 desaturase.
SEQ ID NOs:52-55 correspond to primers YL242, YL243, YL226 and YL227, respectively, used for site-directed mutagenesis during generation of plasmid pY25-d12d-PS.
DETAILED DESCRIPTION OF THE INVENTION All patents, patent applications, and publications cited herein are incorporated by reference in their entirety including not limited to, the following commonly owned copending applications: U.S. Patent
Application No. 10/840478 (filed May 6, 2004), U.S. Patent Application No. 10/840579 (filed May 6, 2004), U.S. Patent Application No. 10/840325 (filed May 6, 2004), U.S. Patent Application No. 10/869630 (filed June 16, 2004), U.S. Patent Application No. 10/987548 (filed November 12, 2004), U.S. Patent Application No. 60/624812 (filed November 4, 2004) and U.S. Patent Application No. 11/183664 (filed July 18, 2005).
Applicants describe herein the isolation and characterization of a promoter and gene from an oleaginous yeast, Yarrowia lipolytica. This promoter region, isolated upstream of the glycerol-3-phosphate O- acyltransferase (gpat) gene, is useful for genetic engineering in Y. lipolytica and other yeasts for the production of heterologous polypeptides.
Preferred heterologous polypeptides of the present invention are those that are involved in the synthesis of microbial oils and particularly polyunsaturated fatty acids (PUFAs). PUFAs, or derivatives thereof, made by the methodology disclosed herein can be used in many applications. For example, the PUFAs can be used as dietary substitutes, or supplements, particularly infant formulas, for patients undergoing intravenous feeding or for preventing or treating malnutrition. Alternatively, the purified PUFAs (or derivatives thereof) may be incorporated into cooking oils, fats or margarines formulated so that in normal use the recipient would receive the desired amount for dietary supplementation. The PUFAs may also be incorporated into infant formulas, nutritional supplements or other food products, and may find use as anti-inflammatory or cholesterol lowering agents. Optionally, the compositions may be used for pharmaceutical use (human or veterinary). In this case, the PUFAs are generally administered orally but can be administered by any route by which they may be successfully absorbed, e.g., parenterally (e.g., subcutaneously, intramuscularly or intravenously), rectally, vaginally or topically (e.g., as a skin ointment or lotion).
Thus, the present invention advances the art by providing methods for the expression of a coding region of interest in a transformed yeast comprising: a) providing a transformed yeast having a chimeric gene comprising (i) a promoter region of a Yarrowia gpat gene; and, (ii) a coding region of interest expressible in the yeast, wherein the promoter region is operably linked to the coding region of interest; b) growing the transformed yeast of step (a) under conditions whereby the chimeric gene is expressed; and, c) optionally isolating the gene product from the cultivation medium. In preferred embodiments, the GPAT promoter region comprises a sequence selected from the group consisting of SEQ ID NOs:13 and 17. Definitions
In this disclosure, a number of terms and abbreviations are used. The following definitions are provided.
"Glycerol-3-phosphate O-acyltransferase" is abbreviated GPAT. "Open reading frame" is abbreviated ORF. "Polymerase chain reaction" is abbreviated PCR.
"Polyunsaturated fatty acid(s)" is abbreviated PUFA(s).
The term "oleaginous" refers to those organisms that tend to store their energy source in the form of lipid (Weete, In: Fungal Lipid Biochemistry, 2nd Ed., Plenum, 1980). Generally, the cellular PUFA content of these microorganisms follows a sigmoid curve, wherein the concentration of lipid increases until it reaches a maximum at the late logarithmic or early stationary growth phase and then gradually decreases during the late stationary and death phases (Yongmanitchai and Ward, Appl. Environ. Microbiol. 57:419-25 (1991 )).
The term "oleaginous yeast" refers to those microorganisms classified as yeast that can accumulate at least 25% of their dry cell weight as oil. Examples of oleaginous yeast include (but are no means limited to) the following genera: Yarrowia, Candida, Rhodotorula, Rhodospoήdium, Cryptococcus, Trichosporon and Lipomyces.Jhe term "GPAT" refers to a glycerol-3-phosphate O-acyltransferase enzyme (E. C. 2.3.1.15) encoded by the gpat gene and which converts acyl-CoA and sn- glycerol 3-phosphate to CoA and 1 -acyl-s/7-glycerol 3-phosphate (the first step of phospholipid biosynthesis). Two representative gpat genes from Saccharomyces cerevisiae are GenBank Accession No. AJ314608 (Gat2(SCT1); SEQ ID NO:1 ) and GenBank Accession No. AJ311354 (Gat1; SEQ ID NO:2) (Zheng, Z. and J. Zou. J. Biol. Chem.
276(45):41710-41716 (2001)). A gpat gene isolated from Yarrowia lipolytica is provided as SEQ ID NO:3, while the corresponding amino acid sequence is provided as SEQ ID NO:4.
The term "GPAT promoter" or "GPAT promoter region" refers to the 5' upstream untranslated region in front of the 'ATG' translation initiation codon of gpat and that is necessary for expression. Examples of suitable GPAT promoter regions are provided as SEQ ID NOs: 13 and 17, but these are not intended to be limiting in nature. One skilled in the art will recognize that since the exact boundaries of the GPAT promoter sequence have not been completely defined, DNA fragments of increased or diminished length may have identical promoter activity.
The term "GPD" refers to a glyceraldehyde-3-phosphate dehydrogenase enzyme (E. C. 1.2.1.12) encoded by the gpd gene and which converts D-glyceraldehyde 3-phosphate to 3-phospho-D-glyceroyl phosphate during glycolysis. The term "GPD promoter" or "GPD promoter region" refers to the 5' upstream untranslated region in front of the 'ATG' translation initiation codon of gpd and that is necessary for expression.
Examples of suitable Yarrowia lipolytics GPD promoter regions are described in U.S. Patent Application No. 10/869630.
The term "GPM" refers to a phosphoglycerate mutase enzyme (EC 5.4.2.1 ) encoded by the gpm gene and which is responsible for the interconversion of 3-phosphoglycerate and 2-phosphoglycerate during glycolysis. The term "GPM promoter" or "GPM promoter region" refers to the 5' upstream untranslated region in front of the 'ATG' translation initiation codon of gpm and that is necessary for expression. Examples of suitable Yarrowia lipolytica GPM promoter regions are described in U.S. Patent Application No. 10/869630.
The term "FBA1" refers to a fructose-bisphosphate aldolase enzyme (E.G. 4.1.2.13) encoded by the fba1 gene and which converts D- fructose 1 ,6-bisphosphate into glycerone phosphate and D-glyceraldehyde 3-phosphate. The term "FBA promoter" or "FBA promoter region" refers to the 5' upstream untranslated region in front of the 'ATG' translation initiation codon of fba1 and that is necessary for expression. An example of a suitable FBA promoter region is provided as SEQ ID NO:25, but this is not intended to be limiting in nature (see WO 2005/049805). The term "FBAIN promoter" or "FBAIN promoter region" refers to the 5' upstream untranslated region in front of the 'ATG' translation initiation codon of fba1 and that is necessary for expression, plus a portion of 5' coding region comprising an intron of the fba1 gene. An example of a suitable FBAIN promoter region is provided as SEQ ID NO:16, but this is not intended to be limiting in nature (see WO 2005/049805). The term "promoter activity" will refer to an assessment of the transcriptional efficiency of a promoter. This may, for instance, be determined directly by measurement of the amount of mRNA transcription from the promoter (e.g., by Northern blotting or primer extension methods) or indirectly by measuring the amount of gene product expressed from the promoter.
As used herein, an "isolated nucleic acid fragment" is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA molecule, when a
single-stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory: Cold Spring Harbor, NY (1989), particularly Chapter 1 1 and Table 11.1 therein (entirely incorporated herein by reference). The conditions of temperature and ionic strength determine the "stringency" of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments (such as homologous sequences from distantly related organisms), to highly similar fragments (such as genes that duplicate functional enzymes from closely related organisms). Post-hybridization washes determine stringency conditions. One set of preferred conditions uses a series of washes starting with 6X SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2X SSC, 0.5% SDS at 45 0C for 30 min, and then repeated twice with 0.2X SSC, 0.5% SDS at 50 0C for 30 min. A more preferred set of stringent conditions uses higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2X SSC, 0.5% SDS was increased to 60 0C. Another preferred set of highly stringent conditions uses two final washes in 0.1 X SSC, 0.1 % SDS at 65 0C. An additional set of stringent conditions include hybridization at 0.1 X SSC, 0.1% SDS, 65 0C and washed with 2X SSC, 0.1 % SDS followed by 0.1 X SSC, 0.1 % SDS, for example. Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of
mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). In one embodiment the length for a hybridizable nucleic acid is at least about 10 nucleotides. Preferably a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least about 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe. A "substantial portion" of an amino acid or nucleotide sequence is that portion comprising enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to putatively identify that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., J. MoI. Biol. 215:403-410 (1993)). In general, a sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to identify putatively a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect to nucleotide sequences, gene-specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises enough of the sequence to specifically identify and/or isolate a nucleic acid fragment comprising the sequence. The instant specification teaches partial or complete amino acid and nucleotide sequences encoding one or more particular microbial proteins and promoters. The skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for purposes known to those skilled in this art. Accordingly, the instant invention comprises the complete sequences as reported in the accompanying Sequence Listing, as well as substantial portions of those sequences as defined above.
The term "oligonucleotide" refers to a nucleic acid, generally of at least 18 nucleotides, that is hybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNA molecule. In one embodiment, a labeled oligonucleotide can be used as a "probe" to detect the presence of a nucleic acid according to the invention. Thus, the term "probe" refers to a single-stranded nucleic acid molecule that can base pair with a complementary single-stranded target nucleic acid to form a double- stranded molecule. The term "label" will refer to any conventional molecule which can be readily attached to mRNA or DNA and which can produce a detectable signal, the intensity of which indicates the relative amount of hybridization of the labeled probe to the DNA fragment.
The term "complementary" is used to describe the relationship between nucleotide bases that are capable of hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the instant invention also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing, as well as those substantially similar nucleic acid sequences. The term "percent identity", as known in the art, is a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the sequences. In the art, "identity" also means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as determined by the match between strings of such sequences. "Identity" and "similarity" can be readily calculated by known methods, including but not limited to those described in: 1.) Computational Molecular Biology (Lesk, A. M., Ed.) Oxford University: NY (1988); 2.) Biocomputinq: Informatics and Genome Projects (Smith, D. W., Ed.) Academic: NY (1993); 3.) Computer Analysis of Sequence Data, Part I (Griffin, A. M., and Griffin, H. G., Eds.) Humania: NJ (1994); 4.) Sequence Analysis in Molecular Biology (von Heinje, G., Ed.) Academic (1987); and 5.) Sequence Analysis Primer (Gribskov, M. and Devereux, J., Eds.) Stockton: NY (1991). Preferred methods to determine identity are designed to give the best match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Sequence alignments and percent identity
calculations may be performed using the Megalign program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wl). Multiple alignment of the sequences is performed using the Clustal method of alignment (Higgins and Sharp, CABIOS. 5:151-153 (1989)) with default parameters (GAP PENALTY=I 0, GAP LENGTH PENALTY=I 0). Default parameters for pairwise alignments using the Clustal method are: KTUPLE 1 , GAP PENALTY=3, WIND0W=5 and DIAGONALS SAVED=5.
Suitable nucleic acid fragments (isolated polynucleotides of the present invention) encode polypeptides that are at least about 70% identical, preferably at least about 75% identical, and more preferably at least about 80% identical to the amino acid sequences reported herein. Preferred nucleic acid fragments encode amino acid sequences that are about 85% identical to the amino acid sequences reported herein. More preferred nucleic acid fragments encode amino acid sequences that are at least about 90% identical to the amino acid sequences reported herein. Most preferred are nucleic acid fragments that encode amino acid sequences that are at least about 95% identical to the amino acid sequences reported herein. Suitable nucleic acid fragments not only have the above homologies but typically encode a polypeptide having at least 50 amino acids, preferably at least 100 amino acids, more preferably at least 150 amino acids, still more preferably at least 200 amino acids, and most preferably at least 250 amino acids.
Likewise, suitable promoter regions (isolated polynucleotides of the present invention) encode promoter regions that are at least about 70% identical, preferably at least about 75% identical, and more preferably at least about 80% identical to the nucleotide sequences reported herein. Preferred nucleic acid fragments are about 85% identical to the nucleotide sequences reported herein, more preferred nucleic acid fragments are at least about 90% identical, and most preferred are nucleic acid fragments at least about 95% identical to the nucleotide sequences reported herein. Suitable promoter regions not only have the above homologies but typically are at least 50 nucleotides in length, more preferably at least 100 nucleotides in length, more preferably at least 250 nucleotides in length, and more preferably at least 500 nucleotides in length. "Codon degeneracy" refers to the nature in the genetic code permitting variation of the nucleotide sequence without affecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant
invention relates to any nucleic acid fragment that encodes all or a substantial portion of the amino acid sequence encoding the instant microbial polypeptide as set forth in SEQ ID NO:4. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
"Chemically synthesized", as related to a sequence of DNA, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be accomplished using well-established procedures; or, automated chemical synthesis can be performed using one of a number of commercially available machines. "Synthetic genes" can be assembled from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form gene segments that are then enzymatically assembled to construct the entire gene. Accordingly, the genes can be tailored for optimal gene expression based on optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determination of preferred codons can be based on a survey of genes derived from the host cell, where sequence information is available. "Gene" refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5' non-coding sequences) and following (31 non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Chimeric gene" refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. Chimeric genes of the present invention will typically comprise a GPAT promoter region operably linked to a coding region of interest. "Endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene not normally
found in the host organism, but that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene that has been introduced into the genome by a transformation procedure. A "codon-optimized gene" is a gene having its frequency of codon usage designed to mimic the frequency of preferred codon usage of the host cell.
"Coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence.
"Suitable regulatory sequences" refer to transcriptional and translational nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (31 non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, polyadenylation recognition sequences, RNA processing sites, effector binding sites and stem-loop structures.
"Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3" to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters that cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.
The term "mutant promoter" is defined herein as a promoter having a nucleotide sequence comprising a substitution, deletion, and/or insertion of one or more nucleotides relative to the parent promoter, wherein the mutant promoter has more or less promoter activity than the corresponding parent promoter. The term "mutant promoter" will encompass natural variants and in vitro generated variants obtained using
methods well known in the art (e.g., classical mutagenesis, site-directed mutagenesis and "DNA shuffling").
The term "3' non-coding sequences" or "transcription terminator" refers to DNA sequences located downstream of a coding sequence. This includes polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 31 end of the mRNA precursor. The 3' region can influence the transcription, RNA processing or stability, or translation of the associated coding sequence.
"RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA" or "mRNA" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a double-stranded DNA that is complementary to, and derived from, mRNA. "Sense" RNA refers to RNA transcript that includes the mRNA and so can be translated into protein by the cell.
"Antisense RNA" refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene (U.S. 5,107,065; WO 99/28508). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that is not translated and yet has an effect on cellular processes.
The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.
The term "expression", as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from
a coding sequence. Expression may also refer to translation of mRNA into a polypeptide.
"Introns" are sequences of non-coding DNA found in gene sequences (either in the coding region, 5' non-coding region, or 3' non- coding region) in most eukaryotes. Their full function is not known; however, some enhancers are located in the introns (Giacopelli F. et al., Gene Expr. 11 :95-104 (2003)). These intron sequences are transcribed, but removed from within the pre-mRNA transcript before the mRNA is translated into a protein. This process of intron removal occurs by self- splicing of the sequences (exons) on either side of the intron.
"Transformation" refers to the transfer of a nucleic acid molecule into a host organism, resulting in genetically stable inheritance. The nucleic acid molecule may be a plasmid that replicates autonomously, for example; or, it may integrate into the genome of the host organism. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" or "recombinant" or "transformed" organisms.
The terms "plasmid", "vector" and "cassette" refer to an extra chromosomal element often carrying genes that are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA fragments. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell. "Expression cassette" refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host. The term "sequence analysis software" refers to any computer algorithm or software program that is useful for the analysis of nucleotide or amino acid sequences. "Sequence analysis software" may be commercially available or independently developed. Typical sequence analysis software will include, but is not limited to: 1.) the GCG suite of programs (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wl); 2.) BLASTP, BLASTN, BLASTX (Altschul et al., J. MoI. Biol. 215:403-410 (1990)); 3.) DNASTAR (DNASTAR, Inc.
Madison, Wl); and 4.) the FASTA program incorporating the Smith- Waterman algorithm (W. R. Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Suhai, Sandor, Ed. Plenum: New York, NY). Within the context of this application it will be understood that where sequence analysis software is used for analysis, that the results of the analysis will be based on the "default values" of the program referenced, unless otherwise specified. As used herein "default values" will mean any set of values or parameters that originally load with the software when first initialized. Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory: Cold Spring Harbor, NY (1989) (hereinafter "Maniatis"); by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring Harbor Laboratory: Cold Spring Harbor, NY (1984); and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc, and Wiley-lnterscience (1987). Identification Of GPAT In Yarrowia lipolvtica GPAT encodes a glycerol-3-phosphate O-acyltransferase that is responsible for carrying out the initial step of phospholipid biosynthesis in yeast. Specifically, the enzyme acylates glycerol 3-phosphate (G-3-P) and dihydroxyacetone phosphate at the sn-1 position; this permits formation of lysophosphatidic acid (LPA) and acyl-dihydroxyacetone (acyl-DHAP). LPA acyltransferase then catalyzes the acylation of LPA at the sn-2 position to generate phosphatidic acid, which serves as a general precursor for all glycerophospholipids (e.g., triacylglycerols). Although it was widely accepted that multiple isoforms of G-3-P acyltransferases were present in yeast, it was not until the work of Zheng and Zou (J. Biol. Chem. 276(45):41710-41716 (2001 ); WO 02/08391 A2) that two genes encoding this enzyme were identified and sequenced from Saccharomyces cerevisiae (i.e., Gat1 [GenBank Accession No. AJ311354] and Gat2(SCT1) [GenBank Accession No. AJ314608]).
The present invention identifies the complete nucleotide sequence encoding a Yarrowia lipolytica glycerol-3-phosphate O- acyltransferase (GPAT) contained within ORF YALI-CDS1055.1 (Genolevures project, sponsored by the Center for Bioinformatics,
LaBRI, batiment A30, Universite Bordeaux 1 , 351 , cours de Ia
Liberation, 33405 Talence Cedex, France; see also GenBank
Accession No. CAG81570). The amino acid sequence of this ORF was publicly available prior to the Applicants' invention and the ORF was annotated as having similarity to the Saccharomyces cerevisiae GPATs. Based on sequence comparison to the S. cerevisiae Gat1 and Gat2(SCT1) genes (supra), the Applicants hypothesized that ORF YALI-CDS 1055.1 likely encoded the Y. lipolytica GPAT. Subsequently, sequencing of the Y. lipolytica ORF confirmed the Applicants' deduction and permitted annotation of the gene within ORF YALI-CDS 1055.1 (also GenBank Accession No.
CAG81570) as the Y. lipolytica GPAT.
As expected, comparison of the gpat nucleotide base (SEQ
ID NO:3) and deduced amino acid sequence to the Genolevures database of Y. lipolytica ORFs reveals that the amino acid sequence of gpat reported herein over a length of 727 amino acids has 100% identity to the Y. lipolytica ORF identified as YALI-
CDS1055.1 (SEQ ID NO:4).
Identification Of The GPAT Promoter Region In Yarrowia lipolytica Although numerous studies have examined GPAT and its ability to affect triacylglyceride and phospholipid synthesis (e.g., WO 00/78974 A2;
WO 02/08391 A2; Mishra, S. and Kamisaka, Y. Biochem. J. 355:315-322
(2001)), few have investigated the GPAT promoter. One exception is the work of Jerkins et al. (J. Biol. Chem. 270(3):1416-1421 (1995)), wherein the murine mitochondrial GPAT promoter (GenBank Accession No.
U11680) was characterized.
In the present invention, it was desirable to identify the putative promoter region that naturally regulates gpat in the oleaginous yeast,
Yarrowia lipolytica, following isolation of the gene encoding GPAT. And, based on the work described herein, this putative promoter region has been identified as useful for driving expression of any suitable coding region of interest in a transformed yeast.
In general, a promoter useful in an oleaginous yeast should meet the following criteria: 1.) Strength. A strong yeast promoter is a necessary premise for a high expression level, and the low copy number of the ars18 (Fournier, P. et al., Yeast 7:25-36 (1991)) based expression
vectors or chimeric genes integrated into the genome makes this demand even more important when Y. lipolytics is used as the host organism.
2.) Activity in a medium suitable for expression of the coding region of interest, and high enzymatic activity of that coding region of interest.
3.) pH Tolerance. If the coding region of interest is known to be produced only in e.g., an acidic environment, then the promoter operably linked to said coding region of interest must function at the appropriate pH. pH tolerance is of course limited by the tolerance of the host organism.
4.) Inducibility. A tightly regulated yeast promoter makes it possible to separate the growth stage from the expression stage, thereby enabling expression of products that are known to inhibit cell growth.
5.) Activity in the stationary phase of growth in oleaginous yeast hosts for accumulation of PUFAs.
Additionally, it is preferable for novel yeast promoters to possess differences in activity with respect to the known Y. lipolytics TEF (U.S. 6,265,185), XPR2 (U.S. 4,937,189; EP220864; EP832258), GPD (WO 2005/003310), GPDIN (U.S. Patent Application No. 11/183664), GPM (WO 2005/003310), FBA (WO 2005/049805) and FBAIN (WO 2005/049805) promoters and/or the G3P, ICL1 , POT1 , POX1 , POX2 and POX5 promoters (Juretzek et al., Biotech. Bioprocess Eng., 5:320-326 (2000)). A comparative study of the TEF and FBAIN promoters and the GPAT promoter of the instant invention is provided in Example 7. It is shown that the yeast promoter of the present invention has improved activity compared to the TEF promoter, and diminished activity with respect to FBAIN. An example of a suitable GPAT promoter region is provided as
SEQ ID NO:17 (comprising the -1130 to -1 region of the Y. lipolytics gpst gene (wherein the 'A' position of the 'ATG' translation initiation codon is designated as +1)), but this is not intended to be limiting in nature. One skilled in the art will recognize that since the exact boundaries of the GPAT promoter sequence have not been completely defined, DNA fragments of increased or diminished length may have identical promoter activity. For example, in an alternate embodiment, the GPAT promoter will
comprise nucleotides -500 to -1 of SEQ ID NO:17, thereby permitting relatively strong promoter activity; in another embodiment, the -100 to -1 region of SEQ ID NO: 17 should be sufficient for basal activity of the promoter. Likewise, the promoter region of the invention may comprise additional nucleotides to those specified above. For example, the promoter sequences of the invention may be constructed on the basis of the -1678 to +1 region of the gpati gene (based on SEQ ID NO:13).
In alternate embodiments mutant promoters may be constructed, wherein the DNA sequence of the promoter has one or more nucleotide substitutions (i.e., deletions, insertions, substitutions, or addition of one or more nucleotides in the sequence) which do not effect (in particular impair) the yeast promoter activity. Regions that can be modified without significantly affecting the yeast promoter activity can be identified by deletion studies. A mutant promoter of the present invention is at least about 20%, preferably at least about 40%, more preferably at least about 60%, more preferably at least about 80%, more preferably at least about 90%, more preferably at least about 100%, more preferably at least about 200%, more preferably at least about 300% and most preferably at least about 400% greater than the promoter activity of the wildtype GPAT promoter region described herein as SEQ ID NO:17.
Methods for mutagenesis are well known in the art and suitable for the generation of mutant promoters. For example, in vitro mutagenesis and selection, PCR based random mutagenesis, site-directed mutagenesis or other means can be employed to obtain mutations of the naturally occurring promoter or gene of the instant invention (wherein such mutations may include deletions, insertions and point mutations, or combinations thereof). This would permit production of a putative promoter having a more desirable level of promoter activity in the host cell. Or, if desired, the regions of a nucleotide of interest important for promoter activity can be determined through routine mutagenesis, expression of the resulting mutant promoters and determination of their activities. An overview of these techniques are described in WO2005/003310. All such mutant promoters that are derived from the instant GPAT promoter described herein are within the scope of the present invention. Promoter activity is typically measured against the activity of the wild type promoter under similar conditions. Promoter activity is generally measured as a function of gene expression and may be determined in a
variety of ways including gene expression profiling, measurement of the level of protein expression by SDS gel or other means, or the measurement of reporter activity where reporter gene fusions have been created. Isolation Of Homoloαs of the GPAT Putative Promoter Region
It will be appreciated by a person of skill in the art that the promoter regions and gene of the present invention have homologs in a variety of yeast species; and, the use of the promoters and genes for heterologous gene expression are not limited to those promoters and genes derived from Y. lipolytics, but extend to homologs in other yeast species. For example, the invention encompasses homologs derived from oleaginous genera including, but not limited to: Yarrowia, Candida, Rhodotorula, Rhodosporidiunπ, Cryptococcus, Trichosporon and Lipomyces; examples of preferred species within these genera include: Rhodosporidium toruloides, Lipomyces starkeyii, L. lipoferus, Candida revkaufi, C. pulcherrima, C. tropicalis, C. utilis, Trichosporon pullans, T. cutaneυm, Rhodotorula glutinυs and R. graminis.
Homology typically is measured using sequence analysis software, wherein the term "sequence analysis software" refers to any computer algorithm or software program (commercially available or independently developed) that is useful for the analysis of nucleotide or amino acid sequences. In general, such computer software matches similar sequences by assigning degrees of homology to various substitutions, deletions and other modifications. As is well known in the art, isolation of homologous promoter regions or genes using sequence-dependent protocols is readily possible using various techniques; and, these techniques can rely on either the direct identification of a promoter having homology to the GPAT promoter of the invention or the indirect identification of a promoter by initial identification of gene having significant homology to the gpat gene and then analysis of the 5' upstream sequence of the homologous gene. Examples of sequence-dependent protocols include, but are not limited to: 1.) methods of nucleic acid hybridization; 2.) methods of DNA and RNA amplification, as exemplified by various uses of nucleic acid amplification technologies [e.g., polymerase chain reaction (PCR), Mullis et al., U.S. Patent 4,683,202; ligase chain reaction (LCR), Tabor, S. et al., Proc. Acad. Sd. USA 82:1074 (1985); or strand displacement amplification
(SDA), Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89:392 (1992)]; and 3.) methods of library construction and screening by complementation.
For example, putative promoter regions or genes encoding similar proteins or polypeptides to those of the instant invention could be isolated by using all or a portion of the instant nucleic acid fragments as DNA hybridization probes to screen libraries from any desired microbe using methodology well known to those skilled in the art. Specific oligonucleotide probes based upon the instant nucleic acid sequences can be designed and synthesized by methods known in the art (Sambrook, supra). Moreover, the entire sequences can be used directly to synthesize DNA probes by methods known to the skilled artisan (e.g., random primers DNA labeling, nick translation, or end-labeling techniques), or RNA probes using available in vitro transcription systems. In addition, specific primers can be designed and used to amplify a part of (or full-length of) the instant sequences. The resulting amplification products can be labeled directly during amplification reactions or labeled after amplification reactions, and used as probes to isolate full-length DNA fragments under conditions of appropriate stringency.
Typically, in PCR-type amplification techniques, the primers have different sequences and are not complementary to each other. Depending on the desired test conditions, the sequences of the primers should be designed to provide for both efficient and faithful replication of the target nucleic acid. Methods of PCR primer design are common and well known in the art (Thein and Wallace, "The use of oligonucleotides as specific hybridization probes in the Diagnosis of Genetic Disorders", in Human Genetic Diseases: A Practical Approach, K. E. Davis (Ed.), (1986) pp 33-50 IRL: Herndon, VA; and Rychlik, W., In Methods in Molecular Biology, White, B. A. (Ed.), (1993) Vol. 15, pp 31-39, PCR Protocols: Current Methods and Applications. Humania: Totowa, NJ). Generally two short segments of the instant sequences may be used in PCR protocols to amplify longer nucleic acid fragments encoding homologous polynucleotides from DNA or RNA. The PCR may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the instant nucleic acid fragments, and the sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRNA precursor encoding microbial genes.
Alternatively, the instant sequences may be employed as hybridization reagents for the identification of homologs. The basic components of a nucleic acid hybridization test include a probe, a sample suspected of containing the nucleotide sequence of interest, and a specific hybridization method. Probes of the present invention are typically single- stranded nucleic acid sequences that are complementary to the nucleic acid sequences to be detected. Probes are "hybridizable" to the nucleic acid sequence to be detected. The probe length can vary from 5 bases to tens of thousands of bases, and will depend upon the specific test to be done. Typically a probe length of about 15 bases to about 30 bases is suitable. Only part of the probe molecule need be complementary to the nucleic acid sequence to be detected. In addition, the complementarity between the probe and the target sequence need not be perfect. Hybridization does occur between imperfectly complementary molecules with the result that a certain fraction of the bases in the hybridized region are not paired with the proper complementary base.
Hybridization methods are well defined. Typically the probe and sample must be mixed under conditions that will permit nucleic acid hybridization. This involves contacting the probe and sample in the presence of an inorganic or organic salt under the proper concentration and temperature conditions. The probe and sample nucleic acids must be in contact for a long enough time that any possible hybridization between the probe and sample nucleic acid may occur. The concentration of probe or target in the mixture will determine the time necessary for hybridization to occur. The higher the probe or target concentration, the shorter the hybridization incubation time needed. Optionally, a chaotropic agent may be added. The chaotropic agent stabilizes nucleic acids by inhibiting nuclease activity. Furthermore, the chaotropic agent allows sensitive and stringent hybridization of short oligonucleotide probes at room temperature (Van Ness and Chen, Nucl. Acids Res. 19:5143-5151 (1991)). Suitable chaotropic agents include guanidinium chloride, guanidinium thiocyanate, sodium thiocyanate, lithium tetrachloroacetate, sodium perchlorate, rubidium tetrachloroacetate, potassium iodide and cesium trifluoroacetate, among others. Typically, the chaotropic agent will be present at a final concentration of about 3 M. If desired, one can add formamide to the hybridization mixture, typically 30-50% (v/v).
Various hybridization solutions can be employed. Typically, these comprise from about 20 to 60% volume, preferably 30%, of a polar organic solvent. A common hybridization solution employs about 30-50% v/v formamide, about 0.15 to 1 M sodium chloride, about 0.05 to 0.1 M buffers (e.g., sodium citrate, Tris-HCI, PIPES or HEPES (pH range about 6-9)), about 0.05 to 0.2% detergent (e.g., sodium dodecylsulfate), or between 0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kdal), polyvinylpyrrolidone (about 250-500 kdal) and serum albumin. Also included in the typical hybridization solution will be unlabeled carrier nucleic acids from about 0.1 to 5 mg/mL, fragmented nucleic DNA (e.g., calf thymus or salmon sperm DNA, or yeast RNA), and optionally from about 0.5 to 2% wt/vol glycine. Other additives may also be included, such as volume exclusion agents that include a variety of polar water- soluble or swellable agents (e.g., polyethylene glycol), anionic polymers (e.g., polyacrylate or polymethylacrylate) and anionic saccharide polymers (e.g., dextran sulfate). Recombinant Expression In Yeast
Initiation control regions or promoter regions that are useful to drive expression of a coding gene of interest in the desired host cell are selected from those derived from the upstream portion of the gpat gene (SEQ ID NO:3). The promoter regions may be identified from the upstream sequences of gpat and its homologs and isolated according to common methods (Maniatis, supra). Once the promoter regions are identified and isolated (e.g., SEQ ID NOs:13 and 17), they may be operably linked to a coding region of interest to be expressed in a suitable expression vector. These chimeric genes may then be expressed in natural host cells and heterologous host cells, particularly in the cells of oleaginous yeast hosts. Thus, one aspect of the present invention provides a recombinant expression vector comprising a yeast promoter of the invention.
In a further aspect, the invention provides a method of expressing a coding region of interest in a transformed yeast, wherein a transformed yeast is provided having a chimeric gene comprising: (i) a promoter region of a Yarrowia gpat gene; and, (ii) a coding region of interest expressible in the yeast, wherein the promoter region is operably linked to the coding region of interest; and the transformed yeast is grown under conditions
wherein the chimeric gene is expressed. The polypeptide so produced can optionally be recovered from the culture.
Microbial expression systems and expression vectors are well known to those skilled in the art. Any of these could be used to construct chimeric genes comprising a promoter region derived from the gpat gene for production of any specific coding region of interest suitable for expression in a desirable yeast host cell. These chimeric genes could then be introduced into appropriate microorganisms by integration via transformation to provide high-level expression of the enzymes upon induction. Alternatively, the promoters can be cloned into a plasmid that is capable of transforming and replicating itself in the preferred yeast. The coding region of interest to be expressed can then be cloned downstream from the promoter. Once the recombinant host is established, gene expression can be accomplished by growing the cells under suitable conditions (infra).
Suitable Coding Regions Of Interest
Useful chimeric genes will include the promoter region of the gpat gene as defined herein or a mutant promoter thereof, operably linked to a suitable coding region of interest to be expressed in a preferred host cell. Coding regions of interest to be expressed in the recombinant yeasl host may be either endogenous to the host or heterologous and must be compatible with the host organism. Genes encoding proteins of commercial value are particularly suitable for expression. For example, suitable coding regions of interest may include (but are not limited to) those encoding viral, bacterial, fungal, plant, insect or vertebrate coding regions of interest, including mammalian polypeptides. Further, these coding regions of interest may be, for example, structural proteins, enzymes (e.g., oxidoreductases, transferases, hydrolyases, lyases, isomerases, ligases), or peptides. A non-limiting list includes genes encoding enzymes such as acyltransferases, aminopeptidases, amylases, carbohyd rases, carboxypeptidases, catalyases, cellulases, chitinases, cutinases, cyclodextrin glycosyltransferases, deoxyribonucleases, esterases, α-galactosidases, β-glucanases, β-galactosidases, glucoamylases, α-glucosidases, β-glucosidases, invertases, laccases, lipases, mannosidases, mutanases, oxidases, pectinolytic enzymes, peroxidases, phospholipases, phytases, polyphenoloxidases, proteolytic enzymes, ribonucleases, transglutaminases or xylanases.
Preferred in the present invention in some embodiments are coding regions of the enzymes involved in the production of microbial oils, including ω-6 and ω-3 fatty acids. These coding regions include desaturases and elongases (e.g., see WO 2004/101757 for a partial review of available genes in GenBank and/or the patent literature and considerations for choosing a specific polypeptide having desaturase or elongase activity). Components Of Vectors/ DNA Cassettes
Vectors or DNA cassettes useful for the transformation of suitable host cells are well known in the art. The specific choice of sequences present in the construct is dependent upon the desired expression products (supra), the nature of the host cell, and the proposed means of separating transformed cells versus non-transformed cells. Typically, however, the vector or cassette contains sequences directing transcription and translation of the relevant gene(s), a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the gene that controls transcriptional initiation and a region 31 of the DNA fragment that controls transcriptional termination. It is most preferred when both control regions are derived from genes from the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host.
Nucleotide sequences surrounding the translational initiation codon 'ATG' have been found to affect expression in yeast cells. If the desired polypeptide is poorly expressed in yeast, the nucleotide sequences of exogenous genes can be modified to include an efficient yeast translation initiation sequence motif to obtain optimal gene expression. For expression in yeast, this can be done by site-directed mutagenesis of an inefficiently expressed gene to include the favored translation initiation motif.
The termination region can be derived from the 3' region of the gene from which the initiation region was obtained or from a different gene. A large number of termination regions are known and function satisfactorily in a variety of hosts (when utilized both in the same and different genera and species from where they were derived). The termination region usually is selected more as a matter of convenience rather than because of any particular property. Preferably, the termination
region is derived from a yeast gene, particularly Saccharomyces, Schizosaccharomyces, Candida, Yarrowia or Kluyveromyces. The 3'- regions of mammalian genes encoding γ-interferon and α-2 interferon are also known to function in yeast. Termination control regions may also be derived from various genes native to the preferred hosts. Optionally, a termination site may be unnecessary; however, it is most preferred if included.
As one of skill in the art is aware, merely inserting a chimeric gene into a cloning vector does not ensure that it will be successfully expressed at the level needed. In response to needs for high expression rates, many specialized expression vectors have been created by manipulating a number of different genetic elements that control aspects of transcription, translation, protein stability, oxygen limitation and secretion from the host cell. More specifically, some of the molecular features that have been manipulated to control gene expression include: 1.) the nature of the relevant transcriptional promoter and terminator sequences; 2.) whether the gene is plasmid-bome or integrated into the genome of the host cell and the number of copies of the cloned gene [e.g., additional copies of a particular coding region of interest (operably linked to the promoter of the instant invention) may be introduced into the host to increase expression]; 3.) the final cellular location of the synthesized foreign protein; 4.) the efficiency of translation in the host organism; 5.) the intrinsic stability of the cloned gene protein within the host cell [e.g., expression of the coding region of interest can be increased by removing/deleting destabilizing sequences from either the mRNA or the encoded protein or by adding stabilizing sequences to the mRNA (U.S. 4,910,141)]; and 6.) the codon usage within the cloned gene, such that its frequency approaches the frequency of preferred codon usage of the host cell [e.g. translational efficiency of the encoded mRNAs can be increased by replacement of codons in the native gene with those for optimal gene expression in the selected host microorganism, to thereby substantially enhance the expression of the foreign gene encoding the polypeptide]. Each of these types of modifications are encompassed in the present invention, as means to further optimize expression of a chimeric gene comprising a promoter region of the gpat gene as defined herein or a mutant promoter thereof, operably linked to a suitable coding region of interest.
Transformation Of Yeast Cells
Once an appropriate chimeric gene has been constructed that is suitable for high-level expression in a yeast cell, it is placed in a plasmid vector capable of autonomous replication in a host cell or it is directly integrated into the genome of the host cell. Integration of expression cassettes can occur randomly within the host genome or can be targeted through the use of constructs containing regions of homology with the host genome sufficient to target recombination with the host locus. Where constructs are targeted to an endogenous locus, all or some of the transcriptional and translational regulatory regions can be provided by the endogenous locus.
Where two or more genes are expressed from separate replicating vectors, it is desirable that each vector has a different means of selection and should lack homology to the other constructs to maintain stable expression and prevent reassortment of elements among constructs. Judicious choice of regulatory regions, selection means and method of propagation of the introduced construct can be experimentally determined so that all introduced genes are expressed at the necessary levels to provide for synthesis of the desired products. Constructs comprising a coding region of interest may be introduced into a host cell by any standard technique. These techniques include transformation (e.g., lithium acetate transformation [Methods in Enzymology, 194:186-187 (1991)]), protoplast fusion, biolistic impact, electroporation, microinjection, or any other method that introduces the gene of interest into the host cell. More specific teachings applicable for oleaginous yeast (i.e., Yarrowia lipolytica) include U.S. Patents No. 4,880,741 and No. 5,071 ,764 and Chen, D. C. et al. (Appl Microbiol Biotechnol. 48(2):232-235 (1997)).
For convenience, a host cell that has been manipulated by any method to take up a DNA sequence (e.g., an expression cassette) will be referred to as "transformed" or "recombinant" herein. The transformed host will have at least one copy of the expression construct and may have two or more, depending upon whether the gene is integrated into the genome, amplified, or is present on an extrachromosomal element having multiple copy numbers. The transformed host cell can be identified by various selection techniques, as described in WO 2004/101757 and WO2005/003310. Preferred Hosts
Preferred host cells for expression of the instant gene and coding regions of interest operably linked to the instant promoter fragments herein are yeast cells (where oleaginous yeast are most preferred where the desired use is for the production of microbial oils, infra). Oleaginous yeast are naturally capable of oil synthesis and accumulation, wherein the oil can comprise greater than about 25% of the cellular dry weight, more preferably greater than about 30% of the cellular dry weight, and most preferably greater than about 40% of the cellular dry weight. Genera typically identified as oleaginous yeast include, but are not limited to: Yarrowia, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Trichosporon and Lipomyces. More specifically, illustrative oil- synthesizing yeast include: Rhodosporidium toruloides, Lipomyces starkeyii, L lipoferus, Candida revkaufi, C. pulcherrima, C. tropicalis, C. utilis, Trichosporon pullans, T. cutaneum, Rhodotorula glutinus, R. graminis and Yarrowia lipolytica (formerly classified as Candida lipolytica). Most preferred is the oleaginous yeast Yarrowia lipolytica; and, in a further embodiment, most preferred are the Y. lipolytica strains designated as ATCC #20362, ATCC #8862, ATCC #18944, ATCC #76982 and/or LGAM S(7)1 (Papanikolaou S., and Aggelis G., Bioresour. Technol. 82(1 ):43-9 (2002)). The Y. lipolytica strain designated as ATCC #20362 was the particular strain from which the GPAT promoter and gene was isolated therefrom.
Industrial Production Using Transformed Yeast Expressing A Suitable Coding Region Of Interest In general, media conditions that may be optimized for expression of a particular coding region of interest include the type and amount of carbon source, the type and amount of nitrogen source, the carbon-to- nitrogen ratio, the oxygen level, growth temperature, pH, length of the biomass production phase and the time of cell harvest. Microorganisms of interest, such as oleaginous yeast, are grown in complex media (e.g., yeast extract-peptone-dextrose broth (YPD)) or a defined minimal media that lacks a component necessary for growth and thereby forces selection of the desired expression cassettes (e.g., Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Ml)). Fermentation media in the present invention must contain a suitable carbon source. Suitable carbon sources may include, but are not limited to: monosaccharides (e.g., glucose, fructose), disaccharides (e.g., lactose,
sucrose), oligosaccharides, polysaccharides (e.g., starch, cellulose or mixtures thereof), sugar alcohols (e.g., glycerol) or mixtures from renewable feedstocks (e.g., cheese whey permeate, cornsteep liquor, sugar beet molasses, barley malt). Additionally, carbon sources may include alkanes, fatty acids, esters of fatty acids, monoglycerides, diglycerides, triglycerides, phospholipids and various commercial sources of fatty acids including vegetable oils (e.g., soybean oil) and animal fats. Additionally, the carbon source may include one-carbon sources (e.g., carbon dioxide, methanol, formaldehyde, formate, carbon-containing amines) for which metabolic conversion into key biochemical intermediates has been demonstrated. Hence it is contemplated that the source of carbon utilized in the present invention may encompass a wide variety of carbon-containing sources and will only be limited by the choice of the host organism. Although all of the above mentioned carbon sources and mixtures thereof are expected to be suitable in the present invention, preferred carbon sources are sugars and/or fatty acids. Most preferred is glucose and/or fatty acids containing between 10-22 carbons.
Nitrogen may be supplied from an inorganic (e.g., (NH4)2SO4) or organic (e.g., urea or glutamate) source. In addition to appropriate carbon and nitrogen sources, the fermentation media must also contain suitable minerals, salts, cofactors, buffers, vitamins and other components known to those skilled in the art suitable for the growth of the microorganism.
Preferred growth media in the present invention are common commercially prepared media, such as Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Ml). Other defined or synthetic growth media may also be used and the appropriate medium for growth of the particular microorganism will be known by one skilled in the art of microbiology or fermentation science. A suitable pH range for the fermentation is typically between about pH 4.0 to pH 8.0, wherein pH 5.5 to pH 7.0 is preferred as the range for the initial growth conditions. The fermentation may be conducted under aerobic or anaerobic conditions, wherein microaerobic conditions are preferred.
Host cells comprising a suitable coding region of interest operably linked to the promoters of the present invention may be cultured using methods known in the art. For example, the cell may be cultivated by shake flask cultivation, small-scale or large-scale fermentation in laboratory or industrial fermentors performed in a suitable medium and
under conditions allowing expression of the coding region of interest. Furthermore, where commercial production of a product that relies on the instant genetic chimera is desired, a variety of culture methodologies may be applied. For example, large-scale production of a specific gene product over-expressed from a recombinant host may be produced by a batch, fed-batch or continuous fermentation process, as is well known in the art (see, e.g., Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, 2nd ed., (1989) Sinauer Associates: Sunderland, MA; or Deshpande, Mukund V., Appl. Biochem. Biotechnol., 36:227 (1992), each herein incorporated by reference).
DESCRIPTION OF PREFERRED EMBODIMENTS Although the promoters of the present invention will be suitable for expression of any suitable coding region of interest in an oleaginous yeast, in a preferred embodiment the promoters will be utilized in the development of an oleaginous yeast that accumulates high levels of oils enriched in PUFAs. Toward this end, it is necessary to introduce and express e.g., desaturases and elongases that allow for the synthesis and accumulation of ω-3 and/or ω-6 fatty acids.
The term "fatty acids" refers to long chain aliphatic acids (alkanoic acids) of varying chain lengths, from about C12 to C22 (although both longer and shorter chain-length acids are known). The predominant chain lengths are between C16 and C22- The structure of a fatty acid is represented by a simple notation system of "X:Y", where X is the total number of carbon (C) atoms and Y is the number of double bonds. Additional details concerning the differentiation between "saturated fatty acids" versus "unsaturated fatty acids", "monounsatu rated fatty acids" versus "polyunsaturated fatty acids" (or "PUFAs"), and "ω-6 fatty acids" (ω- 6 or /?-6) versus "ω-3 fatty acids" (ω-3 or π-3) are provided in WO2004/101757. Nomenclature used to describe PUFAs in the present disclosure is shown below in Table 2. In the column titled "Shorthand Notation", the omega-reference system is used to indicate the number of carbons, the number of double bonds and the position of the double bond closest to the omega carbon, counting from the omega carbon (which is numbered 1 for this purpose). The remainder of the Table summarizes the common names of ω-3 and ω-6 fatty acids, the abbreviations that will be used
throughout the remainder of the specification, and each compounds' chemical name.
Table 2 Nomenclature Of Polyunsaturated Fatty Acids
Microbial Biosynthesis Of Omeqa-3 And Omeqa-6 Fatty Acids
The process of de novo synthesis of palmitate (16:0) in oleaginous microorganisms is described in WO 2004/101757. This fatty acid is the precursor of longer-chain saturated and unsaturated fatty acid derivates, which are formed through the action of elongases and desaturases. For example, palmitate is converted to its unsaturated derivative [palmitoleic acid (16:1 )] by the action of a Δ9 desaturase; similarly, palmitate is
elongated to form stearic acid (18:0), which can be converted to its unsaturated derivative by a Δ9 desaturase to thereby yield oleic (18:1) acid.
The metabolic process that converts LA to GLA, DGLA and ARA (the ω-6 pathway) and ALA to STA, ETA, EPA and DHA (the ω-3 pathway) is well described in the literature and is schematically depicted in Figure 6 (see also WO2004/101757 and WO2005/003310). Simplistically, this process involves elongation of the carbon chain through the addition of carbon atoms and desaturation of the molecule through the addition of double bonds, via a series of special desaturation and elongation enzymes present in the endoplasmic reticulim membrane (and hereinafter referred to as "PUFA biosynthetic pathway enzymes"). More specifically, "PUFA biosynthetic pathway enzymes" or "ω-3/ ω-6 biosynthetic pathway enzymes" will refer to any of the following enzymes (and genes which encode said enzymes) associated with the biosynthesis of a PUFA, including: a Δ4 desaturase, a Δ5 desaturase, a Δ6 desaturase, a Δ12 desaturase, a Δ15 desaturase, a Δ17 desaturase, a Δ9 desaturase, a Δ8 desaturase and/or an elongase(s). For further clarity within the present disclosure, the term "desaturase" refers to a polypeptide that can desaturate one or more fatty acids to produce a mono- or polyunsaturated fatty acid or precursor of interest. Thus, despite use of the omega- reference system to refer to specific fatty acids, it is more convenient to indicate the activity of a desaturase by counting from the carboxyl end of the source using the delta-system. For example, a Δ17 desaturase will desaturate a fatty acid between the 17th and 18th carbon atom numbered from the carboxyl-terminal end of the molecule and can, for example, catalyze the conversion of ARA to EPA and/or DGLA to ETA. In contrast, the term "elongase" refers to a polypeptide that can elongate a fatty acid carbon chain to produce a mono- or polyunsaturated fatty acid that is 2 carbons longer than the fatty acid source that the elongase acts upon. This process of elongation occurs in a multi-step mechanism in association with fatty acid synthase, whereby CoA is the acyl carrier (Lassner et al., The Plant Cell 8:281-292 (1996)).
As will be understood by one skilled in the art, the particular functionalities required to be introduced into a host organism for production of a particular PUFA final product will depend on the host cell (and its native PUFA profile and/or desaturase/elongase profile), the
availability of substrate and the desired end product(s). As shown in Figure 6, LA, GLA, EDA, DGLA, ARA, ALA, STA, ETrA, ETA, EPA, DPA and DHA may all be produced in oleaginous yeast, by introducing various combinations of the following PUFA enzyme functionalities: a Λ4 desaturase, a Δ5 desaturase, a Δ6 desaturase, a Δ12 desaturase, a Δ15 desaturase, a Δ17 desaturase, a Δ9 desaturase, a Δ8 desaturase and/or an elongase(s). One skilled in the art will be able to identify various candidate genes encoding each of the above enzymes, according to publicly available literature (e.g., GenBank), the patent literature, and experimental analysis of microorganisms having the ability to produce PUFAs. Thus, a variety of desaturases and elongases are suitable as coding regions of interest in the present invention. These coding regions of interest could be operably linked to the GPAT promoters of the present invention or mutant promoters thereof, and used as chimeric genes for expression of various ω-6 and ω-3 fatty acids, using techniques well known to those skilled in the art (e.g., see WO 2004/101757). As such, the invention provides a method for the production of ω-3 and/or ω-6 fatty acids comprising: a) providing a transformed oleaginous yeast comprising a chimeric gene, said gene comprising:
1 ) a promoter region of a Yarrowia gpat gene; and,
2) a coding region of interest encoding at least one enzyme of the ω-3/ ω-6 fatty acid biosynthetic pathway; wherein the promoter region and coding region are operably linked; b) culturing the transformed oleaginous yeast of step (a) under conditions whereby the at least one enzyme of the ω-3/ ω-6 fatty acid biosynthetic pathway is expressed and a ω-3 or ω- 6 fatty acid is produced; and, c) optionally recovering the ω-3 or ω-6 fatty acid.
In preferred embodiments, the nucleic acid sequence of the promoter region is selected from the group consisting of: SEQ ID NOs: 13 and 17, and subsequences and mutant promoters thereof; and the coding region of interest is any desaturase or elongase suitable for expression in the oleaginous yeast for the production of ω-3 or ω-6 fatty acids.
For production of the greatest and the most economical yield of PUFAs, the transformed oleaginous yeast host cell is grown under conditions that optimize desaturase and elongase activities by optimizing expression of the chimeric genes of the present invention, wherein these chimeric genes comprise a promoter region of a gpat gene and a coding region of interest encoding a PUFA biosynthetic pathway enzyme.
Typically, accumulation of high levels of PUFAs in oleaginous yeast cells requires a two-stage process, since the metabolic state must be "balanced" between growth and synthesis/storage of fats. Thus, most preferably, a two-stage fermentation process is necessary for the production of PUFAs in oleaginous yeast. In this approach, the first stage of the fermentation is dedicated to the generation and accumulation of cell mass and is characterized by rapid cell growth and cell division. In the second stage of the fermentation, it is preferable to establish conditions of nitrogen deprivation in the culture to promote high levels of lipid accumulation. The effect of this nitrogen deprivation is to reduce the effective concentration of AMP in the cells, thereby reducing the activity of the NAD-dependent isocitrate dehydrogenase of mitochondria. When this occurs, citric acid will accumulate, thus forming abundant pools of acetyl- CoA in the cytoplasm and priming fatty acid synthesis. Thus, this phase is characterized by the cessation of cell division followed by the synthesis of fatty acids and accumulation of oil. Although cells are typically grown at about 30 0C, some studies have shown increased synthesis of unsaturated fatty acids at lower temperatures (Yongmanitchai and Ward, Appl. Environ. Microbiol. 57:419-25 (1991)). Based on process economics, this temperature shift should likely occur after the first phase of the two-stage fermentation, when the bulk of the organisms' growth has occurred. Additionally, particular attention is given to several metal ions (e.g.,
Mn+2, Co+2, Zn+2, Mg+2) that promote synthesis of lipids and PUFAs in the fermentation media (Nakahara, T. et al. Ind. Appl. Single Cell Oils, D. J. Kyle and R. Colin, eds. pp 61-97 (1992)).
Purification Of PUFAs The PUFAs produced in a host microorganism as described herein may be found as free fatty acids or in esterified forms such as acylglycerols, phospholipids, sulfolipids or glycolipids, and may be
extracted from the host cell through a variety of means well-known in the art. One review of extraction techniques, quality analysis and acceptability standards for yeast lipids is that of Z. Jacobs (Critical Reviews in Biotechnology 12(5/6):463-491 (1992)). A brief review of downstream processing is also available by A. Singh and O. Ward (Adv. Appl. Microbiol. 45:271-312 (1997)).
In general, means for the purification of fatty acids (including PUFAs) may include extraction with organic solvents, sonication, supercritical fluid extraction (e.g., using carbon dioxide), saponification and physical means such as presses, or combinations thereof. One is referred to the teachings of WO 2004/101757.
EXAMPLES
The present invention is further defined in the following Examples. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions.
GENERAL METHODS
Standard recombinant DNA and molecular cloning techniques used in the Examples are well known in the art and are described by: 1.) Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory: Cold Spring Harbor, NY (1989) (hereinafter "Maniatis"); 2.) T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions; Cold Spring Harbor Laboratory: Cold Spring Harbor, NY (1984); and 3.) Ausubel, F. M. et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc, and Wiley-lnterscience (1987).
Materials and methods suitable for the maintenance and growth of microbial cultures are well known in the art. Techniques suitable for use in the following examples may be found as set out in Manual of Methods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, Eds), American Society for Microbiology: Washington, D.C. (1994)); or by Thomas D. Brock in Biotechnology: A Textbook of Industrial
Microbiology, 2nd e<±, Sinauer Associates: Sunderland, MA (1989). All reagents, restriction enzymes and materials used for the growth and maintenance of microbial cells were obtained from Aldrich Chemicals (Milwaukee, Wl), DlFCO Laboratories (Detroit, Ml), GIBCO/BRL (Gaithersburg, MD) or Sigma Chemical Company (St. Louis, MO), unless otherwise specified.
General molecular cloning was performed according to standard methods (Sambrook et al., supra). Oligonucleotides were synthesized by Sigma-Genosys (Spring, TX). Site-directed mutagenesis was performed using Stratagene's QuikChange™ Site-Directed Mutagenesis kit (San Diego, CA), per the manufacturer's instructions. When polymerase chain reaction (PCR) or site- directed mutagenesis was involved in subcloning, the constructs were sequenced to confirm that no errors had been introduced to the sequence. PCR products were cloned into Promega's pGEM-T-easy vector (Madison, Wl). Manipulations of genetic sequences were accomplished using the suite of programs available from the Genetics Computer Group Inc. (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wl). The GCG program "Pileup" was used with the gap creation default value of 12, and the gap extension default value of 4. The GCG "Gap" or "Bestfit" programs were used with the default gap creation penalty of 50 and the default gap extension penalty of 3. Unless otherwise stated, in all other cases GCG program default parameters were used.
The meaning of abbreviations is as follows: "sec" means second(s), "min" means minute(s), "h" means hour(s), "d" means day(s), "μL" means microliter(s), "mL" means milliliter(s), "L" means liter(s), "μM" means micromolar, "mM" means millimolar, "M" means molar, "mmol" means millimole(s), "μmole" mean micromole(s), "g" means gram(s), "μg" means microgram(s), "ng" means nanogram(s), "U" means unit(s), "bp" means base pair(s) and "kB" means kilobase(s). Transformation And Cultivation Of Yarrowia lipolvtica
Yarrowia lipolytica strains ATCC #20362 and ATCC #76982 were purchased from the American Type Culture Collection (Rockville, MD). Strains were usually grown at 28 C on YPD agar (1% yeast extract, 2% bactopeptone, 2% glucose, 2% agar) or in YPD liquid medium (2% bacto- yeast extract, 3% bactopeptone, 2% glucose).
Transformation of Y. lipolytica was performed according to the method of Chen, D. C. et al. (Appl. Microbiol Biotechnol. 48(2):232-235
(1997)), unless otherwise noted. Briefly, Yarrowia was streaked onto a YPD plate and grown at 30 °C for approximately 18 hr. Several large loopfuls of cells were scraped from the plate and resuspended in 1 ml_ of transformation buffer containing: 2.25 mL of 50% PEG, average MW 3350; 0.125 mL of 2 M Li acetate, pH 6.0; 0.125 mL of 2 M DTT; and 50 μg sheared salmon sperm DNA. Then, approximately 500 ng of linearized plasmid DNA was incubated in 100 μl of resuspended cells, and maintained at 39 C for 1 hr with vortex mixing at 15 min intervals. The cells were plated onto selection media plates and maintained at 30 °C for 2 to 3 days.
For selection of transformants, minimal medium ("MM") was generally used; the composition of MM is as follows: 0.17% yeast nitrogen base (DIFCO Laboratories, Detroit, Ml) without ammonium sulfate or amino acids, 2% glucose, 0.1% proline, pH 6.1 and 20 g/L agar. Supplements of uracil were added as appropriate to a final concentration of 0.01% (thereby producing "MMU" selection media).
Alternatively, transformants were selected on 5-fluoroorotic acid ("FOA"; also 5-fluorouracil-6-carboxylic acid monohydrate) selection media, comprising: 0.17% yeast nitrogen base (DIFCO Laboratories) without ammonium sulfate or amino acids, 2% glucose, 0.1% proline, 75 mg/L uracil, 75 mg/L uridine, 900 mg/L FOA (Zymo Research Corp., Orange, CA) and 20 g/L agar.
"SD" media comprises: 0.67% yeast nitrogen base with ammonium sulfate, without amino acids and 2% glucose. And finally, to promote conditions of oleaginy, High Glucose Media ("HGM") was prepared as follows: 14 g/L KH2PO4, 4 g/L K2HPO4, 2 g/L MgSO4 -7H2O, 80 g/L glucose (pH 6.5). Fatty Acid Analysis Of Yarrowia lipolytica
For fatty acid analysis, cells were collected by centrifugation and lipids were extracted as described in Bligh, E. G. & Dyer, W. J. (Can. J. Biochem. Physiol. 37:911-917 (1959)). Fatty acid methyl esters were prepared by transesterification of the lipid extract with sodium methoxide (Roughan, G., and Nishida I. Arch Biochem Biophys. 276(1 ):38-46 (1990)) and subsequently analyzed with a Hewlett-Packard 6890 GC fitted with a 30-m X 0.25 mm (i.d.) HP-INNOWAX (Hewlett-Packard) column. The oven temperature was from 170 °C (25 min hold) to 185 °C at 3.5 °C/min.
For direct base transesterification, Yarrowia culture (3 mL) was harvested, washed once in distilled water, and dried under vacuum in a Speed-Vac for 5-10 min. Sodium methoxide (100 μl of 1 %) was added to the sample, and then the sample was vortexed and rocked for 20 min. After adding 3 drops of 1 M NaCI and 400 μl hexane, the sample was vortexed and spun. The upper layer was removed and analyzed by GC as described above.
EXAMPLE 1 Isolation Of The Yarrowia lipolvtica Gene Encoding GPAT The present Example describes work performed to determine the nucleotide sequence (SEQ ID NO:3) of the Yarrowia lipolytica gene encoding GPAT (SEQ ID NO:4). This was possible by identifying an ORF in the Genolevures database of Y. lipolytica ORFs (sponsored by the Center for Bioinformatics, LaBRI, batiment A30, Universite Bordeaux 1 , 351 , cours de Ia Liberation, 33405 Talence Cedex, France) and then designing degenerate primers to amplify the putative gene. Identification Of A Putative Yarrowia lipolvtica GPAT
Based on the gene sequences encoding two isozymes of GPAT in Saccharomyces cerevisiae (GAT1 [GenBank Accession No. AJ311354; SEQ ID NO:2] and GAT2(SCT1) [GenBank Accession No. A.J314608; SEQ ID NO:1]; see also Zheng and Zou. J. Biol. Chem. 276(45):41710~41716 (2001) and WO 02/08391 A2), BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., J. Mo/. Biol. 215:403-410 (1993)) searches were conducted against the Genolevures genome database of Y. lipolytica ORFs (supra) to identify any similar sequences contained therein. The results of the BLAST comparisons identified one homolog (Yarrowia lipolytica ORF YAL1-CDS1055.1 , a protein of 727 amino acids; SEQ ID NO:4; also GenBank Accession No. CAG81570) annotated as a protein having similarity to S. cerevisiae GPATs.
Following the tentative identification of ORF YAL 1 -CDS 1055.1 , this amino acid sequence (SEQ ID NO:4) was BLASTed against the S. cerevisiae isozymes encoding GPAT. The results of these BLAST comparisons are shown below and are reported according to the % identity, % similarity, and Expectation value.
Table 3
Comparison Of Yarrowia lipolvtica ORF YAL1 -CDS 1055.1 To Saccharomvces cerevisiae GAT1 And GAT2(SCT1)
a % Identity is defined as percentage of amino acids that are identical between the two proteins.
D % Similarity is defined as percentage of amino acids that are identical or conserved between the two proteins.
c Expect value. The Expect value estimates the statistical significance of the match, specifying the number of matches, with a given score, that are expected in a search of a database of this size absolutely by chance.
Thus, it was hypothesized that ORF YALI-CDS 1055.1 encoded a V. lipolytica
GPAT.
Amplification And Sequencing Of The Putative Yarrowia lipolvtica GPAT
Degenerate oligonucleotides, as shown below, were designed to amplify the entire coding region of ORF YALI-CDS1055.1.
Degenerate oligonucleotide YGPAT-F (SEQ ID NO:5)
ATGTCNGAGACYGACCAYCTNCTN Degenerate oligonucleotide YGPAT-R (SEQ ID NO:6)
YTCYTCRTCYTGYTCTCGYCGYTT [Note: The nucleic acid degeneracy code used for SEQ ID NOs: 5 and 6 was as follows: R= AIG; Y=CfT; and N=A/C/T/G.]
The PCR amplification was carried out in a 50 μl total volume using a 1 :1 dilution of a premixed 2X PCR solution (TaKaRa Bio Inc., Otsu, Shiga, 520-2193, Japan). The final composition contained 25 mM TAPS, pH 9.3, 50 mM KCI, 2 mM MgCb, 1 mM 2-mercaptoethanol, 200 μM each deoxyribonucleotide triphosphate, 10 pmole of each primer, 50 ng genomic DNA of Y. lipolytica (ATCC #20362) and 1.25 units of TaKaRa Ex Taq™ DNA polymerase (Takara Mirus Bio, Madison, Wl). The thermocycler conditions were set for 30 cycles at 94 °C for 2.5 min, 55 °C for 30 sec and 72 °C for 2.5 min, followed by a final extension at 72 °C for 6 min.
The PCR products were separated by gel electrophoresis in 1% (w/v) agarose. A 2.2 kB DNA fragment was excised and purified using a Qiaexll gel purification kit (Qiagen, Valencia, CA). Subsequently, the purified 2.2 kB DNA fragment was cloned into the pGEM-T-easy vector (Promega, Madison, Wl). The ligated DNA was used to transform cells of E. coli Top10 and transformants were selected on LB (1% bacto-tryptone, 0.5% bacto-yeast extract and 1% NaCI) agar containing ampicillin (100 μg/mL). Analysis of the plasmid DNA from one transformant confirmed the presence of a plasmid of the expected size, designated as "pGPAT-1 ". Sequence analyses of pGPAT-1 showed that it contained a 2184 bp fragment of Y. lipolytica DNA encoding GPAT (SEQ ID NO:3). Specifically, identity of this gene sequence was determined by conducting BLAST searches for similarity to sequences contained in the BLAST "nr" database (comprising all non-redundant GenBank CDS translations, sequences derived from the 3-dimensional structure Brookhaven Protein Data Bank, the SWISS-PROT protein sequence database, EMBL and DDBJ databases). The sequence was analyzed for similarity to all publicly available DNA sequences contained in the "nr" database using the BLASTN algorithm provided by the National Center for Biotechnology Information (NCBI). The DNA sequence was translated in all reading frames and compared for similarity to all publicly available protein sequences contained in the "nr" database, using the BLASTX algorithm (Gish, W. and States, D. J. Nature Genetics 3:266-272 (1993)) provided by the NCBI. The BLAST searches revealed that the translated product of SEQ ID NO:3 (comprising the putative gpat gene) had the highest BLAST hits to annotated GPATs from: (1) Saccharomyces cerevisiae (SwissPro P36148, Entrez
CAA82146): 43% identity, 59% similarity, E-146; and (2)Schizosaccharomyces pombe SCT1(GAT2) homolog (GPAT): 48% identity, 66% similarity, E-148.
Furthermore, the translated product of SEQ ID NO:3 was 100% identical to the amino acid sequence of ORF YAL1-CDS1055.1 (SEQ ID NO:4; also GenBank Accession No. CAG81570).
EXAMPLE 2 Isolation Of The 5' Upstream Region Of GPAT From Yarrowia lipolvtica
To isolate the GPAT promoter region upstream of the gene identified in Example 1 , a genome-walking technique (Universal GenomeWalker, ClonTech, CA) was utilized, following the manufacturer's protocol.
Briefly, genomic DNA of Y. lipolytica was digested with Dral, EcoRV, Pvull or Stul individually, and the digested DNA samples were ligated with Genome
Walker adaptor (SEQ ID N0s:7 [top strand] and 8 [bottom strand]), as shown below:
5'-GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGGT-S' 3'-H2N-CCCGACCA-5'
PCR reactions were then carried out using the ligation products as templates and Adaptor-1 and YGPAT-5R-1 (SEQ ID NOs:9 and 10) as primers. The PCR amplification was carried out in a 50 μl total volume using the components and conditions described in Example 1 , with the exception that the template used was 1 μl of ligation product (versus 50 ng genomic DNA). Second PCR reactions were then carried out using 1 μl of 1 :50 diluted first PCR product as template, and Nested Adaptor Primer 2 and YGPAT-5R-2 (SEQ ID NOs:11 and 12) as primers. The PCR amplifications were carried out as described above. A 1.7 kB DNA fragment, amplified from the EcoRV digested sample, was purified using a Qiagen PCR purification kit and cloned into the pGEM-T-easy vector (Promega, Madison, Wl). The ligated DNA was used to transform E. coli Top10 and transformants were selected on LB agar containing ampicillin (100 μg/mL). Analysis of the plasmid DNA from one transformant confirmed the presence of the expected plasmid, designated "pEcoRV-G-5". Sequence analyses showed that pEcoRV-G-5 contained a fragment of 1781 bp (SEQ ID NO: 13), which included 1678 bp of 5' upstream sequence from the nucleotide 'A' (designated as +1) of the translation initiation codon 'ATG' of the GPAT gene. Assembly of DNA corresponding to overlapping SEQ ID NOs:3 and 13 yielded a single contig of DNA represented as SEQ ID NO:14 (Figure 1 ; 3862 bp total length). This contig therefore contained the -1678 to +2181 region of the GPAT gene, wherein the 'A' position of the 'ATG' translation initiation codon was designated as +1. EXAMPLE 3
Synthesis Of pY5-30 And pDMW214
Two plasmids were created, each comprising a different chimeric gene consisting of either the native Y. lipolytics TEF or FBAIN promoter and the "GUS" reporter gene, wherein "GUS" corresponds to the E. coli gene encoding β-glucuronidase (Jefferson, R.A. Nature. 342(6251 ):837- 838 (1989)). This was required for comparative studies investigating the promoter activity of TEF, FBAIN and GPAT, as described in Example 7.
Synthesis Of Plasmid pY5-30 (TEF::GUS::XPR)
The synthesis of plasmid pY5-30, comprising a TEF::GUS::XPR chimeric gene, is described in WO2005/003310. More specifically, plasmid pY5-30 (Figure 2A; SEQ ID NO:15) contained: a Yarrow/a autonomous replication sequence (ARS18); a CoIEI plasmid origin of replication; an ampicillin-resistance gene (AmpR) for selection in E. coϊr, a Yarrowia LEU2 gene for selection in Yarrowia; and the chimeric TEF::GUS::XPR gene. Synthesis Of Plasmid pDMW214 (FBAIN::GUS::XPR) The synthesis of plasmid pDMW214, comprising a
FBAIN::GUS::XPR chimeric gene, is described in WO 2005/049805. Briefly, however, the FBAIN promoter region (SEQ ID NO:16; which includes both an upstream DNA sequence and a downstream sequence from the putative 'ATG' translation initiation codon of the fructose- bisphosphate aldolase (fba1) gene [wherein the downstream region comprises an intron]) was amplified by PCR, digested with Ncol and Sail, and then purified following gel electrophoresis. The Λ/co//Sa///-digested PCR products were ligated to Ncol/Sall digested pY5-30 vector to produce plasmid "pDMW214" (Figure 2B). EXAMPLE 4
Synthesis Of pYGPAT-GUS
The present Example describes the synthesis of pYGPAT-GUS (comprising a GPAT::GUS::XPR chimeric gene). Synthesis of this plasmid first required amplification of the putative GPAT promoter region. Then, the putative promoter region was cloned into a derivative of pY5-30 (Example 3). Identification And Amplification Of The GPAT Putative Promoter Region
The region upstream of the gpat gene's 'ATG' start site was considered to represent the putative GPAT promoter region. This corresponded to the nucleotide region between the -1130 position and the 'ATG' translation initiation site of the gpat gene (wherein the 'A' nucleotide of the 'ATG' translation initiation codon was designated as +1). This promoter region is provided as SEQ ID NO: 17 and was designated herein as "GPAT-Pro".
GPAT-Pro was amplified by PCR using primers GPAT-5-1 and GPAT-5-2 (SEQ ID NOs:18 and 19), and pEcoRV-G-5 (Example 2) as template. The PCR amplification was carried out as described in Example 1. The PCR product was then purified using a Qiagen PCR purification kit and was completely digested with Sail and Ncol. The digestion product was again purified with a Qiagen PCR
purification kit and ligated to Ncol/Sall digested pY5-30 vector (Example 3, wherein the Ncol/Sall digestion had excised the TEF promoter from the pY5-30 vector backbone). Ligated DNA was then used to individually transform E. coli Top10. Transformants were selected on LB agar containing ampicillin (100 μg/mL).
Analysis of the plasmid DNA from one transformant containing GPAT-Pro confirmed the presence of the expected plasmid, designated "pYGPAT-GUS". Thus, this plasmid contained a chimeric gene comprising a GPAT promoter, GUS reporter gene and the 3' region of the XPR gene (Figure 2C). EXAMPLE 5
Generation Of Yarrowia lipolvtica ATCC #20362 Derivative pDMW236-#18
The present Example describes the construction of Y. //po/yf/ca ATCC #20362 derivative pDMW236-#18. Although originally intended to enable high production of EPA relative to the total lipids, this strain possessed a "dead Δ17 desaturase chimeric gene" that inhibited this conversion. However, the strain was useful for the purposes described herein, as a result of the engineered Leu- marker. Comparison of the TEF, GPAT and FBAIN promoters was examined in this strain based on analysis of GUS expression, as described in Example 7 (infra). The development of Y. lipolytica ATCC #20362 derivative pDMW236-#18 required the construction of strain M4 (producing 8% DGLA), followed by transformation with plasmid pDMW236 (Figure 3A). Construction Of Strain M4 Producing 8% DGLA
Construct pKUNF12T6E (Figure 3B; SEQ ID NO:20) was generated to integrate four chimeric genes (comprising a Δ12 desaturase, a Δ6 desaturase and 2 elongases) into the Ura3 loci of wild type Yarrowia strain ATCC #20362, to thereby enable production of DGLA. The pKUNF12T6E plasmid contained the following components:
Table 4 Description of Plasmid pKUNF12T6E (SEQ ID NO:2Q)
Clal/Pacl TEF::EL2S::XPR, comprising: (1459-1 ) • TEF: TEF promoter (GenBank Accession No.
AF054508)
• EL2S: codon-optimized elongase gene (SEQ ID NO:28), derived from Thraustochytrium aureum (U.S. 6,677,145)
• XPR: 100bp of the 3' region of the Yarrowia Xpr gene (GenBank Accession No. M 17741 )
The pKUNF12T6E plasmid was digested with Ascl/Sphl, and then used for transformation of wild type Y. lipolytica ATCC #20362 according to the General Methods. The transformant cells were plated onto FOA selection media plates and maintained at 30 C for 2 to 3 days. The FOA resistant colonies were picked and streaked onto MM and MMU selection plates. The colonies that could grow on MMU plates but not on MM plates were selected as Ura- strains. Single colonies of Ura- strains were then inoculated into liquid MMU at 30 C and shaken at 250 rpm/min for 2 days. The cells were collected by centrifugation, lipids were extracted, and fatty acid methyl esters were prepared by trans-esterification and subsequently analyzed with a Hewlett-Packard 6890 GC.
GC analyses showed the presence of DGLA in the transformants containing the 4 chimeric genes of pKUNF12T6E (Figure 3B), but not in the wild type Yarrowia control strain. Most of the selected 32 Ura' strains produced about 6% DGLA of total lipids. There were 2 strains (i.e., strains M4 and 13-8) that produced about 8% DGLA of total lipids. Transforamtion With Plasmid pDMW236
Construct pDMW236 (SEQ ID NO:30) is shown in Figure 3C. In a manner similar to that described above, the vector was synthesized, transformed into strain M4 (supra) according to the General Methods and individual colonies were selected and grown. The cells were collected by centrifugation, lipids were extracted, and fatty acid methyl esters were prepared by trans-esterification and subsequently analyzed with a Hewlett- Packard 6890 GC. GC analyses showed no EPA in the total lipids. One clone was designated as Y. lipolytica ATCC #20362 derivative pDMW236- #18.
EXAMPLE 6
Transformation Of Y. lipolytics With pY5-30, pYGPAT-GUS, and pDMW214 The plasmids pY5-30 (Example 3; comprising a TEF::GUS::XPR chimeric gene), pYGPAT-GUS (Example 4; comprising a GPAT::GUS::XPR chimeric gene) and pDMW214 (Example 3; comprising a
FBAIN::GUS::XPR chimeric gene) were transformed separately into Y. lipolytics ATCC #20362 derivative pDMW236-#18, according to the General Methods. Selection was performed on minimal media plates lacking leucine and maintained at 30 °C for 2 to 3 days. Using this technique, transformants were obtained that contained pY5-30, pYGPAT-GUS and pDMW214, respectively.
EXAMPLE 7 Comparative Analysis Of The TEF, GPAT and FBAIN Promoter Activities
In Yarrowia lipolytics, As Determined Bv Histochemical Assay The activity of the TEF, GPAT and FBAIN promoters was determined in V. lipolytics containing the pY5-30, pYGPAT-GUS and pDMW214 constructs, each of which possessed a GUS reporter gene and the 3' region of the Ysrrowis Xpr gene (from Example 6). GUS activity in each expressed construct was measured by histochemical assays (Jefferson, R. A. Plsnt MoI. Biol. Reporter 5:387-405 (1987)).
Specifically, Y. lipolytics strains containing plasmids pY5-30, pYGPAT- GUS and pDMW214, respectively, were grown from single colonies in 3 mL MM with 0.1 g/L L-adenine and 0.1 g/L L-lysine at 30 °C to an OD60O ~1.0. Then, 100 μl of cells were collected by centrifugation, resuspended in 100 μl of histochemical staining buffer and incubated at 3O C. [Staining buffer prepared by dissolving 5 mg of 5-bromo-4-chloro-3-indolyl glucuronide (X-Gluc) in 50 μl dimethyl formamide, followed by addition of 5 mL 50 mM NaPO4, pH 7.O.]
The results of histochemical staining showed that the GPAT promoter in construct pYGPAT-GUS was active. Comparatively, the GPAT promoter appeared to be much stronger than the TEF promoter (Figure 4) and have diminished activity with respect to the FBAIN promoter.
EXAMPLE 8 Comparative Analysis Of The TEF, FBAIN And GPAT Promoter Activities In
Yarrowis lipolytics, As Determined bv Fluorometric Assay A variety of methods are available to compare the activity of various promoters, to thereby facilitate determination of each promoter's strength for use in future applications wherein a suite of promoters would be necessary to
construct chimeric genes. Thus, although it may be useful to indirectly quantitate promoter activity based on reporter gene expression using histochemical staining (Example 7), quantification of GUS expression using more quantitative means may be desirable. One suitable method to assay GUS activity is by fluorometric determination of the production of 4- methylumbelliferone (4-MU) from the corresponding substrate, β-glucuronide (4- MUG; see Jefferson, R. A., Plant MoI. Biol. Reporter 5:387-405 (1987)).
Y. lipolytics strain Y2034 containing plasmids pY5-30, pYGPAT-GUS and pDMW214, respectively (from Example 6), were grown from single colonies in 10 ml_ SD medium at 30 °C for 48 hrs to an OD6oo ~5.0. Two ml_ of each culture was collected for GUS activity assays, as described below, while 5 ml_ of each culture was switched into HGM.
Specifically, cells from the 5 mL aliquot were collected by centrifugation, washed once with 5 mL of HGM and resuspended in HGM. The cultures in HGM were then grown in a shaking incubator at 30 °C for 24 hrs. Two mL of each HGM culture were collected for GUS activity assay, while the remaining culture was allowed to grow for an additional 96 hrs before collecting an additional 2 mL of each culture for the assay.
Each 2 mL culture sample in SD medium was resuspended in 1 mL of 0.5X cell culture lysis reagent (Promega). Resuspended cells were mixed with 0.6 mL of glass beads (0.5 mm diameter) in a 2.0 mL screw cap tube with a rubber O-ring. The cells were then homogenized in a Biospec mini beadbeater (Bartlesville, OK) at the highest setting for 90 sec. The homogenization mixtures were centrifuged for 2 min at 14,000 rpm in an Eppendof centrifuge to remove cell debris and beads. The supernatant was used for GUS assay and protein determination.
For each fluorometric assay, 200 μl of extract was added to 800 μl of GUS assay buffer (2 mM 4-methylumbelliferyl-β-D-glucuronide ("MUG") in extraction buffer) and placed at 37 C. Aliquots of 100 μl were taken at 0, 30 and 60 min time points and added to 900 μl of stop buffer (1 M Na2CO3). Each time point was read using a Fluorimeter (CytoFluorR Series 4000, Framingham, MA) set to an excitation wavelength of 360 nm and an emission wavelength of 455 nm. Total protein concentration of each sample was determined using 20 μl of extract and 980 μl of BioRad Bradford reagent (Bradford, M. M. Anal. Biochem. 72:248- 254 (1976)). GUS activity is expressed as nmoles of 4-MU per minute per mg of total protein.
As shown in the Table below, the activity of the GPAT promoter was significantly higher than the TEF promoter but lower than the FBAIN promoter under all the conditions tested.
Table 5
Comparison of TEF, FBAIN, And GPAT Promoter Activity Under Various Growth
Conditions
EXAMPLE 9 Use Of The GPAT Promoter For Δ6 Desaturase Expression In Yarrowia lipolytics
The present Example describes the construction of a chimeric gene comprising a GPAT promoter, fungal Δ6 desaturase and the Pex20 terminator, and the expression of this chimeric gene in Y. lipolytics. Since transformed host cells were able to produce γ-linoleic acid (while wildtype Y. lipolytics do not possess any Δ6 desaturase activity), this confirmed the ability of the GPAT promoter to drive expression of heterologous PUFA biosynthetic pathway enzymes in oleaginous yeast such as Y. lipolytics. Construction Of Plasmid pZGP6B, Comprising A GPAT::Δ6B1 ::Pex20 Chimeric Gene Synthesis Qf M. sloins cDNA
M. alpina cDNA was synthesized using the BD-Clontech Creator Smart® cDNA library kit (Mississauga, ON, Canada), according to the manufacturer's protocol.
Specifically, M. alpins was grown in 60 mL YPD medium (2% Bacto-yeast extract, 3% Bactor-peptone, 2% glucose) for 3 days at 23 °C. Cells were pelleted by centrifugation at 3750 rpm in a Beckman GH3.8 rotor for 10 min and resuspended in 6X 0.6 mL Trizole reagent (Invitrogen). Resuspended cells were transferred to six 2 mL screw cap tubes each containing 0.6 mL of 0.5 mm glass beads. The cells were homogenized at the HOMOGENIZE setting on a Biospec (Bartlesville, OK) mini bead beater for 2 min. The tubes were briefly spun to settle the beads. Liquid was transfered to 4 fresh 1.5 mL microfuge tubes and 0.2
mL chloroform/isoamyl alcohol (24:1) was added to each tube. The tubes were shaken by hand for 1 min and let stand for 3 min. The tubes were then spun at 14,000 rpm for 10 min at 4 C. The upper layer was transferred to 4 new tubes, lsopropyl alcohol (0.5 mL) was added to each tube. Tubes were incubated at room temperature for 15 min, followed by centrifugation at 14,000 rpm and 4 °C for 10 min. The pellets were washed with 1 mL each of 75% ethanol (made with RNase-free water) and air-dried. The total RNA sample was then redissolved in 500 μl of water, and the amount of RNA was measured by A260 nm using 1 :50 diluted RNA sample. A total of 3.14 mg RNA was obtained.
This total RNA sample was further purified with the Qiagen RNeasy total RNA Midi kit following the manufacturer's protocol. Thus, the total RNA sample was diluted to 2 mL and mixed with 8 mL of buffer RLT with 80 μl of β-mercaptoethanol and 5.6 mL 100% ethanol. The sample was divided into 4 portions and loaded onto 4 RNeasy midid columns. The columns were then centrifuged for 5 min at 4500Xg. To wash the columns, 2 mL of buffer RPE were loaded and the columns centrifuged for 2 mini at 4500Xg. The washing step was repeated once, except that the centrifugation time was extended to 5 min. Total RNA was eluted by applying 250 μl of RNase free water to each column, waiting for 1 min and centrifuging at 4500Xg for 3 min.
PolyA(+)RNA was then isolated from the above total RNA sample, following the protocol of Amersham Biosciences' mRNA Purification Kit. Briefly, 2 oligo-dT-cellulose columns were used. The columns were washed twice with 1 mL each of high salt buffer. The total RNA sample from the previous step was diluted to 2 mL total volume and adjusted to 10 mM Tris/HCI, pH 8.0, 1 mM EDTA. The sample was heated at 65 °C for 5 min, then placed on ice. Sample buffer (0.4 mL) was added and the sample was then loaded onto the two oligo-dT-cellulose columns under gravity feed. The columns were centrifuged at 350Xg for 2 min, washed 2X with 0.25 mL each of high salt buffer, each time followed by centrifugation at 350Xg for 2 min. The columns were further washed 3 times with low salt buffer, following the same centrifugation routine. Poly(A)+RNA was eluted by washing the column 4 times with 0.25 mL each of elution buffer preheated to 65 C, followed by the same centrifugation procedure. The entire purification process was repeated
once. Purified poly(A)+RNA was obtained with a concentration of 30.4 ng/μl. cDNA was generated, using the LD-PCR method specified by BD- Clontech and 0.1 μg of polyA(+) RNA sample. Specifically, for 1st strand cDNA synthesis, 3 μl of the poly(A)+RNA sample was mixed with 1 μl of SMART IV oligo nucleotide (SEQ ID NO:31 ) and 1 μl of CDSIII/3' PCR primer (SEQ ID NO:32). The mixture was heated at 72 °C for 2 min and cooled on ice for 2 min. To the tube was added the following: 2 μl first strand buffer, 1 μl 20 mM DTT, 1 μl 10 mM dNTP mix and 1 μl Powerscript reverse transcriptase. The mixture was incubated at 42 C for
1 hr and cooled on ice.
The 1st strand cDNA synthesis mixture was used as template for the PCR reaction. The reaction mixture contained the following: 2 μl of the 1st strand cDNA mixture, 2 μl 5'-PCR primer (SEQ ID NO:33), 2 μl CDSIII/3'-PCR primer (SEQ ID NO:32), 80 μl water, 10 μl 10X Advantage
2 PCR buffer, 2 μl 5OX dNTP mix and 2 μl 5OX Advantage 2 polymerase mix. The thermocycler conditions were set for 95 C for 20 sec, followed by 20 cycles of 95 °C for 5 sec and 68 °C for 6 min on a GenAmp 9600 instrument. PCR product was quantitated by agarose gel electrophoresis and ethidium bromide staining.
Cloning A Morteriella alpina Δ6 Desaturase A M. alpina Δ6 desaturase gene (referred to herein as "Δ6B") was identified in GenBank (Accession No. AB070555). The Δ6B gene was PCR amplified using the oligonucleotides described below in Table 6 as primers and the cDNA pool of M. alpina as template.
Table 6 Primers Used For Amplification Of The M. alpina Δ6 Desaturase
Optimization was according to Yarrowia codon usage, as described in U.S. Patent Application No.
10/840478.
The PCR amplification were carried out in 50 μl total volume containing: 10 ng cDNA of M. alpina, PCR buffer containing 10 mM KCI, 10 mM (NH4)2SO4, 20 mM Tris-HCI (pH 8.75), 2 mM MgSO4, 0.1% Triton X-100, 100 μg/mL BSA (final concentration), 200 μM each deoxyribonucleotide triphosphate, 10 pmole of each primer (supra) and 1 μl of PfuTurbo DNA polymerase (Stratagene, San Diego, CA). The thermocycler conditions were set for 35 cycles at 95 °C for 1 min, 56 °C for 30 sec, and 72 °C for 1 min, followed by a final extension at 72 °C for 10 min. The PCR products were purified using a Qiagen PCR purification kit (Valencia, CA), and then further purified following gel electrophoresis in 1 % (w/v) agarose. Subsequently, the PCR products were cloned into the pGEM-T-easy vector (Promega, Madison, Wl). The ligated DNA was used to transform cells of E. coli DH5α and transformants were selected on LB agar containing ampicillin (100 μg/mL). Analysis of the plasmid DNA from one transformant confirmed the presence of a plasmid of the expected size. The plasmid was desinated as "pT-6Bc".
Sequence analysis showed that pT-6Bc contained a Δ6 coding region sequence (SEQ ID NO:36) that was similar to GenBank Accession No. AB070555 (-90% identity at the DNA sequence level and 97% identity at the amino acid level). It was. assumed that the differences in the DNA and amino acid sequence came from variations of the same gene in different strains of M. alpina. Additionally, the Δ6B desaturase within pT- 6Bc contained the codon-optimized base pairs that were present at the N- and C-terminal end, according to the preferred codon usage in Yarrowia. Using plasmid pT-6BC as template and oligonucleotides YL475 and
YL476 (SEQ ID NOs:37 and 38) as primers, the internal Λ/col site of Δ6B was eliminated by in vitro mutagenesis (Stratagene, San Diego, CA) to produce pT-6BC-N. Using pT-6BC-N as template and oligonucleotides YL477 and YL478 (SEQ ID NOs:39 and 40) as primers, the internal Sph\ site of Δ6B was eliminated by in vitro mutagenesis to produce pT-6BC-NS. Finally, using pT-6BC-NS as template and oligonucleotides YL479 and YL480 (SEQ ID NOs:41 and 42) as primers, the internal C/al site of Δ6B was eliminated by in vitro mutagenesis to produce pT-6BC-NSC. The elimination of these three internal sites did not change the amino acid sequence of the Δ6B gene.
PCR Amplification Of Yarrowia GPAT Promoter
Using plasmid pYGPAT-GUS (Example 4) as template and oligonucleotides YL497 (SEQ ID NO:43, containing a Swa\ site) and YL498 (SEQ ID NO:44, containing a Λ/col site) as primers, the GPAT promoter was amplified by PCR. Specifically, the PCR amplification was carried out in a 50 μl total volume using the components and conditions described above, with the exception that 10 ng plasmid DNA was used as template.
The PCR products were purified using a Qiagen PCR purification kit (Valencia, CA), digested with Swa\/Nco\, and then purified following gel electrophoresis in 1% (w/v) agarose.
PCR Amplification Of The Yarrowia Pex20 Terminator
The Yarrowia PEX20 terminator (SEQ ID NO:45) of the gene encoding peroxin (GenBank Accession No. AF054613) was amplified from Y. lipolytica genomic DNA using YL259 (SEQ ID NO:46, containing a Notl site) and YL260 (SEQ ID NO:47, containing a BsiW\ site) as primers. The 324 bp PCR product was digested with Not 1 and BsiW1 and gel purified.
Construction of the pY25-d12d-PS plasmid
The synthesis of pY5-13 is described in WO2005/003310 and is illustrated in Figure 5A.
The ORF encoding the Y. lipolytica Δ12 desaturase (SEQ ID NOs:48 and 49) was PCR amplified using upper primer P147 (SEQ ID NO:50) and lower primer P148 (SEQ ID NO:51) from the genomic DNA of Y. lipolytica ATCC #76982 (WO2004/104167). The correct sized (1260 bp) fragment was isolated, purified, digested with Nco I and Not I and cloned into Ncol-Not I cut pY5-13 vector (supra), such that the gene was under the control of the TEF promoter. Correct transformants were confirmed by miniprep analysis and the resultant plasmid was designated "pY25-d12d" (Figure 5B). Using oligonucleotides YL242 and YL243 (SEQ ID NOs:52 and 53) as primers and pY25-d12d as template, a Pmel site was introduced into pY25-d12d by site-directed mutagenesis to generate pY25-d12d-P. A Swa\ site was introduced into pY25-d12d-P by in vitro mutagenesis using YL226 and YL227 (SEQ ID NOs:54 and 55) as primers to generate plasmid pY25-d12d-PS.
Construction Qf pZGPΘB. Comprising A GPAT::Δ6B1 ::Pex20 Chimeric Gene
Plasmid pY25-d12d-PS was digested with Swal/BsiW\, and the large fragment was used as vector. The Swal/BsiW\ digested large fragment of plasmid pY25-d12d-PS, the Swa\/Nco\ digested GPAT promoter DNA fragment, Nco\INot\ digested Δ6B gene DNA fragment and the Not\/BsiW\ digested Pex20 terminator were directionally ligated together. The ligated DNA was used to transform cells of E. coli DH5α and transformants were selected on LB agar containing ampicillin (100 μg/mL). Analysis of the plasmid DNA from one transformant confirmed the presence of a plasmid of the expected size. The plasmid was desinated "pZGP6B" and comprised a GPAT::Δ6B::Pex20 terminator chimeric gene (SEQ ID NO:56). Expression of Plasmid pZGP6B (GPAT::Δ6B::Pex20) in Yarrowia lipolvtica
Plasmid pZGP6B (Figure 5C) was transformed into wild type (WT) Y. lipolytica ATCC #76892 according to the methodology described above in the General Methods. Transformant cells were plated onto MM plates lacking leucine and maintained at 30 °C for 2 to 3 days. Using this technique, transformants were obtained that contained pZGP6B.
Single colonies of wild type and transformant cells were each grown in 3 mL MM with 0.1 g L-adenine and 0.1 g L-lysine at 30 C to an ODβoo ~ 1.0. The cells were harvested, washed in distilled water, speed vacuum dried and subjected to direct trans-esterification and GC analysis (according to the methodology of the General Methods).
The fatty acid profile of wildtype Yarrowia and the transformant containing pZGPΘB are shown below in Table 7. Fatty acids are identified as 16:0 (palmitate), 16:1 (palmitoleic acid), 18:0, 18:1 (oleic acid), 18:2 (LA) and 18:3 (GLA) and the composition of each is presented as a % of the total fatty acids.
Table 7 Ex ression of GPAT::Δ6B::Pex20 In Yarrowia li ol tica
The results above demonstrated that the GPAT promoter is suitable to drive expression of the Δ6 desaturase, leading to production of GLA in Yarrowia.