EP3684896A1 - Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase - Google Patents

Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase

Info

Publication number
EP3684896A1
EP3684896A1 EP18788933.2A EP18788933A EP3684896A1 EP 3684896 A1 EP3684896 A1 EP 3684896A1 EP 18788933 A EP18788933 A EP 18788933A EP 3684896 A1 EP3684896 A1 EP 3684896A1
Authority
EP
European Patent Office
Prior art keywords
seq
cell
gene
protein
methyltransferase
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP18788933.2A
Other languages
German (de)
French (fr)
Inventor
Arthur J. Shaw
Hannah BLITZBLAU
Donald V. Crabtree
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ginkgo Bioworks Inc
Original Assignee
Novogy Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Novogy Inc filed Critical Novogy Inc
Publication of EP3684896A1 publication Critical patent/EP3684896A1/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0069Oxidoreductases (1.) acting on single donors with incorporation of molecular oxygen, i.e. oxygenases (1.13)
    • CCHEMISTRY; METALLURGY
    • C11ANIMAL OR VEGETABLE OILS, FATS, FATTY SUBSTANCES OR WAXES; FATTY ACIDS THEREFROM; DETERGENTS; CANDLES
    • C11BPRODUCING, e.g. BY PRESSING RAW MATERIALS OR BY EXTRACTION FROM WASTE MATERIALS, REFINING OR PRESERVING FATS, FATTY SUBSTANCES, e.g. LANOLIN, FATTY OILS OR WAXES; ESSENTIAL OILS; PERFUMES
    • C11B1/00Production of fats or fatty oils from raw materials
    • C11B1/10Production of fats or fatty oils from raw materials by extracting
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/001Oxidoreductases (1.) acting on the CH-CH group of donors (1.3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1003Transferases (2.) transferring one-carbon groups (2.1)
    • C12N9/1007Methyltransferases (general) (2.1.1.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/64Fats; Fatty oils; Ester-type waxes; Higher fatty acids, i.e. having at least seven carbon atoms in an unbroken chain bound to a carboxyl group; Oxidised oils or fats
    • C12P7/6409Fatty acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/64Fats; Fatty oils; Ester-type waxes; Higher fatty acids, i.e. having at least seven carbon atoms in an unbroken chain bound to a carboxyl group; Oxidised oils or fats
    • C12P7/6436Fatty acid esters
    • C12P7/6445Glycerides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/64Fats; Fatty oils; Ester-type waxes; Higher fatty acids, i.e. having at least seven carbon atoms in an unbroken chain bound to a carboxyl group; Oxidised oils or fats
    • C12P7/6436Fatty acid esters
    • C12P7/6445Glycerides
    • C12P7/6463Glycerides obtained from glyceride producing microorganisms, e.g. single cell oil

Definitions

  • the invention generally concerns production of branched (methyl)lipids by cells expressing recombinant methyltransferases and/or reductases derived from Gammaproteobacteria.
  • Fatty acids derived from agricultural plant and animal oils find use as industrial lubricants, hydraulic fluids, greases, and other specialty fluids in addition to oleochemical feedstocks for processing.
  • the physical and chemical properties of these fatty acids result in large part from their carbon chain length and number of unsaturated double bonds.
  • Fatty acids are typically 16:0 (sixteen carbons, zero double bonds), 16: 1 (sixteen carbons, 1 double bond), 18:0, 18: 1, 18:2, or 18:3.
  • fatty acids with no double bonds (saturated) have high oxidative stability, but they solidify at low temperature. Double bonds improve low- temperature fluidity, but decrease oxidative stability.
  • High 18: 1 (oleic) fatty acid oils provide low temperature fluidity with relatively good oxidative stability. Accordingly, several commercial products, such as high oleic soybean oil, high oleic sunflower oil, and high oleic algal oil, have been developed with high oleic compositions. Oleic acid is an alkene, however, and subject to oxidative degradation. [0005] A superior alternative is the addition of a fully saturated methyl branch to the fatty acid chain.
  • the oil composition is produced by cultivating a cell culture and recovering the oil composition from the cell culture, wherein the oil composition comprises 10-methyl fatty acids, and wherein the 10-methyl fatty acids comprise at least about 1% by weight of the total fatty acids in the oil composition. In some embodiments, the 20-methyl fatty acids comprise at least about 15% by weight of the total fatty acids in the oil composition.
  • Some aspects relate to a method of producing an oil composition, the method comprising: cultivating a cell culture comprising any of the cells disclosed herein; and recovering the oil composition from the cell culture.
  • the method further comprises contacting the cell culture with a substrate comprising a fatty acid from 14 to 18 carbons long with a double bond in the ⁇ 9, ⁇ 10, or ⁇ 11 position.
  • recovering the oil composition from the cell culture comprises recovering lipids that have been secreted by the cell.
  • This reaction may be catalyzed by a tmpB protein as described herein, infra.
  • 10-methylenestearic acid e.g., present as an acyl chain of a glycerolipid or phospholipid
  • the reduction may be catalyzed by a tmpA protein as describe herein, infra, for example, without limitation, using NADPH as a reducing agent.
  • Other examples of the reducing agent may include, without limitation, ferredoxin, flavodoxin, rubredoxin, cytochrome c, or combinations thereof.
  • the language of the specification and claims, however, is not limited to any particular reaction mechanism.
  • Figure 5 is a graph showing the percentage of 10-methylene fatty acids in
  • 14:0 Myristic acid
  • 16:0 Palmitic acid
  • 16: 1 ⁇ 9 palmitoleic acid
  • 16:0cyc 17A,czs-9,10-methylenehexadecanoic acid
  • 10-methylene 16:0 10-methylene hexadecenoic acid
  • 18: 1 ⁇ 11 vaccenic acid
  • 18:0 stearic acid
  • SD standard deviation.
  • Figures 7A-7D show a CLUSTAL OMEGA alignment of tmpB protein sequences encoded by the tmpB genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei, along with the cyclopropane fatty acid synthase (Cfa) enzyme from Escherichia coli.
  • Cfa cyclopropane fatty acid synthase
  • an element means one element or more than one element.
  • biologically-active portion refers to an amino acid sequence that is less than a full-length amino acid sequence, but exhibits at least one activity of the full length sequence.
  • a biologic ally- active portion of a methyltransferase may refer to one or more domains of tmpB having biological activity for converting oleic acid (e.g., a phospholipid comprising an ester of oleate) and methionine (e.g., S-adenosyl methionine) into 10-methylenestearic acid (e.g., a phospholipid comprising an ester of 10-methylenestearate).
  • oleic acid e.g., a phospholipid comprising an ester of oleate
  • methionine e.g., S-adenosyl methionine
  • 10-methylenestearic acid e.g., a phospholipid comprising an ester of 10-methylenestearate
  • a biologically-active portion of a reductase may refer to one or more domains of tmpA having biological activity for converting 10-methylenestearic acid (e.g., a phospholipid comprising an ester of 10-methylenestearate) and a reducing agent (e.g. , ferrodoxin, flavodoxin, rubredoxin, cytochrome c, NADH, NADPH, FAD, FADH 2 , FMNH 2 ) into 10-methylstearic acid (e.g. , a phospholipid comprising an ester of 10-methylstearate).
  • a reducing agent e.g. , ferrodoxin, flavodoxin, rubredoxin, cytochrome c, NADH, NADPH, FAD, FADH 2 , FMNH 2
  • 10-methylstearic acid e.g. , a phospholipid comprising an ester of 10-methylstearate
  • a biologically-active portion of a protein may comprise, comprise at least, or comprise at most, for example, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134
  • biologically-active portions comprise a domain or motif having a catalytic activity, such as catalytic activity for producing 10-methylenestearic acid or 10-methylstearic acid.
  • a biologically-active portion of a protein includes portions of the protein that have the same activity as the full-length peptide and every portion that has more activity than background.
  • a biologically-active portion of an enzyme may have, have at least, or have at most 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, 400
  • codon optimized and “codon-optimized for the cell” refer to coding nucleotide sequences (e.g., genes) that have been altered to substitute at least one codon that is relatively rare in a desired host cell with a synonymous codon that is relatively prevalent in the host cell. Codon optimization thereby allows for better utilization of the tRNA of a host cell by matching the codons of a recombinant gene with the tRNA of the host cell. For example, the codon usage of the species of Gammaproteobacteria (prokaryotes) varies from the codon usage of yeast (eukaryotes).
  • DGAT2 refers to a gene that encodes a type 2 diacylglycerol acyltransferase protein, such as a gene that encodes a yeast DGA1 protein.
  • diacylglyceride is esters comprised of glycerol and two fatty acids.
  • diacylglycerol acyltransferase and “DGA” refer to any protein that catalyzes the formation of triacylglycerides from diacylglycerol.
  • Diacylglycerol acyltransferases include type 1 diacylglycerol acyltransferases (DGA2), type 2 diacylglycerol acyltransferases (DGA1), and type 3 diacylglycerol acyltransferases (DGA3) and all homologs that catalyze the above-mentioned reaction.
  • DGA2 type 1 diacylglycerol acyltransferases
  • DGA1 type 2 diacylglycerol acyltransferases
  • DGA3 type 3 diacylglycerol acyltransferases
  • diacylglycerol acyltransferase type 1
  • type 1 diacylglycerol acyltransferases refer to DGA2 and DGA2 orthologs.
  • drug refers to any molecule that inhibits cell growth or proliferation, thereby providing a selective advantage to cells that contain a gene that confers resistance to the drug. Drugs include antibiotics, antimicrobials, toxins, and pesticides.
  • “Dry weight” and “dry cell weight” mean weight determined in the relative absence of water. For example, reference to oleaginous cells as comprising a specified percentage of a particular component by dry cell weight means that the percentage is calculated based on the weight of the cell after substantially all water has been removed. The term “% dry weight,” when referring to a specific fatty acid (e.g.
  • oleic acid or 10-methylstearic acid includes fatty acids that are present as carboxylates, esters, thioesters, and amides.
  • a cell that comprises 10-methylstearic acid as a percentage of total fatty acids by % dry cell weight includes 10-methylstearic acid, 10-methylstearate, the 10-methylstearate portion of a diacylglycerol comprising a 10-methylstearate ester, the 10-methylstearate portion of a triacylglycerol comprising a 10-methylstearate ester, the 10-methylstearate portion of a phospholipid comprising a 10-methylstearate ester, and the 10-methylstearate portion of 10- methylstearate CoA.
  • % dry weight when referring to a specific type of fatty acid (e.g. , C 16 fatty acids, C 18 fatty acids), includes fatty acids that are present as carboxylates, esters, thioesters, and amides as described above (e.g. , for 10 methylstearic acid).
  • the term "gene,” as used herein, may encompass genomic sequences that contain exons, particularly polynucleotide sequences encoding polypeptide sequences involved in a specific activity.
  • the term further encompasses synthetic nucleic acids that did not derive from genomic sequence.
  • the genes lack introns, as they are synthesized based on the known DNA sequence of cDNA and protein sequence.
  • the genes are synthesized, non-native cDNA wherein the codons have been optimized for expression in Y. lipolytica or A. adeninivorans based on codon usage.
  • the term can further include nucleic acid molecules comprising upstream, downstream, and/or intron nucleotide sequences.
  • inducible promoter refers to a promoter that mediates the transcription of an operably linked gene in response to a particular stimulus.
  • integrated refers to a nucleic acid that is maintained in a cell as an insertion into the cell's genome, such as insertion into a chromosome, including insertions into a plastid genome.
  • operable linkage and “operably linked” refer to a functional linkage between two nucleic acid sequences, such as a control sequence (typically a promoter) and the linked sequence (typically a sequence that encodes a protein, also called a coding sequence).
  • a promoter is in operable linkage with a gene or is operably linked to a gene if it can mediate transcription of the gene.
  • nucleotide structure may be imparted before or after assembly of the polymer.
  • a polynucleotide may be further modified, such as by conjugation with a labeling component.
  • U nucleotides are interchangeable with T nucleotides.
  • phospholipid refers to esters comprising glycerol, two fatty acids, and a phosphate.
  • the phosphate may be covalently linked to carbon-3 of the glycerol and comprise no further substitution, i.e., the phospholipid may be a phosphatidic acid.
  • the phosphate may be substituted with ethanolamine (e.g., phosphatidylethanolamine), choline (e.g. , phosphatidylcholine), serine (e.g. , phosphatidylserine), inositol (e.g. , phosphatidylinositol), inositol phosphate (e.g.
  • phosphatidylinositol-3-phosphate phosphatidylinositol-4-phosphate, phosphatidylinositol- 5 -phosphate
  • inositol bisphosphate e.g. , phosphatidylinositol-4,5-bisphosphate
  • inositol triphosphate e.g. , pho sphatidylino sitol- 3 ,4 , 5 -bispho sphate
  • Plasmid refers to a circular DNA molecule that is physically separate from an organism' s genomic DNA. Plasmids may be linearized before being introduced into a host cell (referred to herein as a linearized plasmid). Linearized plasmids may not be self-replicating, but may integrate into and be replicated with the genomic DNA of an organism.
  • a “promoter” is a nucleic acid control sequence that directs the transcription of a nucleic acid.
  • a promoter includes the necessary nucleic acid sequences near the start site of transcription.
  • protein refers to molecules that comprise an amino acid sequence, wherein the amino acids are linked by peptide bonds.
  • Transformation refers to the transfer of a nucleic acid into a host organism or into the genome of a host organism, resulting in genetically stable inheritance.
  • Host organisms containing the transformed nucleic acid are referred to as “recombinant,” “transgenic,” or “transformed” organisms.
  • nucleic acids of the present invention can be incorporated into recombinant constructs, typically DNA constructs, capable of introduction into and replication in a host cell.
  • Such a construct can be a vector that includes a replication system and sequences that are capable of transcription and translation of a polypeptide-encoding sequence in a given host cell.
  • expression vectors include, for example, one or more cloned genes under the transcriptional control of 5' and 3' regulatory sequences and a selectable marker.
  • Such vectors also can contain a promoter regulatory region (e.g. , a regulatory region controlling inducible or constitutive, environmentally- or developmentally-regulated, or location-specific expression), a transcription initiation start site, a ribosome binding site, a transcription termination site, and/or a polyadenylation signal.
  • the term "recombinant gene” refers to a gene that (1) is operatively linked to a polynucleotide to which it is not linked in nature or (2) has a nucleotide sequence different from the naturally-occurring nucleotide sequence, such as, for example, a non-naturally occurring mutation, a codon-optimized sequence, or a cDNA that lacks naturally-occurring introns that are found at the gene's genomic locus.
  • recombinant can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mRNAs encoded by such nucleic acids.
  • a protein synthesized by a microorganism is recombinant, if it is synthesized from an mRNA that is synthesized from a recombinant gene present in the cell.
  • a gene may be a recombinant gene if it is operably linked to a promoter different from the promoter to which it is operably linked in nature or if it is connected to another gene or portion thereof and, together with the other gene or portion thereof, encodes a protein that is not found in nature, such as a fusion protein or an epitope-tagged protein.
  • Suitable host cells for expression of the genes and nucleic acid molecules are microbial hosts that can be found broadly within the fungal or bacterial families.
  • suitable host strains include but are not limited to fungal or yeast species, such as Arxula, Aspegillus, Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Hansenula, Kluyveromyces, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Yarrowia, or bacterial species, such as members of proteobacteria and actinomycetes, as well as the genera Acinetobacter, Arthrobacter, Brevibacterium, Acidovorax, Bacillus,
  • Yarrowia lipolytica and Arxula adeninivorans are suited for use as a host microorganism because they can accumulate a large percentage of their weight as triacylglycerols.
  • Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are known to those skilled in the art. Any of these could be used to construct chimeric genes to produce any one of the gene products of the instant sequences. These chimeric genes could then be introduced into appropriate microorganisms via transformation techniques to provide high-level expression of the enzymes.
  • a gene encoding an enzyme can be cloned in a suitable plasmid, and an aforementioned starting parent strain as a host can be transformed with the resulting plasmid.
  • This approach can increase the copy number of each of the genes encoding the enzymes and, as a result, the activities of the enzymes can be increased.
  • the plasmid is not particularly limited so long as it renders a desired genetic modification inheritable to the microorganism's progeny.
  • Vectors or cassettes useful for the transformation of suitable host cells are well known.
  • the vector or cassette contains sequences that direct the transcription and translation of the relevant gene, a selectable marker, and sequences that allow autonomous replication or chromosomal integration.
  • Suitable vectors comprise a region 5' of the gene harboring transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination.
  • both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host.
  • Promoters, cDNAs, and 3' UTRs, as well as other elements of the vectors can be generated through cloning techniques using fragments isolated from native sources (see, e.g., Green & Sambrook, Molecular Cloning: A Laboratory Manual, (4th ed., 2012); U.S. Patent No. 4,683,202 (incorporated by reference)). Alternatively, elements can be generated synthetically using known methods (see, e.g., Gene 164:49-53 (1995)).
  • homologous recombination is a precise gene targeting event and, hence, most transgenic lines generated with the same targeting sequence will be essentially identical in terms of phenotype, necessitating the screening of far fewer transformation events.
  • Homologous recombination also targets gene insertion events into the host chromosome, potentially resulting in excellent genetic stability, even in the absence of genetic selection. Because different chromosomal loci will likely impact gene expression, even from exogenous promo ters/UTRs, homologous recombination can be a method of querying loci in an unfamiliar genome environment and to assess the impact of these environments on gene expression.
  • homologous recombination is a precise gene targeting event, it can be used to precisely modify any nucleotide(s) within a gene or region of interest, so long as sufficient flanking regions have been identified. Therefore, homologous recombination can be used as a means to modify regulatory sequences impacting gene expression of RNA and/or proteins. It can also be used to modify protein coding regions in an effort to modify enzyme activities such as substrate specificity, affinities and Km, thereby affecting a desired change in the metabolism of the host cell.
  • Homologous recombination provides a powerful means to manipulate the host genome resulting in gene targeting, gene conversion, gene deletion, gene duplication, gene inversion, and exchanging gene expression regulatory elements such as promoters, enhancers and 3' UTRs.
  • Homologous recombination can be achieved by using targeting constructs containing pieces of endogenous sequences to "target" the gene or region of interest within the endogenous host cell genome. Such targeting sequences can either be located 5' of the gene or region of interest, 3' of the gene/region of interest or even flank the gene/region of interest.
  • Such targeting constructs can be transformed into the host cell either as a supercoiled plasmid DNA with additional vector backbone, a PCR product with no vector backbone, or as a linearized molecule.
  • Other methods of increasing recombination efficiency include using PCR to generate transforming transgenic DNA containing linear ends homologous to the genomic sequences being targeted.
  • Vectors for transforming microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art in view of the disclosure herein.
  • a vector typically contains one or more genes, in which each gene codes for the expression of a desired product (the gene product) and is operably linked to one or more control sequences that regulate gene expression or target the gene product to a particular location in the recombinant cell. a. Control Sequences
  • Control sequences are nucleic acids that regulate the expression of a coding sequence or direct a gene product to a particular location in or outside a cell.
  • Control sequences that regulate expression include, for example, promoters that regulate transcription of a coding sequence and terminators that terminate transcription of a coding sequence.
  • Another control sequence is a 3' untranslated sequence located at the end of a coding sequence that encodes a polyadenylation signal.
  • Control sequences that direct gene products to particular locations include those that encode signal peptides, which direct the protein to which they are attached to a particular location inside or outside the cell.
  • an exemplary vector design for expression of a gene in a microbe contains a coding sequence for a desired gene product (for example, a selectable marker, or an enzyme) in operable linkage with a promoter active in yeast.
  • a desired gene product for example, a selectable marker, or an enzyme
  • the coding sequence can be transformed into the cells such that it becomes operably linked to an endogenous promoter at the point of vector integration.
  • the promoter used to express a gene can be the promoter naturally linked to that gene or a different promoter.
  • a promoter can generally be characterized as constitutive or inducible.
  • Constitutive promoters are generally active or function to drive expression at all times (or at certain times in the cell life cycle) at the same level.
  • Inducible promoters conversely, are active (or rendered inactive) or are significantly up- or down-regulated only in response to a stimulus. Both types of promoters find application in the methods of the invention.
  • Inducible promoters useful in the invention include those that mediate transcription of an operably linked gene in response to a stimulus, such as an exogenously provided small molecule, temperature (heat or cold), lack of nitrogen in culture media, etc.
  • Suitable promoters can activate transcription of an essentially silent gene or upregulate, e.g., substantially, transcription of an operably linked gene that is transcribed at a low level.
  • termination region control sequence is optional, and if employed, then the choice is primarily one of convenience, as the termination region is relatively interchangeable.
  • the termination region may be native to the transcriptional initiation region (the promoter), may be native to the DNA sequence of interest, or may be obtainable from another source (See, e.g. , Chen & Orozco, Nucleic Acids Research 7(5:8411 (1988)). b. Genes and Codon Optimization
  • a common gene present on a vector is a gene that codes for a protein, the expression of which allows the recombinant cell containing the protein to be differentiated from cells that do not express the protein.
  • a gene, and its corresponding gene product is called a selectable marker or selection marker. Any of a wide variety of selectable markers can be employed in a transgene construct useful for transforming the organisms of the invention.
  • selectable markers Any of a wide variety of selectable markers can be employed in a transgene construct useful for transforming the organisms of the invention.
  • For optimal expression of a recombinant protein it is beneficial to employ coding sequences that produce mRNA with codons optimally used by the host cell to be transformed. Thus, proper expression of transgenes can require that the codon usage of the transgene matches the specific codon bias of the organism in which the transgene is being expressed.
  • embodiments of the invention include cells transformed with one or more nucleic acids encoding a methyltransferase and/or reductase protein.
  • the transformed cell is a prokaryotic cell, such as a bacterial cell.
  • the cell is a eukaryotic cell, such as a mammalian cell, a yeast cell, a filamentous fungi cell, a protist cell, an algae cell, an avian cell, a plant cell, or an insect cell.
  • the cell is a yeast. Those with skill in the art will recognize that many forms of filamentous fungi produce yeast-like growth, and the definition of yeast herein encompasses such cells.
  • Aurantiochytrium Candida, Claviceps, Cryptococcus, Cunninghamella, Geotrichum, Hansenula, Kluyveromyces, Kodamaea, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Wickerhamomyces, and Yarrowia. It is specifically contemplated that one or more of these cell types may be excluded from embodiments of this invention.
  • the transformed cell comprises about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, or more lipid as measured by % dry cell weight, or any range derivable therein.
  • the transformed cell comprises C18 fatty acids at a concentration of about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, or 95% as a percentage of total C16 and C18 fatty acids in the cell by weight, or any range derivable therein.
  • the transformed cell comprises oleic acid at a concentration of about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90% or higher as a percentage of total C16 and C18 fatty acids in the cell by weight, or any range derivable therein.
  • the transformed cell comprises 10- methylstearic acid at a concentration of about, at least about, or of at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 77%, 7
  • a cell may be modified to increase its oleate content, which serves as a substrate for 10-methylstearate synthesis.
  • Genetic modifications that increase oleate content are known ⁇ see, e.g., PCT Patent Application Publication No. WO16/094520, published June 16, 2016, hereby incorporated by reference in its entirety).
  • a cell may comprise a ⁇ 12 desaturase knockdown or knockout, which favors the accumulation of oleate and disfavors the production of linoleate.
  • a cell may comprise a recombinant ⁇ 9 desaturase gene, which favors the production of oleate and disfavors the accumulation of stearate.
  • the recombinant ⁇ 9 desaturase gene may be, for example, the ⁇ 9 desaturase gene from Y. lipolytica, Arxula adeninivorans, or Puccinia graminis.
  • a cell may comprise a recombinant elongase 1 gene, which favors the production of oleate and disfavors the accumulation of palmitate and palmitoleate.
  • the recombinant elongase 1 gene may be the elongase 1 gene from Y. lipolytica.
  • a cell may comprise a recombinant elongase 2 gene, which favors the production of oleate and disfavors the accumulation of palmitate and palmitoleate.
  • the recombinant elongase 2 gene may be the elongase 2 gene from R. norvegicus.
  • a cell may be modified to increase its triacylglycerol content, thereby increasing its 10-methylstearate content. Genetic modifications that increase triacylglycerol content are known (see, e.g., PCT Patent Application Publication No. WO16/094520, published June 16, 2016, hereby incorporated by reference in its entirety).
  • a cell may comprise a recombinant diacylglycerol acyltransferase gene (e.g., DGAT1, DGAT2, or DGAT3), which favors the production of triacylglycerols and disfavors the accumulation of diacylglycerols.
  • the recombinant diacylglycerol acyltransferase gene may be, for example, DGAT2 (encoding protein DGA1) from Y. lipolytica, DGAT1 (encoding protein DGA2) from C. purpurea, or DGAT2 (encoding protein DGA1) from R. toruloides.
  • the cell may comprise a glycerol-3- phosphate acyltransferase gene (Sctl) knockdown or knockout, which may favor the accumulation of triacylglycerols, depending on the cell type.
  • the cell may comprise a recombinant glycerol-3-phosphate acyltransferase gene (Sctl) such as the Sctl gene from A. adeninivorans, which may favor the accumulation of triacylglycerols.
  • the cell may comprise a triacylglycerol lipase gene (TGL) knockdown or knockout, which may favor the accumulation of triacylglycerols in the cell.
  • TGL triacylglycerol lipase gene
  • the transformed cell may comprise a recombinant methyltransferase gene (e.g., a tmpB gene), a recombinant reductase gene (e.g., a tmpA gene), an exomethylene-substituted lipid, and/or a branched (methyl)lipid.
  • a recombinant methyltransferase gene e.g., a tmpB gene
  • a recombinant reductase gene e.g., a tmpA gene
  • an exomethylene-substituted lipid e.g., a branched (methyl)lipid.
  • a branched (methyl)lipid may be a carboxylic acid (e.g., 10-methylstearic acid, 10-methylpalmitic acid, 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec-12- enoic acid), carboxylate (e.g., 10-methylstearate, 10-methylpalmitate, 12-methyloleate, 13- methyloleate, 10-methyl-octadec-12-enoate), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylstearyl CoA, 10-methylpalmityl CoA, 12- methyloleoyl CoA, 13-methyloleoyl CoA, 10-methyl-octadec-12-enoyl CoA), or amide.
  • carboxylic acid e.g., 10-methylstearic acid, 10-methylpalmitic acid, 12-methyloleic acid, 13-methyl
  • An exomethylene-substituted lipid may be a carboxylic acid (e.g. , 10-methylenestearic acid, 10- methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene- octadec- 12-enoic acid), carboxylate (e.g. , 10-methylenestearate, 10-methylenepalmitate, 12- methyleneoleate, 13-methyleneoleate, 10-methylene-octadec- 12-enoate), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g.
  • carboxylic acid e.g. , 10-methylenestearic acid, 10- methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene- octadec- 12-enoic acid
  • carboxylate e.g. , 10-methylene
  • the methyltransferase gene and reductase gene may have the capability of together producing a methylated branch from any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the ⁇ 9, ⁇ 10, or ⁇ 11 position.
  • the fatty acid may be 14, 15, 16, 17, or 18 carbons, or any range derivable therein.
  • Fatty acids generally exist in a cell as a phospholipid or triacylglycerol, although they may also exist as a monoacylglycerol or diacylglycerol, for example, as a metabolic intermediate. Free fatty acids also exist in the cell in equilibrium between a relatively abundant carboxylate anion and a relatively scarce, neutrally-charged acid.
  • a fatty acid may exist in a cell as a thioester, especially as a thioester with coenzyme A (CoA), during biosynthesis or oxidation.
  • a fatty acid may exist in a cell as an amide, for example, when covalently bound to a protein to anchor the protein to a membrane.
  • a cell may comprise any one of the nucleic acids described herein, infra (see, e.g. , Section B, below).
  • a cell may comprise multiple copies of any one of the nucleic acids described herein. This can be accomplished by, for example, including a tmpB and/or tmpB gene on a high-copy-number plasmid that is transformed into a cell.
  • An exomethylene-substituted lipid may comprise a branched aliphatic chain
  • a branched (methyl)lipid may be 10-methylstearate, or an acid (10- methylstearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylstearyl CoA), or amide (e.g. , 10-methylstearyl amide) thereof.
  • the branched (methyl)lipid may be a diacylglycerol, triacylglycerol, or phospholipid, and the diacylglycerol, triacylglycerol, or phospholipid may comprise an ester of 10-methylstearate.
  • An exomethylene-substituted lipid may be 10-methylenestearate, or an acid (10- methylenestearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylenestearyl CoA), or amide (e.g. , 10-methylenestearyl amide) thereof.
  • ester e.g. , diacylglycerol, triacylglycerol, phospholipid
  • thioester e.g. , 10-methylenestearyl CoA
  • amide e.g. , 10-methylenestearyl amide
  • the exomethylene-substituted lipid may be a diacylglycerol, triacylglycerol, or phospholipid
  • the diacylglycerol, triacylglycerol, or phospholipid may comprise an ester of 10-methylenestearate.
  • about, at least about, or at most about 1% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein.
  • the cell may comprise about, at least about, or at most about 1% 10-methylstearic acid as measured by % dry cell weight.
  • the cell may comprise about, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, or 50% 10-methylstearic acid as measured by % dry cell weight, or any range derivable therein.
  • the cell may comprise about, at least about, or at most about 1% 10-methylenestearic acid as measured by % dry cell weight.
  • the cell may comprise about, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, or 50% 10-methylenestearic acid as measured by % dry cell weight, or any range derivable therein.
  • An unmodified cell of the same type (e.g. , species) as a cell of the invention may not comprise 10-methylstearate, or an acid (10-methylstearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylstearyl CoA), or amide (e.g. , 10-methylstearyl amide) thereof (e.g. , wherein the unmodified cell does not comprise a recombinant methyltransferase gene or a recombinant reductase gene).
  • An unmodified cell of the same type e.g.
  • species) as a cell of the invention may not comprise 10-methylenestearate, or an acid (10-methylenestearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylenestearyl CoA), or amide (e.g. , 10-methylenestearyl amide) thereof (e.g. , wherein the unmodified cell does not comprise a recombinant methyltransferase gene or a recombinant reductase gene).
  • ester e.g. , diacylglycerol, triacylglycerol, phospholipid
  • thioester e.g. , 10-methylenestearyl CoA
  • amide e.g. , 10-methylenestearyl amide
  • an unmodified cell of the same species as the cell does not comprise a branched (methyl)lipid and/or an exomethylene-substituted lipid. In some embodiments, an unmodified cell of the same species as the cell does not comprise one or more of the branched (methyl)lipids or exomethylene-substituted lipids described herein.
  • a cell may constitutively express the protein encoded by a recombinant methyltransferase gene and/or reductase gene.
  • a cell may constitutively express a methyltransferase protein and/or reductase protein.
  • nucleic acid comprising a recombinant methyltransferase gene, a recombinant reductase gene, or both.
  • the nucleic acid may be, for example, a plasmid.
  • a recombinant methyltransferase gene and/or a recombinant reductase gene is integrated into the genome of a cell, and thus, the nucleic acid may be a chromosome.
  • the invention relates to a cell comprising a recombinant methyltransferase gene, e.g.
  • the invention relates to a cell comprising a recombinant reductase gene, e.g. , wherein the recombinant reductase gene is present in a plasmid or chromosome.
  • a recombinant methyltransferase gene and a recombinant reductase gene may be present in a cell in the same nucleic acid (e.g. , same plasmid or chromosome) or in different nucleic acids (e.g. , different plasmids or chromosomes).
  • a nucleic acid may be inheritable to the progeny of a transformed cell.
  • a gene such as a recombinant methyltransferase gene or recombinant reductase gene may be inheritable because it resides on a plasmid or chromosome. In certain embodiments, a gene may be inheritable because it is integrated into the genome of the transformed cell.
  • a gene may comprise conservative substitutions, deletions, and/or insertions while still encoding a protein that has activity.
  • codons may be optimized for a particular host cell, different codons may be substituted for convenience, such as to introduce a restriction site or to create optimal PCR primers, or codons may be substituted for another purpose.
  • the nucleotide sequence may be altered to create conservative amino acid substitutions, deletions, and/or insertions.
  • Proteins may comprise conservative substitutions, deletions, and/or insertions while still maintaining activity.
  • Conservative substitution tables are well known in the art (Creighton, Proteins (2d. ed., 1992)).
  • Amino acid substitutions, deletions and/or insertions may readily be made using recombinant DNA manipulation techniques. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. These methods include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, OH), Quick Change Site Directed mutagenesis (Stratagene, San Diego, CA), PCR-mediated site- directed mutagenesis, and other site-directed mutagenesis protocols.
  • a "coding sequence” or “coding region” refers to a nucleic acid molecule having sequence information necessary to produce a protein product, such as an amino acid or polypeptide, when the sequence is expressed.
  • the coding sequence may comprise and/or consist of untranslated sequences (including introns or 5' or 3' untranslated regions) within translated regions, or may lack such intervening untranslated sequences (e.g. , as in cDNA).
  • a methyltransferase gene (e.g., a recombinant methyltransferase gene) encodes a methyltransferase protein, which is an enzyme capable of transferring a carbon atom and one or more protons bound thereto from a substrate such as S-adenosyl methionine to a fatty acid such as oleic acid (e.g., wherein the fatty acid is present as a free fatty acid, carboxylate, phospholipid, diacylglycerol, or triacylglycerol).
  • a substrate such as S-adenosyl methionine
  • a fatty acid such as oleic acid
  • the methyltransferase gene (e.g., a recombinant methyltransferase gene) may be a 10- methylstearic B gene (tmpB) as described herein, or a biologically-active portion thereof (i.e., wherein the biologically-active portion thereof comprises methyltransferase activity).
  • tmpB 10- methylstearic B gene
  • the methyltransferase gene (e.g., a recombinant methyltransferase gene) may be derived from a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum.
  • Gammaproteobacteria such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum.
  • the methyltransferase gene (e.g., a recombinant methyltransferase gene) may be selected from the group consisting of Desulfobacula balticum gene tmpB (SEQ ID NO: l), Marinobacter hydrocarbonclasticus gene tmpB (SEQ ID NO:3), Thiohalospira halophila gene tmpB (SEQ ID NO:5), Desulfobacter curvatus gene tmpB (SEQ ID NO:7), Desulfobacter phenolica gene tmpB (SEQ ID NO:9), Desulfobacula toluolica gene tmpB (SEQ ID NO: 11), Desulfobacter postgatei gene tmpB (SEQ ID NO: 13), Halofilum ochraceum gene tmpB (SEQ ID NO: 15), and Marinobacter aquaeolei gene tmpB (SEQ ID NO: 17). It is specifically contemplated that one or more of the above
  • a recombinant methyltransferase gene may be recombinant because it is operably linked to a promoter other than the naturally-occurring promoter of the methyltransferase gene. Such genes may be useful to drive transcription in a particular species of cell.
  • a recombinant methyltransferase gene may be recombinant because it contains one or more nucleotide substitutions relative to a naturally-occurring methyltransferase gene. Such genes may be useful to increase the translation efficiency of the methyltransferase gene's mRNA transcript in a particular species of cell.
  • a nucleic acid may comprise a recombinant methyltransferase gene and a promoter, wherein the recombinant methyltransferase gene and promoter are operably linked.
  • the recombinant methyltransferase gene and promoter may be derived from different species.
  • the recombinant methyltransferase gene may encode the methyltransferase protein of a species of Gammaproteobacteria, and the recombinant methyltransferase gene may be operably-linked to a promoter that can drive transcription in another type of bacteria or a eukaryote (e.g., an algae cell, yeast cell, or plant cell).
  • a eukaryote e.g., an algae cell, yeast cell, or plant cell.
  • the promoter may be a eukaryotic promoter.
  • a cell may comprise the nucleic acid, and the promoter may be capable of driving transcription in the cell.
  • a cell may comprise a recombinant methyltransferase gene, and the recombinant methyltransferase gene may be operably linked to a promoter capable of driving transcription of the recombinant methyltransferase gene in the cell.
  • the cell may be a species of yeast, and the promoter may be a yeast promoter.
  • the cell may be a species of bacteria, and the promoter may be a bacterial promoter (e.g. , wherein the bacterial promoter is not a promoter from a Gammaproteobacterium).
  • the cell may be a species of algae, and the promoter may be an algae promoter.
  • the cell may be a species of plant, and the promoter may be a plant promoter.
  • a recombinant methyltransferase gene may be operably linked to a promoter that cannot drive transcription in the cell from which the recombinant methyltransferase gene originated.
  • the promoter may not be capable of binding an RNA polymerase of the cell from which a recombinant methyltransferase gene originated.
  • the promoter cannot bind a prokaryotic RNA polymerase and/or initiate transcription mediated by a prokaryotic RNA polymerase.
  • a recombinant methyltransferase gene is operably-linked to a promoter that cannot drive transcription in the cell from which the protein encoded by the gene originated.
  • the promoter may not be capable of binding an RNA polymerase of a cell that naturally expresses the methyltransferase enzyme encoded by a recombinant methyltransferase gene.
  • a promoter may be an inducible promoter or a constitutive promoter.
  • a promoter may be any one of the promoters described in PCT Patent Application Publication No. WO 2016/014900, published January 28, 2016 (hereby incorporated by reference in its entirety).
  • WO 2016/014900 describes various promoters derived from yeast species Yarrowia lipolytica and Arxula adeninivorans, which may be particularly useful as promoters for driving the transcription of a recombinant gene in a yeast cell.
  • a promoter may be a promoter from a gene encoding a Translation Elongation factor EF-la; Glycerol-3-phosphate dehydrogenase; Triosephosphate isomerase 1; Fructose- 1,6-bisphosphate aldolase; Phosphogly cerate mutase; Pyruvate kinase; Export protein EXP1; Ribosomal protein S7; Alcohol dehydrogenase; Phosphoglycerate kinase; Hexose Transporter; General amino acid permease; Serine protease; Isocitrate lyase; Acyl-CoA oxidase; ATP-sulfurylase; Hexokinase; 3 -phosphoglycerate dehydrogenase; Pyruvate Dehydrogenase Alpha subunit; Pyruvate Dehydrogenase Beta subunit; Aconitase; Enolase; Actin; Multidrug
  • a recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17, and the recombinant methyltransferase gene may encode a methyltransferase protein with, with at least, or with at most 65%, 66%,
  • a gene that is codon- optimized for expression in yeast may have about 70% sequence identity with SEQ ID NO: l, while the protein encoded by such a codon-optimized gene may have 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2.
  • the codon- optimized gene encodes the same amino acid sequence of the original gene.
  • a recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene because the recombinant methyltransferase gene may be codon- optimized for expression in a eukaryotic cell, such as a plant cell, algae cell, or yeast cell.
  • a cell may comprise a recombinant methyltransferase gene, wherein the recombinant methyltransferase gene is codon-optimized for the cell.
  • a recombinant methyltransferase gene may comprise a nucleotide sequence with at least about 65% sequence identity with the naturally- occurring nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17 ⁇ e.g., at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity), and at least 5 codons of the nucleotide sequence of the recombinant methyltransferase gene may vary from the naturally-occurring nucleotide sequence ⁇ e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 codons).
  • a recombinant methyltransferase gene may encode a methyltransferase protein, and the methyltransferase protein may be substantially identical to any one of the foregoing enzymes, but the recombinant methyltransferase gene may vary from the naturally-occurring gene that encodes the enzyme.
  • the recombinant methyltransferase gene may vary from the naturally-occurring gene because the recombinant methyltransferase gene may be codon-optimized for expression in a specific phylum, class, order, family, genus, species, or strain of cell.
  • a recombinant methyltransferase gene may or may not encode a protein comprising 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, and SEQ ID NO: 18.
  • a recombinant methyltransferase gene may or may not encode a protein comprising 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
  • a recombinant methyltransferase gene may encode a protein having 100% sequence identity with a biologically-active portion of an amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
  • a recombinant methyltransferase gene may encode a protein having at least 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 con
  • Substrates for the methyltransferase protein may include any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the ⁇ 9, ⁇ 10, or ⁇ 11 position.
  • the substrate may have a chain that is 14, 15, 16, 17, or 18 carbons long, or any range derivable therein.
  • the methyltransferase protein may be capable of catalyzing the formation of a methylene substitution at the ⁇ 9, ⁇ 10, or ⁇ 11 position of such a substrate.
  • the recombinant methyltransferase gene encodes a methyltransferase protein that has specific amino acids unchanged from the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
  • the unchanged amino acids can include 1, 2, 3, 4, 5, 6, 7, 8, or 9 amino acids selected from Y163, T175, R199, E211, G269, Y271, N313, N319, and W389 of Marinobacter hydrocarbonclasticus tmpB or corresponding amino acids in tmpB from Desulfobacula balticum, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, or Marinobacter aquaeolei, according to the alignment set forth in Figures 7A-D.
  • a reductase gene (e.g., a recombinant reductase gene) encodes a reductase protein, which is an enzyme capable of reducing a double bond of a fatty acid (e.g., wherein the fatty acid is present as a free fatty acid, carboxylate, phospholipid, diacylglycerol, or triacylglycerol).
  • the reductase gene (e.g., a recombinant reductase gene) may have a coding region that is identical to one from a bacterium of the class Gammaproteobacteria.
  • the reductase gene may comprise any one of the nucleotide sequences set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, and SEQ ID NO:35.
  • the reductase gene (e.g., a recombinant reductase gene) may be a 10-methylstearic A gene (tmpA) as described herein, or a biologically-active portion thereof (i.e., wherein the biologically-active portion thereof comprises reductase activity).
  • the reductase gene (e.g. , a recombinant reductase gene) may be derived from a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum.
  • the reductase gene (e.g., a recombinant reductase gene) may be selected from the group consisting of Desulfobacula balticum gene tmpA (SEQ ID NO: 19), Marinobacter hydrocarbonclasticus gene tmpA (SEQ ID NO:21), Thiohalospira halophila gene tmpA (SEQ ID NO:23), Desulfobacter curvatus gene tmpA (SEQ ID NO:25), Desulfobacter phenolica gene tmpA (SEQ ID NO:27), Desulfobacula toluolica gene tmpA (SEQ ID NO:29), Desulfobacter postgatei gene tmpA (SEQ ID NO:31), Halofilum ochraceum gene tmpA (SEQ ID NO:33), and Marinobacter aquaeolei gene tmpA (SEQ ID NO:35).
  • Desulfobacula balticum gene tmpA (S
  • a recombinant reductase gene may be recombinant because it is operably linked to a promoter other than the naturally-occurring promoter of the reductase gene. Such genes may be useful to drive transcription in a particular species of cell.
  • a recombinant reductase gene may be recombinant because it contains one or more nucleotide substitutions relative to a naturally-occurring reductase gene. Such genes may be useful to increase the translation efficiency of the reductase gene's mRNA transcript in a particular species of cell.
  • a promoter may be an inducible promoter or a constitutive promoter.
  • a promoter may be any one of the promoters described in PCT Patent Application Publication No. WO 2016/014900, published January 28, 2016 (hereby incorporated by reference in its entirety).
  • WO 2016/014900 describes various promoters derived from yeast species Yarrowia lipolytica and Arxula adeninivorans, which may be particularly useful as promoters for driving the transcription of a recombinant gene in a yeast cell.
  • a promoter may be a promoter from a gene encoding a Translation Elongation factor EF-la; Glycerol-3-phosphate dehydrogenase; Triosephosphate isomerase 1; Fructose- 1,6-bisphosphate aldolase; Phosphogly cerate mutase; Pyruvate kinase; Export protein EXP1; Ribosomal protein S7; Alcohol dehydrogenase; Phosphoglycerate kinase; Hexose Transporter; General amino acid permease; Serine protease; Isocitrate lyase; Acyl-CoA oxidase; ATP-sulfurylase; Hexokinase; 3 -phosphoglycerate dehydrogenase; Pyruvate Dehydrogenase Alpha subunit; Pyruvate Dehydrogenase Beta subunit; Aconitase; Enolase; Actin; Multidrug
  • a recombinant reductase may or may not have 100% sequence identity with any one of the nucleotide sequences set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, and SEQ ID NO:35.
  • a recombinant reductase gene may or may not have 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs of the nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35.
  • a recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35, and the recombinant reductase gene may encode a reductase protein with, with at least, or with at most 80%, 81%, 82%
  • a gene that is codon-optimized for expression in yeast may have about 70% sequence identity with SEQ ID NO: 19, while the protein encoded by such a codon-optimized gene may have 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20.
  • the codon-optimized gene encodes the same amino acid sequence of the original gene.
  • a recombinant reductase gene may vary from a naturally-occurring reductase gene because the recombinant reductase gene may be codon-optimized for expression in a eukaryotic cell, such as a plant cell, algae cell, or yeast cell.
  • a cell may comprise a recombinant reductase gene, wherein the recombinant reductase gene is codon-optimized for the cell.
  • a recombinant reductase gene may encode a reductase protein selected from the group consisting of Desulfobacula balticum protein tmpA (SEQ ID NO:20), Marinobacter hydrocarbonclasticus protein tmpA (SEQ ID NO:22), Thiohalospira halophila protein tmpA (SEQ ID NO:24), Desulfobacter curvatus protein tmpA (SEQ ID NO:26), Desulfobacter phenolica protein tmpA (SEQ ID NO:28), Desulfobacula toluolica protein tmpA (SEQ ID NO:30), Desulfobacter postgatei protein tmpA (SEQ ID NO:32), Halofilum ochraceum protein tmpA (SEQ ID NO:34), and Marinobacter aquaeolei protein tmpA (SEQ ID NO:36).
  • Desulfobacula balticum protein tmpA SEQ ID NO:20
  • a recombinant reductase gene may encode a protein having 100% sequence identity with a biologically-active portion of an amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36.
  • Substrates for the reductase protein may include any fatty acid from 14 to 18 carbons long with a methylene substitution in the ⁇ 9, ⁇ 10, or ⁇ 11 position.
  • the substrate may be 14, 15, 16, 17, or 18 carbons long, or any range derivable therein.
  • the reductase protein may be capable of catalyzing the reduction of a methylene-substituted fatty acid substrate to a (methyl)lipid.
  • the unchanged amino acids can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 amino acids selected from 18, L22, F37, P38, R39, K41, G45, W46, P49, G144, C148, P149, E169, E171, L197, 1212, C249, H250, Y252, 1270, G275, L276, E283, A296, and A299 of Marinobacter hydrocarbonclasticus tmpA or corresponding amino acids in tmpA from Desulfobacula balticum, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, or Marinobacter aquaeolei, according to the alignment set forth in Figures 8A-D.
  • the term "complementary" and derivatives thereof are used in reference to pairing of nucleic acids by the well-known rules that A pairs with T or U and C pairs with G. Complement can be "partial” or “complete”. In partial complement, only some of the nucleic acid bases are matched according to the base pairing rules; while in complete or total complement, all the bases are matched according to the pairing rule. The degree of complement between the nucleic acid strands may have significant effects on the efficiency and strength of hybridization between nucleic acid strands as well known in the art. The efficiency and strength of said hybridization depends upon the detection method.
  • nucleic acid that is referred to herein as having a certain percent sequence identity to a sequence set forth in a SEQ ID NO, includes nucleic acids that have the certain percent sequence identity to the complement of the sequence set forth in the SEQ ID NO. d. Nucleic acids comprising a recombinant methyltransferase gene and a recombinant reductase gene
  • a nucleic acid may comprise both a recombinant methyltransferase gene and a recombinant reductase gene.
  • the recombinant methyltransferase gene and the recombinant reductase gene may encode proteins from the same species or from different species.
  • a nucleic acid may comprise the nucleotide sequence of an expression vector comprising a tmp operon that includes both a methyltransferase gene and a reductase gene.
  • Such vectors may include pNC1071 (SEQ ID NO:39), which includes a Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes a Desulfobacula balticum tmp operon, pNC1073 (SEQ ID NO:41), which includes a Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes a Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes a Thiohalospira halophila tmp operon.
  • compositions produced by the cells described herein may be an oil composition comprised of about, at least about, or at most about 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100% lipids by weight.
  • the composition may comprise branched (methyl)lipids and/or exomethylene-substituted lipids.
  • the exomethylene-substituted lipid may be a carboxylic acid (e.g. , 10-methylenestearic acid, 10-methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene-octadec- 12-enoic acid), carboxylate (e.g.
  • Lipids and lipid derivatives can also be extracted using liquefaction, oil liquefaction, and supercritical CO2 extraction.
  • the recovery process may include harvesting cultured cells, such as by filtration or centrifugation, lysing cells to create a lysate, and extracting the lipid/hydrocarbon components using a hydrophobic solvent.
  • the lipids described herein may be secreted by the cells.
  • a process for recovering the lipid may not require creating a lysate from the cells, but collecting the secreted lipid from the culture medium.
  • the compositions described herein may be made by culturing a cell that secretes one of the lipids described herein, such as a a linear fatty acid with a chain length of 14- 18 carbons with a methyl branch at the ⁇ 9, ⁇ 10, or ⁇ 11 position.
  • 10-methyl fatty acids comprise about, at least about, or at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%,
  • the amount of 10-methyl fatty acids in a cell can be optimized by various methods. For example, increasing the expression of tmpA and/or tmpB can increase the methyltransferase and/or reductase activity within the cell, which may lead to accumulation of greater amounts of branched (methyl lipids). One way this can be accomplished is by increasing the number of copies of the gene in the cell, such as by including the genes on high- copy-number plasmids. Additionally or alternatively, the tmpA and/or tmpB cells can be operably linked to a promoter that drives high levels of expression.
  • the method may comprise incubating a cell or plurality of cells as described herein, supra, with media.
  • the media may optionally be supplemented with an unbranched, unsaturated fatty acid, such as oleic acid, that serves as a substrate for methylation.
  • the substrate may include one or more fatty acids from 14 to 18 carbons long with a double bond in the ⁇ 9, ⁇ 10, or ⁇ 11 position.
  • the substrate may be 14, 15, 16, 17, or 18 carbons long, or any range derivable therein.
  • the media may optionally be supplemented with methionine or s-adenosyl methionine, which may similarly serve as a substrate.
  • the method may comprise contacting a cell or plurality of cells with oleic acid (or some other substrate to be methylated), methionine, or both.
  • the method may comprise incubating a cell or plurality of cells as described herein, supra, in a bioreactor.
  • the method may comprise recovering lipids from the cells, such as by extraction with an organic solvent.
  • the method may comprise degumming the cell or plurality of cells, e.g., to remove proteins.
  • the method may comprise transesterification or esterification of the lipids of the cells.
  • An alcohol such as methanol or ethanol may be used for transesterification or esterification, e.g., thereby producing a fatty acid methyl ester or fatty acid ethyl ester.
  • tmp gene operon was responsible for Gammaproteobacteria 10- methyl fatty acid production
  • the genes were designed in an E. coli expression vector using the DNA manipulation software A Plasmid Editor and synthesized by Thermofisher Scientific - GeneArt.
  • the native codon usage of the tmp genes was not changed.
  • tmpB gene transcription was controlled using the constitutively active tac promoter (de Boer 1983), followed by the E. coli lacZ-lacY intergene linker region, the tmpA gene, and the trpT' gene terminator (Wu 1981). These synthetic gene operons were cloned into an E.
  • the plasmid vectors are named pNC1071 (SEQ ID NO:39), which includes the Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes the Desulfobacula balticum tmp operon; pNC1073 (SEQ ID NO:41), which includes the Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes the Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes the Thiohalospira halophila tmp operon.
  • pNC1071 SEQ ID NO:39
  • pNC1072 SEQ ID NO:40
  • pNC1073 SEQ ID NO:41
  • pNC1074 SEQ ID NO:42
  • pNC1076 SEQ ID NO:43
  • Plasmids pNC1071, pNC1072, pNC1073, pNC1074, pNC1076, and the control plasmid pNC53 containing the AmpR gene, ColEl origin, and tac promoter were transformed into E. coli ToplO (Invitrogen) using a standard electrotransformation protocol utilizing 50 ⁇ ⁇ suspended cells, 1 ⁇ ⁇ of plasmid DNA at a concentration of 200 ng per ⁇ , a 1 mm gap electrotransformation cuvette, and a pulse with 1.8 kV voltage, 200 ⁇ , and 25 ⁇ with exponential decay and a time constant of approximately 4.5 milliseconds.
  • the resulting plasmids are pNC996 (Desulfobacter postgatei tmpB), pNC998 (Desulfobacula balticum tmpB), pNClOOO (Desulfobacula toluolica tmpB), pNC1002 (Marinobacter hydrocarbonclasticus tmpB), pNC1006 (Thiohalospira halophila tmpB).
  • plasmids were transformed into NS20 by standard heat shock protocol.
  • Single cells of the resulting transformations were selected and further grown in 96- well shaking plates in YPD supplemented with 50 ⁇ g/mL Nourseothrycin for 2 days at 30° C.
  • plasmids were transformed into strain NS 1009.
  • Resulting transformed strains were grown in 96-well shaking plates in standard nitrogen limited media for 4 days at 30° C.
  • cell pellets were isolated by centrifugation and freeze dried for fatty acid analysis by gas chromatography as performed for E. coli samples. Total fatty acids were measured and the total amount of C16 and C18 fatty acids containing the methylene intermediates were quantified.
  • TmpB protein sequences encoded by the tmpB genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei were aligned with the cyclopropane fatty acid synthase (Cfa) enzyme from Escherichia coli with the CLUSTAL OMEGA software program (European Molecular Biology Laboratory, EMBL).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Oil, Petroleum & Natural Gas (AREA)
  • General Chemical & Material Sciences (AREA)
  • Cell Biology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

Disclosed herein are cells, nucleic acids, and proteins that can be used to produce branched (methyl)lipids, such as 10-methylstearic acids, and compositions that include such lipids. Cells disclosed herein comprise methyltransferase and/or reductase genes from bacteria of the class Gammaproteobacteria, which encode enzymes capable of catalyzing the production of branched (methyl)lipids from unbranched, unsaturated lipids. Saturated branched (methyl)lipids produced using embodiments of the present invention have favorable low-temperature fluidity and favorable oxidative stability, which are desirable properties for lubricants and specialty fluids.

Description

HETEROLOGOUS PRODUCTION OF 10-METHYLSTEARIC ACID BY CELLS EXPRESSING RECOMBINANT METHYLTRANSFERASE
DESCRIPTION
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of priority to U.S. Provisional Patent
Application Serial No. 62/561,136, filed September 20, 2017, hereby incorporated by reference in its entirety.
[0002] This application is related to U.S. Serial No. 15/710,734 and PCT/US 17/52491 both filed September 20, 2017. BACKGROUND OF THE INVENTION
A. Field of the Invention
[0003] The invention generally concerns production of branched (methyl)lipids by cells expressing recombinant methyltransferases and/or reductases derived from Gammaproteobacteria. B. Description of Related Art
[0004] Fatty acids derived from agricultural plant and animal oils find use as industrial lubricants, hydraulic fluids, greases, and other specialty fluids in addition to oleochemical feedstocks for processing. The physical and chemical properties of these fatty acids result in large part from their carbon chain length and number of unsaturated double bonds. Fatty acids are typically 16:0 (sixteen carbons, zero double bonds), 16: 1 (sixteen carbons, 1 double bond), 18:0, 18: 1, 18:2, or 18:3. Importantly, fatty acids with no double bonds (saturated) have high oxidative stability, but they solidify at low temperature. Double bonds improve low- temperature fluidity, but decrease oxidative stability. This trade-off poses challenges for lubricant and other specialty-fluid formulations because consistent long term performance (high oxidative stability) over a wide range of operating temperatures is desirable. High 18: 1 (oleic) fatty acid oils provide low temperature fluidity with relatively good oxidative stability. Accordingly, several commercial products, such as high oleic soybean oil, high oleic sunflower oil, and high oleic algal oil, have been developed with high oleic compositions. Oleic acid is an alkene, however, and subject to oxidative degradation. [0005] A superior alternative is the addition of a fully saturated methyl branch to the fatty acid chain. This creates a similar melting-temperature depression as a double bond, but with no decrease in oxidative stability versus fully saturated linear fatty acids. Methyl branches located near the middle of the fatty acid chain have the largest melting-temperature depression. Several chemical processes have been explored to introduce methyl branches; however, the preferred industrial method results in random placement of the methyl branch and creates a substantial amount of by-product. There remains a need for efficient and economical processes of producing branched (methyl)lipids.
SUMMARY OF THE INVENTION
[0006] Disclosed herein are cells, nucleic acids, and proteins that can be used to produce branched (methyl) lipids, such as 10-methylstearic acids, and compositions that include such lipids. Saturated branched (methyl)lipids produced using embodiments of the present invention have favorable low-temperature fluidity and favorable oxidative stability, which are desirable properties for lubricants and specialty fluids. [0007] Various aspects relate to nucleic acids comprising a recombinant tmpB gene encoding a methyltransferase protein and/or a recombinant tmpA gene encoding a reductase protein. The methyltransferase protein and/or reductase protein may be proteins expressed by species of the class Gammaproteobacteria (phylum, Proteobacteria), and the recombinant tmpB gene and/or recombinant tmpA gene may be codon-optimized for expression in a different phylum of bacteria or in eukaryotes {e.g., yeast, such as Arxula adeninivorans (also known as Blastobotrys adeninivorans or Trichosporon adeninivorans), Saccharomyces cerevisiae, or Yarrowia lipolytica). The recombinant tmpB gene or recombinant tmpA gene may be operably- linked to a promoter capable of driving expression in a phylum of bacteria other than Gammaproteobacteria or in eukaryotes {e.g., yeast). The nucleic acid may be a plasmid or a chromosome.
[0008] Some aspects relate to a cell comprising a nucleic acid as described herein. The cell may comprise a branched (methyl)lipid, such as 10-methylstearic acid, and/or an exomethylene-substituted lipid, such as 10-methylenestearic acid. The cell may be a eukaryotic cell, such as an algae cell, yeast cell, or plant cell. [0009] Some aspects relate to a composition produced by cultivating a cell culture comprising cells as described herein. The oil composition may comprise a branched (methyl)lipid, such as 10-methylstearic acid, and/or an exomethylene-substituted lipid, such as 10-methylenestearic acid. In some embodiments, the oil composition is produced by cultivating a cell culture and recovering the oil composition from the cell culture, wherein the oil composition comprises 10-methyl fatty acids, and wherein the 10-methyl fatty acids comprise at least about 1% by weight of the total fatty acids in the oil composition. In some embodiments, the 20-methyl fatty acids comprise at least about 15% by weight of the total fatty acids in the oil composition.
[0010] Some aspects relate to a method of producing an oil composition, the method comprising: cultivating a cell culture comprising any of the cells disclosed herein; and recovering the oil composition from the cell culture. In some embodiments, the method further comprises contacting the cell culture with a substrate comprising a fatty acid from 14 to 18 carbons long with a double bond in the Δ9, Δ10, or Δ11 position. In some embodiments, recovering the oil composition from the cell culture comprises recovering lipids that have been secreted by the cell. In some embodiments, producing the oil composition comprises performing chemical reactions or causing chemical reactions to be performed in which oleic acid and methionine substrates are converted to 10-methylenestearic acid, wherein the chemical reactions are catalyzed by a tmpB protein. In some embodiments, producing the oil composition comprises performing chemical reactions or causing chemical reactions to be performed in which 10-methylene stearic acid is reduced to 10-methylstearic acid, wherein the chemical reactions are catalyzed by tmpA protein. In some embodiments, the reduction is performed using NADPH, ferredoxin, flavodoxin, rubredoxin, cytochrome c, or combinations thereof as reducing agents. In any of the methods disclosed herein that involve reduction reactions any one of, or any combination of, NADPH, ferredoxin, flavodoxin, rubredoxin, and cytochrome c may be used.
[0011] Other objects, features and advantages of the present invention will become apparent from the following figures, detailed description, and examples. It should be understood, however, that the figures, detailed description, and examples, while indicating specific embodiments of the invention, are given by way of illustration only and are not meant to be limiting. Additionally, it is contemplated that changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description. BRIEF DESCRIPTION OF THE DRAWINGS
[0012] Figure 1 depicts one possible mechanism for the conversion of oleic acid to 10- methylstearic acid. An oleic acid substrate may be present as an acyl chain of a glycerolipid or phospholipid. A methionine substrate, which donates the methyl group, may be present as S-adenosyl methionine. The oleic acid and methionine substrates may be converted to 10- methylenestearic acid (e.g., present as an acyl chain of a glycerolipid or phospholipid) and homocysteine (e.g., present as S-adenosyl homocysteine). This reaction may be catalyzed by a tmpB protein as described herein, infra. 10-methylenestearic acid (e.g., present as an acyl chain of a glycerolipid or phospholipid) may be reduced to 10-methyl stearic acid. The reduction may be catalyzed by a tmpA protein as describe herein, infra, for example, without limitation, using NADPH as a reducing agent. Other examples of the reducing agent may include, without limitation, ferredoxin, flavodoxin, rubredoxin, cytochrome c, or combinations thereof. The language of the specification and claims, however, is not limited to any particular reaction mechanism. [0013] Figure 2 shows the occurrence of cyclopropane fatty acyl phospholipid synthase (cfa) homologs and 10-methylpalmitic acid (10Mel6) in certain Gammaproteobacteria with sequenced genomes and observed lipid profiles.
[0014] Figures 3A-3B depict maps of the following vectors, which encode a tmp operon: pNC1071 (SEQ ID NO:39), which includes a Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes a Desulfobacula balticum tmp operon, pNC1073 (SEQ ID NO:41), which includes a Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes a Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes a Thiohalospira halophila tmp operon.
[0015] Figure 4 is a graph showing the percentage of 10-methylene fatty acids in Saccharomyces cerevisiae transformed with plasmids expressing tmpB from the indicated species: D. postgatei (D.po.), D. balticum (D.ba.), D. toluolica (D.to.), M. hydrocarbonclasticus (M.hy.) and T. halophila (T.ha.), or an empty vector control (-).
[0016] Figure 5 is a graph showing the percentage of 10-methylene fatty acids in
Yarrowia lipolytica transformed with plasmids expressing tmpB from the indicated species: D. postgatei (D.po.), D. balticum (D.ba.), D. toluolica (D.to.), M. hydrocarbonclasticus (M.hy.) and T. halophila (T.ha.), or an empty vector control (-). [0017] Figure 6 shows the fatty acid profile of E. coli Top 10 cells with plasmids pNC1071, pNC1072, pNC1073, pNC1074, pNC1076, and pNC53 (empty control vector) grown in LB medium. Percentage values show the weight percent of the indicated fatty acid as a percentage of all fatty acids. 14:0 = Myristic acid, 16:0 = Palmitic acid, 16: 1Δ9 = palmitoleic acid, 16:0cyc = 17A,czs-9,10-methylenehexadecanoic acid, 10-methylene 16:0 = 10-methylene hexadecenoic acid, 18: 1Δ11 = vaccenic acid, 18:0 = stearic acid, SD = standard deviation.
[0018] Figures 7A-7D show a CLUSTAL OMEGA alignment of tmpB protein sequences encoded by the tmpB genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei, along with the cyclopropane fatty acid synthase (Cfa) enzyme from Escherichia coli.
[0019] Figures 8A-8D show a CLUSTAL OMEGA alignment tmpA protein sequences encoded by the tmpA genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei, along with the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464. DETAILED DESCRIPTION OF THE INVENTION
A. Definitions
[0020] The articles "a" and "an" are used herein to refer to one or to more than one
(i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element. [0021] The term "biologically-active portion" refers to an amino acid sequence that is less than a full-length amino acid sequence, but exhibits at least one activity of the full length sequence. For example, a biologic ally- active portion of a methyltransferase may refer to one or more domains of tmpB having biological activity for converting oleic acid (e.g., a phospholipid comprising an ester of oleate) and methionine (e.g., S-adenosyl methionine) into 10-methylenestearic acid (e.g., a phospholipid comprising an ester of 10-methylenestearate). A biologically-active portion of a reductase may refer to one or more domains of tmpA having biological activity for converting 10-methylenestearic acid (e.g., a phospholipid comprising an ester of 10-methylenestearate) and a reducing agent (e.g. , ferrodoxin, flavodoxin, rubredoxin, cytochrome c, NADH, NADPH, FAD, FADH2, FMNH2) into 10-methylstearic acid (e.g. , a phospholipid comprising an ester of 10-methylstearate). Biologically-active portions of a protein include peptides or polypeptides comprising amino acid sequences sufficiently identical to or derived from the amino acid sequence of the protein, e.g. , the amino acid sequence set forth in SEQ ID NO:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, or 36, that include fewer amino acids than the full length protein, and exhibit at least one activity of the protein, especially methyltransferase or reductase activity. A biologically-active portion of a protein may comprise, comprise at least, or comprise at most, for example, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291 , 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481 , 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, or more amino acids, or any range derivable therein. Typically, biologically-active portions comprise a domain or motif having a catalytic activity, such as catalytic activity for producing 10-methylenestearic acid or 10-methylstearic acid. A biologically-active portion of a protein includes portions of the protein that have the same activity as the full-length peptide and every portion that has more activity than background. For example, a biologically-active portion of an enzyme may have, have at least, or have at most 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, 400% or higher activity relative to the full-length enzyme (or any range derivable therein). A biologically-active portion of a protein may include portions of a protein that lack a domain that targets the protein to a cellular compartment.
[0022] The terms "codon optimized" and "codon-optimized for the cell" refer to coding nucleotide sequences (e.g., genes) that have been altered to substitute at least one codon that is relatively rare in a desired host cell with a synonymous codon that is relatively prevalent in the host cell. Codon optimization thereby allows for better utilization of the tRNA of a host cell by matching the codons of a recombinant gene with the tRNA of the host cell. For example, the codon usage of the species of Gammaproteobacteria (prokaryotes) varies from the codon usage of yeast (eukaryotes). The translation efficiency in a yeast host cell of an mRNA encoding a Gammaproteobacteria protein may be increased by substituting the codons of the corresponding Gammaproteobacteria gene with codons that are more prevalent in the particular species of yeast. A codon optimized gene thereby has a nucleotide sequence that varies from a naturally-occurring gene.
[0023] The term "constitutive promoter" refers to a promoter that mediates the transcription of an operably linked gene independent of a particular stimulus (e.g. , independent of the presence of a reagent such as isopropyl β-D-l-thiogalactopyranoside).
[0024] The term "DGAT1" refers to a gene that encodes a type 1 diacylglycerol acyltransferase protein, such as a gene that encodes a yeast DGA2 protein.
[0025] The term "DGAT2" refers to a gene that encodes a type 2 diacylglycerol acyltransferase protein, such as a gene that encodes a yeast DGA1 protein.
[0026] "Diacylglyceride," "diacylglycerol," and "diglyceride," are esters comprised of glycerol and two fatty acids. [0027] The terms "diacylglycerol acyltransferase" and "DGA" refer to any protein that catalyzes the formation of triacylglycerides from diacylglycerol. Diacylglycerol acyltransferases include type 1 diacylglycerol acyltransferases (DGA2), type 2 diacylglycerol acyltransferases (DGA1), and type 3 diacylglycerol acyltransferases (DGA3) and all homologs that catalyze the above-mentioned reaction.
[0028] The terms "diacylglycerol acyltransferase, type 1" and "type 1 diacylglycerol acyltransferases" refer to DGA2 and DGA2 orthologs.
[0029] The terms "diacylglycerol acyltransferase, type 2" and "type 2 diacylglycerol acyltransferases" refer to DGA1 and DGA1 orthologs. [0030] The term "domain" refers to a part of the amino acid sequence of a protein that is able to fold into a stable three-dimensional structure independent of the rest of the protein.
[0031] The term "drug" refers to any molecule that inhibits cell growth or proliferation, thereby providing a selective advantage to cells that contain a gene that confers resistance to the drug. Drugs include antibiotics, antimicrobials, toxins, and pesticides. [0032] "Dry weight" and "dry cell weight" mean weight determined in the relative absence of water. For example, reference to oleaginous cells as comprising a specified percentage of a particular component by dry cell weight means that the percentage is calculated based on the weight of the cell after substantially all water has been removed. The term "% dry weight," when referring to a specific fatty acid (e.g. , oleic acid or 10-methylstearic acid), includes fatty acids that are present as carboxylates, esters, thioesters, and amides. For example, a cell that comprises 10-methylstearic acid as a percentage of total fatty acids by % dry cell weight includes 10-methylstearic acid, 10-methylstearate, the 10-methylstearate portion of a diacylglycerol comprising a 10-methylstearate ester, the 10-methylstearate portion of a triacylglycerol comprising a 10-methylstearate ester, the 10-methylstearate portion of a phospholipid comprising a 10-methylstearate ester, and the 10-methylstearate portion of 10- methylstearate CoA. The term "% dry weight," when referring to a specific type of fatty acid (e.g. , C 16 fatty acids, C 18 fatty acids), includes fatty acids that are present as carboxylates, esters, thioesters, and amides as described above (e.g. , for 10 methylstearic acid).
[0033] The term "gene," as used herein, may encompass genomic sequences that contain exons, particularly polynucleotide sequences encoding polypeptide sequences involved in a specific activity. The term further encompasses synthetic nucleic acids that did not derive from genomic sequence. In certain embodiments, the genes lack introns, as they are synthesized based on the known DNA sequence of cDNA and protein sequence. In other embodiments, the genes are synthesized, non-native cDNA wherein the codons have been optimized for expression in Y. lipolytica or A. adeninivorans based on codon usage. The term can further include nucleic acid molecules comprising upstream, downstream, and/or intron nucleotide sequences.
[0034] The term "inducible promoter" refers to a promoter that mediates the transcription of an operably linked gene in response to a particular stimulus.
[0035] The term "integrated" refers to a nucleic acid that is maintained in a cell as an insertion into the cell's genome, such as insertion into a chromosome, including insertions into a plastid genome.
[0036] "In operable linkage" and "operably linked" refer to a functional linkage between two nucleic acid sequences, such as a control sequence (typically a promoter) and the linked sequence (typically a sequence that encodes a protein, also called a coding sequence). A promoter is in operable linkage with a gene or is operably linked to a gene if it can mediate transcription of the gene.
[0037] The term "nucleic acid" refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function. The following are non- limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. A polynucleotide may be further modified, such as by conjugation with a labeling component. In all nucleic acid sequences provided herein, U nucleotides are interchangeable with T nucleotides.
[0038] The term "phospholipid" refers to esters comprising glycerol, two fatty acids, and a phosphate. The phosphate may be covalently linked to carbon-3 of the glycerol and comprise no further substitution, i.e., the phospholipid may be a phosphatidic acid. The phosphate may be substituted with ethanolamine (e.g., phosphatidylethanolamine), choline (e.g. , phosphatidylcholine), serine (e.g. , phosphatidylserine), inositol (e.g. , phosphatidylinositol), inositol phosphate (e.g. , phosphatidylinositol-3-phosphate, phosphatidylinositol-4-phosphate, phosphatidylinositol- 5 -phosphate), inositol bisphosphate (e.g. , phosphatidylinositol-4,5-bisphosphate), or inositol triphosphate (e.g. , pho sphatidylino sitol- 3 ,4 , 5 -bispho sphate) .
[0039] As used herein, the term "plasmid" refers to a circular DNA molecule that is physically separate from an organism' s genomic DNA. Plasmids may be linearized before being introduced into a host cell (referred to herein as a linearized plasmid). Linearized plasmids may not be self-replicating, but may integrate into and be replicated with the genomic DNA of an organism.
[0040] A "promoter" is a nucleic acid control sequence that directs the transcription of a nucleic acid. As used herein, a promoter includes the necessary nucleic acid sequences near the start site of transcription.
[0041] The term "protein" refers to molecules that comprise an amino acid sequence, wherein the amino acids are linked by peptide bonds.
[0042] "Transformation" refers to the transfer of a nucleic acid into a host organism or into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid are referred to as "recombinant," "transgenic," or "transformed" organisms. Thus, nucleic acids of the present invention can be incorporated into recombinant constructs, typically DNA constructs, capable of introduction into and replication in a host cell. Such a construct can be a vector that includes a replication system and sequences that are capable of transcription and translation of a polypeptide-encoding sequence in a given host cell. Typically, expression vectors include, for example, one or more cloned genes under the transcriptional control of 5' and 3' regulatory sequences and a selectable marker. Such vectors also can contain a promoter regulatory region (e.g. , a regulatory region controlling inducible or constitutive, environmentally- or developmentally-regulated, or location-specific expression), a transcription initiation start site, a ribosome binding site, a transcription termination site, and/or a polyadenylation signal.
[0043] The term "transformed cell" refers to a cell that has undergone a transformation. Thus, a transformed cell comprises the parent' s genome and an inheritable genetic modification. [0044] The terms "triacylglyceride," "triacylglycerol," "triglyceride," and "TAG" are esters comprised of glycerol and three fatty acids.
[0045] The term "recombinant gene" refers to a gene that (1) is operatively linked to a polynucleotide to which it is not linked in nature or (2) has a nucleotide sequence different from the naturally-occurring nucleotide sequence, such as, for example, a non-naturally occurring mutation, a codon-optimized sequence, or a cDNA that lacks naturally-occurring introns that are found at the gene's genomic locus. The term "recombinant" can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mRNAs encoded by such nucleic acids. Thus, for example, a protein synthesized by a microorganism is recombinant, if it is synthesized from an mRNA that is synthesized from a recombinant gene present in the cell. As other examples, a gene may be a recombinant gene if it is operably linked to a promoter different from the promoter to which it is operably linked in nature or if it is connected to another gene or portion thereof and, together with the other gene or portion thereof, encodes a protein that is not found in nature, such as a fusion protein or an epitope-tagged protein.
B. Microbe Engineering
1. Overview
[0046] Genes and gene products may be introduced into microbial host cells. Suitable host cells for expression of the genes and nucleic acid molecules are microbial hosts that can be found broadly within the fungal or bacterial families. Examples of suitable host strains include but are not limited to fungal or yeast species, such as Arxula, Aspegillus, Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Hansenula, Kluyveromyces, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Yarrowia, or bacterial species, such as members of proteobacteria and actinomycetes, as well as the genera Acinetobacter, Arthrobacter, Brevibacterium, Acidovorax, Bacillus, Clostridia, Streptomyces, Escherichia, Salmonella, Pseudomonas, and Corny ebacterium. Yarrowia lipolytica and Arxula adeninivorans are suited for use as a host microorganism because they can accumulate a large percentage of their weight as triacylglycerols. [0047] Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are known to those skilled in the art. Any of these could be used to construct chimeric genes to produce any one of the gene products of the instant sequences. These chimeric genes could then be introduced into appropriate microorganisms via transformation techniques to provide high-level expression of the enzymes.
[0048] For example, a gene encoding an enzyme can be cloned in a suitable plasmid, and an aforementioned starting parent strain as a host can be transformed with the resulting plasmid. This approach can increase the copy number of each of the genes encoding the enzymes and, as a result, the activities of the enzymes can be increased. The plasmid is not particularly limited so long as it renders a desired genetic modification inheritable to the microorganism's progeny.
[0049] Vectors or cassettes useful for the transformation of suitable host cells are well known. Typically the vector or cassette contains sequences that direct the transcription and translation of the relevant gene, a selectable marker, and sequences that allow autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the gene harboring transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. In certain embodiments both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host.
[0050] Promoters, cDNAs, and 3' UTRs, as well as other elements of the vectors, can be generated through cloning techniques using fragments isolated from native sources (see, e.g., Green & Sambrook, Molecular Cloning: A Laboratory Manual, (4th ed., 2012); U.S. Patent No. 4,683,202 (incorporated by reference)). Alternatively, elements can be generated synthetically using known methods (see, e.g., Gene 164:49-53 (1995)).
2. Homologous Recombination
[0051] Homologous recombination is the ability of complementary DNA sequences to align and exchange regions of homology. Transgenic DNA ("donor") containing sequences homologous to the genomic sequences being targeted ("template") is introduced into the organism and then undergoes recombination into the genome at the site of the corresponding homologous genomic sequences. [0052] The ability to carry out homologous recombination in a host organism has many practical implications for what can be carried out at the molecular genetic level and is useful in the generation of a microbe that can produce a desired product. By its nature homologous recombination is a precise gene targeting event and, hence, most transgenic lines generated with the same targeting sequence will be essentially identical in terms of phenotype, necessitating the screening of far fewer transformation events. Homologous recombination also targets gene insertion events into the host chromosome, potentially resulting in excellent genetic stability, even in the absence of genetic selection. Because different chromosomal loci will likely impact gene expression, even from exogenous promo ters/UTRs, homologous recombination can be a method of querying loci in an unfamiliar genome environment and to assess the impact of these environments on gene expression.
[0053] A particularly useful genetic engineering approach using homologous recombination is to co-opt specific host regulatory elements, such as promoters/UTRs, to drive heterologous gene expression in a highly specific fashion. [0054] Because homologous recombination is a precise gene targeting event, it can be used to precisely modify any nucleotide(s) within a gene or region of interest, so long as sufficient flanking regions have been identified. Therefore, homologous recombination can be used as a means to modify regulatory sequences impacting gene expression of RNA and/or proteins. It can also be used to modify protein coding regions in an effort to modify enzyme activities such as substrate specificity, affinities and Km, thereby affecting a desired change in the metabolism of the host cell. Homologous recombination provides a powerful means to manipulate the host genome resulting in gene targeting, gene conversion, gene deletion, gene duplication, gene inversion, and exchanging gene expression regulatory elements such as promoters, enhancers and 3' UTRs. [0055] Homologous recombination can be achieved by using targeting constructs containing pieces of endogenous sequences to "target" the gene or region of interest within the endogenous host cell genome. Such targeting sequences can either be located 5' of the gene or region of interest, 3' of the gene/region of interest or even flank the gene/region of interest. Such targeting constructs can be transformed into the host cell either as a supercoiled plasmid DNA with additional vector backbone, a PCR product with no vector backbone, or as a linearized molecule. In some cases, it may be advantageous to first expose the homologous sequences within the transgenic DNA (donor DNA) by cutting the transgenic DNA with a restriction enzyme. This step can increase the recombination efficiency and decrease the occurrence of undesired events. Other methods of increasing recombination efficiency include using PCR to generate transforming transgenic DNA containing linear ends homologous to the genomic sequences being targeted.
3. Vectors and Vector Components
[0056] Vectors for transforming microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art in view of the disclosure herein. A vector typically contains one or more genes, in which each gene codes for the expression of a desired product (the gene product) and is operably linked to one or more control sequences that regulate gene expression or target the gene product to a particular location in the recombinant cell. a. Control Sequences
[0057] Control sequences are nucleic acids that regulate the expression of a coding sequence or direct a gene product to a particular location in or outside a cell. Control sequences that regulate expression include, for example, promoters that regulate transcription of a coding sequence and terminators that terminate transcription of a coding sequence. Another control sequence is a 3' untranslated sequence located at the end of a coding sequence that encodes a polyadenylation signal. Control sequences that direct gene products to particular locations include those that encode signal peptides, which direct the protein to which they are attached to a particular location inside or outside the cell. [0058] Thus, an exemplary vector design for expression of a gene in a microbe contains a coding sequence for a desired gene product (for example, a selectable marker, or an enzyme) in operable linkage with a promoter active in yeast. Alternatively, if the vector does not contain a promoter in operable linkage with the coding sequence of interest, the coding sequence can be transformed into the cells such that it becomes operably linked to an endogenous promoter at the point of vector integration. The promoter used to express a gene can be the promoter naturally linked to that gene or a different promoter.
[0059] A promoter can generally be characterized as constitutive or inducible.
Constitutive promoters are generally active or function to drive expression at all times (or at certain times in the cell life cycle) at the same level. Inducible promoters, conversely, are active (or rendered inactive) or are significantly up- or down-regulated only in response to a stimulus. Both types of promoters find application in the methods of the invention. Inducible promoters useful in the invention include those that mediate transcription of an operably linked gene in response to a stimulus, such as an exogenously provided small molecule, temperature (heat or cold), lack of nitrogen in culture media, etc. Suitable promoters can activate transcription of an essentially silent gene or upregulate, e.g., substantially, transcription of an operably linked gene that is transcribed at a low level. [0060] Inclusion of termination region control sequence is optional, and if employed, then the choice is primarily one of convenience, as the termination region is relatively interchangeable. The termination region may be native to the transcriptional initiation region (the promoter), may be native to the DNA sequence of interest, or may be obtainable from another source (See, e.g. , Chen & Orozco, Nucleic Acids Research 7(5:8411 (1988)). b. Genes and Codon Optimization
[0061] Typically, a gene includes a promoter, a coding sequence, and termination control sequences. When assembled by recombinant DNA technology, a gene may be termed an expression cassette and may be flanked by restriction sites for convenient insertion into a vector that is used to introduce the recombinant gene into a host cell. The expression cassette can be flanked by DNA sequences from the genome or other nucleic acid target to facilitate stable integration of the expression cassette into the genome by homologous recombination. Alternatively, the vector and its expression cassette may remain unintegrated (e.g. , an episome), in which case, the vector typically includes an origin of replication, which is capable of providing for replication of the vector DNA. [0062] A common gene present on a vector is a gene that codes for a protein, the expression of which allows the recombinant cell containing the protein to be differentiated from cells that do not express the protein. Such a gene, and its corresponding gene product, is called a selectable marker or selection marker. Any of a wide variety of selectable markers can be employed in a transgene construct useful for transforming the organisms of the invention. [0063] For optimal expression of a recombinant protein, it is beneficial to employ coding sequences that produce mRNA with codons optimally used by the host cell to be transformed. Thus, proper expression of transgenes can require that the codon usage of the transgene matches the specific codon bias of the organism in which the transgene is being expressed. The precise mechanisms underlying this effect are many, but include the proper balancing of available aminoacylated tRNA pools with proteins being synthesized in the cell, coupled with more efficient translation of the transgenic messenger RNA (mRNA) when this need is met. When codon usage in the transgene is not optimized, available tRNA pools are not sufficient to allow for efficient translation of the transgenic mRNA resulting in ribosomal stalling and termination and possible instability of the transgenic mRNA. Resources for codon- optimization of gene sequences are described in Puigbo et al., Nucleic Acids Research 35:W126-31 (2007), and principles underlying codon optimization strategies are described in Angov, Biotechnology Journal (5:650-69 (2011). Public databases providing statistics for codon usage by different organisms are available, including at www.kazusa.or.jp/codon/ and other publicly available databases and resources.
4. Transformation
[0064] Cells can be transformed by any suitable technique including, e.g., biolistics, electroporation, glass bead transformation, and silicon carbide whisker transformation. Any convenient technique for introducing a transgene into a microorganism can be employed in the present invention. Transformation can be achieved by, for example, the method of D. M. Morrison (Methods in Enzymology 68:326 (1979)), the method by increasing permeability of recipient cells for DNA with calcium chloride (Mandel & Higa, J. Molecular Biology, 53:159 (1970)), or the like.
[0065] Examples of expression of transgenes in oleaginous yeast (e.g., Yarrowia lipolytica) can be found in the literature (Bordes et al., J. Microbiological Methods, 70:493 (2007); Chen et al., Applied Microbiology & Biotechnology 48:232 (1997)). Examples of expression of exogenous genes in bacteria such as E. coli are well known (Green & Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., 2012)).
[0066] Vectors for transformation of microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art. In one embodiment, an exemplary vector design for expression of a gene in a microorganism contains a gene encoding an enzyme in operable linkage with a promoter active in the microorganism. Alternatively, if the vector does not contain a promoter in operable linkage with the gene of interest, the gene can be transformed into the cells such that it becomes operably linked to a native promoter at the point of vector integration. The vector can also contain a second gene that encodes a protein. Optionally, one or both gene(s) is/are followed by a 3' untranslated sequence containing a polyadenylation signal. Expression cassettes encoding the two genes can be physically linked in the vector or on separate vectors. Co-transformation of microbes can also be used, in which distinct vector molecules are simultaneously used to transform cells (Protist 55:381-93 (2004)). The transformed cells can be optionally selected based upon the ability to grow in the presence of the antibiotic or other selectable marker under conditions in which cells lacking the resistance cassette would not grow.
C. Exemplary Cells, Nucleic Acids, Compositions, and Methods
1. Transformed Cells
[0067] In some aspects, embodiments of the invention include cells transformed with one or more nucleic acids encoding a methyltransferase and/or reductase protein. In some embodiments, the transformed cell is a prokaryotic cell, such as a bacterial cell. In some embodiments, the cell is a eukaryotic cell, such as a mammalian cell, a yeast cell, a filamentous fungi cell, a protist cell, an algae cell, an avian cell, a plant cell, or an insect cell. In some embodiments, the cell is a yeast. Those with skill in the art will recognize that many forms of filamentous fungi produce yeast-like growth, and the definition of yeast herein encompasses such cells. The cell may cell may be selected from the group consisting of algae, bacteria, molds, fungi, plants, and yeasts. The cell may be a yeast, fungus, or yeast-like algae. The cell may be selected from thraustochytrids (Aurantiochytrium) and achlorophylic unicellular algae (Prototheca).
[0068] The cell may be selected from the group consisting of Arxula, Aspegillus,
Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Geotrichum, Hansenula, Kluyveromyces, Kodamaea, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Wickerhamomyces, and Yarrowia. It is specifically contemplated that one or more of these cell types may be excluded from embodiments of this invention.
[0069] The cell may be selected from the group of consisting of Arxula adeninivorans,
Aspergillus niger, Aspergillus orzyae, Aspergillus terreus, Aurantiochytrium limacinum, Candida utilis, Claviceps purpurea, Cryptococcus albidus, Cryptococcus curvatus, Cryptococcus ramirezgomezianus , Cryptococcus terreus, Cryptococcus wieringae, Cunninghamella echinulata, Cunninghamella japonica, Geotrichum fermentans, Hansenula polymorpha, Kluyveromyces lactis, Kluyveromyces marxianus, Kodamaea ohmeri, Leucosporidiella creatinivora, Lipomyces lipofer, Lipomyces starkeyi, Lipomyces tetrasporus, Mortierella isabellina, Mortierella alpina, Ogataea polymorpha, Pichia ciferrii, Pichia guilliermondii, Pichia pastoris, Pichia stipites, Prototheca zopfii, Rhizopus arrhizus, Rhodosporidium babjevae, Rhodosporidium toruloides, Rhodosporidium paludigenum, Rhodotorula glutinis, Rhodotorula mucilaginosa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Tremella enchepala, Trichosporon cutaneum, Trichosporon fermentans, Wickerhamomyces ciferrii, and Yarrowia lipolytica. It is specifically contemplated that one or more of these cell types may be excluded from embodiments of this invention. [0070] In certain embodiments, the transformed cell comprises about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, or more lipid as measured by % dry cell weight, or any range derivable therein. In some embodiments, the transformed cell comprises C18 fatty acids at a concentration of about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, or 95% as a percentage of total C16 and C18 fatty acids in the cell by weight, or any range derivable therein. [0071] In some embodiments, the transformed cell comprises oleic acid at a concentration of about, at least about, or at most about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90% or higher as a percentage of total C16 and C18 fatty acids in the cell by weight, or any range derivable therein. In some embodiments, the transformed cell comprises 10- methylstearic acid at a concentration of about, at least about, or of at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, or higher as a percentage of total fatty acids in the cell by weight, or any range derivable therein.
[0072] A cell may be modified to increase its oleate content, which serves as a substrate for 10-methylstearate synthesis. Genetic modifications that increase oleate content are known {see, e.g., PCT Patent Application Publication No. WO16/094520, published June 16, 2016, hereby incorporated by reference in its entirety). For example, a cell may comprise a Δ12 desaturase knockdown or knockout, which favors the accumulation of oleate and disfavors the production of linoleate. A cell may comprise a recombinant Δ9 desaturase gene, which favors the production of oleate and disfavors the accumulation of stearate. The recombinant Δ9 desaturase gene may be, for example, the Δ9 desaturase gene from Y. lipolytica, Arxula adeninivorans, or Puccinia graminis. A cell may comprise a recombinant elongase 1 gene, which favors the production of oleate and disfavors the accumulation of palmitate and palmitoleate. The recombinant elongase 1 gene may be the elongase 1 gene from Y. lipolytica. A cell may comprise a recombinant elongase 2 gene, which favors the production of oleate and disfavors the accumulation of palmitate and palmitoleate. The recombinant elongase 2 gene may be the elongase 2 gene from R. norvegicus. [0073] A cell may be modified to increase its triacylglycerol content, thereby increasing its 10-methylstearate content. Genetic modifications that increase triacylglycerol content are known (see, e.g., PCT Patent Application Publication No. WO16/094520, published June 16, 2016, hereby incorporated by reference in its entirety). A cell may comprise a recombinant diacylglycerol acyltransferase gene (e.g., DGAT1, DGAT2, or DGAT3), which favors the production of triacylglycerols and disfavors the accumulation of diacylglycerols. The recombinant diacylglycerol acyltransferase gene may be, for example, DGAT2 (encoding protein DGA1) from Y. lipolytica, DGAT1 (encoding protein DGA2) from C. purpurea, or DGAT2 (encoding protein DGA1) from R. toruloides. The cell may comprise a glycerol-3- phosphate acyltransferase gene (Sctl) knockdown or knockout, which may favor the accumulation of triacylglycerols, depending on the cell type. The cell may comprise a recombinant glycerol-3-phosphate acyltransferase gene (Sctl) such as the Sctl gene from A. adeninivorans, which may favor the accumulation of triacylglycerols. The cell may comprise a triacylglycerol lipase gene (TGL) knockdown or knockout, which may favor the accumulation of triacylglycerols in the cell. [0074] Various aspects of the invention relate to a transformed cell. The transformed cell may comprise a recombinant methyltransferase gene (e.g., a tmpB gene), a recombinant reductase gene (e.g., a tmpA gene), an exomethylene-substituted lipid, and/or a branched (methyl)lipid. A branched (methyl)lipid may be a carboxylic acid (e.g., 10-methylstearic acid, 10-methylpalmitic acid, 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec-12- enoic acid), carboxylate (e.g., 10-methylstearate, 10-methylpalmitate, 12-methyloleate, 13- methyloleate, 10-methyl-octadec-12-enoate), ester (e.g., diacylglycerol, triacylglycerol, phospholipid), thioester (e.g., 10-methylstearyl CoA, 10-methylpalmityl CoA, 12- methyloleoyl CoA, 13-methyloleoyl CoA, 10-methyl-octadec-12-enoyl CoA), or amide. An exomethylene-substituted lipid may be a carboxylic acid (e.g. , 10-methylenestearic acid, 10- methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene- octadec- 12-enoic acid), carboxylate (e.g. , 10-methylenestearate, 10-methylenepalmitate, 12- methyleneoleate, 13-methyleneoleate, 10-methylene-octadec- 12-enoate), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylenestearyl CoA, 10- methylenepalmityl CoA, 12-methyleneoleoyl CoA, 13-methyleneoleoyl CoA, 10-methylene- octadec- 12-enoyl CoA), or amide. It is specifically contemplated that one or more of the above lipids may be excluded from embodiments of this invention. The methyltransferase gene and reductase gene may have the capability of together producing a methylated branch from any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the Δ9, Δ10, or Δ11 position. The fatty acid may be 14, 15, 16, 17, or 18 carbons, or any range derivable therein.
[0075] "Fatty acids" generally exist in a cell as a phospholipid or triacylglycerol, although they may also exist as a monoacylglycerol or diacylglycerol, for example, as a metabolic intermediate. Free fatty acids also exist in the cell in equilibrium between a relatively abundant carboxylate anion and a relatively scarce, neutrally-charged acid. A fatty acid may exist in a cell as a thioester, especially as a thioester with coenzyme A (CoA), during biosynthesis or oxidation. A fatty acid may exist in a cell as an amide, for example, when covalently bound to a protein to anchor the protein to a membrane.
[0076] A cell may comprise any one of the nucleic acids described herein, infra (see, e.g. , Section B, below). A cell may comprise multiple copies of any one of the nucleic acids described herein. This can be accomplished by, for example, including a tmpB and/or tmpB gene on a high-copy-number plasmid that is transformed into a cell.
[0077] A branched (methyl)lipid may comprise a saturated branched aliphatic chain
(e.g. , 10-methylstearic acid, 10-methylpalmitic acid) or an unsaturated branched aliphatic chain (e.g. , 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec- 12-enoic acid). The branched (methyl)lipid may comprise a saturated or unsaturated branched aliphatic chain comprising a branching methyl group.
[0078] An exomethylene-substituted lipid may comprise a branched aliphatic chain
(e.g. , 10-methylenestearic acid, 10-methylenepalmitic acid, 12-methyleneoleic acid, 13- methyleneoleic acid, 10-methylene-octadec- 12-enoic acid). The aliphatic chain may be branched because the aliphatic chain is substituted with an exomethylene group. [0079] A branched (methyl)lipid may be 10-methylstearate, or an acid (10- methylstearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylstearyl CoA), or amide (e.g. , 10-methylstearyl amide) thereof. For example, the branched (methyl)lipid may be a diacylglycerol, triacylglycerol, or phospholipid, and the diacylglycerol, triacylglycerol, or phospholipid may comprise an ester of 10-methylstearate.
[0080] An exomethylene-substituted lipid may be 10-methylenestearate, or an acid (10- methylenestearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylenestearyl CoA), or amide (e.g. , 10-methylenestearyl amide) thereof. For example, the exomethylene-substituted lipid may be a diacylglycerol, triacylglycerol, or phospholipid, and the diacylglycerol, triacylglycerol, or phospholipid may comprise an ester of 10-methylenestearate.
[0081] In some embodiments, about, at least about, or at most about 1% of the fatty acids of the cell may be 10-methylstearic acid by weight. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11 %, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may be 10-methylstearic acid, or any range derivable therein.
[0082] In some embodiments, about, at least about, or at most about 1% of the fatty acids of the cell may be 10-methylenestearic acid by weight. About, at leat about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11 %, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may be 10-methylenestearic acid, or any range derivable therein. [0083] In some embodiments, about, at least about, or at most about 1% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein, or any range derivable therein.
[0084] In some embodiments, about, at least about, or at most about 1% by weight of the fatty acids of the cell may be one or more of the branched (methyl)lipids described herein. About, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids of the cell may one or more of the branched (methyl)lipids described herein, or any range derivable therein.
[0085] In some embodiments, the cell may comprise about, at least about, or at most about 1% 10-methylstearic acid as measured by % dry cell weight. The cell may comprise about, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, or 50% 10-methylstearic acid as measured by % dry cell weight, or any range derivable therein.
[0086] In some embodiments, the cell may comprise about, at least about, or at most about 1% 10-methylenestearic acid as measured by % dry cell weight. The cell may comprise about, at least about, or at most about 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, or 50% 10-methylenestearic acid as measured by % dry cell weight, or any range derivable therein.
[0087] An unmodified cell of the same type (e.g. , species) as a cell of the invention may not comprise 10-methylstearate, or an acid (10-methylstearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylstearyl CoA), or amide (e.g. , 10-methylstearyl amide) thereof (e.g. , wherein the unmodified cell does not comprise a recombinant methyltransferase gene or a recombinant reductase gene). An unmodified cell of the same type (e.g. , species) as a cell of the invention may not comprise 10-methylenestearate, or an acid (10-methylenestearic acid), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylenestearyl CoA), or amide (e.g. , 10-methylenestearyl amide) thereof (e.g. , wherein the unmodified cell does not comprise a recombinant methyltransferase gene or a recombinant reductase gene). In some embodiments, an unmodified cell of the same species as the cell does not comprise a branched (methyl)lipid and/or an exomethylene-substituted lipid. In some embodiments, an unmodified cell of the same species as the cell does not comprise one or more of the branched (methyl)lipids or exomethylene-substituted lipids described herein.
[0088] In some embodiments, a cell may constitutively express the protein encoded by a recombinant methyltransferase gene and/or reductase gene. A cell may constitutively express a methyltransferase protein and/or reductase protein. 2. Nucleic Acids a. General
[0089] Various aspects of the invention relate to a nucleic acid comprising a recombinant methyltransferase gene, a recombinant reductase gene, or both. The nucleic acid may be, for example, a plasmid. In some embodiments, a recombinant methyltransferase gene and/or a recombinant reductase gene is integrated into the genome of a cell, and thus, the nucleic acid may be a chromosome. In some embodiments, the invention relates to a cell comprising a recombinant methyltransferase gene, e.g. , wherein the recombinant methyltransferase gene is present in a plasmid or chromosome. In some embodiments, the invention relates to a cell comprising a recombinant reductase gene, e.g. , wherein the recombinant reductase gene is present in a plasmid or chromosome. A recombinant methyltransferase gene and a recombinant reductase gene may be present in a cell in the same nucleic acid (e.g. , same plasmid or chromosome) or in different nucleic acids (e.g. , different plasmids or chromosomes).
[0090] A nucleic acid may be inheritable to the progeny of a transformed cell. A gene such as a recombinant methyltransferase gene or recombinant reductase gene may be inheritable because it resides on a plasmid or chromosome. In certain embodiments, a gene may be inheritable because it is integrated into the genome of the transformed cell.
[0091] A gene may comprise conservative substitutions, deletions, and/or insertions while still encoding a protein that has activity. For example, codons may be optimized for a particular host cell, different codons may be substituted for convenience, such as to introduce a restriction site or to create optimal PCR primers, or codons may be substituted for another purpose. Similarly, the nucleotide sequence may be altered to create conservative amino acid substitutions, deletions, and/or insertions.
[0092] Proteins may comprise conservative substitutions, deletions, and/or insertions while still maintaining activity. Conservative substitution tables are well known in the art (Creighton, Proteins (2d. ed., 1992)).
[0093] Amino acid substitutions, deletions and/or insertions may readily be made using recombinant DNA manipulation techniques. Methods for the manipulation of DNA sequences to produce substitution, insertion or deletion variants of a protein are well known in the art. These methods include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, Cleveland, OH), Quick Change Site Directed mutagenesis (Stratagene, San Diego, CA), PCR-mediated site- directed mutagenesis, and other site-directed mutagenesis protocols.
[0094] A "coding sequence" or "coding region" refers to a nucleic acid molecule having sequence information necessary to produce a protein product, such as an amino acid or polypeptide, when the sequence is expressed. The coding sequence may comprise and/or consist of untranslated sequences (including introns or 5' or 3' untranslated regions) within translated regions, or may lack such intervening untranslated sequences (e.g. , as in cDNA).
[0095] The abbreviation used throughout the specification to refer to nucleic acids comprising and/or consisting of nucleotide sequences are the conventional one-letter abbreviations. Thus, when included in a nucleic acid, the naturally occurring encoding nucleotides are abbreviated as follows: adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U). Also, unless otherwise specified, the nucleic acid sequences presented herein is the 5'→3' direction. b. Nucleic acids comprising a recombinant methyltransferase gene
[0096] A methyltransferase gene (e.g., a recombinant methyltransferase gene) encodes a methyltransferase protein, which is an enzyme capable of transferring a carbon atom and one or more protons bound thereto from a substrate such as S-adenosyl methionine to a fatty acid such as oleic acid (e.g., wherein the fatty acid is present as a free fatty acid, carboxylate, phospholipid, diacylglycerol, or triacylglycerol). The methyltransferase gene (e.g., a recombinant methyltransferase gene) may have a coding region that is identical to one from a bacterium of the class Gammaproteobacteria. The methyltransferase gene may comprise any one of the nucleotide sequences set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, and SEQ ID NO: 17. The methyltransferase gene (e.g., a recombinant methyltransferase gene) may be a 10- methylstearic B gene (tmpB) as described herein, or a biologically-active portion thereof (i.e., wherein the biologically-active portion thereof comprises methyltransferase activity).
[0097] The methyltransferase gene (e.g., a recombinant methyltransferase gene) may be derived from a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. The methyltransferase gene (e.g., a recombinant methyltransferase gene) may be selected from the group consisting of Desulfobacula balticum gene tmpB (SEQ ID NO: l), Marinobacter hydrocarbonclasticus gene tmpB (SEQ ID NO:3), Thiohalospira halophila gene tmpB (SEQ ID NO:5), Desulfobacter curvatus gene tmpB (SEQ ID NO:7), Desulfobacter phenolica gene tmpB (SEQ ID NO:9), Desulfobacula toluolica gene tmpB (SEQ ID NO: 11), Desulfobacter postgatei gene tmpB (SEQ ID NO: 13), Halofilum ochraceum gene tmpB (SEQ ID NO: 15), and Marinobacter aquaeolei gene tmpB (SEQ ID NO: 17). It is specifically contemplated that one or more of the above methyltransferase genes may be excluded from embodiments of this invention.
[0098] A recombinant methyltransferase gene may be recombinant because it is operably linked to a promoter other than the naturally-occurring promoter of the methyltransferase gene. Such genes may be useful to drive transcription in a particular species of cell. A recombinant methyltransferase gene may be recombinant because it contains one or more nucleotide substitutions relative to a naturally-occurring methyltransferase gene. Such genes may be useful to increase the translation efficiency of the methyltransferase gene's mRNA transcript in a particular species of cell. [0099] A nucleic acid may comprise a recombinant methyltransferase gene and a promoter, wherein the recombinant methyltransferase gene and promoter are operably linked. The recombinant methyltransferase gene and promoter may be derived from different species. For example, the recombinant methyltransferase gene may encode the methyltransferase protein of a species of Gammaproteobacteria, and the recombinant methyltransferase gene may be operably-linked to a promoter that can drive transcription in another type of bacteria or a eukaryote (e.g., an algae cell, yeast cell, or plant cell). The promoter may be a eukaryotic promoter. A cell may comprise the nucleic acid, and the promoter may be capable of driving transcription in the cell. A cell may comprise a recombinant methyltransferase gene, and the recombinant methyltransferase gene may be operably linked to a promoter capable of driving transcription of the recombinant methyltransferase gene in the cell. The cell may be a species of yeast, and the promoter may be a yeast promoter. The cell may be a species of bacteria, and the promoter may be a bacterial promoter (e.g. , wherein the bacterial promoter is not a promoter from a Gammaproteobacterium). The cell may be a species of algae, and the promoter may be an algae promoter. The cell may be a species of plant, and the promoter may be a plant promoter.
[00100] A recombinant methyltransferase gene may be operably linked to a promoter that cannot drive transcription in the cell from which the recombinant methyltransferase gene originated. For example, the promoter may not be capable of binding an RNA polymerase of the cell from which a recombinant methyltransferase gene originated. In some embodiments, the promoter cannot bind a prokaryotic RNA polymerase and/or initiate transcription mediated by a prokaryotic RNA polymerase. In some embodiments, a recombinant methyltransferase gene is operably-linked to a promoter that cannot drive transcription in the cell from which the protein encoded by the gene originated. For example, the promoter may not be capable of binding an RNA polymerase of a cell that naturally expresses the methyltransferase enzyme encoded by a recombinant methyltransferase gene.
[00101] A promoter may be an inducible promoter or a constitutive promoter. A promoter may be any one of the promoters described in PCT Patent Application Publication No. WO 2016/014900, published January 28, 2016 (hereby incorporated by reference in its entirety). WO 2016/014900 describes various promoters derived from yeast species Yarrowia lipolytica and Arxula adeninivorans, which may be particularly useful as promoters for driving the transcription of a recombinant gene in a yeast cell. A promoter may be a promoter from a gene encoding a Translation Elongation factor EF-la; Glycerol-3-phosphate dehydrogenase; Triosephosphate isomerase 1; Fructose- 1,6-bisphosphate aldolase; Phosphogly cerate mutase; Pyruvate kinase; Export protein EXP1; Ribosomal protein S7; Alcohol dehydrogenase; Phosphoglycerate kinase; Hexose Transporter; General amino acid permease; Serine protease; Isocitrate lyase; Acyl-CoA oxidase; ATP-sulfurylase; Hexokinase; 3 -phosphoglycerate dehydrogenase; Pyruvate Dehydrogenase Alpha subunit; Pyruvate Dehydrogenase Beta subunit; Aconitase; Enolase; Actin; Multidrug resistance protein (ABC-transporter); Ubiquitin; GTPase; Plasma membrane Na+/Pi cotransporter; Pyruvate decarboxylase; Phytase; or Alpha- amylase, e.g., wherein the gene is a yeast gene, such as a gene from Yarrowia lipolytica or Arxula adeninivorans. [00102] A recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17. A recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs of the nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17. A recombinant methyltransferase may or may not have 100% sequence identity with any one of the nucleotide sequences set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17. A recombinant methyltransferase gene may or may not have 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs starting at nucleotide position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134,
135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153,
154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172,
173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210,
211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229,
230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248,
249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267,
268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286,
287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305,
306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324,
325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343,
344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362,
363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381,
382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400,
401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419,
420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438,
439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457,
458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476,
477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495,
496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514,
515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533,
534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552,
553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571,
572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590,
591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609,
610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628,
629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647,
648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666,
667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685,
686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704,
705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723,
724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742,
743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, or 1200 of the nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17. A recombinant methyltransferase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17, and the recombinant methyltransferase gene may encode a methyltransferase protein with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. For example, a gene that is codon- optimized for expression in yeast may have about 70% sequence identity with SEQ ID NO: l, while the protein encoded by such a codon-optimized gene may have 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2. Thus, even though a codon-optimized gene may have only about 70% sequence identity or less to the original gene, the codon- optimized gene encodes the same amino acid sequence of the original gene.
[00103] A recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene because the recombinant methyltransferase gene may be codon- optimized for expression in a eukaryotic cell, such as a plant cell, algae cell, or yeast cell. A cell may comprise a recombinant methyltransferase gene, wherein the recombinant methyltransferase gene is codon-optimized for the cell.
[00104] Exactly, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 codons of a recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene or may be unchanged from a naturally- occurring methyltransferase gene. For example, a recombinant methyltransferase gene may comprise a nucleotide sequence with at least about 65% sequence identity with the naturally- occurring nucleotide sequence set forth in SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, or SEQ ID NO: 17 {e.g., at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity), and at least 5 codons of the nucleotide sequence of the recombinant methyltransferase gene may vary from the naturally-occurring nucleotide sequence {e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 codons).
[00105] A methyltransferase gene encodes a methyltransferase protein. A methyltransferase protein may be a protein expressed by a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. A recombinant methyltransferase gene may encode a naturally-occurring methyltransferase protein even if the recombinant methyltransferase gene is not a naturally- occurring methyltransferase gene. For example, a recombinant methyltransferase gene may vary from a naturally-occurring methyltransferase gene because the recombinant methyltransferase gene is codon-optimized for expression in a specific cell. The codon- optimized, recombinant methyltransferase gene and the naturally-occurring methyltransferase gene may nevertheless encode the same naturally-occurring methyltransferase protein.
[00106] A recombinant methyltransferase gene may encode a methyltransferase protein selected from the group consisting of Desulfobacula balticum protein tmpB (SEQ ID NO:2), Marinobacter hydrocarbonclasticus protein tmpB (SEQ ID NO:4), Thiohalospira halophila protein tmpB (SEQ ID NO:6), Desulfobacter curvatus protein tmpB (SEQ ID NO:8), Desulfobacter phenolica protein tmpB (SEQ ID NO: 10), Desulfobacula toluolica protein tmpB (SEQ ID NO: 12), Desulfobacter postgatei protein tmpB (SEQ ID NO: 14), Halofilum ochraceum protein tmpB (SEQ ID NO: 16), and Marinobacter aquaeolei protein tmpB (SEQ ID NO: 18). It is specifically contemplated that one or more of the above methyltransferase proteins may be excluded from embodiments of this invention. A recombinant methyltransferase gene may encode a methyltransferase protein, and the methyltransferase protein may be substantially identical to any one of the foregoing enzymes, but the recombinant methyltransferase gene may vary from the naturally-occurring gene that encodes the enzyme. The recombinant methyltransferase gene may vary from the naturally-occurring gene because the recombinant methyltransferase gene may be codon-optimized for expression in a specific phylum, class, order, family, genus, species, or strain of cell.
[00107] The sequences of naturally-occurring methyltransferase proteins are set forth in
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, and SEQ ID NO: 18. A recombinant methyltransferase gene may or may not encode a protein comprising 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. For example, a recombinant methyltransferase gene may encode a protein having 100% sequence identity with a biologically-active portion of an amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
[00108] A recombinant methyltransferase gene may encode a methyltransferase protein having, having at least, or having at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18, or a biologically-active portion thereof. A recombinant methyltransferase gene may encode a methyltransferase protein having about, at least about, or at most about 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, or 400% methyltransferase activity relative to a protein comprising the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. A recombinant methyltransferase gene may encode a protein having at least 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 contiguous amino acids starting at position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 of the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18.
[00109] Substrates for the methyltransferase protein may include any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the Δ9, Δ10, or Δ11 position. The substrate may have a chain that is 14, 15, 16, 17, or 18 carbons long, or any range derivable therein. The methyltransferase protein may be capable of catalyzing the formation of a methylene substitution at the Δ9, Δ10, or Δ11 position of such a substrate.
[00110] In some embodiments, the recombinant methyltransferase gene encodes a methyltransferase protein that has specific amino acids unchanged from the amino acid sequence set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, or SEQ ID NO: 18. The unchanged amino acids can include 1, 2, 3, 4, 5, 6, 7, 8, or 9 amino acids selected from Y163, T175, R199, E211, G269, Y271, N313, N319, and W389 of Marinobacter hydrocarbonclasticus tmpB or corresponding amino acids in tmpB from Desulfobacula balticum, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, or Marinobacter aquaeolei, according to the alignment set forth in Figures 7A-D. c. Nucleic acids comprising a recombinant reductase gene
[00111] A reductase gene (e.g., a recombinant reductase gene) encodes a reductase protein, which is an enzyme capable of reducing a double bond of a fatty acid (e.g., wherein the fatty acid is present as a free fatty acid, carboxylate, phospholipid, diacylglycerol, or triacylglycerol). The reductase gene (e.g., a recombinant reductase gene) may have a coding region that is identical to one from a bacterium of the class Gammaproteobacteria. The reductase gene may comprise any one of the nucleotide sequences set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, and SEQ ID NO:35. The reductase gene (e.g., a recombinant reductase gene) may be a 10-methylstearic A gene (tmpA) as described herein, or a biologically-active portion thereof (i.e., wherein the biologically-active portion thereof comprises reductase activity).
[00112] The reductase gene (e.g. , a recombinant reductase gene) may be derived from a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. The reductase gene (e.g., a recombinant reductase gene) may be selected from the group consisting of Desulfobacula balticum gene tmpA (SEQ ID NO: 19), Marinobacter hydrocarbonclasticus gene tmpA (SEQ ID NO:21), Thiohalospira halophila gene tmpA (SEQ ID NO:23), Desulfobacter curvatus gene tmpA (SEQ ID NO:25), Desulfobacter phenolica gene tmpA (SEQ ID NO:27), Desulfobacula toluolica gene tmpA (SEQ ID NO:29), Desulfobacter postgatei gene tmpA (SEQ ID NO:31), Halofilum ochraceum gene tmpA (SEQ ID NO:33), and Marinobacter aquaeolei gene tmpA (SEQ ID NO:35). It is specifically contemplated that one or more of the above reductase genes may be excluded from embodiments of this invention. [00113] A recombinant reductase gene may be recombinant because it is operably linked to a promoter other than the naturally-occurring promoter of the reductase gene. Such genes may be useful to drive transcription in a particular species of cell. A recombinant reductase gene may be recombinant because it contains one or more nucleotide substitutions relative to a naturally-occurring reductase gene. Such genes may be useful to increase the translation efficiency of the reductase gene's mRNA transcript in a particular species of cell.
[00114] A nucleic acid may comprise a recombinant reductase gene and a promoter, wherein the recombinant reductase gene and promoter are operably linked. The recombinant reductase gene and promoter may be derived from different species. For example, the recombinant reductase gene may encode the reductase protein of a species of Gammaproteobacteria, and the recombinant reductase gene may be operably-linked to a promoter that can drive transcription in another type of bacteria or a eukaryote (e.g., an algae cell, yeast cell, or plant cell). The promoter may be a eukaryotic promoter. A cell may comprise the nucleic acid, and the promoter may be capable of driving transcription in the cell. A cell may comprise a recombinant reductase gene, and the recombinant reductase gene may be operably linked to a promoter capable of driving transcription of the recombinant reductase gene in the cell. The cell may be a species of yeast, and the promoter may be a yeast promoter. The cell may be a species of bacteria, and the promoter may be a bacterial promoter (e.g., wherein the bacterial promoter is not a promoter from a Gammaproteobacterium). The cell may be a species of algae, and the promoter may be an algae promoter. The cell may be a species of plant, and the promoter may be a plant promoter.
[00115] A recombinant reductase gene may be operably linked to a promoter that cannot drive transcription in the cell from which the recombinant reductase gene originated. For example, the promoter may not be capable of binding an RNA polymerase of the cell from which a recombinant reductase gene originated. In some embodiments, the promoter cannot bind a prokaryotic RNA polymerase and/or initiate transcription mediated by a prokaryotic RNA polymerase. In some embodiments, a recombinant reductase gene is operably-linked to a promoter that cannot drive transcription in the cell from which the protein encoded by the gene originated. For example, the promoter may not be capable of binding an RNA polymerase of a cell that naturally expresses the reductase enzyme encoded by a recombinant reductase gene.
[00116] A promoter may be an inducible promoter or a constitutive promoter. A promoter may be any one of the promoters described in PCT Patent Application Publication No. WO 2016/014900, published January 28, 2016 (hereby incorporated by reference in its entirety). WO 2016/014900 describes various promoters derived from yeast species Yarrowia lipolytica and Arxula adeninivorans, which may be particularly useful as promoters for driving the transcription of a recombinant gene in a yeast cell. A promoter may be a promoter from a gene encoding a Translation Elongation factor EF-la; Glycerol-3-phosphate dehydrogenase; Triosephosphate isomerase 1; Fructose- 1,6-bisphosphate aldolase; Phosphogly cerate mutase; Pyruvate kinase; Export protein EXP1; Ribosomal protein S7; Alcohol dehydrogenase; Phosphoglycerate kinase; Hexose Transporter; General amino acid permease; Serine protease; Isocitrate lyase; Acyl-CoA oxidase; ATP-sulfurylase; Hexokinase; 3 -phosphoglycerate dehydrogenase; Pyruvate Dehydrogenase Alpha subunit; Pyruvate Dehydrogenase Beta subunit; Aconitase; Enolase; Actin; Multidrug resistance protein (ABC-transporter); Ubiquitin; GTPase; Plasma membrane Na+/Pi cotransporter; Pyruvate decarboxylase; Phytase; or Alpha- amylase, e.g., wherein the gene is a yeast gene, such as a gene from Yarrowia lipolytica or Arxula adeninivorans.
[00117] A recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35. A recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs starting at nucleotide position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113,
114 115 116, 117 118 119, 120 121 122 123 124 125 126, 127 128, 129, 130 131 132, 133 134 135, 136 137 138, 139 140 141 142 143 144 145, 146 147, 148, 149 150 151, 152 153 154, 155 156 157, 158 159 160 161 162 163 164, 165 166, 167, 168 169 170, 171 172 173, 174 175 176, 177 178 179 180 181 182 183, 184 185, 186, 187 188 189, 190 191 192, 193 194 195, 196 197 198 199 200 201 202, 203 204, 205, 206 207 208, 209 210 211, 212 213 214, 215 216 217 218 219 220 221, 222 223, 224, 225 226 227, 228 229 230, 231 232 233, 234 235, 236 237 238 239 240, 241 242, 243, 244 245, 246, 247 248, 249, 250 251 252, 253 254, 255 256 257 258, 259, 260 261, 262, 263 264 265, 266 267 268, 269 270 271, 272 273, 274 275 276 277, 278, 279 280, 281, 282 283, 284, 285 286 287, 288 289 290, 291 292 293 294 295 296 297, 298 299, 300, 301 302 303, 304 305 306, 307 308 309, 310 311 312 313 314 315 316, 317 318, 319, 320 321 322, 323 324, 325, 326 327 328, 329 330 331 332 333 334, 335, 336 337, 338, 339 340 341, 342 343, 344, 345 346 347, 348 349 350 351 352 353, 354, 355 356, 357, 358 359 360, 361 362 363, 364 365 366, 367 368 369 370 371 372, 373, 374 375, 376, 377 378 379, 380 381 382, 383 384 385, 386 387 388 389 390 391 392, 393 394, 395, 396 397 398, 399 400 401, 402 403 404, 405 406 407 408 409 410 411, 412 413, 414, 415 416 417, 418 419 420, 421 422 423, 424 425, 426 427 428 429 430, 431 432, 433, 434 435, 436, 437 438, 439, 440 441 442, 443 444, 445 446 447 448 449, 450 451, 452, 453 454, 455, 456 457, 458, 459 460 461, 462 463 464 465 466 467 468, 469 470, 471, 472 473, 474, 475 476 477, 478 479 480, 481 482, 483 484 485 486 487, 488 489, 490, 491 492 493, 494 495 496, 497 498 499, 500 501 502 503 504 505 506, 507 508, 509, 510 511 512, 513 514 515, 516 517 518, 519 520 521 522 523 524, 525, 526 527, 528, 529 530 531, 532 533, 534, 535 536 537, 538 539 540 541 542 543, 544, 545 546, 547, 548 549 550, 551 552, 553, 554 555 556, 557 558, 559 560 561 562 563, 564 565, 566, 567 568 569, 570 571 572, 573 574 575, 576 577, 578 579 580 581 582, 583 584, 585, 586 587 588, 589 590 591, 592 593 594, 595 596 597 598 599 600 601, 602 603, 604, 605 606 607, 608 609 610, 611 612 613, 614 615 616 617 618 619 620, 621 622, 623, 624 625 626, 627 628 629, 630 631 632, 633 634 635 636 637 638 639, 640 641, 642, 643 644 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1200 of the nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35. A recombinant reductase may or may not have 100% sequence identity with any one of the nucleotide sequences set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, and SEQ ID NO:35. A recombinant reductase gene may or may not have 100% sequence identity with 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or 1300 contiguous base pairs of the nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35. A recombinant reductase gene may comprise a nucleotide sequence with, with at least, or with at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35, and the recombinant reductase gene may encode a reductase protein with, with at least, or with at most 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. For example, a gene that is codon-optimized for expression in yeast may have about 70% sequence identity with SEQ ID NO: 19, while the protein encoded by such a codon-optimized gene may have 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20. Thus, even though a codon-optimized gene may have only about 70% sequence identity or less to the original gene, the codon-optimized gene encodes the same amino acid sequence of the original gene. [00118] A recombinant reductase gene may vary from a naturally-occurring reductase gene because the recombinant reductase gene may be codon-optimized for expression in a eukaryotic cell, such as a plant cell, algae cell, or yeast cell. A cell may comprise a recombinant reductase gene, wherein the recombinant reductase gene is codon-optimized for the cell.
[00119] Exactly, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132,
133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151,
152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170,
171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189,
190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208,
209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227,
228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246,
247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265,
266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284,
285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303,
304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322,
323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341,
342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360,
361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379,
380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398,
399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417,
418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436,
437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455,
456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474,
475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493,
494, 495, 496, 497, 498, 499, or 500 codons of a recombinant reductase gene may vary from a naturally-occurring reductase gene or may be unchanged from a naturally-occurring reductase gene. For example, a recombinant reductase gene may comprise a nucleotide sequence with at least about 65% sequence identity with the naturally-occurring nucleotide sequence set forth in SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, or SEQ ID NO:35 (e.g., at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity), and at least 5 codons of the nucleotide sequence of the recombinant reductase gene may vary from the naturally- occurring nucleotide sequence (e.g., at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 codons).
[00120] A reductase gene encodes a reductase protein. A reductase protein may be a protein expressed by a species of Gammaproteobacteria, such as bacteria from the genera Desulfobacter, Desulfobacula, Marinobacter, Thiohalospira, or Halofilum. A recombinant reductase gene may encode a naturally-occurring reductase protein even if the recombinant reductase gene is not a naturally-occurring reductase gene. For example, a recombinant reductase gene may vary from a naturally-occurring reductase gene because the recombinant reductase gene is codon-optimized for expression in a specific cell. The codon-optimized, recombinant reductase gene and the naturally-occurring reductase gene may nevertheless encode the same naturally-occurring reductase protein.
[00121] A recombinant reductase gene may encode a reductase protein selected from the group consisting of Desulfobacula balticum protein tmpA (SEQ ID NO:20), Marinobacter hydrocarbonclasticus protein tmpA (SEQ ID NO:22), Thiohalospira halophila protein tmpA (SEQ ID NO:24), Desulfobacter curvatus protein tmpA (SEQ ID NO:26), Desulfobacter phenolica protein tmpA (SEQ ID NO:28), Desulfobacula toluolica protein tmpA (SEQ ID NO:30), Desulfobacter postgatei protein tmpA (SEQ ID NO:32), Halofilum ochraceum protein tmpA (SEQ ID NO:34), and Marinobacter aquaeolei protein tmpA (SEQ ID NO:36). It is specifically contemplated that one or more of the above reductase proteins may be excluded from embodiments of this invention. A recombinant reductase gene may encode a reductase protein, and the reductase protein may be substantially identical to any one of the foregoing enzymes, but the recombinant reductase gene may vary from the naturally-occurring gene that encodes the enzyme. The recombinant reductase gene may vary from the naturally-occurring gene because the recombinant reductase gene may be codon-optimized for expression in a specific phylum, class, order, family, genus, species, or strain of cell.
[00122] The sequences of naturally-occurring reductase proteins are set forth in SEQ ID
NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, and SEQ ID NO:36. A recombinant reductase gene may or may not encode a protein comprising 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. For example, a recombinant reductase gene may encode a protein having 100% sequence identity with a biologically-active portion of an amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36.
[00123] A recombinant reductase gene may encode a reductase protein having, having at least, or having at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36, or a biologically-active portion thereof. A recombinant reductase gene may encode a reductase protein having about, at least about, or at most about 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 10%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, 100%, 100.1%, 100.2%, 100.3%, 100.4%, 100.5%, 100.6%, 100.7%, 100.8%, 100.9%, 101%, 105%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%, 190%, 200%, 220%, 240%, 260%, 280%, 300%, 320%, 340%, 360%, 380%, or 400% reductase activity relative to a protein comprising the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. A recombinant reductase gene may encode a protein having, having at least, or having at most 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity with 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, or 500 contiguous amino acids starting at amino acid position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, or 500 of the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36.
[00124] Substrates for the reductase protein may include any fatty acid from 14 to 18 carbons long with a methylene substitution in the Δ9, Δ10, or Δ11 position. The substrate may be 14, 15, 16, 17, or 18 carbons long, or any range derivable therein. The reductase protein may be capable of catalyzing the reduction of a methylene-substituted fatty acid substrate to a (methyl)lipid. The reductase protein, together with a methyltransferase protein, may be capable of catalyzing the production of a methylated branch from any fatty acid from 14 to 18 carbons long with an unsaturated double bond in the Δ9, Δ10, or Δ11 position, including fatty acids that are 14, 15, 16, 17, or 18 carbons long, or any range derivable therein.
[00125] In some embodiments, the recombinant reductase gene encodes a reductase protein that has specific amino acids unchanged from the amino acid sequence set forth in SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, or SEQ ID NO:36. The unchanged amino acids can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 amino acids selected from 18, L22, F37, P38, R39, K41, G45, W46, P49, G144, C148, P149, E169, E171, L197, 1212, C249, H250, Y252, 1270, G275, L276, E283, A296, and A299 of Marinobacter hydrocarbonclasticus tmpA or corresponding amino acids in tmpA from Desulfobacula balticum, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, or Marinobacter aquaeolei, according to the alignment set forth in Figures 8A-D. [00126] As used herein, the term "complementary" and derivatives thereof are used in reference to pairing of nucleic acids by the well-known rules that A pairs with T or U and C pairs with G. Complement can be "partial" or "complete". In partial complement, only some of the nucleic acid bases are matched according to the base pairing rules; while in complete or total complement, all the bases are matched according to the pairing rule. The degree of complement between the nucleic acid strands may have significant effects on the efficiency and strength of hybridization between nucleic acid strands as well known in the art. The efficiency and strength of said hybridization depends upon the detection method.
[00127] Any nucleic acid that is referred to herein as having a certain percent sequence identity to a sequence set forth in a SEQ ID NO, includes nucleic acids that have the certain percent sequence identity to the complement of the sequence set forth in the SEQ ID NO. d. Nucleic acids comprising a recombinant methyltransferase gene and a recombinant reductase gene
[00128] A nucleic acid may comprise both a recombinant methyltransferase gene and a recombinant reductase gene. The recombinant methyltransferase gene and the recombinant reductase gene may encode proteins from the same species or from different species.
[00129] A nucleic acid may comprise the nucleotide sequence of an expression vector comprising a tmp operon that includes both a methyltransferase gene and a reductase gene. Such vectors may include pNC1071 (SEQ ID NO:39), which includes a Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes a Desulfobacula balticum tmp operon, pNC1073 (SEQ ID NO:41), which includes a Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes a Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes a Thiohalospira halophila tmp operon.
[00130] In some embodiments, the nucleic acid encodes a fusion protein that includes both a methyltransferase and a reductase or fragments thereof. In the context of the present invention, "fusion protein" means a single protein molecule containing two or more distinct proteins or fragments thereof, covalently linked via peptide bond in a single peptide chain. In some embodiments, the fusion protein comprises enzymatically active domains from both a methyltransferase protein and a reductase protein. The nucleic acid may further encode a linker peptide between the methyltransferase and the reductase. In some embodiments, the linker peptide comprises the amino acid sequence AGGAEGGNGGGA. The linker may comprise about or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or 30 amino acids, or any range derivable therein. The nucleic acid may comprise any of the methyltransferase and reductase genes described herein, and the fusion protein encoded by the nucleic acid can comprise any of the methyltransferase and reductase proteins described herein, including biologically active fragments thereof. In some embodiments, the fusion protein is a tmpA-B protein, in which the tmpA protein is closer to the N-terminus than the tmpB protein.
3. Compositions
[00131] Various aspects of the invention relate to compositions produced by the cells described herein. The composition may be an oil composition comprised of about, at least about, or at most about 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100% lipids by weight. The composition may comprise branched (methyl)lipids and/or exomethylene-substituted lipids. The branched (methyl)lipid may be a carboxylic acid (e.g. , 10-methylstearic acid, 10-methylpalmitic acid, 12-methyloleic acid, 13-methyloleic acid, 10-methyl-octadec-12-enoic acid), carboxylate (e.g. , 10-methylstearate, 10-methylpalmitate, 12-methyloleate, 13-methyloleate, 10-methyl-octadec- 12-enoate), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10- methylstearyl CoA, 10-methylpalmityl CoA, 12-methyloleoyl CoA, 13-methyloleoyl CoA, 10- methyl-octadec-12-enoyl CoA), or amide. The exomethylene-substituted lipid may be a carboxylic acid (e.g. , 10-methylenestearic acid, 10-methylenepalmitic acid, 12-methyleneoleic acid, 13-methyleneoleic acid, 10-methylene-octadec- 12-enoic acid), carboxylate (e.g. , 10- methylenestearate, 10-methylenepalmitate, 12-methyleneoleate, 13-methyleneoleate, 10- methylene-octadec-12-enoate), ester (e.g. , diacylglycerol, triacylglycerol, phospholipid), thioester (e.g. , 10-methylenestearyl CoA, 10-methylenepalmityl CoA, 12-methyleneoleoyl CoA, 13-methyleneoleoyl CoA, 10-methylene-octadec-12-enoyl CoA), or amide. 10-methyl lipids, 10-methylene lipids, or both. It is specifically contemplated that one or more of the above lipids may be excluded from certain embodiments.
[00132] In some aspects, the composition is produced by cultivating a culture comprising any of the cells described herein and recovering the oil composition from the cell culture. The cells in the culture may contain any of the recombinant methyltransferase genes described herein and/or any of the recombinant reductase genes described herein. The culture medium and conditions can be chosen based on the species of the cell to be cultured and can be optimized to provide for maximal production of the desired lipid profile. [00133] Various methods are known for recovering an oil composition from a culture of cells. For example, lipids, lipid derivatives, and hydrocarbons can be extracted with a hydrophobic solvent such as hexane. Lipids and lipid derivatives can also be extracted using liquefaction, oil liquefaction, and supercritical CO2 extraction. The recovery process may include harvesting cultured cells, such as by filtration or centrifugation, lysing cells to create a lysate, and extracting the lipid/hydrocarbon components using a hydrophobic solvent.
[00134] In addition to accumulating within cells, the lipids described herein may be secreted by the cells. In that case, a process for recovering the lipid may not require creating a lysate from the cells, but collecting the secreted lipid from the culture medium. Thus, the compositions described herein may be made by culturing a cell that secretes one of the lipids described herein, such as a a linear fatty acid with a chain length of 14- 18 carbons with a methyl branch at the Δ9, Δ10, or Δ11 position.
[00135] In some embodiments, the oil composition comprises about, at least about, or at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of a branched (methyl)lipid, such as a 10-methyl fatty acid, or any range derivable therein. In some embodiments, 10-methyl fatty acids comprise about, at least about, or at most about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% by weight of the fatty acids in the composition, or any range derivable therein. [00136] The amount of 10-methyl fatty acids in a cell can be optimized by various methods. For example, increasing the expression of tmpA and/or tmpB can increase the methyltransferase and/or reductase activity within the cell, which may lead to accumulation of greater amounts of branched (methyl lipids). One way this can be accomplished is by increasing the number of copies of the gene in the cell, such as by including the genes on high- copy-number plasmids. Additionally or alternatively, the tmpA and/or tmpB cells can be operably linked to a promoter that drives high levels of expression.
4. Methods of producing branched (methyl)lipid
[00137] Various aspects of the invention relate to a method of producing a branched
(methyl)lipid. The method may comprise incubating a cell or plurality of cells as described herein, supra, with media. The media may optionally be supplemented with an unbranched, unsaturated fatty acid, such as oleic acid, that serves as a substrate for methylation. The substrate may include one or more fatty acids from 14 to 18 carbons long with a double bond in the Δ9, Δ10, or Δ11 position. The substrate may be 14, 15, 16, 17, or 18 carbons long, or any range derivable therein. The media may optionally be supplemented with methionine or s-adenosyl methionine, which may similarly serve as a substrate. Thus, the method may comprise contacting a cell or plurality of cells with oleic acid (or some other substrate to be methylated), methionine, or both. The method may comprise incubating a cell or plurality of cells as described herein, supra, in a bioreactor. The method may comprise recovering lipids from the cells, such as by extraction with an organic solvent.
[00138] The method may comprise degumming the cell or plurality of cells, e.g., to remove proteins. The method may comprise transesterification or esterification of the lipids of the cells. An alcohol such as methanol or ethanol may be used for transesterification or esterification, e.g., thereby producing a fatty acid methyl ester or fatty acid ethyl ester.
EXAMPLES
[00139] The present invention will be described in greater detail by way of specific examples. The following examples are offered for illustrative purposes only, and are not intended to limit the invention in any manner. Those of skill in the art will readily recognize a variety of noncritical parameters which can be changed or modified to yield essentially the same results.
Example 1
Identification of tmpB and tmpA Genes in Gammaproteobacteria [00140] Select Gammaproteobacteria are known to produce branched 10-methyl fatty acids. The acetate-oxidizing, sulfate-reducing Desulfobacter bacteria were reported to produce 10-methylhexadecanoic acid at 6% - 24% of total phospholipid-ester linked fatty acid content (Dowling, Microbiology 32: 1815-25 (1986)). Other reports of 10-methyl branched fatty acid production exist for bacteria in the Genus Marinobacter (Marquez, J. Syst. Evol. Microbiol. 55: 1349-51 (2005); Huu, Int. J. Syst. Evol. Microbiol. 49:367-75 (1999); Gauthier, Int. J. Syst. Evol. Microbiol. 42:568-76 (1992); Thiohalospira (Sorokin, Int. J. Syst. Evol. Microbiol., 58:2890-97 (2008)), Thiohalorhabdus (Sorokin, Int. J. Syst. Evol. Microbiol. 58:2890-97 (2008)), Desulfobacula, and Desulfotignum (Kuever, Int. J. Syst. Evol. Microbiol. 51 : 171-77 (2001)). However, no genes or enzymes involved in Gammaproteobacteria 10-methyl fatty acid production have been described. In this Example, a pair of phylogenetically and sequence- homology distinct genes present in certain Gammaproteobacteria which direct production of 10-methyl fatty acids in heterologous hosts are described. [00141] A list of Gammaproteobacteria that produce 10-methyl fatty acids and have sequenced genomes was compiled from literature reports. Additionally, representative Gammaproteobacteria that are not reported to produce 10-methyl fatty acids were included for comparison. According to a biochemical study on the unrelated bacterium Mycobacterium phlei using unpurified enzyme preparations, the first step of 10-methyl fatty acid synthesis occurs via a mechanism similar to cyclopropane fatty acid synthesis and is followed by an enzymatic reduction step (Akamatsu, J. Biol. Chem. 245:701-08 (1970)). To find gene candidates responsible for 10-methyl fatty acid production the Gammaproteobacteria genomes were scanned for homologs of E. coli cyclopropane fatty acyl phospholipid synthase {cfa), which is responsible for methylation of unsaturated fatty acids to produce cyclopropane fatty acids (Wang, Biochemistry 3i :11020-28 (1992); Taylor, Biochemistry i8:3292-3300 (1979)). This was performed using the NCBI BLAST protein analysis tool and the BioCyc genomic database (Caspi, Nucleic Acids Res. 40:D742-53 (2016)). Next, the cfa homologs were scanned for adjacent genes in an operon structure that had homology to an oxidoreductase or electron transfer function. Interestingly, Gammaproteobacteria able to produce 10-methyl fatty acids all possessed a gene operon (referred to herein as the imp operon) with a cyclopropane fatty acid synthase gene homolog (referred to herein as tmpB) and a gene with homology to a geranylgeranyl reductase (referred to herein as tmpA). These results are summarized in Figure 2. It is unlikely tmpA is a true geranylgeranyl reductase since the enzyme is involved in chlorophyll and tocopherol biosynthesis, neither of which chemicals the bacteria produce.
Example 2
E. coli Expression of the tmpB and tmpA Gene Products
[00142] To test if the tmp gene operon was responsible for Gammaproteobacteria 10- methyl fatty acid production, the genes were designed in an E. coli expression vector using the DNA manipulation software A Plasmid Editor and synthesized by Thermofisher Scientific - GeneArt. The native codon usage of the tmp genes was not changed. tmpB gene transcription was controlled using the constitutively active tac promoter (de Boer 1983), followed by the E. coli lacZ-lacY intergene linker region, the tmpA gene, and the trpT' gene terminator (Wu 1981). These synthetic gene operons were cloned into an E. coli expression vector containing the AmpR ampicillin resistance gene and the ColEl origin of replication (Figure 3A-3B). The plasmid vectors are named pNC1071 (SEQ ID NO:39), which includes the Desulfobacter postgatei tmp operon; pNC1072 (SEQ ID NO:40), which includes the Desulfobacula balticum tmp operon; pNC1073 (SEQ ID NO:41), which includes the Desulfobacula toluolica tmp operon; pNC1074 (SEQ ID NO:42), which includes the Marinobacter hydrocarbonclasticus tmp operon; and pNC1076 (SEQ ID NO:43), which includes the Thiohalospira halophila tmp operon.
[00143] Plasmids pNC1071, pNC1072, pNC1073, pNC1074, pNC1076, and the control plasmid pNC53 containing the AmpR gene, ColEl origin, and tac promoter were transformed into E. coli ToplO (Invitrogen) using a standard electrotransformation protocol utilizing 50 μΐ^ suspended cells, 1 μϊ^ of plasmid DNA at a concentration of 200 ng per μί, a 1 mm gap electrotransformation cuvette, and a pulse with 1.8 kV voltage, 200 Ω, and 25 μΕ with exponential decay and a time constant of approximately 4.5 milliseconds. During the protocol cells were kept on ice and the cuvette was pre-chilled before pulsing with a Bio-Rad Gene Pulser Electroporation System. After pulsing, cells were transferred to 1 mL SOC medium and incubated at 37°C for 1 hour before plating on LB agar containing 100 μg per mL ampicillin antibiotic.
[00144] Single colonies from the transformation plates were chosen and grown in 5 mL
LB liquid media in 14 mL plastic falcon tubes overnight at 37°C. These were used to prepare freezer vials with 0.75 mL culture broth and 0.75 mL of 50% glycerol/water which were stored at -80°C.
[00145] Fermentation studies were performed in 50 mL LB media with 100 μg per mL ampicillin in 250 mL baffled shake flasks. 10 μΐ^ of frozen culture stock was added to the media and the flask was incubated at 37°C and shaken at 200 rpm in a New Brunswick orbital incubator for 24 hours. Cell were harvested by centrifugation at 4000 rpm for 15 minutes in an Eppendorf 5810 R clinical centrifuge, resuspended in 0.5 mL deionized water, and frozen at -80°C. [00146] Figure 6 shows that E. coli transformed with pNC1071, pNC1073, pNC1074, and pNC1076, but not the empty vector control (pNC53) produced 10-methylene hexadecenoic acid.
[00147] To test the acyl chain substrate range for the tmpB and tmpA enzymes, E. coli transformed with pNC1074 ( . hydrocarbonclausticus tmp operon) or pNC1076 (Γ. halophila tmp operon) were grown in LB media supplemented with ampicillin and 100 mg/L of one of the fatty acids indicated in Table 1 below. After culturing, cells were harvested by centrifugation, washed with deionized water, resuspended in deionized water, and frozen. Cells were then lyophilized to dryness and used to perform a HCl-methanol catalyzed transesterification reaction to produce fatty acid methyl esters (FAME). These samples were dissolved in isooctane and injected inot a gas chromatography system (Agilent Technologies) equipped with a flame ionization detector. Table 1 shows the percentage of each fatty acid that was converted to methylene- and methyl -branched fatty acids.
Table 1. Fatty acid conversion to methylene and methyl branched fatty acids with E. coli expressing the tmpB and tmpA genes from M. hydrocarbonclasticus and T. halophila.
Ε. coli + pNC1074 ( . E. coli + pNC1076 (T.
Fatty
hydrocarbonclausticus tmpBA) halophila) tmpBA
acid
percent conversion percent conversion
12:1Δ11 0% 0%
13:1Δ12 0% 0%
14:1Δ9 89% 95%
15:1Δ10 86% 69%
16:1Δ9 55% 95%
17:1Δ10 36% 19%
18:1Δ6 0% 0%
18:1Δ9 42% 47%
18:1Δ11 9% 8%
19:1Δ7 0% 0%
19:1Δ10 0% 0%
20:1Δ5 0% 0%
20:1Δ8 0% 0%
20:1Δ11 0% 0%
22:1Δ13 0% 0%
24:1Δ15 0% 0%
As shown in Table 1, methylation occurred on fatty acids with 14, 15, 16, 17, and 18 carbons, and on Δ9, Δ10, and Δ11 double bond positions. Example 3
tmpB Gene Expression in Yeast
[00148] To test the production of 10-methylene fatty acids by the tmpB genes in the yeast Saccharomyces cerevisiae and Yarrowia lipolytica, the genes containing native bacterial codons were cloned into a standard Yarrowia overexpression vector. The vector contains a selectable NAT marker and a 2μ origin of replication for high copy maintenance in Saccharomyces cerevisiae. The resulting plasmids are pNC996 (Desulfobacter postgatei tmpB), pNC998 (Desulfobacula balticum tmpB), pNClOOO (Desulfobacula toluolica tmpB), pNC1002 (Marinobacter hydrocarbonclasticus tmpB), pNC1006 (Thiohalospira halophila tmpB). For Saccharomyces, plasmids were transformed into NS20 by standard heat shock protocol. Single cells of the resulting transformations were selected and further grown in 96- well shaking plates in YPD supplemented with 50 μg/mL Nourseothrycin for 2 days at 30° C. For Yarrowia, plasmids were transformed into strain NS 1009. Resulting transformed strains were grown in 96-well shaking plates in standard nitrogen limited media for 4 days at 30° C. For all yeast experiments, cell pellets were isolated by centrifugation and freeze dried for fatty acid analysis by gas chromatography as performed for E. coli samples. Total fatty acids were measured and the total amount of C16 and C18 fatty acids containing the methylene intermediates were quantified.
[00149] Results: Three tmpB genes produced 10-methylene fatty acids in NS20, Desulfobacula balticum, Marinobacter hydrocarbonclasticus, and Thiohalospira halophila (Figure 4). The tmpB genes from Marinobacter hydrocarbonclasticus, and Thiohalospira halophila were able to produce 10-methylene fatty acids in Yarrowia lipolytica (Figure 5).
Example 4
tmpB and tmpA sequence analysis
[00150] TmpB protein sequences encoded by the tmpB genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei were aligned with the cyclopropane fatty acid synthase (Cfa) enzyme from Escherichia coli with the CLUSTAL OMEGA software program (European Molecular Biology Laboratory, EMBL). Figures 7A-D show the alignment of these protein sequences and indicates a number of amino acids that are conserved in the tmsB protein sequences but not in the E. coli Cfa sequence. The following amino acids are conserved in the TmpB aligned proteins, but not present in the E. coli Cfa protein: Y163, T175, R199, E211, G269, Y271, N313, N319, W389 (amino acid number based on the M. hydrocarbonclasticus TmpB protein). The percent sequence identity of each of the aligned proteins as compared to M. hydrocarbonclasticus tmpB is indicated below: % Identity of amino acid sequence
Desulfobacula balticum TmpB 37%
Thiohalospira halophila TmpB 58%
Desulfobacter curvatus TmpB 43%
Desulfobacter phenolica TmpB 39%
Desulfobacula toluolica TmpB 39%
Desulfobacter postgatei TmpB 43%
Halofilum ochraceum TmpB 59%
Marinobacter aquaeolei TmpB 88%
Escherichia coli Cfa 46%
[00151] TmpA protein sequences encoded by the tmpA genes from Desulfobacula balticum, Marinobacter hydrocarbonclasticus, Thiohalospira halophila, Desulfobacter curvatus, Desulfobacter phenolica, Desulfobacula toluolica, Desulfobacter postgatei, Halofilum ochraceum, and Marinobacter aquaeolei were aligned with the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464 with the CLUSTAL OMEGA software program (European Molecular Biology Laboratory, EMBL). Figures 8A-D show the alignment of these protein sequences and indicates a number of amino acids that are conserved in the tmsA protein sequences but not in the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464. The following amino acids are conserved in the TmpA aligned proteins, but not present in the Archaeoglobus fulgidus geranylgeranyl reductase protein AF0464: 18, L22, F37, P38, R39, K41, G45, W46, P49, G144, C148, P149, E169, E171, L197, 1212, C249, H250, Y252, 1270, G275, L276, E283, A296, A299 (amino acid number based on the M. hydrocarbonclasticus TmpA protein).
% Identity of amino acid sequence
Desulfobacula balticum TmpA 33%
Thiohalospira halophila TmpA 57%
Desulfobacter curvatus TmpA 36%
Desulfobacter phenolica TmpA 34%
Desulfobacula toluolica TmpA 34%
Desulfobacter postgatei TmpA 34%
Halofilum ochraceum TmpA 64%
Marinobacter aquaeolei TmpA 83%
Archaeoglobus fulgidus AF0464 27%

Claims

1. A cell comprising a recombinant methyltransferase gene encoding a methyltransferase protein from a bacterium of the class Gammaproteobacteria and either a branched (methyl)lipid or an exomethylene-substituted lipid, wherein the branched (methyl)lipid is a carboxylic acid, carboxylate, ester, thioester, or amide; and the branched (methyl)lipid comprises a saturated or unsaturated branched aliphatic chain comprising a branching methyl group; or the exomethylene-substituded lipid is a carboxylic acid, carboxylate, ester, thioester, or amide; and the exomethylene-substituted lipid comprises a branched aliphatic chain; and the aliphatic chain is branched because the aliphatic chain is substituted with an
exomethylene group..
2. The cell of claim 1 , wherein the branched (methyl)lipid or the exomethylene-substituted lipid is a fatty acid from 14 to 18 carbons long with a methyl moiety in the Δ9, Δ10, or Δ11 position.
3. The cell of claim 2, wherein the branched (methyl)lipid is 10-methylstearate, or an ester, thioester, or amide thereof or the exomethylene-substituted lipid is 10-methylenestearate, or an ester, thioester, or amide thereof.
4. The cell of claim 3, wherein the branched (methyl)lipid or the exomethylene-substituted lipid is a diacylglycerol, triacylglycerol, or phospholipid, and the diacylglycerol, triacylglycerol, or phospholipid comprising an ester of 10-methylstearate.
5. The cell of claim 1, wherein the methyltransferase protein is selected from Desulfobacula balticum enzyme tmpB, Marinobacter hydrocarbonclasticus enzyme tmpB, or Thiohalospira halophila enzyme tmpB .
6. The cell of claim 1, wherein the methyltransferase protein has at least 90% sequence identity to SEQ ID NO:2, SEQ ID NO:4, or SEQ ID NO:6.
7. The cell of claim 1, further comprising a recombinant reductase gene encoding a reductase protein from a bacterium of the class Gammaproteobacteria.
8. The cell of claim 7, wherein the reductase protein is capable of converting a methylene- substituted lipid to a methyl-substituted lipid.
9. The cell of claim 8, wherein the methylene-substituted lipid is a fatty acid from 14 to 18 carbons long with a methylene substitution in the Δ9, Δ10, or Δ11 position and the methyl- substituted lipid is a fatty acid from 14 to 18 carbons long with a methyl moiety in the Δ9, Δ10, or Δ11 position.
10. The cell of claim 7, wherein the reductase protein is selected from Desulfobacula balticum enzyme tmpA, Marinobacter hydrocarbonclasticus enzyme tmpA, or Thiohalospira halophila enzyme tmpA.
11. The cell of claim 7, wherein the reductase protein has at least 90% sequence identity to SEQ ID NO:20, SEQ ID NO:22, or SEQ ID NO:24.
12. The cell of claim 7, wherein the methyltransferase gene and the reductase gene are included in a single open reading frame encoding a fusion protein comprising both the methyltransferase protein and the reductase protein.
13. The cell of claim 1, wherein at least about 1% to about 15% by weight of the fatty acids of the cell are 10-methyl fatty acids or 10-methylene fatty acids.
14. The cell of claim 1, wherein the cell lacks an endogenous methyltransferase gene.
15. The cell of claim 1, wherein the cell lacks the endogenous ability to produce the branched (methyl)lipid or exomethylene-substituted lipid.
16. The cell of claim 1, wherein the cell is a bacterial cell, a fungal cell, an algal cell, a mold cell, a plant cell, or a yeast cell.
17. The cell of claim 1, wherein the cell is selected from the group consisting of Arxula, Aspegillus, Aurantiochytrium, Candida, Claviceps, Cryptococcus, Cunninghamella, Geotrichum, Hansenula, Kluyveromyces, Kodamaea, Leucosporidiella, Lipomyces, Mortierella, Ogataea, Pichia, Prototheca, Rhizopus, Rhodosporidium, Rhodotorula, Saccharomyces, Schizosaccharomyces, Tremella, Trichosporon, Wickerhamomyces, and Yarrowia
18. The cell of claim 17, wherein the cell is selected from the group consisting of Arxula adeninivorans, Aspergillus niger, Aspergillus orzyae, Aspergillus terreus, Aurantiochytrium limacinum, Candida utilis, Claviceps purpurea, Cryptococcus albidus, Cryptococcus curvatus, Cryptococcus ramirezgomezianus , Cryptococcus terreus, Cryptococcus wieringae, Cunninghamella echinulata, Cunninghamella japonica, Geotrichum fermentans, Hansenula polymorpha, Kluyveromyces lactis, Kluyveromyces marxianus, Kodamaea ohmeri, Leucosporidiella creatinivora, Lipomyces lipofer, Lipomyces starkeyi, Lipomyces tetrasporus, Mortierella isabellina, Mortierella alpina, Ogataea polymorpha, Pichia ciferrii, Pichia guilliermondii, Pichia pastoris, Pichia stipites, Prototheca zopfii, Rhizopus arrhizus, Rhodosporidium babjevae, Rhodosporidium toruloides, Rhodosporidium paludigenum, Rhodotorula glutinis, Rhodotorula mucilaginosa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Tremella enchepala, Trichosporon cutaneum, Trichosporon fermentans, Wickerhamomyces ciferrii, and Yarrowia lipolytica.
19. A method of producing a branched (methyl)lipid or exomethylene-substituted lipid, comprising contacting the cell of any one of claims 1 to 18 with a substrate fatty acid, methionine, or both a substrate fatty acid and methionine, wherein the substrate fatty acid comprises a fatty acid from 14 to 18 carbons long with a double bond in the Δ9, Δ10, or Δ11 position.
20. A nucleic acid comprising a recombinant methyltransferase gene encoding a methyltransferase protein from a bacterium of the class Gammaproteobacteria and a first promoter operably linked to the recombinant methyltransferase gene.
21. The nucleic acid of claim 20, wherein the methyltransferase gene has at least 80% sequence identity to SEQ ID NO: l, SEQ ID NO:3, or SEQ ID NO:5.
22. The nucleic acid of claim 20, wherein the methyltransferase gene encodes a protein having at least 90% sequence identity to SEQ ID NO:2, SEQ ID NO:4, or SEQ ID NO:6.
23. The nucleic acid of claim 20, wherein the methyltransferase gene is codon-optimized for expression in yeast, algae, or plants.
24. The nucleic acid of any one of claim 20, further comprising a reductase gene encoding a reductase protein from a bacterium of the class Gammaproteobacteria.
25. The nucleic acid of claim 24, wherein the reductase gene has at least 80% sequence identity to SEQ ID NO: 19, SEQ ID NO:21, or SEQ ID NO:23.
26. The nucleic acid of claim 24, wherein the reductase gene encodes a protein having at least 90% sequence identity to SEQ ID NO:20, SEQ ID NO:22, or SEQ ID NO:24.
27. The nucleic acid of claim 24, wherein the reductase gene is fused in frame with the methyltransferase gene.
28. The nucleic acid of claim 27, further comprising a nucleic acid linker sequence between the methyltransferase gene and the reductase gene, wherein the nucleic acid linker sequence encodes a linker peptide between the methyltransferase protein and the reductase protein.
29. The nucleic acid of claim 24, wherein the reductase gene is operably linked to a second promoter.
EP18788933.2A 2017-09-20 2018-09-20 Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase Pending EP3684896A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762561136P 2017-09-20 2017-09-20
PCT/US2018/051919 WO2019060527A1 (en) 2017-09-20 2018-09-20 Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase

Publications (1)

Publication Number Publication Date
EP3684896A1 true EP3684896A1 (en) 2020-07-29

Family

ID=63878797

Family Applications (1)

Application Number Title Priority Date Filing Date
EP18788933.2A Pending EP3684896A1 (en) 2017-09-20 2018-09-20 Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase

Country Status (5)

Country Link
US (2) US11236373B2 (en)
EP (1) EP3684896A1 (en)
BR (1) BR112020005278A2 (en)
CA (1) CA3076321A1 (en)
WO (1) WO2019060527A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9988624B2 (en) 2015-12-07 2018-06-05 Zymergen Inc. Microbial strain improvement by a HTP genomic engineering platform
US11208649B2 (en) 2015-12-07 2021-12-28 Zymergen Inc. HTP genomic engineering platform
CN111690587B (en) * 2019-03-13 2022-10-25 上海凯赛生物技术股份有限公司 Method for centrifugally screening grease yeast strains with high oil content and application thereof
US20210348145A1 (en) 2020-05-11 2021-11-11 Lallemand Hungary Liquidity Management Llc Recombinant yeast host cell expressing an hydrolase
CN114807080B (en) * 2022-05-16 2024-06-07 上海交通大学 Methyltransferase for catalyzing methyl esterification of small-molecule carboxylic acid and application thereof

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US7166766B1 (en) 2000-04-03 2007-01-23 Total Raffinage Distribution S.A. Method for producing branched fatty acids using genetically modified plants
WO2016014900A2 (en) 2014-07-25 2016-01-28 Novogy, Inc. Promoters derived from yarrowia lipolytica and arxula adeninivorans, and methods of use thereof
CN107208088B (en) 2014-12-10 2022-02-11 银杏生物制品公司 Oleic acid production in yeast
US10457963B2 (en) * 2016-09-20 2019-10-29 Novogy, Inc. Heterologous production of 10-methylstearic acid

Also Published As

Publication number Publication date
CA3076321A1 (en) 2019-03-28
US20220340939A1 (en) 2022-10-27
BR112020005278A2 (en) 2020-09-24
WO2019060527A1 (en) 2019-03-28
US11236373B2 (en) 2022-02-01
US20200231998A1 (en) 2020-07-23

Similar Documents

Publication Publication Date Title
US20220340939A1 (en) Heterologous production of 10-methylstearic acid by cells expressing recombinant methyltransferase
US10975398B2 (en) Heterologous production of 10-methylstearic acid
Hsieh et al. Accumulation of lipid production in Chlorella minutissima by triacylglycerol biosynthesis-related genes cloned from Saccharomyces cerevisiae and Yarrowia lipolytica
US10724041B2 (en) Increasing lipid production and optimizing lipid composition
CN102257126B (en) Reducing byproduction of malonates in a fermentation process
US11634737B2 (en) Oleic acid production in yeast
US20080220474A1 (en) Glyceraldehyde-3-phosphate dehydrogenase and phosphoglycerate mutase promoters for gene expression in oleaginous yeast
EP3137616B1 (en) Increasing cellular lipid production by increasingthe activity of diacylglycerol acyltransferase and decreasing the activity of triacylglycerol lipase
WO2006052814A2 (en) A mortierella alpina diacylglycerol acyltransferase for alteration of polyunsaturated fatty acids and oil content in oleaginous organisms
WO2006052807A2 (en) A dna molecule of mortierella alpina lpaat homolog
WO2005049805A2 (en) Fructose-bisphosphate aldolase regulatory sequences for gene expression in oleaginous yeast
US20190225996A1 (en) Increasing lipid production in oleaginous yeast
Alvarez et al. Metabolism of triacylglycerols in Rhodococcus species: insights from physiology and molecular genetics
Kim et al. Characterization of a soil metagenome-derived gene encoding wax ester synthase
US20240060100A1 (en) Processes for production of alkylated fatty acids and derivatives thereof
Sato et al. Identification and characterization of the suppressed lipid accumulation-related gene, SLA1, in the oleaginous yeast Lipomyces starkeyi
WO2015134547A1 (en) Molecules associated with fatty acid biosynthetic pathways and uses thereof

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20200319

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

RIN1 Information on inventor provided before grant (corrected)

Inventor name: CRABTREE, DONALD V.

Inventor name: BLITZBLAU, HANNAH

Inventor name: SHAW, ARTHUR J.

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: GINKGO BIOWORKS, INC.

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20210706

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230516