EP2971024A1 - Thioesterases and cells for production of tailored oils - Google Patents

Thioesterases and cells for production of tailored oils

Info

Publication number
EP2971024A1
EP2971024A1 EP14769502.7A EP14769502A EP2971024A1 EP 2971024 A1 EP2971024 A1 EP 2971024A1 EP 14769502 A EP14769502 A EP 14769502A EP 2971024 A1 EP2971024 A1 EP 2971024A1
Authority
EP
European Patent Office
Prior art keywords
seq
sequence
oil
amino acid
nucleic acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP14769502.7A
Other languages
German (de)
French (fr)
Other versions
EP2971024A4 (en
Inventor
George N. RUDENKO
Jason Casolari
Scott Franklin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Corbion Biotech Inc
Original Assignee
Solazyme Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/837,996 external-priority patent/US9290749B2/en
Application filed by Solazyme Inc filed Critical Solazyme Inc
Publication of EP2971024A1 publication Critical patent/EP2971024A1/en
Publication of EP2971024A4 publication Critical patent/EP2971024A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8247Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine involving modified lipid metabolism, e.g. seed oil composition
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/64Fats; Fatty oils; Ester-type waxes; Higher fatty acids, i.e. having at least seven carbon atoms in an unbroken chain bound to a carboxyl group; Oxidised oils or fats
    • C12P7/6409Fatty acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/02Thioester hydrolases (3.1.2)
    • C12Y301/02014Oleoyl-[acyl-carrier-protein] hydrolase (3.1.2.14), i.e. ACP-thioesterase
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E50/00Technologies for the production of fuel of non-fossil origin
    • Y02E50/10Biofuels, e.g. bio-diesel

Definitions

  • Type II fatty acid biosynthesis typically involves extension of a growing acyl-ACP (acyl-carrier protein) chain by two carbon units followed by cleavage by an acyl-ACP thioesterase.
  • acyl-ACP acyl-carrier protein
  • acyl-ACP thioesterases In plants, two main classes of acyl-ACP thioesterases have been identified: (i) those encoded by genes of the FatA class, which tend to hydro lyze oleoyl-ACP into oleate (an 18: 1 fatty acid) and ACP, and (ii) those encoded by genes of the FatB class, which liberate C8-C16 fatty acids from corresponding acyl-ACP molecules.
  • invention(s) contemplated herein may include, but need not be limited to, any one or more of the following embodiments:
  • Embodiment 1 A nucleic acid construct including a regulatory element and a FatB gene expressing an active acyl-ACP thioesterase operable to produce an altered fatty acid profile in an oil produced by a cell expressing the nucleic acid construct, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 5 of Table la, the sequence having at least 94.6% sequence identity with each of SEQ ID NOs: 88, 82, 85, and 103, and optionally wherein the fatty acid of the oil is enriched in C8 and CIO fatty acids.
  • Embodiment 2 A nucleic acid construct including a regulatory element and a FatB gene expressing an active acyl-ACP thioesterase operable to produce an altered fatty acid profile in an oil produced by a cell expressing the nucleic acid construct, wherein the FatB gene expresses a protein having an amino acid sequence falling within one of clades 1-12 of Table la.
  • Embodiment 3 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
  • Embodiment 4 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
  • Embodiment 5 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
  • Embodiment 6 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 79, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
  • Embodiment 7 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
  • Embodiment 8 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
  • Embodiment 9 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
  • Embodiment 10 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 9 of Table la, the sequence having at least 83.8% sequence identity with each of SEQ ID NOs: 187-189, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
  • Embodiment 11 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 10 of Table la, the sequence having at least 95.9% sequence identity with each of SEQ ID NOs: 147, 149, 146, 150, 152, 151, 148, 154, 156, 155, 157, 108, 75, 190, 191, and 192, and optionally wherein the fatty acid of the oil is enriched in C14 and C16 fatty acids.
  • Embodiment 12 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 11 of Table la, the sequence having at least 88.7% sequence identity with SEQ ID NO: 121, and optionally wherein the fatty acid of the oil is enriched in C14 and C16 fatty acids.
  • Embodiment 13 The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 12 of Table la, the sequence having at least 72.8% sequence identity with each of SEQ ID NOs: 129 and 186, and optionally wherein the fatty acid of the oil is enriched in C16 fatty acids.
  • Embodiment 14 An isolated nucleic acid or recombinant DNA construct including a nucleic acid, wherein the nucleic acid has at least 80%> sequence identity to any of SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107, 109 or any equivalent sequences by virtue of the degeneracy of the genetic code.
  • Embodiment 15 An isolated nucleic acid sequence encoding a protein or a host cell expressing a protein having at least 80% sequence identity to any of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, 110-192 or a fragment thereof having acyl-ACP thioesterase activity.
  • Embodiment 16 The isolated nucleic acid of embodiment 15, wherein, the protein has acyl-ACP thioesterase activity operable to alter the fatty acid profile of an oil produced by a recombinant cell including that sequence.
  • Embodiment 17 A method of producing a recombinant cell that produces an altered fatty acid profile, the method including transforming the cell with a nucleic acid according to any of embodiments 1-3.
  • Embodiment 18 A host cell produced by the method of embodiment
  • Embodiment 19 The host cell of embodiment 18, wherein the host cell is selected from a plant cell, a microbial cell, and a microalgal cell.
  • Embodiment 20 A method for producing an oil or oil-derived product, the method including cultivating a host cell of embodiment 5 or 6, and extracting oil produced thereby, optionally wherein the cultivation is heterotrophic growth on sugar.
  • Embodiment 21 The method of embodiment 20, further including producing a fatty acid, fuel, chemical, or other oil-derived product from the oil.
  • Embodiment 22 An oil produced by the method of embodiment 20, optionally having a fatty acid profile including at least 20% C8, CIO, C12, C14 or C16 fatty acids.
  • Embodiment 23 An oil-derived product produced by the method of embodiment 21.
  • Embodiment 24 The oil of embodiment 23, wherein the oil is produced by a microalgae and optionally, lacks C24-alpha sterols.
  • isolated refers to a nucleic acid that is free of at least one other component that is typically present with the naturally occurring nucleic acid. Thus, a naturally occurring nucleic acid is isolated if it has been purified away from at least one other component that occurs naturally with the nucleic acid.
  • a "natural oil” or “natural fat” shall mean a predominantly triglyceride oil obtained from an organism, where the oil has not undergone blending with another natural or synthetic oil, or fractionation so as to substantially alter the fatty acid profile of the triglyceride.
  • the natural oil or natural fat has not been subjected to interesterification or other synthetic process to obtain that regiospecific triglyceride profile, rather the regiospecificity is produced naturally, by a cell or population of cells.
  • the terms oil and fat are used interchangeably, except where otherwise noted.
  • an “oil” or a “fat” can be liquid, solid, or partially solid at room temperature, depending on the makeup of the substance and other conditions.
  • fractionation means removing material from the oil in a way that changes its fatty acid profile relative to the profile produced by the organism, however accomplished.
  • natural oil and natural fat encompass such oils obtained from an organism, where the oil has undergone minimal processing, including refining, bleaching and/or degumming, which does not substantially change its triglyceride profile.
  • a natural oil can also be a "noninteresterified natural oil", which means that the natural oil has not undergone a process in which fatty acids have been redistributed in their acyl linkages to glycerol and remain essentially in the same configuration as when recovered from the organism.
  • Exogenous gene shall mean a nucleic acid that codes for the expression of an R A and/or protein that has been introduced into a cell (e.g. by transformation/transfection), and is also referred to as a "transgene".
  • a cell comprising an exogenous gene may be referred to as a recombinant cell, into which additional exogenous gene(s) may be introduced.
  • the exogenous gene may be from a different species (and so heterologous), or from the same species (and so
  • an exogenous gene can include a homologous gene that occupies a different location in the genome of the cell or is under different control, relative to the endogenous copy of the gene.
  • An exogenous gene may be present in more than one copy in the cell.
  • An exogenous gene may be maintained in a cell, for example, as an insertion into the genome (nuclear or plastid) or as an episomal molecule.
  • Fatty acids shall mean free fatty acids, fatty acid salts, or fatty acyl moieties in a glycerolipid. It will be understood that fatty acyl groups of glycerolipids can be described in terms of the carboxylic acid or anion of a carboxylic acid that is produced when the triglyceride is hydrolyzed or saponified.
  • Microalgae are microbial organisms that contain a chloroplast or other plastid, and optionally that are capable of performing photosynthesis, or a prokaryotic microbial organism capable of performing photosynthesis.
  • Microalgae include obligate photoautotrophs, which cannot metabolize a fixed carbon source as energy, as well as heterotrophs, which can live solely off of a fixed carbon source.
  • Microalgae include unicellular organisms that separate from sister cells shortly after cell division, such as Chlamydomonas, as well as microbes such as, for example, Volvox, which is a simple multicellular photosynthetic microbe of two distinct cell types.
  • Microalgae include cells such as Chlorella, Dunaliella, and Prototheca.
  • Microalgae also include other microbial photosynthetic organisms that exhibit cell- cell adhesion, such as Agmenellum, Anabaena, and Pyrobotrys.
  • Microalgae also include obligate heterotrophic microorganisms that have lost the ability to perform photosynthesis, such as certain dinoflagellate algae species and species of the genus Prototheca.
  • An "oleaginous” cell is a cell capable of producing at least 20% lipid by dry cell weight, naturally or through recombinant or classical strain improvement.
  • An "oleaginous microbe” or “oleaginous microorganism” is a microbe, including a microalga that is oleaginous.
  • sequence comparison to determine percent nucleotide or amino acid identity
  • test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated.
  • sequence comparison algorithm calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • Optimal alignment of sequences for comparison can be conducted using the NCBI BLAST software (ncbi.nlm.nih.gov/BLAST/) set to default parameters.
  • BLAST 2 Sequences Version 2.0.12 (Apr. 21, 2000) set at the following default parameters: Matrix: BLOSUM62; Reward for match: 1; Penalty for mismatch: -2; Open Gap: 5 and Extension Gap: 2 penalties; Gap x drop-off: 50;
  • a “profile” is the distribution of particular species or triglycerides or fatty acyl groups within the oil.
  • a “fatty acid profile” is the distribution of fatty acyl groups in the triglycerides of the oil without reference to attachment to a glycerol backbone.
  • Fatty acid profiles are typically determined by conversion to a fatty acid methyl ester (FAME), followed by gas chromatography (GC) analysis with flame ionization detection (FID).
  • FAME fatty acid methyl ester
  • FAME gas chromatography
  • FID flame ionization detection
  • the fatty acid profile can be expressed as one or more percent of a fatty acid in the total fatty acid signal determined from the area under the curve for that fatty acid.
  • an oil is said to be "enriched" in one or more particular fatty acids if there is at least a 10% increase in the mass of that fatty acid in the oil relative to the non-enriched oil.
  • the oil produced by the cell is said to be enriched in, e.g., C8 and C16 fatty acids if the mass of these fatty acids in the oil is at least 10% greater than in oil produced by a cell of the same type that does not express the heterologous FatB gene (e.g., wild type oil).
  • Recombinant is a cell, nucleic acid, protein or vector that has been modified due to the introduction of an exogenous nucleic acid or the alteration of a native nucleic acid.
  • recombinant (host) cells can express genes that are not found within the native (non-recombinant) form of the cell or express native genes differently than those genes are expressed by a non-recombinant cell.
  • Recombinant cells can, without limitation, include recombinant nucleic acids that encode a gene product or suppression elements such as mutations, knockouts, antisense, interfering RNA (RNAi) or dsRNA that reduce the levels of active gene product in a cell.
  • RNAi interfering RNA
  • a "recombinant nucleic acid” is a nucleic acid originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using polymerases, ligases, exonucleases, and endonucleases, using chemical synthesis, or otherwise is in a form not normally found in nature.
  • Recombinant nucleic acids may be produced, for example, to place two or more nucleic acids in operable linkage.
  • an isolated nucleic acid or an expression vector formed in vitro by nucleic by ligating DNA molecules that are not normally joined in nature are both considered recombinant for the purposes of this invention.
  • Recombinant nucleic acids can also be produced in other ways; e.g., using chemical DNA synthesis.
  • a recombinant nucleic acid Once a recombinant nucleic acid is made and introduced into a host cell or organism, it may replicate using the in vivo cellular machinery of the host cell; however, such nucleic acids, once produced recombinantly, although subsequently replicated intracellularly, are still considered recombinant for purposes of this invention.
  • a "recombinant protein” is a protein made using recombinant techniques, i.e., through the expression of a recombinant nucleic acid.
  • Embodiments of the present invention relate to the use of FatB genes isolated from plants, which can be expressed in a host cell in order to alter the fatty acid profile of an oil produced by the recombinant cell.
  • the microalga Prototheca moriformis
  • the genes are useful in a wide variety of host cells.
  • the genes can be expressed in bacteria, other microalgae, or higher plants.
  • the genes can be expressed in higher plants according to the methods of US Patent Nos. 5,850,022; 5,723,761; 5,639,790; 5,807,893; 5,455,167; 5,654,495;
  • the fatty acids can be further converted to triglycerides, fatty aldehydes, fatty alcohols and other oleochemicals either synthetically or biosynthetically.
  • triglycerides are produced by a host cell expressing a novel FatB gene.
  • a triglyceride-containing natural oil can be recovered from the host cell.
  • the natural oil can be refined, degummed, bleached and/or deodorized.
  • the oil in its natural or processed form, can be used for foods, chemicals, fuels, cosmetics, plastics, and other uses.
  • the FatB gene may not be novel, but the expression of the gene in a microalga is novel.
  • the genes can be used in a variety of genetic constructs including plasmids or other vectors for expression or recombination in a host cell.
  • the genes can be codon optimized for expression in a target host cell.
  • the proteins produced by the genes can be used in vivo or in purified form.
  • the gene can be prepared in an expression vector comprising an operably linked promoter and 5 'UTR.
  • a suitably active plastid targeting peptide can be fused to the FATB gene, as in the examples below.
  • this transit peptide is replaced with a 38 amino acid sequence that is effective in the Prototheca moriformis host cell for transporting the enzyme to the plastids of those cells.
  • the invention contemplates deletions and fusion proteins in order to optimize enzyme activity in a given host cell.
  • a transit peptide from the host or related species may be used instead of that of the newly discovered plant genes described here.
  • a selectable marker gene may be included in the vector to assist in isolating a transformed cell.
  • selectable markers useful in microlagae include sucrose invertase and antibiotic resistance genes.
  • the gene sequences disclosed can also be used to prepare antisense, or inhibitory R A (e.g., RNAi or hairpin R A) to inhibit complementary genes in a plant or other organism.
  • inhibitory R A e.g., RNAi or hairpin R A
  • FatB genes found to be useful in producing desired fatty acid profiles in a cell are summarized below in Table 1. Nucleic acids or proteins having the sequence of SEQ ID NOS: 1-109 can be used to alter the fatty acid profile of a recombinant cell.
  • Variant nucleic acids can also be used; e.g., variants having at least 70, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity to SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107 or 109.
  • Codon optimization of the genes for a variety of host organisms is contemplated, as is the use of gene fragments.
  • Preferred codons for Prototheca strains and for Chlorella protothecoides are shown below in Tables 2 and 3, respectively.
  • Codon usage for Cuphea wrightii is shown in Table 3a.
  • Codon usage for Arabidopsis is shown in Table 3b; for example, the most preferred of codon for each amino acid can be selected. Codon tables for other organisms including microalgae and higher plants are known in the art.
  • the first and/or second most preferred Prototheca codons are employed for codon optimization.
  • novel amino acid sequences contained in the sequence listings below are converted into nucleic acid sequences according to the most preferred codon usage in Prototheca, Chlorella, Cuphea wrightii, or Arabidopsis as set forth in tables 2 through 3b or nucleic acid sequences having at least 70, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity to these derived nucleic acid sequences.
  • protein or a nucleic acid encoding a protein having any of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, or 110-192.
  • the invention encompasses a fragment any of the above-described proteins or nucleic acids
  • the fragment includes a domain of an acyl-ACP thioesterase that mediates a particular function, e.g., a specificity-determining domain.
  • Illustrative fragments can be produced by C-terminal and/or N-terminal truncations and include at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%), 98%o, or 99%> of the full-length sequences disclosed herein.
  • percent sequence identity for variants of the nucleic acids or proteins discussed above can be calculated by using the full-length nucleic acid sequence (e.g., one of SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107 or 109) or full-length amino acid sequence (e.g., one of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64
  • the nucleic acids can be in isolated form, or part of a vector or other construct, chromosome or host cell. It has been found that is many cases the full length gene (and protein) is not needed; for example, deletion of some or all of the N- terminal hydrophobic domain (typically an 18 amino acid domain starting with LPDW) yields a still-functional gene. In addition, fusions of the specificity determining regions of the genes in Table 1 with catalytic domains of other acyl-ACP thioesterases can yield functional genes.
  • the invention encompasses functional fragments (e.g., specificity determining regions) of the disclosed nucleic acid or amino acids fused to heterologous acyl-ACP thioesterase nucleic acid or amino acid sequences, respectively.
  • Cuphea CvisFATB 1 published SEQ ID N/A SEQ ID viscosissima NO: 73 NO: 74
  • Cuphea CvisFATB2 published SEQ ID N/A SEQ ID viscosissima NO: 75 NO: 76
  • Cuphea CvisFATB3 published SEQ ID N/A SEQ ID viscosissima NO: 77 NO: 78
  • Consensus JcFATB2 Consensus SEQ ID None, SEQ ID sequence NO: 108 can be NO: 109 codon
  • a host cell e.g. plant or microalgal cell
  • a recombinant FATB protein falling into one of clades 1-12 of Table la.
  • These clades were determined by sequence alignment and observation of changes in fatty acid profile when expressed in Prototheca. See Example 5.
  • the FATB amino acid sequence can fall within x% amino acid sequence identity of each sequence in that clade listed in Table la, where x is a first second or third cutoff value, also listed in Table la.
  • Table la Groupings of Novel FatB genes into clades.
  • CwFATB3 (SEQ ID NO: 112) Increase C12/C 14 85.9 98.9 99.5 CwFATB3a (SEQ ID NO: 113) fatty acids
  • ChtFATB2e (SEQ ID NO: 142)
  • ChtFATB2h (SEQ ID NO: 145)
  • ChtFATB2f (SEQ ID NO: 143)
  • ChtFATB2g (SEQ ID NO: 144)
  • ChtFATB2a (SEQ ID NO: 139)
  • ChtFATB2c (SEQ ID NO: 140)
  • ChtFATB2b (SEQ ID NO: 138)
  • ChtFATB2d (SEQ ID NO: 141)
  • CcrFATB2c (SEQ ID NO: 187) Increase C12/C 14 83.8 90 95 CcrFATB2 (SEQ ID NO: 188) fatty acids
  • ChtFATB3b (SEQ ID NO: 147) Increase C14/C16 95.9 98 99 ChtFATB3d (SEQ ID NO: 149)
  • ChtFATB3a (SEQ ID NO: 146)
  • ChtFATB3e (SEQ ID NO: 150)
  • ChtFATB3g (SEQ ID NO: 152)
  • ChtFATB3f (SEQ ID NO: 151)
  • ChtFATB3c (SEQ ID NO: 148)
  • ChsFATB2 (SEQ ID NO: 154)
  • ChsFATB2c (SEQ ID NO: 156)
  • ChsFATB2b (SEQ ID NO: 155)
  • ChsFATB2d (SEQ ID NO: 157)
  • JcFATB2/SzFATB2 (SEQ ID NO: 108)
  • CcFATB3 (SEQ ID NO: 129) Increase C16 fatty 72.8 85 90 UcFATB3 (SEQ ID NO: 186) acids
  • GCA 66 (0.07) AAC 201 (0.96) GCT 101(0.11)
  • TGC 105 (0.90) CCC 267 (0.49) Asp GAT 43(0.12) Gin CAG 226(0.82)
  • CAC 154 (0.79) TCC 173 (0.31) lie ATA 4 (0.01) ACG 184 (0.38)
  • GCC (Ala) AAC (Asn) GGC (Gly) GTG (Val)
  • GUC V 0.2115.0 ( 40) GCC A 0.2018.0 ( 48) GAC D 0.3721.0 ( 56) GGC G 0.20 18.0 ( 48) GUAV0.1410.1 ( 27) GCA A 0.3329.6 ( 79) GAA E 0.4118.3 ( 49) GGAG0.35 31.4 ( 84)
  • GUAV0.15 9.9 (308605) GCAA0.2717.5 (543180) GAA E 0.5234.3 (1068012) GGA G 0.3724.2 (751489) GUG V 0.2617.4 (539873) GCG A 0.14 9.0 (280804) GAG E 0.4832.2 (1002594) GGG G0.1610.2 (316620) Host Cells
  • the host cell can be a single cell (e.g., microalga, bacteria, yeast) or part of a multicellular organism such as a plant or fungus.
  • Methods for expressing Fatb genes in a plant are given in 5,850,022; 5,723,761; 5,639,790; 5,807,893;
  • oleaginous host cells include plant cells and microbial cells having a type II fatty acid biosynthetic pathway, including plastidic oleaginous cells such as those of oleaginous algae.
  • microalgal cells include heterotrophic or obligate heterotrophic microalgae of the phylum Chlorophtya, the class Trebouxiophytae, the order Chlorellales, or the family Chlorellacae.
  • heterotrophic or obligate heterotrophic microalgae of the phylum Chlorophtya the class Trebouxiophytae, the order Chlorellales, or the family Chlorellacae.
  • oleaginous microalgae are provided in Published PCT Patent Applications
  • WO2008/151149, WO2010/06032, WO2011/150410, and WO2011/150411 including species of Chlorella and Prototheca, a genus comprising obligate heterotrophs.
  • the oleaginous cells can be, for example, capable of producing 25, 30, 40, 50, 60, 70, 80, 85, or about 90% oil by cell weight, ⁇ 5%.
  • the oils produced can be low in DHA or EPA fatty acids.
  • the oils can comprise less than 5%, 2 %, or 1% DHA and/or EPA.
  • the above-mentioned publications also disclose methods for cultivating such cells and extracting oil, especially from microalgal cells; such methods are applicable to the cells disclosed herein and incorporated by reference for these teachings.
  • microalgal cells When microalgal cells are used they can be cultivated autotrophically (unless an obligate heterotroph) or in the dark using a sugar (e.g., glucose, fructose and/or sucrose).
  • a sugar e.g., glucose, fructose and/or sucrose.
  • the cells can be heterotrophic cells comprising an exogenous invertase gene so as to allow the cells to produce oil from a sucrose feedstock.
  • the cells can metabolize xylose from cellulosic feedstocks.
  • the cells can be genetically engineered to express one or more xylose metabolism genes such as those encoding an active xylose transporter, a xylulose-5 -phosphate transporter, a xylose isomerase, a xylulokinase, a xylitol dehydrogenase and a xylose reductase.
  • xylose metabolism genes such as those encoding an active xylose transporter, a xylulose-5 -phosphate transporter, a xylose isomerase, a xylulokinase, a xylitol dehydrogenase and a xylose reductase. See WO2012/154626, "GENETICALLY ENGINEERED
  • the oleaginous cells express one or more exogenous genes encoding fatty acid biosynthesis enzymes.
  • some embodiments feature natural oils that were not obtainable from a non-plant or non-seed oil, or not obtainable at all.
  • the oleaginous cells produce a storage oil, which is primarily triacylglyceride and may be stored in storage bodies of the cell.
  • a raw oil may be obtained from the cells by disrupting the cells and isolating the oil.
  • WO2008/151149, WO2010/06032, WO2011/150410, and WO2011/1504 disclose heterotrophic cultivation and oil isolation techniques. For example, oil may be obtained by cultivating, drying and pressing the cells.
  • the oils produced may be refined, bleached and deodorized (RBD) as known in the art or as described in WO2010/120939.
  • the raw or RBD oils may be used in a variety of food, chemical, and industrial products or processes. After recovery of the oil, a valuable residual biomass remains. Uses for the residual biomass include the production of paper, plastics, absorbents, adsorbents, as animal feed, for human nutrition, or for fertilizer.
  • a fatty acid profile of a triglyceride also referred to as a
  • triacylglyceride or "TAG" cell oil
  • TAG triacylglyceride
  • TAG cell oil
  • the oil may be subjected to an RBD process to remove phospholipids, free fatty acids and odors yet have only minor or negligible changes to the fatty acid profile of the triglycerides in the oil. Because the cells are oleaginous, in some cases the storage oil will constitute the bulk of all the TAGs in the cell.
  • the stable carbon isotope value 513C is an expression of the ratio of
  • the stable carbon isotope value 513C (0/00) of the oils can be related to the 513C value of the feedstock used.
  • the oils are derived from oleaginous organisms
  • the 513C (0/00) of the oil is from -10 to -17 0/00 or from -13 to -16 0/00.
  • the oils produced according to the above methods in some cases are made using a microalgal host cell.
  • the microalga can be, without limitation, fall in the classification of Chlorophyta, Trebouxiophyceae , Chlorellales, Chlorellaceae, or Chlorophyceae. It has been found that microalgae of
  • Trebouxiophyceae can be distinguished from vegetable oils based on their sterol profiles.
  • Oil produced by Chlorella protothecoides was found to produce sterols that appeared to be brassicasterol, ergosterol, campesterol, stigmasterol, and ⁇ -sitosterol, when detected by GC-MS.
  • all sterols produced by Chlorella have C24P stereochemistry.
  • the molecules detected as campesterol, stigmasterol, and ⁇ -sitosterol are actually 22,23- dihydrobrassicasterol, proferasterol and clionasterol, respectively.
  • the oils produced by the microalgae described above can be distinguished from plant oils by the presence of sterols with C24 stereochemistry and the absence of C24a stereochemistry in the sterols present.
  • the oils produced may contain 22, 23 -dihydrobrassicasterol while lacking campesterol; contain clionasterol, while lacking in ⁇ -sitosterol, and/or contain poriferasterol while lacking stigmasterol.
  • oils may contain significant amounts of ⁇ 7 - poriferasterol.
  • the oils provided herein are not vegetable oils.
  • Vegetable oils are oils extracted from plants and plant seeds. Vegetable oils can be distinguished from the non-plant oils provided herein on the basis of their oil content. A variety of methods for analyzing the oil content can be employed to determine the source of the oil or whether adulteration of an oil provided herein with an oil of a different (e.g. plant) origin has occurred. The determination can be made on the basis of one or a combination of the analytical methods. These tests include but are not limited to analysis of one or more of free fatty acids, fatty acid profile, total triacylglycerol content, diacylglycerol content, peroxide values, spectroscopic properties (e.g. UV absorption), sterol profile, sterol degradation products, antioxidants (e.g.
  • tocopherols include pigments (e.g. chlorophyll), dl3C values and sensory analysis (e.g. taste, odor, and mouth feel).
  • pigments e.g. chlorophyll
  • dl3C values e.g. dl3C values
  • sensory analysis e.g. taste, odor, and mouth feel.
  • Sterol profile analysis is a particularly well-known method for determining the biological source of organic matter.
  • Campesterol, b-sitosterol, and stigamsterol are common plant sterols, with b-sitosterol being a principle plant sterol.
  • b-sitosterol was found to be in greatest abundance in an analysis of certain seed oils, approximately 64% in corn, 29% in rapeseed, 64% in sunflower, 74% in cottonseed, 26% in soybean, and 79% in olive oil (Gul et al. J. Cell and Molecular Biology 5:71-79, 2006).
  • Oil isolated from Prototheca moriformis strain UTEX1435 were separately clarified (CL), refined and bleached (RB), or refined, bleached and deodorized (RBD) and were tested for sterol content according to the procedure described in JAOCS vol.
  • ergosterol was found to be the most abundant of all the sterols, accounting for about 50% or more of the total sterols. The amount of ergosterol is greater than that of campesterol, ⁇ -sitosterol, and stigmasterol combined. Ergosterol is steroid commonly found in fungus and not commonly found in plants, and its presence particularly in significant amounts serves as a useful marker for non-plant oils. Secondly, the oil was found to contain brassicasterol. With the exception of rapeseed oil, brassicasterol is not commonly found in plant based oils. Thirdly, less than 2% ⁇ -sitosterol was found to be present.
  • ⁇ -sitosterol is a prominent plant sterol not commonly found in microalgae, and its presence particularly in significant amounts serves as a useful marker for oils of plant origin.
  • Prototheca moriformis strain UTEX1435 has been found to contain both significant amounts of ergosterol and only trace amounts of ⁇ -sitosterol as a percentage of total sterol content. Accordingly, the ratio of ergosterol : ⁇ - sitosterol or in combination with the presence of brassicasterol can be used to distinguish this oil from plant oils.
  • the oil content of an oil provided herein contains, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% ⁇ - sitosterol. In other embodiments the oil is free from ⁇ -sitosterol.
  • the oil is free from one or more of ⁇ -sitosterol, campesterol, or stigmasterol. In some embodiments the oil is free from ⁇ -sitosterol, campesterol, and stigmasterol. In some embodiments the oil is free from campesterol. In some embodiments the oil is free from stigmasterol.
  • the oil content of an oil provided herein comprises, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% 24- ethylcholest-5-en-3-ol.
  • the 24-ethylcholest-5-en-3-ol is clionasterol.
  • the oil content of an oil provided herein comprises, as a percentage of total sterols, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%), or 10%) clionasterol.
  • the oil content of an oil provided herein contains, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% 24- methylcholest-5-en-3-ol.
  • the 24-methylcholest-5-en-3-ol is 22, 23-dihydrobrassicasterol.
  • the oil content of an oil provided herein comprises, as a percentage of total sterols, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, or 10% 22,23-dihydrobrassicasterol.
  • the oil content of an oil provided herein contains, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% 5,22- cholestadien-24-ethyl-3-ol.
  • the 5, 22-cholestadien-24-ethyl-3- ol is poriferasterol.
  • the oil content of an oil provided herein comprises, as a percentage of total sterols, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%), or 10%) poriferasterol.
  • the oil content of an oil provided herein contains ergosterol or brassicasterol or a combination of the two. In some embodiments, the oil content contains, as a percentage of total sterols, at least 5%, 10%, 20%, 25%, 35%, 40%, 45%, 50%, 55%, 60%, or 65% ergosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 25% ergosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 40% ergosterol.
  • the oil content contains, as a percentage of total sterols, at least 5%, 10%, 20%, 25%, 35%, 40%, 45%, 50%, 55%, 60%, or 65% of a combination of ergosterol and brassicasterol.
  • the oil content contains, as a percentage of total sterols, at least 1%, 2%, 3%, 4% or 5% brassicasterol. In some embodiments, the oil content contains, as a percentage of total sterols less than 10%, 9%, 8%, 7%, 6%, or 5% brassicasterol.
  • the ratio of ergosterol to brassicasterol is at least 5: 1, 10: 1, 15: 1, or 20: 1.
  • the oil content contains, as a percentage of total sterols, at least 5%, 10%, 20%, 25%, 35%, 40%, 45%, 50%, 55%, 60%, or 65% ergosterol and less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% ⁇ -sitosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 25% ergosterol and less than 5% ⁇ -sitosterol. In some embodiments, the oil content further comprises brassicasterol. [0070] Sterols contain from 27 to 29 carbon atoms (C27 to C29) and are found in all eukaryotes.
  • C27 sterols Animals exclusively make C27 sterols as they lack the ability to further modify the C27 sterols to produce C28 and C29 sterols. Plants however are able to synthesize C28 and C29 sterols, and C28/C29 plant sterols are often referred to as phytosterols.
  • the sterol profile of a given plant is high in C29 sterols, and the primary sterols in plants are typically the C29 sterols b-sitosterol and stigmasterol.
  • the sterol profile of non-plant organisms contain greater percentages of C27 and C28 sterols. For example the sterols in fungi and in many microalgae are principally C28 sterols.
  • the primary sterols in the microalgal oils provided herein are sterols other than b-sitosterol and stigmasterol.
  • C29 sterols make up less than 50%, 40%, 30%, 20%, 10%, or 5% by weight of the total sterol content.
  • the microalgal oils provided herein contain C28 sterols in excess of C29 sterols. In some embodiments of the microalgal oils, C28 sterols make up greater than 50%, 60%, 70%, 80%, 90%, or 95% by weight of the total sterol content. In some embodiments the C28 sterol is ergosterol. In some embodiments the C28 sterol is brassicasterol.
  • oleaginous cells expressing one or more of the genes of Table 1 can produce an oil with at least 20, 40, 60 or 70% of C8, CIO, C12, C14 or C16 fatty acids.
  • the level of myristate (C14:0) in the oil is greater than 30%.
  • the transformed cell is cultivated to produce an oil and, optionally, the oil is extracted.
  • Oil extracted in this way can be used to produce food, oleochemicals or other products.
  • the oils discussed above alone or in combination are useful in the production of foods, fuels and chemicals (including plastics, foams, films, etc).
  • the oils, triglycerides, fatty acids from the oils may be subjected to C-H activation, hydroamino methylation, methoxy-carbonation, ozonolysis, enzymatic
  • a residual biomass may be left, which may have use as a fuel, as an animal feed, or as an ingredient in paper, plastic, or other product.
  • a residual biomass from heterotrophic algae can be used in such products.
  • Seeds of oleaginous plants were obtained from local grocery stores or requested through USDA ARS National Plant Germplasm System (NPGS) from North Central Regional Plant Introduction Station (NCRIS) or USDA ARS North Central Soil Conservation Research Laboratory (Morris, MI). Dry seeds were homogenized in liquid nitrogen to powder, resuspended in cold extraction buffer containing 6-8M Urea and 3M LiCl and left on ice for a few hours to overnight at 4 °C. The seed homogenate was passed through NucleoSpin Filters (Macherey-Nagel) by centrifugation at 20,000g for 20 minutes in the refrigerated microcentrifuge (4 °C).
  • RNA pellets were resuspended in the buffer containing 20 mM Tris HC1, pH7.5, 0.5% SDS, 100 mM NaCl, 25 mM EDTA, 2% PVPP) and RNA was subsequently extracted once with Phenol-Chloroform-Isoamyl Alcohol (25:24: 1, v/v) and once with chloroform. RNA was finally precipitated with isopropyl alcohol (0.7 Vol.) in the presence of 150 mM of Na Acetate, pH5.2, washed with 80% ethanol by centrifugation, and dried. RNA samples were treated with Turbo DNAse (Lifetech) and purified further using RNeasy kits (Qiagen) following manufacturers' protocols. The resulting purified RNA samples were converted to pair-end cDNA libraries and subjected to next-generation sequencing (2xl00bp) using Illumina Hiseq 2000 platform. RNA sequence reads were assembled into corresponding seed
  • transcriptomes using Trinity or Oases packages. Putative thioesterase-containg cDNA contigs were identified by mining transcriptomes for sequences with homology to known thioesterases. These in silico identified putative thioesterase cDNAs have been further verified by direct reverse transcription PCR analysis using seed RNA and primer pairs targeting full-length thioesterase cDNAs. The resulting amplified products were cloned and sequenced de novo to confirm authenticity of identified thioesterase genes.
  • ChtFATB3a SEQ ID NO: 146
  • ChtFATB3f SEQ ID NO: 151
  • ChtFATB3g SEQ ID NO: 152
  • Example 1 R A was extracted from dried plant seeds and submitted for paired-end sequencing using the Illumina Hiseq 2000 platform. RNA sequence reads were assembled into corresponding seed transcriptomes using Trinity or Oases packages and putative thioesterase-containing cDNA contigs were identified by mining transcriptomes for sequences with homology to known thioesterases.
  • Cinnamomum camphora, Cuphea hyssopifolia, Cuphea PSR23, Cuphea wrightii, Cuphea heterophylla, and Cuphea viscosissima were synthesized in a codon- optimized form to reflect Prototheca moriformis (UTEX 1435) codon usage.
  • 27 genes synthesized 24 were identified by our transcriptome sequencing efforts and the 3 genes from Cuphea viscosissima, were from published sequences in GenBank.
  • Strain A Prototheca moriformis, derived from UTEX 1435 by classical mutation and screening for high oil production
  • the construct pSZ2760 encoding Cinnamomum camphora (Cc) FATBlb is shown as an example, but identical methods were used to generate each of the remaining 26 constructs encoding the different respective thioesterases.
  • Construct pSZ2760 can be written as
  • Bold, lowercase sequences at the 5 ' and 3 ' end of the construct represent genomic DNA from UTEX 1435 that target integration to the 6S locus via homologous recombination. Proceeding in the 5 ' to 3 ' direction, the selection cassette has the C reinhardtii ⁇ -tubulin promoter driving expression of the S. cerevisiae gene SUC2 (conferring the ability to grow on sucrose) and the Chlorella vulgaris Nitrate Reductase (NR) gene 3 ' UTR.
  • the promoter is indicated by lowercase, boxed text.
  • the initiator ATG and terminator TGA for ScSUC2 are indicated by bold, uppercase italics, while the coding region is indicated with lowercase italics.
  • the 3 ' UTR is indicated by lowercase underlined text.
  • the spacer region between the two cassettes is indicated by upper case text.
  • Cinnamomum camphora is driven by the Prototheca moriformis endogenous AMT3 promoter, and has the Chlorella vulgaris Nitrate Reductase (NR) gene 3 ' UTR.
  • the AMT3 promoter is indicated by lowercase, boxed text.
  • the initiator ATG and terminator TGA for the CcFATBlb gene are indicated in bold, uppercase italics, while the coding region is indicated by lowercase italics and the spacer region is indicated by upper case text.
  • the 3 ' UTR is indicated by lowercase underlined text. The final construct was sequenced to ensure correct reading frame and targeting sequences.
  • CcFATBlb from pSZ2760 in Table 6 were transformed into Strain A, and selected for the ability to grow on sucrose. Transformations, cell culture, lipid production and fatty acid analysis were all carried out as previously described. After cultivating on sucrose under low nitrogen conditions to accumulate oil, fatty acid profiles were determined by FAME-GC. The top performer from each transformation, as judged by the ability to produce the highest level of midchain fatty acids, is shown in Table 4.
  • CcFATBlb causes an increase in myristate levels from 2% of total fatty acids in the parent, Strain A, to -15% in the D 1670- 13 primary trans formant.
  • Other examples include CcFATB4, which exhibits an increase in laurate levels from 0% in Strain A to -33%, and ChsFATB3, which exhibits an increase in myristate levels to -34%.
  • constructs such as the deduced amino acid sequence of the encoded acyl-ACP thioesterase, the native CDS coding sequence, the Prototheca moriformis codon- optimized coding sequence, and the nature of the sequence variants examined, is provided as SEQ ID NOS: 1-78.
  • the nine putative Acyl-ACP FatB Thioesterases from the species Cuphea calcarata, Cuphea painter, Cuphea hookeriana, Cuphea avigera var.
  • pulcherrima Cuphea paucipetala, Cuphea procumbens, and Cuphea ignea were synthesized in a codon-optimized form to reflect UTEX 1435 codon usage.
  • the new Acyl-ACP FatB thioesterases were synthesized with a modified transit peptide from Chlorella protothecoides (Cp) in place of the native transit peptide.
  • the modified transit peptide derived from the CpSADl gene, "CpSADltp_trimmed” was synthesized as an in- frame, N-terminal fusion to the FatB acyl-ACP thioesterases in place of the native transit peptide; the resulting sequences are listed below.
  • the novel FatB genes were cloned into Prototheca moriformis as described above. Constructs encoding heterologous FatB genes were transformed into strain S6165 (a descendant of S3150/Strain A) and selected for the ability to grow on sucrose. Transformations, cell culture, lipid production and fatty acid analysis were all carried out as previously described. The results for the nine novel FatB acyl-ACP thioesterases are displayed in the table immediately below.
  • CigneaFATBl which exhibits 8% C10:0 and 1% C12:0 fatty acid levels
  • CcalcFATBl which exhibits 18% C14:0 and 12% C12:0 levels
  • CaFATBl which exhibits 22% C8:0 and 9% C10:0 fatty acid levels.
  • CaFATBl which exhibits high C8:0 and C10:0 levels, is of particular interest.
  • CaFATBl arose from two separate contigs that were assembled from the Cupha avigera var. pulcherrima transcriptome, S17_Cavig_trinity_7406 and
  • CpuFATB3 The coding sequence of CpuFATB3 is 100% identical to the CaFATBl gene we identified and contains one nucleotide difference in the RNA sequence outside the predicted coding region. Tjellstrom et al. (2013) showed that CpuFATB3 produces an average of 4.8%> C8:0 when expressed in Arabidopsis, and further requires deletion of two acyl-ACP synthetases, AAE15/16, to produce an average of 9.2% C8:0 with a maximum level of -12% C8.0.
  • the CaFATBl gene we identified was codon-optimized for expression in UTEX1435 and generated as a CpSADltp-trimmed transit peptide fusion before introduction into S6165.
  • the CpSADltp_trimmed:CaFATBl gene produces an average C8:0 level of 14% and a maximum level of 22% C8:0 without requiring the deletion of endogenous acyl-ACP synthetases. [0093] Table 7. Amino Acid Sequences of Additional Novel FatB Acyl-ACP
  • CprocFATBl (Cuphea procumbens FATB1) SEO ID NO: 172
  • CprocFATB2 (Cuphea procumbens FATB2) SEO ID NO: 173
  • CprocFATB3 (Cuphea procumbens FATB3) SEO ID NO: 174
  • ChookFATB4 (Cuphea hookeriana FATB4) SEO ID NO: 177
  • CaFATBl (Cuphea avigera var. pulcherrima FATB1) SEO ID NO: 178
  • CpauFATBl (Cuphea paucipetala FATB1) SEO ID NO: 179
  • CprocFATBl (Cuphea procumbens FATB1) SEO ID NO: 180
  • CprocFATB2 (Cuphea procumbens FATB2) SEO ID NO: 181
  • CprocFATB3 (Cuphea procumbens FATB3) SEO ID NO: 182
  • thioesterases in UTEX1435, S3150 we identified several thioesterases with increased CI 0:0 and CI 6:0 activity above the background midchain levels found in the strain. We reasoned that a consensus sequence could be obtained for an idealized C10:0 thioesterase and C16:0 thioesterase from aligning the best- performing C10:0 and C16:0 thioesterases.
  • a consensus C10:0 specific thioesterase sequence was generated using the C palustris FatBl (CpFATBl), C. PSR23 FatB3 (CuPSR23FATB3), C viscosissima FatB 1 (CvisFATB 1 ), C glossostoma FatB 1
  • CgFATBl C. carthagenensis FatB2
  • CcrFATB2 C. carthagenensis FatB2 sequences as inputs resulting in a C10:0 specific consensus sequence termed JcFATBl/SzFATBl .
  • a consensus CI 6:0 specific thioesterase sequence was generated using the C. heterophylla FatB3a (ChtFATB3a), C carthagenensis FatBl (CcrFATBl), C viscosissima FatB2
  • CvisFATB2 C hookeriana FatBl (ChFATBl; AAC48990), C hyssopifolia FatB2 (ChsFATB2), C calophylla FatB2 (CcalFATB2; ABB71581), C hookeriana FatBl-1 (ChFATBl-1; AAC72882), C lanceolata FatBl (C1FATB1; CAC 19933), and C. wrightii FatB4a (CwFATB4a) sequences as inputs resulting in a CI 6:0 specific consensus sequence termed JcFATB2/SzFATB2.
  • the resulting consensus sequences were synthesized, cloned into a vector identical to that used to test other FatB thioesterases, and introduced into S3150 as described above.
  • the consensus amino acid sequences are given as SEQ ID NOs. 106 and 107; the nucleic acid sequences were based on these amino acid sequences using codon optimization for Prototheca moriformis.
  • the trans formants were selected, cultivated and the oil was extracted and analyzed by FAME-GC-FID. The fatty acid profiles obtained are given in the table below.
  • Cinnamomum camphora (Cc) FATBlb variant M25L, M322R, AT367-D368 amino acid sequence MATTSLASAFCSMKAVMLARDGRGLKPRSSDLQLRAGNAQTSLKMINGTKFSYTESLKKLPD WSMLFAVITTIFSAAEKQWTNLEWKPKPNPPQLLDDHFGPHGLVFRRTFAIRSYEVGPDRSTSI VAVMNHLQEAALNHAKSVGILGDGFGTTLEMSKRDLIWVVKRTHVAVERYPAWGDTVEVE CWVGASGNNGRRHDFLVRDCKTGEILTRCTSLSVMMNTRTRRLSKIPEEVRGEIGPAFIDNVA VKDEEIKKPQKLNDSTADYIQGGLTPRWNDLDINQHVNNIKYVDWILETVPDSIFESHHISSFTI EYRRECTRDSVLQSLTTVSGGSSEAGLVCEHLLQLEGGSEVLRAKTEWRPKLSFRGISVIPAES
  • Cinnamomum camphora (Cc) FATBlb variant M25L, M322R, AT367-D368 coding DNA sequence
  • Cinnamomum camphora (Cc) FATBlb variant M25L, M322R, AT367-D368 coding DNA sequence codon optimized for Prototheca moriformis
  • Cinnamomum camphora (Cc) FATB4 amino acid sequence
  • Cinnamomum camphora (Cc) FATB3 amino acid sequence
  • Cuphea hyssopifolia (Chs) FATB1 amino acid sequence
  • Cuphea hyssopifolia (Chs) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea hyssopifolia (Chs) FATB2 amino acid sequence
  • Cuphea hyssopifolia (Chs) FATB2b +a.a.248-259 variant amino acid sequence
  • Cuphea hyssopifolia (Chs) FATB2b+a.a.248-259 variant coding DNA sequence
  • Cuphea hyssopifolia (Chs) FATB3 amino acid sequence
  • Cuphea hyssopifolia (Chs) FATB3 coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea hyssopifolia (Chs) FATB3b (V204I,C239F, E243D, M25 IV variant) amino acid sequence
  • Cuphea hyssopifolia (Chs) FATB3b (V204I,C239F, E243D, M25 IV variant) coding DNA sequence
  • Cuphea hyssopifolia (Chs) FATB3b (V204I,C239F, E243D, M25 IV variant) coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea heterophylla (Cht) FATBlb (P16S, T20P, G94S, G105W, S293F, L305F variant) amino acid sequence
  • Cuphea heterophylla (Cht) FATBlb(P16S, T20P, G94S, G105W, S293F, L305F variant) coding DNA sequence
  • Cuphea heterophylla (Cht) FATBlb (P16S, T20P, G94S, G105W, S293F, L305F variant) coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea heterophylla (Cht) FATB2a (S17P, P21 S, T28N, L30P, S33L, G76D, S78P, G137W variant) amino acid sequence
  • Cuphea heterophylla (Cht) FATB2a (S17P, P21 S, T28N, L30P, S33L, G76D, S78P, G137W variant) coding DNA sequence
  • Cuphea heterophylla (Cht) FATB2a (S17P, P21 S, T28N, L30P, S33L, G76D, S78P, G137W variant) coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea heterophylla (Cht) FATB2d (S21P, T28N, L30P, S33L, G76D, R97L, H124L, W127L, I132S, K258N, C303R, E309G, K334T, T386A variant) amino acid sequence
  • Cuphea heterophylla (Cht) FATB2d (S21P, T28N, L30P, S33L, G76D, R97L, H124L, W127L, I132S, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence
  • Cuphea heterophylla (Cht) FATB2d (S21P, T28N, L30P, S33L, G76D, R97L, H124L, W127L, I132S, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea heterophylla (Cht) FATB2e (G76D, R97L, H124L, I132S, G152S, H165L, T211N, K258N, C303R, E309G, K334T, T386A variant) amino acid sequence
  • Cuphea heterophylla (Cht) FATB2e (G76D, R97L, H124L, I132S, G152S, H165L, T211N, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence
  • Cuphea heterophylla (Cht) FATB2e (G76D, R97L, H124L, I132S, G152S, H165L, T211N, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea heterophylla (Cht) FATB2f (R97L, H124L, I132S, G152S, H165L, T211N variant) amino acid sequence
  • Cuphea heterophylla (Cht) FATB2f (R97L, H124L, I132S, G152S, H165L, T21 IN variant) coding DNA sequence
  • Cuphea heterophylla (Cht) FATB2f (R97L, H124L, I132S, G152S, H165L, T21 IN variant) coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea heterophylla (Cht) FATB2g (A6T, A16V, S17P, G76D, R97L, H124L, I132S, S 143I, G152S, A157T, H165L, T21 IN, G414A variant) amino acid sequence
  • Cuphea heterophylla (Cht) FATB2g (A6T, A16V, S17P, G76D, R97L, H124L, I132S, S143I, G152S, A157T, H165L, T21 IN, G414A variant) coding DNA sequence
  • Cuphea heterophylla (Cht) FATB2g (A6T, A16V, S17P, G76D, R97L, H124L, I132S, S 143I, G152S, A157T, H165L, T21 IN, G414A variant) coding DNA sequence codon optimized for Prototheca moriformis
  • CM Cuphea heterophylla
  • Cuphea avigera var. pulcherrima (Ca) FATB1 amino acid sequence
  • Cuphea avigera var. pulcherrima (Ca) FATB1 coding DNA sequence
  • Cuphea avigera var. pulcherrima (Ca) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea procumbens (Cproc) FATB1 amino acid sequence
  • Cuphea procumbens (Cproc) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea procumbens (Cproc) FATB2 amino acid sequence
  • Cuphea procumbens (Cproc) FATB2 coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea procumbens (Cproc) FATB3 amino acid sequence MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHV VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGGYGSQFQ HLLRLEDGGEIVKGRTEWRPKNAGINGVLPT
  • Cuphea procumbens (Cproc) FATB3 coding DNA sequence codon optimized for Prototheca moriformis
  • Cuphea ignea (Cignea) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
  • CaFATBl (Cuphea avigera var. pulcherrima FATB1)
  • CaFATBl (Cuphea avigera var. pulcherrima FATB1)
  • DTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVDSVGLKNIVRDG LVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMCKNDLIWVLTK MQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVWAMMNQKTRRFS RLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVS VKYI GWILESMPIEVLEAQELCSLTVEYRRECGMDSVLESVTAVDPSEDGGRSQYNHLLRLEDGTDV VKGRTEWRPKNAETNGAISPGNTSNGNSIS SEQ ID NO: 181
  • Cuphea carthagenensis CCrFATB2c (V138L variant of FATB2)

Abstract

The invention features plant acyl-ACP thioesterase genes of the FatB class and proteins encoded by these genes. The genes are useful for constructing recombinant host cells having altered fatty acid profiles. Oleaginous microalga host cells with the new genes or previously identified FatB genes are disclosed. The microalgae cells produce triglycerides with useful fatty acid profiles.

Description

THIOESTERASES AND CELLS FOR PRODUCTION OF TAILORED OILS
Cross Reference to related Applications
[0001] This application is a Continuation-in-part of United States Patent Application No. 13/837,996, filed March 15, 2013, and claims the benefit of United States Provisional Patent Application Serial No. 61/791,861, filed March 15, 2013, and United States Provisional Patent Application Serial No. 61/917,217, filed December 17, 2013, each of which is hereby incorporated by reference herein in its entirety. Background
[0002] Certain organisms including plants and some microalgae use a type II fatty acid biosynthetic pathway, characterized by the use of discrete, monofunctional enzymes for fatty acid synthesis. In contrast, mammals and fungi use a single, large, multifunctional protein. [0003] Type II fatty acid biosynthesis typically involves extension of a growing acyl-ACP (acyl-carrier protein) chain by two carbon units followed by cleavage by an acyl-ACP thioesterase. In plants, two main classes of acyl-ACP thioesterases have been identified: (i) those encoded by genes of the FatA class, which tend to hydro lyze oleoyl-ACP into oleate (an 18: 1 fatty acid) and ACP, and (ii) those encoded by genes of the FatB class, which liberate C8-C16 fatty acids from corresponding acyl-ACP molecules.
[0004] Different FatB genes from various plants have specificities for different acyl chain lengths. As a result, different gene products will produce different fatty acid profiles in plant seeds. See, US Patent Nos. 5,850,022; 5,723,761; 5,639,790; 5,807,893; 5,455,167; 5,654,495; 5,512,482;5,298,421;5,667,997; and 5,344,771; 5,304,481. Recently, FatB genes have been cloned into oleaginous microalgae to produce triglycerides with altered fatty acid profiles. See,
WO2010/063032, WO2011/150411, WO2012/106560, and WO2013/158938. Summary
[0005] In various aspects, the invention(s) contemplated herein may include, but need not be limited to, any one or more of the following embodiments:
[0006] Embodiment 1 : A nucleic acid construct including a regulatory element and a FatB gene expressing an active acyl-ACP thioesterase operable to produce an altered fatty acid profile in an oil produced by a cell expressing the nucleic acid construct, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 5 of Table la, the sequence having at least 94.6% sequence identity with each of SEQ ID NOs: 88, 82, 85, and 103, and optionally wherein the fatty acid of the oil is enriched in C8 and CIO fatty acids.
[0007] Embodiment 2: A nucleic acid construct including a regulatory element and a FatB gene expressing an active acyl-ACP thioesterase operable to produce an altered fatty acid profile in an oil produced by a cell expressing the nucleic acid construct, wherein the FatB gene expresses a protein having an amino acid sequence falling within one of clades 1-12 of Table la.
[0008] Embodiment 3: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
1 of Table la, the sequence having at least 85.9%> sequence identity with each of SEQ ID NOs: 19, 161, 22, and 160, and optionally wherein the fatty acid of the oil is enriched in C 14 and C 16 fatty acids.
[0009] Embodiment 4: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
2 of Table la, the sequence having at least 89.5%> sequence identity with each of SEQ ID NOs: 134-136, 132, 133, 137, 124, 122, 123, 125, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
[0010] Embodiment 5: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
3 of Table la, the sequence having at least 92.5% sequence identity with each of SEQ ID NOs: 126 and 127, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
[0011] Embodiment 6: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 79, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
[0012] Embodiment 7: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
6 of Table la, the sequence having at least 99.9% sequence identity with each of SEQ ID NOs: 111 and 110, and optionally wherein the fatty acid of the oil is enriched in CIO fatty acids.
[0013] Embodiment 8: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
7 of Table la, the sequence having at least 89.5% sequence identity with each of SEQ ID NOs: 73, 106, 185, 172, 171, 173, 174, and optionally wherein the fatty acid of the oil is enriched in CIO and C12 fatty acids.
[0014] Embodiment 9: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade
8 of Table la, the sequence having at least 85.9% sequence identity with each of SEQ ID NOs: 112, 113, 142, 145, 143, 144, 139, 140, 138, 141, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
[0015] Embodiment 10: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 9 of Table la, the sequence having at least 83.8% sequence identity with each of SEQ ID NOs: 187-189, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
[0016] Embodiment 11 : The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 10 of Table la, the sequence having at least 95.9% sequence identity with each of SEQ ID NOs: 147, 149, 146, 150, 152, 151, 148, 154, 156, 155, 157, 108, 75, 190, 191, and 192, and optionally wherein the fatty acid of the oil is enriched in C14 and C16 fatty acids. [0017] Embodiment 12: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 11 of Table la, the sequence having at least 88.7% sequence identity with SEQ ID NO: 121, and optionally wherein the fatty acid of the oil is enriched in C14 and C16 fatty acids.
[0018] Embodiment 13: The nucleic acid construct of embodiment 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 12 of Table la, the sequence having at least 72.8% sequence identity with each of SEQ ID NOs: 129 and 186, and optionally wherein the fatty acid of the oil is enriched in C16 fatty acids.
[0019] Embodiment 14: An isolated nucleic acid or recombinant DNA construct including a nucleic acid, wherein the nucleic acid has at least 80%> sequence identity to any of SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107, 109 or any equivalent sequences by virtue of the degeneracy of the genetic code. [0020] Embodiment 15: An isolated nucleic acid sequence encoding a protein or a host cell expressing a protein having at least 80% sequence identity to any of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, 110-192 or a fragment thereof having acyl-ACP thioesterase activity. [0021] Embodiment 16: The isolated nucleic acid of embodiment 15, wherein, the protein has acyl-ACP thioesterase activity operable to alter the fatty acid profile of an oil produced by a recombinant cell including that sequence.
[0022] Embodiment 17: A method of producing a recombinant cell that produces an altered fatty acid profile, the method including transforming the cell with a nucleic acid according to any of embodiments 1-3.
[0023] Embodiment 18: A host cell produced by the method of embodiment
17.
[0024] Embodiment 19: The host cell of embodiment 18, wherein the host cell is selected from a plant cell, a microbial cell, and a microalgal cell. [0025] Embodiment 20: A method for producing an oil or oil-derived product, the method including cultivating a host cell of embodiment 5 or 6, and extracting oil produced thereby, optionally wherein the cultivation is heterotrophic growth on sugar.
[0026] Embodiment 21 : The method of embodiment 20, further including producing a fatty acid, fuel, chemical, or other oil-derived product from the oil. [0027] Embodiment 22: An oil produced by the method of embodiment 20, optionally having a fatty acid profile including at least 20% C8, CIO, C12, C14 or C16 fatty acids.
[0028] Embodiment 23: An oil-derived product produced by the method of embodiment 21. [0029] Embodiment 24: The oil of embodiment 23, wherein the oil is produced by a microalgae and optionally, lacks C24-alpha sterols.
Description of Illustrative Embodiments of the Invention
Definitions
[0030] As used with respect to nucleic acids, the term "isolated" refers to a nucleic acid that is free of at least one other component that is typically present with the naturally occurring nucleic acid. Thus, a naturally occurring nucleic acid is isolated if it has been purified away from at least one other component that occurs naturally with the nucleic acid.
[0031] A "natural oil" or "natural fat" shall mean a predominantly triglyceride oil obtained from an organism, where the oil has not undergone blending with another natural or synthetic oil, or fractionation so as to substantially alter the fatty acid profile of the triglyceride. In connection with an oil comprising triglycerides of a particular regiospecificity, the natural oil or natural fat has not been subjected to interesterification or other synthetic process to obtain that regiospecific triglyceride profile, rather the regiospecificity is produced naturally, by a cell or population of cells. In connection with a natural oil or natural fat, and as used generally throughout the present disclosure, the terms oil and fat are used interchangeably, except where otherwise noted. Thus, an "oil" or a "fat" can be liquid, solid, or partially solid at room temperature, depending on the makeup of the substance and other conditions. Here, the term "fractionation" means removing material from the oil in a way that changes its fatty acid profile relative to the profile produced by the organism, however accomplished. The terms "natural oil" and "natural fat" encompass such oils obtained from an organism, where the oil has undergone minimal processing, including refining, bleaching and/or degumming, which does not substantially change its triglyceride profile. A natural oil can also be a "noninteresterified natural oil", which means that the natural oil has not undergone a process in which fatty acids have been redistributed in their acyl linkages to glycerol and remain essentially in the same configuration as when recovered from the organism.
[0032] "Exogenous gene" shall mean a nucleic acid that codes for the expression of an R A and/or protein that has been introduced into a cell (e.g. by transformation/transfection), and is also referred to as a "transgene". A cell comprising an exogenous gene may be referred to as a recombinant cell, into which additional exogenous gene(s) may be introduced. The exogenous gene may be from a different species (and so heterologous), or from the same species (and so
homologous), relative to the cell being transformed. Thus, an exogenous gene can include a homologous gene that occupies a different location in the genome of the cell or is under different control, relative to the endogenous copy of the gene. An exogenous gene may be present in more than one copy in the cell. An exogenous gene may be maintained in a cell, for example, as an insertion into the genome (nuclear or plastid) or as an episomal molecule.
[0033] "Fatty acids" shall mean free fatty acids, fatty acid salts, or fatty acyl moieties in a glycerolipid. It will be understood that fatty acyl groups of glycerolipids can be described in terms of the carboxylic acid or anion of a carboxylic acid that is produced when the triglyceride is hydrolyzed or saponified. [0034] "Microalgae" are microbial organisms that contain a chloroplast or other plastid, and optionally that are capable of performing photosynthesis, or a prokaryotic microbial organism capable of performing photosynthesis. Microalgae include obligate photoautotrophs, which cannot metabolize a fixed carbon source as energy, as well as heterotrophs, which can live solely off of a fixed carbon source. Microalgae include unicellular organisms that separate from sister cells shortly after cell division, such as Chlamydomonas, as well as microbes such as, for example, Volvox, which is a simple multicellular photosynthetic microbe of two distinct cell types. Microalgae include cells such as Chlorella, Dunaliella, and Prototheca. Microalgae also include other microbial photosynthetic organisms that exhibit cell- cell adhesion, such as Agmenellum, Anabaena, and Pyrobotrys. Microalgae also include obligate heterotrophic microorganisms that have lost the ability to perform photosynthesis, such as certain dinoflagellate algae species and species of the genus Prototheca.
[0035] An "oleaginous" cell is a cell capable of producing at least 20% lipid by dry cell weight, naturally or through recombinant or classical strain improvement. An "oleaginous microbe" or "oleaginous microorganism" is a microbe, including a microalga that is oleaginous. [0036] The term "percent sequence identity," in the context of two or more amino acid or nucleic acid sequences, refers to two or more sequences or
subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum
correspondence, as measured using a sequence comparison algorithm or by visual inspection. For sequence comparison to determine percent nucleotide or amino acid identity, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters. Optimal alignment of sequences for comparison can be conducted using the NCBI BLAST software (ncbi.nlm.nih.gov/BLAST/) set to default parameters. For example, to compare two nucleic acid sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 2.0.12 (Apr. 21, 2000) set at the following default parameters: Matrix: BLOSUM62; Reward for match: 1; Penalty for mismatch: -2; Open Gap: 5 and Extension Gap: 2 penalties; Gap x drop-off: 50;
Expect: 10; Word Size: 11; Filter: on. For a pairwise comparison of two amino acid sequences, one may use the "BLAST 2 Sequences" tool Version 2.0.12 (Apr. 21, 2000) with blastp set, for example, at the following default parameters: Matrix:
BLOSUM62; Open Gap: 11 and Extension Gap: 1 penalties; Gap x drop-off 50;
Expect: 10; Word Size: 3; Filter: on. [0037] In connection with a natural oil, a "profile" is the distribution of particular species or triglycerides or fatty acyl groups within the oil. A "fatty acid profile" is the distribution of fatty acyl groups in the triglycerides of the oil without reference to attachment to a glycerol backbone. Fatty acid profiles are typically determined by conversion to a fatty acid methyl ester (FAME), followed by gas chromatography (GC) analysis with flame ionization detection (FID). The fatty acid profile can be expressed as one or more percent of a fatty acid in the total fatty acid signal determined from the area under the curve for that fatty acid. FAME-GC-FID measurement approximate weight percentages of the fatty acids. [0038] As used herein, an oil is said to be "enriched" in one or more particular fatty acids if there is at least a 10% increase in the mass of that fatty acid in the oil relative to the non-enriched oil. For example, in the case of a cell expressing a heterologous FatB gene described herein, the oil produced by the cell is said to be enriched in, e.g., C8 and C16 fatty acids if the mass of these fatty acids in the oil is at least 10% greater than in oil produced by a cell of the same type that does not express the heterologous FatB gene (e.g., wild type oil).
[0039] "Recombinant" is a cell, nucleic acid, protein or vector that has been modified due to the introduction of an exogenous nucleic acid or the alteration of a native nucleic acid. Thus, e.g., recombinant (host) cells can express genes that are not found within the native (non-recombinant) form of the cell or express native genes differently than those genes are expressed by a non-recombinant cell. Recombinant cells can, without limitation, include recombinant nucleic acids that encode a gene product or suppression elements such as mutations, knockouts, antisense, interfering RNA (RNAi) or dsRNA that reduce the levels of active gene product in a cell. A "recombinant nucleic acid" is a nucleic acid originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using polymerases, ligases, exonucleases, and endonucleases, using chemical synthesis, or otherwise is in a form not normally found in nature. Recombinant nucleic acids may be produced, for example, to place two or more nucleic acids in operable linkage. Thus, an isolated nucleic acid or an expression vector formed in vitro by nucleic by ligating DNA molecules that are not normally joined in nature, are both considered recombinant for the purposes of this invention. Recombinant nucleic acids can also be produced in other ways; e.g., using chemical DNA synthesis. Once a recombinant nucleic acid is made and introduced into a host cell or organism, it may replicate using the in vivo cellular machinery of the host cell; however, such nucleic acids, once produced recombinantly, although subsequently replicated intracellularly, are still considered recombinant for purposes of this invention. Similarly, a "recombinant protein" is a protein made using recombinant techniques, i.e., through the expression of a recombinant nucleic acid.
[0040] Embodiments of the present invention relate to the use of FatB genes isolated from plants, which can be expressed in a host cell in order to alter the fatty acid profile of an oil produced by the recombinant cell. Although the microalga, Prototheca moriformis, was used to screen the genes for ability to the alter fatty acid profile, the genes are useful in a wide variety of host cells. For example, the genes can be expressed in bacteria, other microalgae, or higher plants. The genes can be expressed in higher plants according to the methods of US Patent Nos. 5,850,022; 5,723,761; 5,639,790; 5,807,893; 5,455,167; 5,654,495;
5,512,482;5,298,421;5,667,997; 5,344,771; and 5,304,481. The fatty acids can be further converted to triglycerides, fatty aldehydes, fatty alcohols and other oleochemicals either synthetically or biosynthetically.
[0041] In specific embodiments, triglycerides are produced by a host cell expressing a novel FatB gene. A triglyceride-containing natural oil can be recovered from the host cell. The natural oil can be refined, degummed, bleached and/or deodorized. The oil, in its natural or processed form, can be used for foods, chemicals, fuels, cosmetics, plastics, and other uses. In other embodiments, the FatB gene may not be novel, but the expression of the gene in a microalga is novel.
[0042] The genes can be used in a variety of genetic constructs including plasmids or other vectors for expression or recombination in a host cell. The genes can be codon optimized for expression in a target host cell. The proteins produced by the genes can be used in vivo or in purified form.
[0043] For example, the gene can be prepared in an expression vector comprising an operably linked promoter and 5 'UTR. Where a plastidic cell is used as the host, a suitably active plastid targeting peptide can be fused to the FATB gene, as in the examples below. Generally, for the newly identified FATB genes, there are roughly 50 amino acids at the N-terminal that constitute a plastid transit peptide, which are responsible for transporting the enzyme to the chloroplast. In the examples below, this transit peptide is replaced with a 38 amino acid sequence that is effective in the Prototheca moriformis host cell for transporting the enzyme to the plastids of those cells. Thus, the invention contemplates deletions and fusion proteins in order to optimize enzyme activity in a given host cell. For example, a transit peptide from the host or related species may be used instead of that of the newly discovered plant genes described here.
[0044] A selectable marker gene may be included in the vector to assist in isolating a transformed cell. Examples of selectable markers useful in microlagae include sucrose invertase and antibiotic resistance genes.
[0045] The gene sequences disclosed can also be used to prepare antisense, or inhibitory R A (e.g., RNAi or hairpin R A) to inhibit complementary genes in a plant or other organism.
[0046] FatB genes found to be useful in producing desired fatty acid profiles in a cell are summarized below in Table 1. Nucleic acids or proteins having the sequence of SEQ ID NOS: 1-109 can be used to alter the fatty acid profile of a recombinant cell. Variant nucleic acids can also be used; e.g., variants having at least 70, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity to SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107 or 109. Codon optimization of the genes for a variety of host organisms is contemplated, as is the use of gene fragments. Preferred codons for Prototheca strains and for Chlorella protothecoides are shown below in Tables 2 and 3, respectively. Codon usage for Cuphea wrightii is shown in Table 3a. Codon usage for Arabidopsis is shown in Table 3b; for example, the most preferred of codon for each amino acid can be selected. Codon tables for other organisms including microalgae and higher plants are known in the art. In some embodiments, the first and/or second most preferred Prototheca codons are employed for codon optimization. In specific embodiments, the novel amino acid sequences contained in the sequence listings below are converted into nucleic acid sequences according to the most preferred codon usage in Prototheca, Chlorella, Cuphea wrightii, or Arabidopsis as set forth in tables 2 through 3b or nucleic acid sequences having at least 70, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity to these derived nucleic acid sequences. [0047] In embodiments of the invention, there is protein or a nucleic acid encoding a protein having any of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, or 110-192. In an embodiment, there is protein or a nucleic acid encoding a protein having at least 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% sequence identity with any of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, or 110-192. In certain embodiments, the invention encompasses a fragment any of the above-described proteins or nucleic acids
(including fragments of protein or nucleic acid variants), wherein the protein fragment has acyl-ACP thioesterase activity or the nucleic acid fragment encodes such a protein fragment. In other embodiments, the fragment includes a domain of an acyl-ACP thioesterase that mediates a particular function, e.g., a specificity-determining domain. Illustrative fragments can be produced by C-terminal and/or N-terminal truncations and include at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%), 98%o, or 99%> of the full-length sequences disclosed herein.
[0048] In certain embodiments, percent sequence identity for variants of the nucleic acids or proteins discussed above can be calculated by using the full-length nucleic acid sequence (e.g., one of SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107 or 109) or full-length amino acid sequence (e.g., one of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, or 110-192) as the reference sequence and comparing the full-length test sequence to this reference sequence. In some embodiments relating to fragments, percent sequence identity for variants of nucleic acid or protein fragments can be calculated over the entire length of the fragment.
[0049] The nucleic acids can be in isolated form, or part of a vector or other construct, chromosome or host cell. It has been found that is many cases the full length gene (and protein) is not needed; for example, deletion of some or all of the N- terminal hydrophobic domain (typically an 18 amino acid domain starting with LPDW) yields a still-functional gene. In addition, fusions of the specificity determining regions of the genes in Table 1 with catalytic domains of other acyl-ACP thioesterases can yield functional genes. Thus, in certain embodiments, the invention encompasses functional fragments (e.g., specificity determining regions) of the disclosed nucleic acid or amino acids fused to heterologous acyl-ACP thioesterase nucleic acid or amino acid sequences, respectively.
Table 1 : FatB genes according to embodiments of the present invention
Cuphea ChsFATB2b +a.a.248-259 SEQ ID SEQ ID SEQ ID hyssopifolia NO: 16 NO: 17 NO: 18
Cuphea ChsFATB3 "wild-type" SEQ ID SEQ ID SEQ ID hyssopifolia NO: 19 NO: 20 NO: 21
Cuphea ChsFATB3b V204I.C239F, SEQ ID SEQ ID SEQ ID hyssopifolia E243D, M251V NO: 22 NO: 23 NO: 24
Cuphea CuPSR23FATB "wild-type" SEQ ID SEQ ID SEQ ID PSR23 3 NO: 25 NO: 26 NO: 27
Cuphea CwFATB3 "wild-type" SEQ ID SEQ ID SEQ ID wrightii NO: 28 NO: 29 NO: 30
Cuphea CwFATB4a "wild-type" SEQ ID SEQ ID SEQ ID wrightii NO: 31 NO: 32 NO: 33
Cuphea CwFATB4b "wild-type" SEQ ID SEQ ID SEQ ID wrightii NO: 34 NO: 35 NO: 36
Cuphea CwFATB5 "wild-type" SEQ ID SEQ ID SEQ ID wrightii NO: 37 NO: 38 NO: 39
Cuphea ChtFATBla "wild-type" SEQ ID SEQ ID SEQ ID heterophylla NO: 40 NO: 41 NO: 42
Cuphea ChtFATBlb P16S, T20P, SEQ ID SEQ ID SEQ ID heterophylla G94S, G105W, NO: 43 NO: 44 NO: 45
S293F, L305F
Cuphea ChtFATB2b "wild-type" SEQ ID SEQ ID SEQ ID heterophylla NO: 46 NO: 47 NO: 48
Cuphea ChtFATB2a S17P, P21S, SEQ SEQ ID SEQ ID heterophylla T28N, L30P, IDO NO: 50 NO: 51
S33L, G76D, NO: 49
S78P, G137W
Cuphea ChtFATB2c G76D, S78P SEQ ID SEQ ID SEQ ID heterophylla NO: 52 NO: 53 NO: 54
Cuphea ChtFATB2d S21P, T28N, SEQ ID SEQ ID SEQ ID heterophylla L30P, S33L, NO: 55 NO: 56 NO: 57
G76D, R97L,
H124L, W127L, I132S, K258N,
C303R, E309G,
K334T, T386A
Cuphea ChtFATB2e G76D, R97L, SEQ ID SEQ ID SEQ ID heterophylla H124L, I132S, NO: 58 NO: 59 NO: 60
G152S, H165L,
T211N, K258N,
C303R, E309G,
K334T, T386A
Cuphea ChtFATB2f R97L, H124L, SEQ ID SEQ ID SEQ ID heterophylla I132S, G152S, NO: 61 NO: 62 NO: 63
H165L, T211N
Cuphea ChtFATB2g A6T, A16V, S17P, SEQ ID SEQ ID SEQ ID heterophylla G76D, R97L, NO: 64 NO: 65 NO: 66
H124L, I132S,
S143I, G152S,
A157T, H165L,
T211N, G414A
Cuphea ChtFATB3a "wild-type" SEQ ID SEQ ID SEQ ID heterophylla NO: 67 NO: 68 NO: 69
Cuphea ChtFATB3b C67G, H72Q, SEQ ID SEQ ID SEQ ID heterophylla L128F, N179I NO: 70 NO: 71 NO: 72
Cuphea CvisFATB 1 published SEQ ID N/A SEQ ID viscosissima NO: 73 NO: 74
Cuphea CvisFATB2 published SEQ ID N/A SEQ ID viscosissima NO: 75 NO: 76
Cuphea CvisFATB3 published SEQ ID N/A SEQ ID viscosissima NO: 77 NO: 78
Cuphea CcalcFATBl "wild-type" SEQ ID SEQ ID SEQ ID calcarata NO: 79 80 81
Cuphea CpaiFATB 1 "wild-type" SEQ ID SEQ ID SEQ ID painteri NO: 82 83 84 Cuphea ChookFATB4 "wild-type" SEQ ID SEQ ID SEQ ID hookeriana NO: 85 86 87
Cuphea CaFATBl "wild-type" SEQ ID SEQ ID SEQ ID avigera var. NO: 88 89 90 pulcherrima
Cuphea CPauFATB 1 "wild-type" SEQ ID SEQ ID SEQ ID paucipetala NO: 91 92 93
Cuphea CprocFATBl "wild-type" SEQ ID SEQ ID SEQ ID procumbens NO: 94 95 96
Cuphea CprocFATB2 "wild-type" SEQ ID SEQ ID SEQ ID procumbens NO: 97 98 99
Cuphea CprocFATB3 "wild-type" SEQ ID SEQ ID SEQ ID procumbens NO: 100 101 102
Cuphea CigneaFATBl "wildtype"; partial SEQ ID SEQ ID SEQ ID ignea (missing N- NO: 103 104 105 terminal portion of
native transit
peptide, fused to
Cp SAD 1 tp trimm
ed transit peptide)
Consensus JcFATBl Consensus SEQ ID None, SEQ ID sequence NO: 106 can be NO: 107 codon
optimize d for a
given host
Consensus JcFATB2 Consensus SEQ ID None, SEQ ID sequence NO: 108 can be NO: 109 codon
optimize d for a
given
host
In certain embodiments, a host cell (e.g. plant or microalgal cell) is transformed to produce a recombinant FATB protein falling into one of clades 1-12 of Table la. These clades were determined by sequence alignment and observation of changes in fatty acid profile when expressed in Prototheca. See Example 5. The FATB amino acid sequence can fall within x% amino acid sequence identity of each sequence in that clade listed in Table la, where x is a first second or third cutoff value, also listed in Table la.
Table la: Groupings of Novel FatB genes into clades.
CgFATBlb (SEQ ID NO: 185)
CprocFATBl (SEQ ID NO: 172)
CpauFATBl (SEQ ID NO: 171)
CprocFATB2 (SEQ ID NO: 173)
CprocFATB3 (SEQ ID NO: 174)
CwFATB3 (SEQ ID NO: 112) Increase C12/C 14 85.9 98.9 99.5 CwFATB3a (SEQ ID NO: 113) fatty acids
ChtFATB2e (SEQ ID NO: 142)
ChtFATB2h (SEQ ID NO: 145)
ChtFATB2f (SEQ ID NO: 143)
ChtFATB2g (SEQ ID NO: 144)
ChtFATB2a (SEQ ID NO: 139)
ChtFATB2c (SEQ ID NO: 140)
ChtFATB2b (SEQ ID NO: 138)
ChtFATB2d (SEQ ID NO: 141)
CcrFATB2c (SEQ ID NO: 187) Increase C12/C 14 83.8 90 95 CcrFATB2 (SEQ ID NO: 188) fatty acids
CcrFATB2b (SEQ ID NO: 189)
ChtFATB3b (SEQ ID NO: 147) Increase C14/C16 95.9 98 99 ChtFATB3d (SEQ ID NO: 149)
fatty acids
ChtFATB3a (SEQ ID NO: 146)
ChtFATB3e (SEQ ID NO: 150)
ChtFATB3g (SEQ ID NO: 152)
ChtFATB3f (SEQ ID NO: 151)
ChtFATB3c (SEQ ID NO: 148)
ChsFATB2 (SEQ ID NO: 154)
ChsFATB2c (SEQ ID NO: 156)
ChsFATB2b (SEQ ID NO: 155)
ChsFATB2d (SEQ ID NO: 157)
JcFATB2/SzFATB2 (SEQ ID NO: 108)
CvisFATB2 (SEQ ID NO: 75)
CcrFATBl (SEQ ID NO: 190 )
CcrFATBlb (SEQ ID NO: 191 )
CcrFATBl c (SEQ ID NO: 192 )
Increase C14/C16 88.7 94.5 97
CwFATB4b.l (SEQ ID NO: 121) fatty acids
CcFATB3 (SEQ ID NO: 129) Increase C16 fatty 72.8 85 90 UcFATB3 (SEQ ID NO: 186) acids
(predicted) Table 2: Preferred codon usage in Prototheca strains
Ala GCG 345 (0.36) Asn AAT 8(0.04)
GCA 66 (0.07) AAC 201 (0.96) GCT 101(0.11)
GCC 442(0.46) Pro CCG 161(0.29)
CCA 49 (0.09)
Cys TGT 12(0.10) CCT 71(0.13)
TGC 105 (0.90) CCC 267 (0.49) Asp GAT 43(0.12) Gin CAG 226(0.82)
GAC 316(0.88) CAA 48(0.18)
Glu GAG 377 (0.96) Arg AGG 33 (0.06)
GAA 14(0.04) AGA 14(0.02)
CGG 102(0.18)
Phe TTT 89(0.29) CGA 49(0.08)
TTC 216(0.71) CGT 51(0.09)
CGC 331 (0.57)
Gly GGG 92(0.12)
GGA 56 (0.07) Ser AGT 16(0.03)
GGT 76(0.10) AGC 123 (0.22)
GGC 559(0.71) TCG 152(0.28)
TCA 31 (0.06) His CAT 42 (0.21) TCT 55 (0.10)
CAC 154 (0.79) TCC 173 (0.31) lie ATA 4 (0.01) ACG 184 (0.38)
ATT 30 (0.08) ACA 24 (0.05)
ATC 338 (0.91) ACT 21 (0.05)
ACC 249 (0.52)
Lys AAG 284 (0.98)
AAA 7 (0.02) GTG 308 (0.50)
GTA 9 (0.01)
Leu TTG 26 (0.04) GTT 35 (0.06)
TTA 3 (0.00) GTC 262 (0.43)
CTG 447 (0.61)
CTA 20 (0.03) TGG 107 (1.00)
CTT 45 (0.06)
CTC 190 (0.26) TAT 10 (0.05)
TAC 180 (0.95)
Met ATG 191 (1.00)
TGA/TAG/TAA Table 3 : Preferred codon usage in Chlorella protothecoides
TTC (Phe) TAC (Tyr) TGC (Cys) TGA (Stop)
TGG (Trp) CCC (Pro) CAC (His) CGC (Arg)
CTG (Leu) CAG (Gin) ATC ACC (Thr) GAC (Asp) TCC (Ser) ATG (Met) AAG (Lys)
GCC (Ala) AAC (Asn) GGC (Gly) GTG (Val)
GAG (Glu)
Table 3a: Codon usage for Cuphea wrightii UUU F 0.48 19.5 ( 52) UCU S 0.21 19.5 ( 52) UAU Y 0.45 6.4 ( 17) UGU C 0.41 10.5 ( 28)
UUC F 0.52 21.3 ( 57) UCC S 0.26 23.6 ( 63) UAC Y 0.55 7.9 ( 21) UGC C 0.59 15.0 ( 40)
UUA L 0.07 5.2 ( 14) UCA S 0.18 16.8 ( 45) UAA * 0.33 0.7 ( 2) UGA * 0.33 0.7 ( 2)
UUG L 0.19 14.6 ( 39) UCG S 0.1 1 9.7 ( 26) UAG * 0.33 0.7 ( 2) UGG W 1.00 15.4 ( 41)
CUU L 0.27 21.0 ( 56) CCU P 0.48 21.7 ( 58) CAU H 0.60 1 1.2 ( 30) CGU R 0.09 5.6 ( 15)
CUC L 0.22 17.2 ( 46) CCC P 0.16 7.1 ( 19) CAC H 0.40 7.5 ( 20) CGC R 0.13 7.9 ( 21)
CUA L 0.13 10.1 ( 27) CCA P 0.21 9.7 ( 26) CAA Q 0.31 8.6 ( 23) CGA R 0.1 1 6.7 ( 18) CUG L 0.12 9.7 ( 26) CCG P 0.16 7.1 ( 19) CAG Q 0.69 19.5 ( 52) CGG R 0.16 9.4
( 25) AUU I 0.4422.8 ( 61) ACU T 0.3316.8 ( 45) AAUN 0.6631.4 ( 84) AGUS0.18
16.1 ( 43)
AUC I 0.2915.4 ( 41) ACC T 0.2713.9 ( 37) AAC N 0.3416.5 ( 44) AGC S 0.07 6.0 ( 16)
AUAI0.2713.9 ( 37) ACA T 0.2613.5 ( 36) AAA K 0.4221.0 ( 56) AGAR0.24
14.2 ( 38)
AUGM 1.0028.1 ( 75) ACG T 0.14 7.1 ( 19) AAG K 0.5829.2 ( 78) AGG R 0.27
16.1 ( 43)
GUUV0.2819.8 ( 53) GCU A 0.3531.4 ( 84) GAU D 0.6335.9 ( 96) GGU G 0.29 26.6 ( 71)
GUC V 0.2115.0 ( 40) GCC A 0.2018.0 ( 48) GAC D 0.3721.0 ( 56) GGC G 0.20 18.0 ( 48) GUAV0.1410.1 ( 27) GCA A 0.3329.6 ( 79) GAA E 0.4118.3 ( 49) GGAG0.35 31.4 ( 84)
GUGVO.3625.1 ( 67) GCG A 0.11 9.7 ( 26) GAG E 0.5926.2 ( 70) GGGG0.16
14.2 ( 38)
Table 3b: Codon usage for Arabidopsis UUUF 0.5121.8 (678320) UCU S 0.2825.2 (782818) UAU Y 0.5214.6 (455089) UGU C 0.6010.5 (327640)
UUC F 0.4920.7 (642407) UCCS 0.1311.2 (348173) UAC Y 0.4813.7(427132) UGC C 0.40 7.2(222769)
UUAL 0.1412.7(394867) UCA S 0.2018.3 (568570) UAA * 0.36 0.9(29405) UGA * 0.44 1.2(36260)
UUGL 0.2220.9 (649150) UCGS 0.10 9.3 (290158) UAG * 0.20 0.5 ( 16417) UGG W 1.0012.5 (388049) CUUL 0.2624.1 (750114) CCU P 0.3818.7 (580962) CAUH0.6113.8 (428694) CGUR 0.17 9.0(280392)
CUC L 0.1716.1 (500524) CCCP0.11 5.3 (165252) CAC H 0.39 8.7 (271155) CGC R 0.07 3.8 (117543) CUAL0.11 9.9 (307000) CCAP0.3316.1 (502101) CAA Q 0.5619.4 (604800) CGAR 0.12 6.3 (195736)
CUGL 0.11 9.8 (305822) CCGP 0.18 8.6 (268115) CAG Q 0.4415.2 (473809) CGG R 0.09 4.9(151572)
AUU I 0.4121.5 (668227) ACU T 0.3417.5 (544807) AAU N 0.5222.3 (693344) AGU S 0.1614.0 (435738)
AUC I 0.3518.5 (576287) ACC T 0.2010.3 (321640) AAC N 0.4820.9 (650826) AGC S 0.1311.3 (352568)
AUA I 0.2412.6(391867) ACAT0.3115.7 (487161) AAA K 0.4930.8 (957374) AGAR 0.3519.0 (589788)
AUG M 1.0024.5 (762852) ACGT0.15 7.7 (240652) AAG K 0.5132.7 (1016176) AGG R0.2011.0 (340922)
GUUV 0.4027.2 (847061) GCU A 0.4328.3 (880808) GAU D 0.6836.6 (1139637) GGU G 0.3422.2 (689891)
GUCV 0.1912.8 (397008) GCC A 0.1610.3 (321500) GAC D 0.3217.2 (535668) GGC G 0.14 9.2(284681)
GUAV0.15 9.9 (308605) GCAA0.2717.5 (543180) GAA E 0.5234.3 (1068012) GGA G 0.3724.2 (751489) GUG V 0.2617.4 (539873) GCG A 0.14 9.0 (280804) GAG E 0.4832.2 (1002594) GGG G0.1610.2 (316620) Host Cells
[0050] The host cell can be a single cell (e.g., microalga, bacteria, yeast) or part of a multicellular organism such as a plant or fungus. Methods for expressing Fatb genes in a plant are given in 5,850,022; 5,723,761; 5,639,790; 5,807,893;
5,455,167; 5,654,495; 5,512,482;5,298,421;5,667,997; and 5,344,771; 5,304,481, or can be accomplished using other techniques generally known in plant biotechnology. Engineering of oleaginous microbes including those of Chlorophyta is disclosed in WO2010/063032, WO2011,150411, and WO2012/106560 and in the examples below. [0051] Examples of oleaginous host cells include plant cells and microbial cells having a type II fatty acid biosynthetic pathway, including plastidic oleaginous cells such as those of oleaginous algae. Specific examples of microalgal cells include heterotrophic or obligate heterotrophic microalgae of the phylum Chlorophtya, the class Trebouxiophytae, the order Chlorellales, or the family Chlorellacae. Examples of oleaginous microalgae are provided in Published PCT Patent Applications
WO2008/151149, WO2010/06032, WO2011/150410, and WO2011/150411, including species of Chlorella and Prototheca, a genus comprising obligate heterotrophs. The oleaginous cells can be, for example, capable of producing 25, 30, 40, 50, 60, 70, 80, 85, or about 90% oil by cell weight, ±5%. Optionally, the oils produced can be low in DHA or EPA fatty acids. For example, the oils can comprise less than 5%, 2 %, or 1% DHA and/or EPA. The above-mentioned publications also disclose methods for cultivating such cells and extracting oil, especially from microalgal cells; such methods are applicable to the cells disclosed herein and incorporated by reference for these teachings. When microalgal cells are used they can be cultivated autotrophically (unless an obligate heterotroph) or in the dark using a sugar (e.g., glucose, fructose and/or sucrose). In any of the embodiments described herein, the cells can be heterotrophic cells comprising an exogenous invertase gene so as to allow the cells to produce oil from a sucrose feedstock. Alternately, or in addition, the cells can metabolize xylose from cellulosic feedstocks. For example, the cells can be genetically engineered to express one or more xylose metabolism genes such as those encoding an active xylose transporter, a xylulose-5 -phosphate transporter, a xylose isomerase, a xylulokinase, a xylitol dehydrogenase and a xylose reductase. See WO2012/154626, "GENETICALLY ENGINEERED
MICROORGANISMS THAT METABOLIZE XYLOSE", published Nov 15, 2012.
Oils and Related Products
[0052] The oleaginous cells express one or more exogenous genes encoding fatty acid biosynthesis enzymes. As a result, some embodiments feature natural oils that were not obtainable from a non-plant or non-seed oil, or not obtainable at all.
[0053] The oleaginous cells produce a storage oil, which is primarily triacylglyceride and may be stored in storage bodies of the cell. A raw oil may be obtained from the cells by disrupting the cells and isolating the oil. WO2008/151149, WO2010/06032, WO2011/150410, and WO2011/1504 disclose heterotrophic cultivation and oil isolation techniques. For example, oil may be obtained by cultivating, drying and pressing the cells. The oils produced may be refined, bleached and deodorized (RBD) as known in the art or as described in WO2010/120939. The raw or RBD oils may be used in a variety of food, chemical, and industrial products or processes. After recovery of the oil, a valuable residual biomass remains. Uses for the residual biomass include the production of paper, plastics, absorbents, adsorbents, as animal feed, for human nutrition, or for fertilizer.
[0054] Where a fatty acid profile of a triglyceride (also referred to as a
"triacylglyceride" or "TAG") cell oil is given here, it will be understood that this refers to a nonfractionated sample of the storage oil extracted from the cell analyzed under conditions in which phospholipids have been removed or with an analysis method that is substantially insensitive to the fatty acids of the phospholipids (e.g. using chromatography and mass spectrometry). The oil may be subjected to an RBD process to remove phospholipids, free fatty acids and odors yet have only minor or negligible changes to the fatty acid profile of the triglycerides in the oil. Because the cells are oleaginous, in some cases the storage oil will constitute the bulk of all the TAGs in the cell.
[0055] The stable carbon isotope value 513C is an expression of the ratio of
13C/12C relative to a standard (e.g. PDB, carbonite of fossil skeleton of Belemnite americana from Peedee formation of South Carolina). The stable carbon isotope value 513C (0/00) of the oils can be related to the 513C value of the feedstock used. In some embodiments, the oils are derived from oleaginous organisms
heterotrophically grown on sugar derived from a C4 plant such as corn or sugarcane. In some embodiments the 513C (0/00) of the oil is from -10 to -17 0/00 or from -13 to -16 0/00. [0056] The oils produced according to the above methods in some cases are made using a microalgal host cell. As described above, the microalga can be, without limitation, fall in the classification of Chlorophyta, Trebouxiophyceae , Chlorellales, Chlorellaceae, or Chlorophyceae. It has been found that microalgae of
Trebouxiophyceae can be distinguished from vegetable oils based on their sterol profiles. Oil produced by Chlorella protothecoides was found to produce sterols that appeared to be brassicasterol, ergosterol, campesterol, stigmasterol, and β-sitosterol, when detected by GC-MS. However, it is believed that all sterols produced by Chlorella have C24P stereochemistry. Thus, it is believed that the molecules detected as campesterol, stigmasterol, and β-sitosterol, are actually 22,23- dihydrobrassicasterol, proferasterol and clionasterol, respectively. Thus, the oils produced by the microalgae described above can be distinguished from plant oils by the presence of sterols with C24 stereochemistry and the absence of C24a stereochemistry in the sterols present. For example, the oils produced may contain 22, 23 -dihydrobrassicasterol while lacking campesterol; contain clionasterol, while lacking in β-sitosterol, and/or contain poriferasterol while lacking stigmasterol.
Alternately, or in addition, the oils may contain significant amounts of Δ7- poriferasterol.
[0057] In one embodiment, the oils provided herein are not vegetable oils.
Vegetable oils are oils extracted from plants and plant seeds. Vegetable oils can be distinguished from the non-plant oils provided herein on the basis of their oil content. A variety of methods for analyzing the oil content can be employed to determine the source of the oil or whether adulteration of an oil provided herein with an oil of a different (e.g. plant) origin has occurred. The determination can be made on the basis of one or a combination of the analytical methods. These tests include but are not limited to analysis of one or more of free fatty acids, fatty acid profile, total triacylglycerol content, diacylglycerol content, peroxide values, spectroscopic properties (e.g. UV absorption), sterol profile, sterol degradation products, antioxidants (e.g. tocopherols), pigments (e.g. chlorophyll), dl3C values and sensory analysis (e.g. taste, odor, and mouth feel). Many such tests have been standardized for commercial oils such as the Codex Alimentarius standards for edible fats and oils.
[0058] Sterol profile analysis is a particularly well-known method for determining the biological source of organic matter. Campesterol, b-sitosterol, and stigamsterol are common plant sterols, with b-sitosterol being a principle plant sterol. For example, b-sitosterol was found to be in greatest abundance in an analysis of certain seed oils, approximately 64% in corn, 29% in rapeseed, 64% in sunflower, 74% in cottonseed, 26% in soybean, and 79% in olive oil (Gul et al. J. Cell and Molecular Biology 5:71-79, 2006).
[0059] Oil isolated from Prototheca moriformis strain UTEX1435 were separately clarified (CL), refined and bleached (RB), or refined, bleached and deodorized (RBD) and were tested for sterol content according to the procedure described in JAOCS vol.
60, no.8, August 1983. Results of the analysis are shown below (units in mg/lOOg):
[0060] These results show three striking features. First, ergosterol was found to be the most abundant of all the sterols, accounting for about 50% or more of the total sterols. The amount of ergosterol is greater than that of campesterol, β-sitosterol, and stigmasterol combined. Ergosterol is steroid commonly found in fungus and not commonly found in plants, and its presence particularly in significant amounts serves as a useful marker for non-plant oils. Secondly, the oil was found to contain brassicasterol. With the exception of rapeseed oil, brassicasterol is not commonly found in plant based oils. Thirdly, less than 2% β-sitosterol was found to be present. β-sitosterol is a prominent plant sterol not commonly found in microalgae, and its presence particularly in significant amounts serves as a useful marker for oils of plant origin. In summary, Prototheca moriformis strain UTEX1435 has been found to contain both significant amounts of ergosterol and only trace amounts of β-sitosterol as a percentage of total sterol content. Accordingly, the ratio of ergosterol : β- sitosterol or in combination with the presence of brassicasterol can be used to distinguish this oil from plant oils.
[0061] In some embodiments, the oil content of an oil provided herein contains, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% β- sitosterol. In other embodiments the oil is free from β-sitosterol.
[0062] In some embodiments, the oil is free from one or more of β-sitosterol, campesterol, or stigmasterol. In some embodiments the oil is free from β-sitosterol, campesterol, and stigmasterol. In some embodiments the oil is free from campesterol. In some embodiments the oil is free from stigmasterol.
[0063] In some embodiments, the oil content of an oil provided herein comprises, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% 24- ethylcholest-5-en-3-ol. In some embodiments, the 24-ethylcholest-5-en-3-ol is clionasterol. In some embodiments, the oil content of an oil provided herein comprises, as a percentage of total sterols, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%), or 10%) clionasterol.
[0064] In some embodiments, the oil content of an oil provided herein contains, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% 24- methylcholest-5-en-3-ol. In some embodiments, the 24-methylcholest-5-en-3-ol is 22, 23-dihydrobrassicasterol. In some embodiments, the oil content of an oil provided herein comprises, as a percentage of total sterols, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, or 10% 22,23-dihydrobrassicasterol.
[0065] In some embodiments, the oil content of an oil provided herein contains, as a percentage of total sterols, less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% 5,22- cholestadien-24-ethyl-3-ol. In some embodiments, the 5, 22-cholestadien-24-ethyl-3- ol is poriferasterol. In some embodiments, the oil content of an oil provided herein comprises, as a percentage of total sterols, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%), or 10%) poriferasterol.
[0066] In some embodiments, the oil content of an oil provided herein contains ergosterol or brassicasterol or a combination of the two. In some embodiments, the oil content contains, as a percentage of total sterols, at least 5%, 10%, 20%, 25%, 35%, 40%, 45%, 50%, 55%, 60%, or 65% ergosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 25% ergosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 40% ergosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 5%, 10%, 20%, 25%, 35%, 40%, 45%, 50%, 55%, 60%, or 65% of a combination of ergosterol and brassicasterol.
[0067] In some embodiments, the oil content contains, as a percentage of total sterols, at least 1%, 2%, 3%, 4% or 5% brassicasterol. In some embodiments, the oil content contains, as a percentage of total sterols less than 10%, 9%, 8%, 7%, 6%, or 5% brassicasterol.
[0068] In some embodiments the ratio of ergosterol to brassicasterol is at least 5: 1, 10: 1, 15: 1, or 20: 1.
[0069] In some embodiments, the oil content contains, as a percentage of total sterols, at least 5%, 10%, 20%, 25%, 35%, 40%, 45%, 50%, 55%, 60%, or 65% ergosterol and less than 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1% β-sitosterol. In some embodiments, the oil content contains, as a percentage of total sterols, at least 25% ergosterol and less than 5% β-sitosterol. In some embodiments, the oil content further comprises brassicasterol. [0070] Sterols contain from 27 to 29 carbon atoms (C27 to C29) and are found in all eukaryotes. Animals exclusively make C27 sterols as they lack the ability to further modify the C27 sterols to produce C28 and C29 sterols. Plants however are able to synthesize C28 and C29 sterols, and C28/C29 plant sterols are often referred to as phytosterols. The sterol profile of a given plant is high in C29 sterols, and the primary sterols in plants are typically the C29 sterols b-sitosterol and stigmasterol. In contrast, the sterol profile of non-plant organisms contain greater percentages of C27 and C28 sterols. For example the sterols in fungi and in many microalgae are principally C28 sterols. The sterol profile and particularly the striking predominance of C29 sterols over C28 sterols in plants has been exploited for determining the proportion of plant and marine matter in soil samples (Huang, Wen-Yen, Meinschein W. G., "Sterols as ecological indicators"; Geochimica et Cosmochimia Acta. Vol 43. pp 739-745).
[0071] In some embodiments the primary sterols in the microalgal oils provided herein are sterols other than b-sitosterol and stigmasterol. In some embodiments of the microalgal oils, C29 sterols make up less than 50%, 40%, 30%, 20%, 10%, or 5% by weight of the total sterol content.
[0072] In some embodiments the microalgal oils provided herein contain C28 sterols in excess of C29 sterols. In some embodiments of the microalgal oils, C28 sterols make up greater than 50%, 60%, 70%, 80%, 90%, or 95% by weight of the total sterol content. In some embodiments the C28 sterol is ergosterol. In some embodiments the C28 sterol is brassicasterol.
[0073] In embodiments of the present invention, oleaginous cells expressing one or more of the genes of Table 1 can produce an oil with at least 20, 40, 60 or 70% of C8, CIO, C12, C14 or C16 fatty acids. In a specific embodiment, the level of myristate (C14:0) in the oil is greater than 30%. [0074] Thus, in embodiments of the invention, there is a process for producing an oil, triglyceride, fatty acid, or derivative of any of these, comprising transforming a cell with any of the nucleic acids discussed herein. In another embodiment, the transformed cell is cultivated to produce an oil and, optionally, the oil is extracted. Oil extracted in this way can be used to produce food, oleochemicals or other products. [0075] The oils discussed above alone or in combination are useful in the production of foods, fuels and chemicals (including plastics, foams, films, etc). The oils, triglycerides, fatty acids from the oils may be subjected to C-H activation, hydroamino methylation, methoxy-carbonation, ozonolysis, enzymatic
transformations, epoxidation, methylation, dimerization, thiolation, metathesis, hydro- alkylation, lactonization, or other chemical processes.
[0076] After extracting the oil, a residual biomass may be left, which may have use as a fuel, as an animal feed, or as an ingredient in paper, plastic, or other product. For example, residual biomass from heterotrophic algae can be used in such products.
Example 1. Discovery of Novel FATB sequences
[0077] Sequences of novel plant acyl-ACP thioesterases involved in seed-specific mid-chain (C8-C16) fatty acid biosynthesis in higher plants were isolated. Seed-specific lipid production genes were isolated through direct interrogation of R A pools accumulating in oilseeds. Based on phylogenetic analysis, novel enzymes can be classified as members of FatB family of acyl-ACP thioesterases.
[0078] Seeds of oleaginous plants were obtained from local grocery stores or requested through USDA ARS National Plant Germplasm System (NPGS) from North Central Regional Plant Introduction Station (NCRIS) or USDA ARS North Central Soil Conservation Research Laboratory (Morris, MI). Dry seeds were homogenized in liquid nitrogen to powder, resuspended in cold extraction buffer containing 6-8M Urea and 3M LiCl and left on ice for a few hours to overnight at 4 °C. The seed homogenate was passed through NucleoSpin Filters (Macherey-Nagel) by centrifugation at 20,000g for 20 minutes in the refrigerated microcentrifuge (4 °C). The resulting RNA pellets were resuspended in the buffer containing 20 mM Tris HC1, pH7.5, 0.5% SDS, 100 mM NaCl, 25 mM EDTA, 2% PVPP) and RNA was subsequently extracted once with Phenol-Chloroform-Isoamyl Alcohol (25:24: 1, v/v) and once with chloroform. RNA was finally precipitated with isopropyl alcohol (0.7 Vol.) in the presence of 150 mM of Na Acetate, pH5.2, washed with 80% ethanol by centrifugation, and dried. RNA samples were treated with Turbo DNAse (Lifetech) and purified further using RNeasy kits (Qiagen) following manufacturers' protocols. The resulting purified RNA samples were converted to pair-end cDNA libraries and subjected to next-generation sequencing (2xl00bp) using Illumina Hiseq 2000 platform. RNA sequence reads were assembled into corresponding seed
transcriptomes using Trinity or Oases packages. Putative thioesterase-containg cDNA contigs were identified by mining transcriptomes for sequences with homology to known thioesterases. These in silico identified putative thioesterase cDNAs have been further verified by direct reverse transcription PCR analysis using seed RNA and primer pairs targeting full-length thioesterase cDNAs. The resulting amplified products were cloned and sequenced de novo to confirm authenticity of identified thioesterase genes.
[0079] To interrogate evolutionary and functional relationship between novel acyl-ACP thioesterases and the members of two existing thioesterase classes (FatA and FatB), we performed a phylogenetic analysis using published full-length (Mayer and Shanklin, 2007) and truncated (THYME database) amino acid thioesterase sequences. Novel proteins appear to group with known acyl-ACP FatB thioesterases involved in biosynthesis of C8-C16 fatty acids. Moreover, novel thioesterases appear to cluster into 3 predominant out-groups suggesting distinct functional similarity and evolutionary relatedness among members of each cluster.
[0080] The amino acid sequences of the FatB genes follow are shown in Table
Table 4: Amino acid sequences of FatB genes:
CwFATB4b SEQ ID NO: 120
CwFATB4b.l SEQ ID NO: 121
CwFATBS SEQ ID NO: 122
CwFATBSa SEQ ID NO: 123
CwFATBSb SEQ ID NO: 124
CwFATBSc SEQ ID NO: 125
CwFATBS.1 SEQ ID NO: 126
CwFATBS.la SEQ ID NO: 127
CcFATB2b SEQ ID NO: 128
CcFATB3 SEQ ID NO: 129
CcFATB3b SEQ ID NO: 130
CcFATB3c SEQ ID NO: 131
ChtFATBla SEQ ID NO: 132
ChtFATBla.l SEQ ID NO: 133
ChtFATBla.2 SEQ ID NO: 134
ChtFATBla.3 SEQ ID NO: 135
ChtFATBla.4 SEQ ID NO: 136
ChtFATBlb SEQ ID NO: 137
ChtFATB2b SEQ ID NO: 138
ChtFATB2a SEQ ID NO: 139
ChtFATB2c SEQ ID NO: 140
ChtFATB2d SEQ ID NO: 141
ChtFATB2e SEQ ID NO: 142
ChtFATB2f SEQ ID NO: 143
ChtFATB2g SEQ ID NO: 144
ChtFATB2h SEQ ID NO: 145
ChtFATB3a SEQ ID NO: 146
ChtFATB3b SEQ ID NO: 147
ChtFATB3c SEQ ID NO: 148
ChtFATB3d SEQ ID NO: 149
ChtFATB3e SEQ ID NO: 150
ChtFATB3f SEQ ID NO: 151 ChtFATB3g SEQ ID NO: 152
ChsFATBl SEQ ID NO: 153
ChsFATB2 SEQ ID NO: 154
ChsFatB2b SEQ ID NO: 155
ChsFatB2c SEQ ID NO: 156
ChsFatB2d SEQ ID NO: 157
Chs FATB3 SEQ ID NO: 158
ChsFatb3b SEQ ID NO: 159
ChsFatB3c SEQ ID NO: 160
ChsFATB3d SEQ ID NO: 161
ChsFATB3e SEQ ID NO: 162
ChsFATB3f SEQ ID NO: 163
ChsFATB3g SEQ ID NO: 164
ChsFATB3h SEQ ID NO: 165
ChsFATB3i SEQ ID NO: 166
ChsFATB3j SEQ ID NO: 167
ChsFATB3j:
MVAAEASSALFSVRTPGTSPKPGKFGNWPTSLSVPFKSKSNHNGGFQV KANASARPKANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITT VFVAAEKQWTMLDRKSKRPDMLMDPFGVDRVVQDGAVFRQSFSIRSYEIGA DRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWVVTKMHIEV NRYPTWGDTIEVNTWVSESGKTGMGRDWLISDFHTGDILIRATSVCAMMNQ KTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHV NVKYIGWILESVPTEVFETQELCGLTLEYRQECGRDSVLESV TAMDPSKEGDRSLYQHLLRLEDGTDIAKGRTKWRPK AGKTSNGNSIS Example 2. Cloning and fatty acid analysis of cells transformed with novel FATB genes
[0081] In the example below, we detail the effect of expressing plant oilseed transcriptome-derived, heterologous thioesterases in the UTEX1435
(web.biosci.utexas.edu/utex/) strain, Strain A.
[0082] As in Example 1 , R A was extracted from dried plant seeds and submitted for paired-end sequencing using the Illumina Hiseq 2000 platform. RNA sequence reads were assembled into corresponding seed transcriptomes using Trinity or Oases packages and putative thioesterase-containing cDNA contigs were identified by mining transcriptomes for sequences with homology to known thioesterases.
These in silico identified putative thioesterase cDNAs were verified by direct reverse transcription PCR analysis using seed RNA and primer pairs targeting full-length thioesterase cDNAs. The resulting amplified products were cloned and sequenced de novo to confirm authenticity of identified thioesterase genes and to identify sequence variants arising from expression of different gene alleles or diversity of sequences within a population of seeds. The resulting amino acid sequences were subjected to phylogenetic analysis using published full-length (Mayer and Shanklin, 2007) and truncated (THYME database) FatB sequences. The thioesterases that clustered with acyl-ACP FatB thioesterases, which are involved in biosynthesis of C8-C16 fatty acids, were pursued.
Construction of Transforming Vectors Expressing Acyl-ACP FatB Thioesterases
[0083] 27 putative acyl-ACP FatB thioesterases from the species
Cinnamomum camphora, Cuphea hyssopifolia, Cuphea PSR23, Cuphea wrightii, Cuphea heterophylla, and Cuphea viscosissima were synthesized in a codon- optimized form to reflect Prototheca moriformis (UTEX 1435) codon usage. Of the 27 genes synthesized, 24 were identified by our transcriptome sequencing efforts and the 3 genes from Cuphea viscosissima, were from published sequences in GenBank.
[0084] Transgenic strains were generated via transformation of the base strain
Strain A (Prototheca moriformis, derived from UTEX 1435 by classical mutation and screening for high oil production) with a construct encoding 1 of the 27 FatB thioesterases. The construct pSZ2760 encoding Cinnamomum camphora (Cc) FATBlb is shown as an example, but identical methods were used to generate each of the remaining 26 constructs encoding the different respective thioesterases. Construct pSZ2760 can be written as
6S::CrTUB2:ScSUC2:CvNR::PmAMT3:CcFATBlb:CvNR::6S. The sequence of the transforming DNA is provided in Table 5 (pSZ2760). The relevant restriction sites in the construct from 5 '-3', BspQl, Kpnl, Ascl, Mfel, EcoRI, Spel, Xhol, Sad, BspQl, respectively, are indicated in lowercase, bold, and underlined. BspQl sites delimit the 5 ' and 3 ' ends of the transforming DNA. Bold, lowercase sequences at the 5 ' and 3 ' end of the construct represent genomic DNA from UTEX 1435 that target integration to the 6S locus via homologous recombination. Proceeding in the 5 ' to 3 ' direction, the selection cassette has the C reinhardtii β-tubulin promoter driving expression of the S. cerevisiae gene SUC2 (conferring the ability to grow on sucrose) and the Chlorella vulgaris Nitrate Reductase (NR) gene 3 ' UTR. The promoter is indicated by lowercase, boxed text. The initiator ATG and terminator TGA for ScSUC2 are indicated by bold, uppercase italics, while the coding region is indicated with lowercase italics. The 3 ' UTR is indicated by lowercase underlined text. The spacer region between the two cassettes is indicated by upper case text. The second cassette containing the codon optimized CcFATBlb gene (Table 5; pSZ2760) from
Cinnamomum camphora is driven by the Prototheca moriformis endogenous AMT3 promoter, and has the Chlorella vulgaris Nitrate Reductase (NR) gene 3 ' UTR. In this cassette, the AMT3 promoter is indicated by lowercase, boxed text. The initiator ATG and terminator TGA for the CcFATBlb gene are indicated in bold, uppercase italics, while the coding region is indicated by lowercase italics and the spacer region is indicated by upper case text. The 3 ' UTR is indicated by lowercase underlined text. The final construct was sequenced to ensure correct reading frame and targeting sequences.
Table 5: pSZ2760 Transforming construct gctcttcgccgccgccactcctgctcgagcgcgcccgcgcgtgcgccgccagcgccttggccttttcgccgcgctcgtgc gcgtcgctgatgtccatcaccaggtccatgaggtctgccttgcgccggctgagccactgcttcgtccgggcggccaagag gagcatgagggaggactcctggtccagggtcctgacgtggtcgcggctctgggagcgggccagcatcatctggctctgc cgcaccgaggccgcctccaactggtcctccagcagccgcagtcgccgccgaccctggcagaggaagacaggtgaggg gggtatgaattgtacagaacaaccacgagccttgtctaggcagaatccctaccagtcatggctttacctggatgacggcctg cgaacagctgtccagcgaccctcgctgccgccgcttctcccgcacgcttctttccagcaccgtgatggcgcgagccagcg ccgcacgctggcgctgcgcttcgccgatctgaggacagtcggggaactctgatcagtctaaacccccttgcgcgttagtgtt gccatcctttgcagaccggtgagagccgacttgttgtgcgccaccccccacaccacctcctcccagaccaattctgtcacct ttttggcgaaggcatcggcctcggcctgcagagaggacagcagtgcccagccgctgggggttggcggatgcacgctca ggtacqctttcttgcgctatgacacttccagcaaaaggtagggcgggctgcgagacggcttcccggcgctgcatgcaaca!
|ccgatgatgcttcgaccccccgaagctccttcggggctgcatgggcgctccgatgccgctccagggcgagcgctgtttaa|
|atagccaggcccccgattgcaaagacattatagcgagctaccaaagccatattcaaacacctagatcactaccacttctaca|
|caggccactcgagcttgtgatcgcactccgctaagggggcgcctcttcctcttcgtttcagtcacaacccgcaaac|ggcgc g^cATGctgctgcaggccttcctgttcctgctggccggcttcgccgccaagatcagcgcctccatgacgaacgagac gtccgaccgccccctggtgcacttcacccccaacaagggctggatgaacgaccccaacggcctgtggtacgacgag aaggacgccaagtggcacctgtacttccagtacaacccgaacgacaccgtctgggggacgcccttgttctggggcca cgccacgtccgacgacctgaccaactgggaggaccagcccatcgccatcgccccgaagcgcaacgactccggcgc cttctccggctccatggtggtggactacaacaacacctccggcttcttcaacgacaccatcgacccgcgccagcgctgc gtggccatctggacctacaacaccccggagtccgaggagcagtacatctcctacagcctggacggcggctacaccttc accgagtaccagaagaaccccgtgctggccgccaactccacccagttccgcgacccgaaggtcttctggtacgagcc ctcccagaagtggatcatgaccgcggccaagtcccaggactacaagatcgagatctactcctccgacgacctgaagt cctggaagctggagtccgcgttcgccaacgagggcttcctcggctaccagtacgagtgccccggcctgatcgaggtcc ccaccgagcaggaccccagcaagtcctactgggtgatgttcatctccatcaaccccggcgccccggccggcggctcct tcaaccagtacttcgtcggcagcttcaacggcacccacttcgaggccttcgacaaccagtcccgcgtggtggacttcgg caaggactactacgccctgcagaccttcttcaacaccgacccgacctacgggagcgccctgggcatcgcgtgggcctc caactgggagtactccgccttcgtgcccaccaacccctggcgctcctccatgtccctcgtgcgcaagttctccctcaaca ccgagtaccaggccaacccggagacggagctgatcaacctgaaggccgagccgatcctgaacatcagcaacgccg gcccctggagccggttcgccaccaacaccacgttgacgaaggccaacagctacaacgtcgacctgtccaacagcac cggcaccctggagttcgagctggtgtacgccgtcaacaccacccagacgatctccaagtccgtgttcgcggacctctcc ctctggttcaagggcctggaggaccccgaggagtacctccgcatgggcttcgaggtgtccgcgtcctccttcttcctgga ccgcgggaacagcaaggtgaagttcgtgaaggagaacccctacttcaccaaccgcatgagcgtgaacaaccagcc cttcaagagcgagaacgacctgtcctactacaaggtgtacggcttgctggaccagaacatcctggagctgtacttcaac gacggcgacgtcgtgtccaccaacacctacttcatgaccaccgggaacgccctgggctccgtgaacatgacgacggg ggtggacaacctgttctacatcgacaagttccaggtgcgcgaggtcaagTGAcaattQQcaQcaQcaQctCQ taQ tatcgacacactctggacgctggtcgtgtgatggactgttgccgccacacttgctgccttgacctgtgaatatccctgccgctt ttatcaaacagcctcagtgtgtttgatcttgtgtgtacgcgcttttgcg
cagcatccccttccctcgtttcatatcgcttgcatcccaaccgcaacttatctacgctgtcctgctatccctcagcgctgctcct gctcctgctcactgcccctcgcacagccttggtttgggctccgcctgtattctcctggtactgcaacctgtaaaccagcactg caatgctgatgcacgggaagtagtgggatgggaacacaaatggaAAGCTGTATAGGGATAAgaattc[gg lccgacaggacgcgcgtcaaaggtgctggtcgtgtatgccctggccggcaggtcgttgctgctgctggttagtgattccgcal laccctgattttggcgtcttattttggcgtggcaaacgctggcgcccgcgagccgggccggcggcgatgcggtgccccacgl
|gctgccggaatccaagggaggcaagagcgcccgggtcagttgaagggctttacgcgcaaggtacagccgctcctgcaa|
Iggctgcgtggtggaattggacgtgcaggtcctgctgaagttcctccaccgcctcaccagcggacaaagcaccggtgtatq
|aggtccgtgtcatccactctaaagaactcgactacgacctactgatggccctagattcttcatcaaaaacgcctgagacactt|
Igcccaggattgaaactccctgaagggaccaccaggggccctgagttgttccttccccccgtggcgagctgccagccaggi lctgtacctgtgatcgaggctggcgggaaaataggcttcgtgtgctcaggtcatgggaggtgcaggacagctcatgaaacgi
|ccaacaatcgcacaattcatgtcaagctaatcagctatttcctcttcacgagctgtaattgtcccaaaattctggtctaccgggg| Igtgatccttcgtgtacgggcccttccctcaaccctaggtatgcgcgcatgcggtcgccgcgcaactcgcgcgagggccga]
|gggtttgggacgggccgtcccgaaatgcagttgcacccggatgcgtggcaccttttttgcgataatttatgcaatggactgct| lctgcaaaattctggctctgtcgccaaccctaggatcagcggcgtaggatttcgtaatcattcgtcctgatggggagctaccgi
|actaccctaatatcagcccgactgcctgacgccagcgtccacttttgtgcacacattccattcgtgcccaagacatttcattgt|
|ggtgcgaagcgtccccagttacgctcacctgtttcccgacctccttactgttctgtcgacagagcgggcccacaggccggt| ^^a^^^gtATGgccaccacctccctggcctccgccttctgctccatgaaggccgtgatgctggcccgcgacg gccgcggcctgaagccccgctcctccgacctgcagctgcgcgccggcaacgcccagacctccctgaagatgatcaac ggcaccaagttctcctacaccgagtccctgaagaagctgcccgactggtccatgctgttcgccgtgatcaccaccatctt ctccgccgccgagaagcagtggaccaacctggagtggaagcccaagcccaaccccccccagctgctggacgacca cttcggcccccacggcctggtgttccgccgcaccttcgccatccgctcctacgaggtgggccccgaccgctccacctcc atcgtggccgtgatgaaccacctgcaggaggccgccctgaaccacgccaagtccgtgggcatcctgggcgacggctt cggcaccaccctggagatgtccaagcgcgacctgatctgggtggtgaagcgcacccacgtggccgtggagcgctacc ccgcctggggcgacaccgtggaggtggagtgctgggtgggcgcctccggcaacaacggccgccgccacgacttcct ggtgcgcgactgcaagaccggcgagatcctgacccgctgcacctccctgtccgtgatgatgaacacccgcacccgcc gcctgtccaagatccccgaggaggtgcgcggcgagatcggccccgccttcatcgacaacgtggccgtgaaggacga ggagatcaagaagccccagaagctgaacgactccaccgccgactacatccagggcggcctgaccccccgctggaa cgacctggacatcaaccagcacgtgaacaacatcaagtacgtggactggatcctggagaccgtgcccgactccatctt cgagtcccaccacatctcctccttcaccatcgagtaccgccgcgagtgcacccgcgactccgtgctgcagtccctgacc accgtgtccggcggctcctccgaggccggcctggtgtgcgagcacctgctgcagctggagggcggctccgaggtgct gcgcgccaagaccgagtggcgccccaagctgtccttccgcggcatctccgtgatccccgccgagtcctccgtgatgga ctacaaggaccacgacggcgactacaaggaccacgacatcgactacaaggacgacgacgacaagTGAc g^, ggcagcagcagctcggatagtatcgacacactctggacgctggtcgtgtgatggactgttgccgccacacttgctgccttg acctgtgaatatccctgccgcttttatcaaacagcctcagtgtgtttgatctt gtgctatttgcgaataccacccccagcatccccttccctcgtttcatatcgcttgcatcccaaccgcaacttatctacgctgtcc tgctatccctcagcgctgctcctgctcctgctcactgcccctcgcacagccttggtttgggctccgcctgtattctcctggtact gcaacctgtaaaccagcactgcaatgctgatgcacgggaagtagtgggatgggaacacaaatggaAAGCTGTAT AGGGATAACAGGGTAATgagctcttgttttccagaaggagttgctccttgagcctttcattctcagcctcgata acctccaaagccgctctaattgtggagggggttcgaatttaaaagcttggaatgttggttcgtgcgtctggaacaagcccag acttgttgctcactgggaaaaggaccatcagctccaaaaaacttgccgctcaaaccgcgtacctctgctttcgcgcaatctgc cctgttgaaatcgccaccacattcatattgtgacgcttgagcagtctgtaattgcctcagaatgtggaatcatctgccccctgtg cgagcccatgccaggcatgtcgcgggcgaggacacccgccactcgtacagcagaccattatgctacctcacaatagttca taacagtgaccatatttctcgaagctccccaacgagcacctccatgctctgagtggccaccccccggccctggtgcttgcg gagggcaggtcaaccggcatggggctaccgaaatccccgaccggatcccaccacccccgcgatgggaagaatctctcc ccgggatgtgggcccaccaccagcacaacctgctggcccaggcgagcgtcaaaccataccacacaaatatccttggcat cggccctgaattccttctgccgctctgctacccggtgcttctgtccgaagcaggggttgctagggatcgctccgagtccgca aacccttgtcgcgtggcggggcttgttcgagcttgaagagc
[0085] Constructs encoding the identified heterologous FatB genes, such as
CcFATBlb from pSZ2760 in Table 6, were transformed into Strain A, and selected for the ability to grow on sucrose. Transformations, cell culture, lipid production and fatty acid analysis were all carried out as previously described. After cultivating on sucrose under low nitrogen conditions to accumulate oil, fatty acid profiles were determined by FAME-GC. The top performer from each transformation, as judged by the ability to produce the highest level of midchain fatty acids, is shown in Table 4.
Table 6: Alteration of Fatty Acid Profiles in S3150 upon Expression of Heterologous FatB Thioesterases
0 2 3 o
[0086] Many of the acyl-ACP FatB thioesterases were found to exhibit midchain activity when expressed in Prototheca moriformis. For example, expression of CcFATBlb causes an increase in myristate levels from 2% of total fatty acids in the parent, Strain A, to -15% in the D 1670- 13 primary trans formant. Other examples include CcFATB4, which exhibits an increase in laurate levels from 0% in Strain A to -33%, and ChsFATB3, which exhibits an increase in myristate levels to -34%.
Although some of the acyl-ACP thioesterases did not exhibit dramatic effects on midchain levels in the current incarnation, efforts will likely develop to optimize some of these constructs.
[0087] Sequences of the Heterologous Acyl-ACP Thioesterases Identified and
Transformed into P. moriformis (UTEX 1435) [0088] A complete listing of relevant sequences for the transforming
constructs, such as the deduced amino acid sequence of the encoded acyl-ACP thioesterase, the native CDS coding sequence, the Prototheca moriformis codon- optimized coding sequence, and the nature of the sequence variants examined, is provided as SEQ ID NOS: 1-78.
Example 3. Discovery and Cloning of Additional FATB genes
Additional FATB genes were obtained from seeds as described above. The species and number of FatB genes identified were:
[0089] The thioesterases that clustered with acyl-ACP FatB thioesterases, which are involved in biosynthesis of C8-C16 fatty acids, were pursued. The native, putative plastid-targeting transit peptide sequence is indicated by underlining.
[0090] Construction of Transforming Vectors Expressing Acyl-ACP FatB
Thioesterases. The nine putative Acyl-ACP FatB Thioesterases from the species Cuphea calcarata, Cuphea painter, Cuphea hookeriana, Cuphea avigera var.
pulcherrima, Cuphea paucipetala, Cuphea procumbens, and Cuphea ignea were synthesized in a codon-optimized form to reflect UTEX 1435 codon usage. In contrast to the previous example, the new Acyl-ACP FatB thioesterases were synthesized with a modified transit peptide from Chlorella protothecoides (Cp) in place of the native transit peptide. The modified transit peptide derived from the CpSADl gene, "CpSADltp_trimmed", was synthesized as an in- frame, N-terminal fusion to the FatB acyl-ACP thioesterases in place of the native transit peptide; the resulting sequences are listed below. The novel FatB genes were cloned into Prototheca moriformis as described above. Constructs encoding heterologous FatB genes were transformed into strain S6165 (a descendant of S3150/Strain A) and selected for the ability to grow on sucrose. Transformations, cell culture, lipid production and fatty acid analysis were all carried out as previously described. The results for the nine novel FatB acyl-ACP thioesterases are displayed in the table immediately below.
n 00 o
o FA profile of top performerfrom each transformation (%; primary lipid)
©
P^
Species Gene Name SZ Plasmid Strain C8:0 C10:0 C12:0 C14:0 C16:0 C18:0 C18:l C18:2 C18:3 o P Cuphea calcarata CcalcFATBl pSZ3764 S6165; T778; D2508-26 1 12 18 29 2 29
Cuphea painter] CpaiFATBl pSZ3838 S6165;T841; D2796-22 17 1 2 18 2 43
Cuphea hookeriana ChookFATB4 pSZ3837 S6165; T788; D2552-18 0 2 32 2 54
<:
> Cuphea avigera var. pulcherrima CaFATBl pSZ4084 S6165;T841; D2800-7 22 2 15 2 42
Cuphea paucipetala CpauFATBl pSZ3762 S6165; T778; D2506-46 3 28 2 47 n o Cuphea procumbens CprocFATBl pSZ3929 S6165;T814; D2675-3 3 30 2 50
Pi Cuphea procumbens Cprocl ΛΙ Β7 pSZ3839 S6165; T788; D2553-2 2 32 3 55
S Cuphea procumbens CprocFATB3 pSZ3763 S6165; T778; D2507-29 2 28 3 54
Cuphea ignea CigneaFATBl pSZ3930 S6165: T814: D2676-34 4 24 2 51
H S6165 (parent strain): 29 58
Cd
P.
n
o P
X
H
Cd n o'
prs
O X
© f
n n
o
© ©
P
pi
a-
00
acid levels; CigneaFATBl, which exhibits 8% C10:0 and 1% C12:0 fatty acid levels; CcalcFATBl, which exhibits 18% C14:0 and 12% C12:0 levels; and CaFATBl, which exhibits 22% C8:0 and 9% C10:0 fatty acid levels.
[0092] CaFATBl, which exhibits high C8:0 and C10:0 levels, is of particular interest. CaFATBl arose from two separate contigs that were assembled from the Cupha avigera var. pulcherrima transcriptome, S17_Cavig_trinity_7406 and
S17_Cavig_trinity_7407. Although the two partial contigs exhibit only 17 nucleotides of overlap, we were able to assemble a putative full length transcript encoding CaFATBl from the two contigs and then subsequently confirm the existence of the full-length transcript by direct reverse transcription PCR analysis using seed R A and primer pairs targeting the full-length CaFATBl thioesterase cDNA. Tjellstrom et al. (2013) discloses the expression of a newly identified fatty acyl-ACP thioesterase from Cuphea pulcherrima that they named "CpuFATB3" (Genbank accession number KC675178). The coding sequence of CpuFATB3 is 100% identical to the CaFATBl gene we identified and contains one nucleotide difference in the RNA sequence outside the predicted coding region. Tjellstrom et al. (2013) showed that CpuFATB3 produces an average of 4.8%> C8:0 when expressed in Arabidopsis, and further requires deletion of two acyl-ACP synthetases, AAE15/16, to produce an average of 9.2% C8:0 with a maximum level of -12% C8.0. The CaFATBl gene we identified was codon-optimized for expression in UTEX1435 and generated as a CpSADltp-trimmed transit peptide fusion before introduction into S6165. The CpSADltp_trimmed:CaFATBl gene produces an average C8:0 level of 14% and a maximum level of 22% C8:0 without requiring the deletion of endogenous acyl-ACP synthetases. [0093] Table 7. Amino Acid Sequences of Additional Novel FatB Acyl-ACP
Thioesterases. In the appended sequence listings, the native, putative plastid-targeting transit peptide sequence is underlined:
CpauFATBl (Cuphea paucipetala FATB1) SEQ ID NO: 171
CprocFATBl (Cuphea procumbens FATB1) SEO ID NO: 172
CprocFATB2 (Cuphea procumbens FATB2) SEO ID NO: 173
CprocFATB3 (Cuphea procumbens FATB3) SEO ID NO: 174
CigneaFATBl (Cuphea ignea FATBl) SEO ID NO: 175
CcalcFATBl (Cuphea calcarata FATB1) SEO ID NO: 176
ChookFATB4 (Cuphea hookeriana FATB4) SEO ID NO: 177
CaFATBl (Cuphea avigera var. pulcherrima FATB1) SEO ID NO: 178
CpauFATBl (Cuphea paucipetala FATB1) SEO ID NO: 179
CprocFATBl (Cuphea procumbens FATB1) SEO ID NO: 180
CprocFATB2 (Cuphea procumbens FATB2) SEO ID NO: 181
CprocFATB3 (Cuphea procumbens FATB3) SEO ID NO: 182
CigneaFATBl (Cuphea ignea FATBl) SEO ID NO: 183
Example 4. FATB consensus sequences: Discovery, cloning and fatty acid profiles
[0094] In the course of testing several new putative midchain FatB
thioesterases in UTEX1435, S3150 (Strain A above), we identified several thioesterases with increased CI 0:0 and CI 6:0 activity above the background midchain levels found in the strain. We reasoned that a consensus sequence could be obtained for an idealized C10:0 thioesterase and C16:0 thioesterase from aligning the best- performing C10:0 and C16:0 thioesterases. A consensus C10:0 specific thioesterase sequence was generated using the C palustris FatBl (CpFATBl), C. PSR23 FatB3 (CuPSR23FATB3), C viscosissima FatB 1 (CvisFATB 1 ), C glossostoma FatB 1
(CgFATBl), and C. carthagenensis FatB2 (CcrFATB2) sequences as inputs resulting in a C10:0 specific consensus sequence termed JcFATBl/SzFATBl . A consensus CI 6:0 specific thioesterase sequence was generated using the C. heterophylla FatB3a (ChtFATB3a), C carthagenensis FatBl (CcrFATBl), C viscosissima FatB2
(CvisFATB2), C hookeriana FatBl (ChFATBl; AAC48990), C hyssopifolia FatB2 (ChsFATB2), C calophylla FatB2 (CcalFATB2; ABB71581), C hookeriana FatBl-1 (ChFATBl-1; AAC72882), C lanceolata FatBl (C1FATB1; CAC 19933), and C. wrightii FatB4a (CwFATB4a) sequences as inputs resulting in a CI 6:0 specific consensus sequence termed JcFATB2/SzFATB2. The resulting consensus sequences were synthesized, cloned into a vector identical to that used to test other FatB thioesterases, and introduced into S3150 as described above. The consensus amino acid sequences are given as SEQ ID NOs. 106 and 107; the nucleic acid sequences were based on these amino acid sequences using codon optimization for Prototheca moriformis. The trans formants were selected, cultivated and the oil was extracted and analyzed by FAME-GC-FID. The fatty acid profiles obtained are given in the table below.
Example S: Clade analysis
Various novel FATB thioesterases were clustered according to a neighbor joining algorithm. These were found to form twelve clades as listed in Table la. Putative function was assigned based on expression in Prototheca as described above.
[0095] The described embodiments of the invention are intended to be merely exemplary and numerous variations and modifications will be apparent to those skilled in the art. All such variations and modifications are intended to be within the scope of the present invention. Sequence Listing
SEQ ID NO 1 :
Cinnamomum camphora (Cc) FATBlb variant M25L, M322R, AT367-D368 amino acid sequence MATTSLASAFCSMKAVMLARDGRGLKPRSSDLQLRAGNAQTSLKMINGTKFSYTESLKKLPD WSMLFAVITTIFSAAEKQWTNLEWKPKPNPPQLLDDHFGPHGLVFRRTFAIRSYEVGPDRSTSI VAVMNHLQEAALNHAKSVGILGDGFGTTLEMSKRDLIWVVKRTHVAVERYPAWGDTVEVE CWVGASGNNGRRHDFLVRDCKTGEILTRCTSLSVMMNTRTRRLSKIPEEVRGEIGPAFIDNVA VKDEEIKKPQKLNDSTADYIQGGLTPRWNDLDINQHVNNIKYVDWILETVPDSIFESHHISSFTI EYRRECTRDSVLQSLTTVSGGSSEAGLVCEHLLQLEGGSEVLRAKTEWRPKLSFRGISVIPAES
sv*
SEQ ID NO 2:
Cinnamomum camphora (Cc) FATBlb variant M25L, M322R, AT367-D368 coding DNA sequence
TTAGCTTCTGCTTTCTGCTCGATGAAAGCTGTAATGTTGGCTCGTGATGGCAGGGGCTTGA AACCCAGGAGCAGTGATTTGCAGCTGAGGGCGGGAAATGCACAAACCTCTTTGAAGATGA TCAATGGGACCAAGTTCAGTTACACAGAGAGCTTGAAAAAGTTGCCTGACTGGAGCATGC TCTTTGCAGTGATCACGACCATCTTTTCGGCTGCTGAGAAGCAGTGGACCAATCTAGAGTG GAAGCCGAAGCCGAATCCACCCCAGTTGCTTGATGACCATTTTGGGCCGCATGGGTTAGTT TTCAGGCGCACCTTTGCCATCAGATCGTATGAGGTGGGACCTGACCGCTCCACATCTATAG TGGCTGTTATGAATCACTTGCAGGAGGCTGCACTTAATCATGCGAAGAGTGTGGGAATTCT AGGAGATGGATTCGGTACGACGCTAGAGATGAGTAAGAGAGATCTGATATGGGTTGTGAA ACGCACGCATGTTGCTGTGGAACGGTACCCTGCTTGGGGTGATACTGTTGAAGTAGAGTG CTGGGTTGGTGCATCGGGAAATAATGGCAGGCGCCATGATTTCCTTGTCCGGGACTGCAA AACAGGCGAAATTCTTACAAGATGTACCAGTCTTTCGGTGATGATGAATACAAGGACAAG GAGGTTGTCCAAAATCCCTGAAGAAGTTAGAGGGGAGATAGGGCCTGCATTCATTGATAA TGTGGCTGTCAAGGACGAGGAAATTAAGAAACCACAGAAGCTCAATGACAGCACTGCAG ATTACATCCAAGGAGGATTGACTCCTCGATGGAATGATTTGGATATCAATCAGCACGTTA ACAACATCAAATACGTTGACTGGATTCTTGAGACTGTCCCAGACTCAATCTTTGAGAGTCA TCATATTTCCAGCTTCACTATTGAATACAGGAGAGAGTGCACGAGGGATAGCGTGCTGCA GTCCCTGACCACTGTCTCCGGTGGCTCGTCGGAAGCTGGGTTAGTGTGCGAGCACTTGCTC CAGCTTGAAGGTGGGTCTGAGGTATTGAGGGCAAAAACAGAGTGGAGGCCTAAGCTTAGT TTCAGAGGGATTAGTGTGATACCCGCAGAATCGAGTGTCTAA
SEQ ID NO 3:
Cinnamomum camphora (Cc) FATBlb variant M25L, M322R, AT367-D368 coding DNA sequence codon optimized for Prototheca moriformis
TTAGCTTCTGCTTTCTGCTCGATGAAAGCTGTAATGTTGGCTCGTGATGGCAGGGGCTTGA
AACCCAGGAGCAGTGATTTGCAGCTGAGGGCGGGAAATGCACAAACCTCTTTGAAGATGA
TCAATGGGACCAAGTTCAGTTACACAGAGAGCTTGAAAAAGTTGCCTGACTGGAGCATGC TCTTTGCAGTGATCACGACCATCTTTTCGGCTGCTGAGAAGCAGTGGACCAATCTAGAGTG GAAGCCGAAGCCGAATCCACCCCAGTTGCTTGATGACCATTTTGGGCCGCATGGGTTAGTT TTCAGGCGCACCTTTGCCATCAGATCGTATGAGGTGGGACCTGACCGCTCCACATCTATAG TGGCTGTTATGAATCACTTGCAGGAGGCTGCACTTAATCATGCGAAGAGTGTGGGAATTCT AGGAGATGGATTCGGTACGACGCTAGAGATGAGTAAGAGAGATCTGATATGGGTTGTGAA ACGCACGCATGTTGCTGTGGAACGGTACCCTGCTTGGGGTGATACTGTTGAAGTAGAGTG CTGGGTTGGTGCATCGGGAAATAATGGCAGGCGCCATGATTTCCTTGTCCGGGACTGCAA AACAGGCGAAATTCTTACAAGATGTACCAGTCTTTCGGTGATGATGAATACAAGGACAAG GAGGTTGTCCAAAATCCCTGAAGAAGTTAGAGGGGAGATAGGGCCTGCATTCATTGATAA TGTGGCTGTCAAGGACGAGGAAATTAAGAAACCACAGAAGCTCAATGACAGCACTGCAG ATTAC ATCCAAGGAGGATTGACTCCTCGATGGAATGATTTGGAT ATC AATCAGC ACGTT A ACAACATCAAATACGTTGACTGGATTCTTGAGACTGTCCCAGACTCAATCTTTGAGAGTCA TCATATTTCCAGCTTCACTATTGAATACAGGAGAGAGTGCACGAGGGATAGCGTGCTGCA GTCCCTGACCACTGTCTCCGGTGGCTCGTCGGAAGCTGGGTTAGTGTGCGAGCACTTGCTC CAGCTTGAAGGTGGGTCTGAGGTATTGAGGGCAAAAACAGAGTGGAGGCCTAAGCTTAGT TTCAGAGGGATTAGTGTGATACCCGCAGAATCGAGTGTCTAA
SEQ ID NO:4
Cinnamomum camphora (Cc) FATB4 amino acid sequence
MVTTSLASAYFSMKAVMLAPDGRGIKPRSSGLQVRAGNERNSCKVINGTKVKDTEGLKGCST LQGQSMLDDHFGLHGLVFRRTFAIRCYEVGPDRSTSIMAVMNHLQEAARNHAESLGLLGDGF GETLEMSKRDLIWWRRTHVAVERYPAWGDTVEVEAWVGASGNTGMRRDFLVRDCKTGHI LTRCTSVSVMMNMRTRRLSKIPQEVRAEIDPLFIEKVAVKEGEIKKLQKLNDSTADYIQGGWT PRWNDLDWQHVNNIIYVGWIFKSVPDSISENHHLSSITLEYRRECTRGNKLQSLTTVCGGSSE AGIICEHLLQLEDGSEVLRARTEWRPKHTDSFQGISERFPQQEPHK
SEQ ID NO: 5
Cinnamomum camphora (Cc) FATB4 coding DNA sequence
ATGGTCACCACCTCTTTAGCTTCCGCTTACTTCTCGATGAAAGCTGTAATGTTGGCTCCTGA CGGCAGGGGCATAAAGCCCAGGAGCAGTGGTTTGCAGGTGAGGGCGGGAAATGAACGAA ACTCTTGCAAGGTGATCAATGGGACCAAGGTCAAAGACACGGAGGGCTTGAAAGGGTGC AGCACGTTGCAAGGCCAGAGCATGCTTGATGACCATTTTGGTCTGCATGGGCTAGTTTTCA GGCGCACCTTTGCAATCAGATGCTATGAGGTTGGACCTGACCGCTCCACATCCATAATGGC TGTTATGAATCACTTGCAGGAAGCTGCACGTAATCATGCGGAGAGTCTGGGACTTCTAGG AGATGGATTCGGTGAGACACTGGAGATGAGTAAGAGAGATCTGATATGGGTTGTGAGACG CACGCATGTTGCTGTGGAACGGTACCCTGCTTGGGGCGATACTGTTGAAGTCGAGGCCTG GGTGGGTGCATCAGGTAACACTGGCATGCGCCGCGATTTCCTTGTCCGCGACTGCAAAAC TGGCCACATTCTTACAAGATGTACCAGTGTTTCAGTGATGATGAATATGAGGACAAGGAG ATTGTCCAAAATTCCCCAAGAAGTTAGAGCGGAGATTGACCCTCTTTTCATTGAAAAGGTT GCTGTCAAGGAAGGGGAAATTAAAAAATTACAGAAGTTGAATGATAGCACTGCAGATTAC ATTCAAGGGGGTTGGACTCCTCGATGGAATGATTTGGATGTCAATCAGCACGTGAACAAT ATCATATACGTTGGCTGGATTTTTAAGAGCGTCCCAGACTCTATCTCTGAGAATCATCATC TTTCTAGCATCACTCTCGAATACAGGAGAGAGTGCACAAGGGGCAACAAGCTGCAGTCCC TGACCACTGTTTGTGGTGGCTCGTCGGAAGCTGGGATCATATGTGAGCACCTACTCCAGCT TGAGGATGGGTCTGAGGTTTTGAGGGC AAGAAC AGAGTGGAGGCCC AAGC ACACCGAT A GTTTCCAAGGCATTAGTGAGAGATTCCCGCAGCAAGAACCGCATAAGTAA
SEQ ID NO: 6
Cinnamomum camphora (Cc) FATB4 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGACCACCTCCCTGGCCTCCGCCTACTTCTCCATGAAGGCCGTGATGCTGGCCCCCG ACGGCCGCGGCATCAAGCCCCGCTCCTCCGGCCTGCAGGTGCGCGCCGGCAACGAGCGCA ACTCCTGCAAGGTGATCAACGGCACCAAGGTGAAGGACACCGAGGGCCTGAAGGGCTGC TCCACCCTGCAGGGCCAGTCCATGCTGGACGACCACTTCGGCCTGCACGGCCTGGTGTTCC GCCGCACCTTCGCCATCCGCTGCTACGAGGTGGGCCCCGACCGCTCCACCTCCATCATGGC CGTGATGAACCACCTGCAGGAGGCCGCCCGCAACCACGCCGAGTCCCTGGGCCTGCTGGG CGACGGCTTCGGCGAGACCCTGGAGATGTCCAAGCGCGACCTGATCTGGGTGGTGCGCCG CACCCACGTGGCCGTGGAGCGCTACCCCGCCTGGGGCGACACCGTGGAGGTGGAGGCCTG GGTGGGCGCCTCCGGCAACACCGGCATGCGCCGCGACTTCCTGGTGCGCGACTGCAAGAC CGGCCACATCCTGACCCGCTGCACCTCCGTGTCCGTGATGATGAACATGCGCACCCGCCGC CTGTCCAAGATCCCCCAGGAGGTGCGCGCCGAGATCGACCCCCTGTTCATCGAGAAGGTG GCCGTGAAGGAGGGCGAGATCAAGAAGCTGCAGAAGCTGAACGACTCCACCGCCGACTA CATCCAGGGCGGCTGGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAA CATCATCTACGTGGGCTGGATCTTCAAGTCCGTGCCCGACTCCATCTCCGAGAACCACCAC CTGTCCTCCATCACCCTGGAGTACCGCCGCGAGTGCACCCGCGGCAACAAGCTGCAGTCC CTGACCACCGTGTGCGGCGGCTCCTCCGAGGCCGGCATCATCTGCGAGCACCTGCTGCAG CTGGAGGACGGCTCCGAGGTGCTGCGCGCCCGCACCGAGTGGCGCCCCAAGCACACCGAC TCCTTCCAGGGCATCTCCGAGCGCTTCCCCCAGCAGGAGCCCCACAAGTGA SEQ ID NO: 7
Cinnamomum camphora (Cc) FATB3 amino acid sequence
MVAT AAAS AFFPVGAPAT S S ATS AKASMMPDNLDARGIKPKPAS S SGLQVKANAHASPKING SKVSTDTLKGEDTLTSSPAPRTFINQLPDWSMFLAAITTIFLAAEKQWTNLDWKPRRPDMLAD PFGIGRFMQDGLIFRQHFAIRSYEIGADRTASIETLMNHLQETALNHVRSAGLLGDGFGATPEM SRRDLIWVVTRMQVLVDRYPAWGDIVEVETWVGASGKNGMRRDWLVRDSQTGEILTRATSV WVMMNKRTRRLSKLPEEVRGEIGPYFIEDVAIIEEDNRKLQKLNENTADNVRRGLTPRWSDLD VNQHVNNVKYIGWILESAPGSILESHELSCMTLEYRRECGKDSVLQSMTAVSGGGSAAGGSPE S S VECDHLLQLE SGPE WRGRTE WRPKS ANNSRSILEMPAE SL SEQ ID NO: 8
Cinnamomum camphora (Cc) FATB3 coding DNA sequence
ATGGTTGCCACCGCTGCTGCTTCTGCTTTCTTCCCGGTCGGTGCTCCGGCTACGTCATCTGC AACTTCAGCCAAAGCGTCGATGATGCCTGATAATTTGGATGCCAGAGGCATCAAACCGAA GCCGGCTTCGTCCAGCGGCTTGCAGGTTAAGGCAAATGCCCATGCCTCTCCCAAGATTAAT GGTTCCAAGGTGAGCACGGATACCTTGAAGGGGGAAGACACCTTAACTTCCTCGCCCGCC CCACGGACCTTTATCAACCAATTGCCTGACTGGAGCATGTTCCTTGCTGCCATCACAACTA TTTTCTTGGCTGCCGAGAAGCAGTGGACGAATCTCGACTGGAAGCCCAGAAGACCCGACA TGCTTGCTGACCCGTTTGGCATCGGGAGGTTTATGCAGGATGGGCTGATTTTCAGGCAGCA CTTTGCAATCAGATCTTATGAGATTGGGGCTGATAGAACGGCGTCTATAGAGACTTTAATG AATCACTTGCAGGAGACTGCACTTAATCATGTGAGGAGTGCTGGACTCCTAGGTGATGGA TTTGGTGCGACACCTGAGATGAGTAGAAGAGATCTGATATGGGTTGTAACACGTATGCAG GTTCTTGTGGACCGCTACCCTGCTTGGGGTGATATTGTTGAAGTAGAGACCTGGGTTGGTG CATCTGGAAAAAATGGTATGCGCCGTGATTGGCTTGTTCGGGACAGCCAAACTGGTGAAA TTCTCACACGAGCTACCAGTGTTTGGGTGATGATGAATAAACGGACAAGGCGATTGTCCA AACTTCCTGAAGAAGTTAGAGGGGAAATAGGGCCTTATTTTATAGAAGATGTTGCTATCA TAGAGGAGGACAACAGGAAACTACAGAAGCTCAATGAAAACACTGCTGATAATGTTCGA AGGGGTTTGACTCCTCGCTGGAGTGATCTGGATGTTAATCAGCATGTGAACAATGTCAAAT ACATTGGTTGGATTCTTGAGAGTGCACCAGGATCCATCTTGGAGAGTCATGAGCTTTCCTG CATGACCCTTGAATACAGGAGAGAATGTGGGAAGGACAGTGTGCTGCAGTCAATGACTGC TGTCTCTGGTGGAGGCAGTGC AGC AGGTGGCTC ACC AGAATCT AGCGTTGAGTGTGACC A CTTGCTCCAGCTAGAGAGTGGGCCTGAAGTTGTGAGGGGAAGAACCGAGTGGAGGCCCA AGAGTGCTAATAACTCGAGGAGCATCCTGGAGATGCCGGCCGAGAGC
SEQ ID NO: 9
Cinnamomum camphora (Cc) FATB4 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCGCCGCCGCCTCCGCCTTCTTCCCCGTGGGCGCCCCCGCCACCTCCTCCG CCACCTCCGCCAAGGCCTCCATGATGCCCGACAACCTGGACGCCCGCGGCATCAAGCCCA AGCCCGCCTCCTCCTCCGGCCTGCAGGTGAAGGCCAACGCCCACGCCTCCCCCAAGATCA ACGGCTCCAAGGTGTCCACCGACACCCTGAAGGGCGAGGACACCCTGACCTCCTCCCCCG CCCCCCGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGTTCCTGGCCGCCATCACCAC CATCTTCCTGGCCGCCGAGAAGCAGTGGACCAACCTGGACTGGAAGCCCCGCCGCCCCGA CATGCTGGCCGACCCCTTCGGCATCGGCCGCTTCATGCAGGACGGCCTGATCTTCCGCCAG CACTTCGCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGA TGAACCACCTGCAGGAGACCGCCCTGAACCACGTGCGCTCCGCCGGCCTGCTGGGCGACG GCTTCGGCGCCACCCCCGAGATGTCCCGCCGCGACCTGATCTGGGTGGTGACCCGCATGC AGGTGCTGGTGGACCGCTACCCCGCCTGGGGCGACATCGTGGAGGTGGAGACCTGGGTGG GCGCCTCCGGCAAGAACGGCATGCGCCGCGACTGGCTGGTGCGCGACTCCCAGACCGGCG AGATCCTGACCCGCGCCACCTCCGTGTGGGTGATGATGAACAAGCGCACCCGCCGCCTGT CCAAGCTGCCCGAGGAGGTGCGCGGCGAGATCGGCCCCTACTTCATCGAGGACGTGGCCA TC ATCGAGGAGGAC AACCGCAAGCTGC AGAAGCTGAACGAGAAC ACCGCCGAC AACGTG CGCCGCGGCCTGACCCCCCGCTGGTCCGACCTGGACGTGAACCAGCACGTGAACAACGTG AAGTACATCGGCTGGATCCTGGAGTCCGCCCCCGGCTCCATCCTGGAGTCCCACGAGCTGT CCTGCATGACCCTGGAGTACCGCCGCGAGTGCGGCAAGGACTCCGTGCTGCAGTCCATGA CCGCCGTGTCCGGCGGCGGCTCCGCCGCCGGCGGCTCCCCCGAGTCCTCCGTGGAGTGCG
ACCACCTGCTGCAGCTGGAGTCCGGCCCCGAGGTGGTGCGCGGCCGCACCGAGTGGCGCC
CCAAGTCCGCCAACAACTCCCGCTCCATCCTGGAGATGCCCGCCGAGTCCCTGTGA
SEQ ID NO: 10
Cuphea hyssopifolia (Chs) FATB1 amino acid sequence
MVATNAAAFSAYTFFLTSPTHGYSSKRLADTQNGYPGTSLKSKSTPPPAAAAARNGALPLLAS ICKCPKKADGSMQLDSSLVFGFQFYIRSYEVGADQTVSIQTVL YLQEAAINHVQSAGYFGDS FGATPEMTKRNLIWVITKMQVLVDRYPAWGDVVQVDTWTCSSGKNSMQRDWFVRDLKTGD IITRASSVWVLMNRLTRKLSKIPEAVLEEAKLFVMNTAPTVDDNRKLPKLDGSSADYVLSGLT PRWSDLDMNQHV VKYIAWILESVPQSIPETHKLSAITVEYRRECGKNSVLQSLTNVSGDGI TCGNSIIECHHLLQLETGPEILLARTEWISKEPGFRGAPIQAEKVYNNK*
SEQ ID NO: 11
Cuphea hyssopifolia (Chs) FATB1 coding DNA sequence
ATGGTTGCCACTAATGCTGCTGCCTTTTCTGCTTATACTTTCTTCCTTACTTCACCAACTCA TGGTTACTCTTCCAAACGTCTCGCCGATACTCAAAATGGTTATCCGGGTACCTCCTTGAAA TCGAAATCCACTCCTCCACCAGCTGCTGCTGCTGCTCGTAACGGTGCATTGCCACTGCTGG CCTCCATCTGCAAATGCCCCAAAAAGGCTGATGGGAGTATGCAACTAGACAGCTCCTTGG TCTTCGGGTTTCAATTTTACATTAGATCATATGAAGTGGGTGCGGATCAAACCGTGTCAAT ACAGACAGTACTCAATTACTTACAGGAGGCAGCCATCAATCATGTTCAGAGTGCTGGCTA TTTTGGTGATAGTTTTGGCGCCACCCCGGAAATGACCAAGAGGAACCTCATCTGGGTTATC ACTAAGATGCAGGTTTTGGTGGATCGCTATCCCGCTTGGGGCGATGTTGTTCAAGTTGATA CATGGACCTGTAGTTCTGGTAAAAACAGCATGCAGCGTGATTGGTTCGTACGGGATCTCA AAACTGGAGATATTATAACAAGAGCCTCGAGCGTGTGGGTGCTGATGAATAGACTCACCA GAAAATTATCAAAAATTCCTGAAGCAGTTCTGGAAGAAGCAAAACTTTTTGTGATGAACA CTGCCCCCACCGTAGATGACAACAGGAAGCTACCAAAGCTGGATGGCAGCAGTGCTGATT ATGTCCTCTCTGGCTTAACTCCTAGATGGAGCGACTTAGATATGAACCAGCATGTCAACAA TGTGAAGTACATAGCCTGGATCCTTGAGAGTGTCCCTCAGAGCATACCGGAGACACACAA GCTGTCAGCGATAACCGTGGAGTACAGGAGAGAATGTGGCAAGAACAGCGTCCTCCAGTC TCTGACCAACGTCTCCGGGGATGGAATCACATGTGGAAACAGTATTATCGAGTGCCACCA TTTGCTTC AACTTGAGACTGGCCCAGAGATTCTACTAGCGCGGACGGAGTGGAT ATCC AA GGAACCTGGGTTCAGGGGAGCTCCAATCCAGGCAGAGAAAGTCTACAACAACAAATAA
SEQ ID NO: 12
Cuphea hyssopifolia (Chs) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCAACGCCGCCGCCTTCTCCGCCTACACCTTCTTCCTGACCTCCCCCACCC ACGGCT ACTCCTCC AAGCGCCTGGCCGAC ACCC AGAACGGCTACCCCGGC ACCTCCCTGA AGTCCAAGTCCACCCCCCCCCCCGCCGCCGCCGCCGCCCGCAACGGCGCCCTGCCCCTGCT GGCCTCCATCTGCAAGTGCCCCAAGAAGGCCGACGGCTCCATGCAGCTGGACTCCTCCCT GGTGTTCGGCTTCCAGTTCTACATCCGCTCCTACGAGGTGGGCGCCGACCAGACCGTGTCC ATCCAGACCGTGCTGAACTACCTGCAGGAGGCCGCCATCAACCACGTGCAGTCCGCCGGC TACTTCGGCGACTCCTTCGGCGCCACCCCCGAGATGACCAAGCGCAACCTGATCTGGGTG ATCACCAAGATGCAGGTGCTGGTGGACCGCTACCCCGCCTGGGGCGACGTGGTGCAGGTG GACACCTGGACCTGCTCCTCCGGCAAGAACTCCATGCAGCGCGACTGGTTCGTGCGCGAC CTGAAGACCGGCGACATCATCACCCGCGCCTCCTCCGTGTGGGTGCTGATGAACCGCCTG ACCCGCAAGCTGTCCAAGATCCCCGAGGCCGTGCTGGAGGAGGCCAAGCTGTTCGTGATG AACACCGCCCCCACCGTGGACGACAACCGCAAGCTGCCCAAGCTGGACGGCTCCTCCGCC GACTACGTGCTGTCCGGCCTGACCCCCCGCTGGTCCGACCTGGACATGAACCAGCACGTG AACAACGTGAAGTACATCGCCTGGATCCTGGAGTCCGTGCCCCAGTCCATCCCCGAGACC CACAAGCTGTCCGCCATCACCGTGGAGTACCGCCGCGAGTGCGGCAAGAACTCCGTGCTG CAGTCCCTGACCAACGTGTCCGGCGACGGCATCACCTGCGGCAACTCCATCATCGAGTGC CACCACCTGCTGCAGCTGGAGACCGGCCCCGAGATCCTGCTGGCCCGCACCGAGTGGATC TCCAAGGAGCCCGGCTTCCGCGGCGCCCCCATCCAGGCCGAGAAGGTGTACAACAACAAG TGA SEQ ID NO: 13
Cuphea hyssopifolia (Chs) FATB2 amino acid sequence
MVAT AAS S AFFPVPSPDAS SRPGKLGNGS S SLSPLKPKLMANGGLQVKANAS APPKINGS SVG LKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDP FGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFGRTLEM YKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASS VWVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTPKWNDL DVNQHV VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKGSGSQFQ HLLRLEDGGEIVKGRTEWRPKTAGINGPIASGETSPGDSS*
SEQ ID NO: 14
Cuphea hyssopifolia (Chs) FATB2 coding DNA sequence
ATGGTGGCTACCGCTGCAAGTTCAGCATTCTTCCCTGTGCCGTCCCCCGACGCCTCCTCTA GACCTGGAAAGCTCGGCAATGGGTCATCGAGCTTGAGCCCCCTCAAGCCCAAATTGATGG CCAATGGCGGGTTGCAGGTTAAGGCAAACGCCAGTGCCCCTCCTAAGATCAATGGTTCTT CGGTCGGTCTAAAGTCCGGCAGTCTCAAGACTCAGGAAGACACTCCTTCGGCGCCTCCTCC CCGGACTTTTATTAACCAGCTGCCTGATTGGAGTATGCTTCTTGCTGCAATCACTACTGTCT TCTTGGCAGCAGAGAAGCAGTGGATGATGCTTGATTGGAAACCCAAGAGGCCTGACATGC TTGTGGACCCGTTCGGATTGGGAAGGATTGTTCAAGATGGGCTTGTGTTCAGGCAGAATTT TTCGATTAGGTCCTATGAAATAGGCGCTGATCGCACTGCGTCTATAGAGACGGTGATGAA CCACTTGCAGGAAACAGCTCTCAATCATGTTAAGAGTGCTGGGCTTCTTAATGACGGCTTT GGTCGTACTCTTGAGATGTATAAAAGGGACCTTATTTGGGTTGTTGCAAAAATGCAGGTCA TGGTTAACCGCTATCCTACTTGGGGCGACACGGTTGAAGTGAATACTTGGGTTGCCAAGTC AGGGAAAAATGGTATGCGTCGTGATTGGCTCATAAGTGATTGCAATACAGGAGAAATTCT TACTAGAGCATCAAGTGTGTGGGTCATGATGAATCAAAAGACAAGAAGATTGTCAAAAAT TCCAGATGAGGTTCGACATGAGATAGAGCCTCATTTCGTGGACTCTGCTCCCGTCATTGAA GATGATGACCGGAAACTTCCCAAGCTGGATGAGAAGACTGCTGACTCCATCCGCAAGGGT CTAACTCCGAAGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTACATT GGGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCCTTA CCCTGGAATATAGGCGGGAATGCGGAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTGTGG ACCCCTCTGGAAAGGGCTCTGGGTCTC AGTTCC AGC ACCTTCTGCGGCTTGAGGATGGAG GTGAGATTGTGAAGGGGAGAACTGAGTGGCGACCCAAGACTGCAGGAATCAATGGGCCA ATAGCATCCGGGGAGACCTCACCTGGAGACTCTTCTTAG
SEQ ID NO: 15
Cuphea hyssopifolia (Chs) FATB2 coding DNA sequence codon optimized for Prototheca moriformis ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCCCCGACGCCTCCTCCCG CCCCGGCAAGCTGGGCAACGGCTCCTCCTCCCTGTCCCCCCTGAAGCCCAAGCTGATGGCC AACGGCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCCTCC GTGGGCCTGAAGTCCGGCTCCCTGAAGACCCAGGAGGACACCCCCTCCGCCCCCCCCCCC CGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGTGT TCCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACATGC TGGTGGACCCCTTCGGCCTGGGCCGCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAACTT CTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATGAA CCACCTGCAGGAGACCGCCCTGAACCACGTGAAGTCCGCCGGCCTGCTGAACGACGGCTT CGGCCGCACCCTGGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAGGT GATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCCAA GTCCGGCAAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGAT CCTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCCGCCTGTCCAA GATCCCCGACGAGGTGCGCCACGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTGAT CGAGGACGACGACCGCAAGCTGCCCAAGCTGGACGAGAAGACCGCCGACTCCATCCGCA AGGGCCTGACCCCC AAGTGGAACGACCTGGACGTGAACC AGC ACGTGAAC AACGTGAAG TACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGC TCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACC GCCGTGGACCCCTCCGGCAAGGGCTCCGGCTCCCAGTTCCAGCACCTGCTGCGCCTGGAG GACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGACCGCCGGCATCAA CGGCCCCATCGCCTCCGGCGAGACCTCCCCCGGCGACTCCTCCTGA
SEQ ID NO: 16
Cuphea hyssopifolia (Chs) FATB2b +a.a.248-259 variant amino acid sequence
MVAT AAS S AFFPVPSPDAS SRPGKLGNGS S SLSPLKPKLMANGGLQVKANAS APPKINGS SVG LKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDP FGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFGRTLEM YKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASS KSQIMLPLHYCSVWVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADS IRKGLTPKWNDLDVNQHV VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTA VDPSGKGSGSQFQHLLRLEDGGEIVKGRTEWRPKTAGINGPIASGETSPGDSS*
SEQ ID NO: 17
Cuphea hyssopifolia (Chs) FATB2b+a.a.248-259 variant coding DNA sequence
ATGGTGGCTACCGCTGCAAGTTCAGCATTCTTCCCTGTGCCGTCCCCCGACGCCTCCTCTA GACCTGGAAAGCTCGGCAATGGGTCATCGAGCTTGAGCCCCCTCAAGCCCAAATTGATGG CCAATGGCGGGTTGCAGGTTAAGGCAAACGCCAGTGCCCCTCCTAAGATCAATGGTTCTT CGGTCGGTCTAAAGTCCGGCAGTCTCAAGACTCAGGAAGACACTCCTTCGGCGCCTCCTCC CCGGACTTTTATTAACCAGCTGCCTGATTGGAGTATGCTTCTTGCTGCAATCACTACTGTCT TCTTGGCAGCAGAGAAGCAGTGGATGATGCTTGATTGGAAACCCAAGAGGCCTGACATGC TTGTGGACCCGTTCGGATTGGGAAGGATTGTTCAAGATGGGCTTGTGTTCAGGCAGAATTT TTCGATTAGGTCCTATGAAATAGGCGCTGATCGCACTGCGTCTATAGAGACGGTGATGAA CCACTTGCAGGAAACAGCTCTCAATCATGTTAAGAGTGCTGGGCTTCTTAATGACGGCTTT GGTCGTACTCTTGAGATGTATAAAAGGGACCTTATTTGGGTTGTTGCAAAAATGCAGGTCA TGGTTAACCGCTATCCTACTTGGGGCGACACGGTTGAAGTGAATACTTGGGTTGCCAAGTC AGGGAAAAATGGTATGCGTCGTGATTGGCTCATAAGTGATTGCAATACAGGAGAAATTCT TACTAGAGCATCAAGTAAAAGCCAAATTATGTTACCCTTACATTATTGCAGTGTGTGGGTC ATGATGAATCAAAAGACAAGAAGATTGTCAAAAATTCCAGATGAGGTTCGACATGAGATA GAGCCTCATTTCGTGGACTCTGCTCCCGTCATTGAAGATGATGACCGGAAACTTCCCAAGC TGGATGAGAAGACTGCTGACTCCATCCGCAAGGGTCTAACTCCGAAGTGGAATGACTTGG ATGTC AATCAGCACGTCAACAACGTGAAGTACATTGGGTGGATTCTTGAGAGT ACTCC AC CAGAAGTTCTGGAGACCCAGGAGTTATGTTCCCTTACCCTGGAATATAGGCGGGAATGCG GAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTGTGGACCCCTCTGGAAAGGGCTCTGGGT CTCAGTTCCAGCACCTTCTGCGGCTTGAGGATGGAGGTGAGATTGTGAAGGGGAGAACTG AGTGGCGACCCAAGACTGCAGGAATCAATGGGCCAATAGCATCCGGGGAGACCTCACCTG GAGACTCTTCTTAG
SEQ ID NO: 18
Cuphea hyssopifolia (Chs) FATB2b +a.a.248-259 variant coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCCCCGACGCCTCCTCCCG CCCCGGCAAGCTGGGCAACGGCTCCTCCTCCCTGTCCCCCCTGAAGCCCAAGCTGATGGCC AACGGCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCCTCC GTGGGCCTGAAGTCCGGCTCCCTGAAGACCCAGGAGGACACCCCCTCCGCCCCCCCCCCC CGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGTGT TCCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACATGC TGGTGGACCCCTTCGGCCTGGGCCGCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAACTT CTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATGAA CCACCTGCAGGAGACCGCCCTGAACCACGTGAAGTCCGCCGGCCTGCTGAACGACGGCTT CGGCCGCACCCTGGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAGGT GATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCCAA GTCCGGC AAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGC AAC ACCGGCGAGAT CCTGACCCGCGCCTCCTCCAAGTCCCAGATCATGCTGCCCCTGCACTACTGCTCCGTGTGG GTGATGATGAACCAGAAGACCCGCCGCCTGTCCAAGATCCCCGACGAGGTGCGCCACGAG ATCGAGCCCCACTTCGTGGACTCCGCCCCCGTGATCGAGGACGACGACCGCAAGCTGCCC AAGCTGGACGAGAAGACCGCCGACTCCATCCGCAAGGGCCTGACCCCCAAGTGGAACGA CCTGGACGTGAACCAGCACGTGAACAACGTGAAGTACATCGGCTGGATCCTGGAGTCCAC CCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGCTCCCTGACCCTGGAGTACCGCCGCGA GTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACCGCCGTGGACCCCTCCGGCAAGGGCTC CGGCTCCCAGTTCCAGCACCTGCTGCGCCTGGAGGACGGCGGCGAGATCGTGAAGGGCCG CACCGAGTGGCGCCCCAAGACCGCCGGCATCAACGGCCCCATCGCCTCCGGCGAGACCTC CCCCGGCGACTCCTCCTGA
SEQ ID NO: 19
Cuphea hyssopifolia (Chs) FATB3 amino acid sequence
MVAAEASSALFSVRTPGTSPKPGKFGNWPTSLSVPFKSKSNHNGGFQVKANASARPKANGSA VSLKSGSLDTQEDTS S S S SPPRTFINQLPD WSMLLS AITTVF VAAEKQ WTMLDRKSKRPDMLM DPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEM CKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILIRATSMC AMMNQKTRRF SKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRWNDLDV NQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEGDRSLYQH LLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS*
SEQ ID NO: 20
Cuphea hyssopifolia (Chs) FATB3 coding DNA sequence
ATGGTGGCTGCCGAAGCAAGTTCTGCACTCTTCTCCGTTCGAACCCCGGGAACCTCCCCTA AACCCGGGAAGTTCGGGAATTGGCCAACGAGCTTGAGCGTCCCCTTCAAGTCCAAATCAA ACCACAATGGCGGCTTTCAGGTTAAGGCAAACGCCAGTGCCCGTCCTAAGGCTAACGGTT CTGCAGTAAGTCTAAAGTCTGGCAGCCTCGACACTCAGGAGGACACTTCATCGTCGTCCTC TCCTCCTCGGACTTTCATTAACCAGTTGCCCGACTGGAGTATGCTGCTGTCCGCGATCACG ACCGTCTTCGTGGCGGCTGAGAAGCAGTGGACGATGCTTGATCGGAAATCTAAGAGGCCC GACATGCTCATGGACCCGTTTGGGGTTGACAGGGTTGTTCAGGATGGGGCTGTGTTCAGA CAGAGTTTTTCGATTAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATAGAGACG CTGATGAACATCTTCCAGGAAACATCTCTCAATCATTGTAAGAGTATCGGTCTTCTCAATG ACGGCTTTGGTCGTACTCCTGAGATGTGTAAGAGGGACCTCATTTGGGTGGTTACAAAAAT GC ACGTCGAGGTTAATCGCTATCCTACTTGGGGTGATACTATCGAGGTCAAT ACTTGGGTC TCCGAGTCGGGGAAAACCGGTATGGGTCGTGATTGGCTGATAAGTGATTGTCATACAGGA GAAATTCTAATAAGAGCAACGAGCATGTGTGCTATGATGAATCAAAAGACGAGAAGATTC TCAAAATTTCCATATGAGGTTCGACAGGAGTTGGCGCCTCATTTTGTGGACTCTGCTCCTG TCATTGAAGACTATCAAAAATTGCACAAGCTTGATGTGAAGACGGGTGATTCCATTTGCA ATGGCCT AACTCC AAGGTGGAATGACTTGGATGTC AATC AGC ACGTTAAC AATGTGAAGT ACATTGGGTGGATTCTCGAGAGTGTTCCAACGGAAGTTTTCGAGACCCAGGAGCTATGTG GCCTCACCCTTGAGTATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCCGTGACCG CTATGGATCCATCAAAAGAGGGAGACAGATCTCTGTACCAGCACCTTCTTCGGCTTGAGG ATGGGGCTGATATCGCGAAGGGCAGAACCAAGTGGCGGCCGAAGAATGCAGGAACCAAT GGGGCAATATCAACAGGAAAGACTTCAAATGGAAACTCGATCTCTTAG
SEQ ID NO: 21
Cuphea hyssopifolia (Chs) FATB3 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGAGGCCTCCTCCGCCCTGTTCTCCGTGCGCACCCCCGGCACCTCCCCCA AGCCCGGCAAGTTCGGCAACTGGCCCACCTCCCTGTCCGTGCCCTTCAAGTCCAAGTCCAA CCACAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCGCCCCAAGGCCAACGGCTC CGCCGTGTCCCTGAAGTCCGGCTCCCTGGACACCCAGGAGGACACCTCCTCCTCCTCCTCC CCCCCCCGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGTCCGCCATCACCA CCGTGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCG ACATGCTGATGGACCCCTTCGGCGTGGACCGCGTGGTGCAGGACGGCGCCGTGTTCCGCC AGTCCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACATCTTCCAGGAGACCTCCCTGAACCACTGCAAGTCCATCGGCCTGCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGAT GCACGTGGAGGTGAACCGCTACCCCACCTGGGGCGACACCATCGAGGTGAACACCTGGGT GTCCGAGTCCGGCAAGACCGGCATGGGCCGCGACTGGCTGATCTCCGACTGCCACACCGG CGAGATCCTGATCCGCGCCACCTCCATGTGCGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGTTCCCCTACGAGGTGCGCCAGGAGCTGGCCCCCCACTTCGTGGACTCCGCCCCC GTGATCGAGGACTACCAGAAGCTGCACAAGCTGGACGTGAAGACCGGCGACTCCATCTGC AACGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAA GTACATCGGCTGGATCCTGGAGTCCGTGCCCACCGAGGTGTTCGAGACCCAGGAGCTGTG CGGCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTGAC CGCCATGGACCCCTCCAAGGAGGGCGACCGCTCCCTGTACCAGCACCTGCTGCGCCTGGA GGACGGCGCCGACATCGCCAAGGGCCGCACCAAGTGGCGCCCCAAGAACGCCGGCACCA ACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCATCTCCTGA
SEQ ID NO: 22
Cuphea hyssopifolia (Chs) FATB3b (V204I,C239F, E243D, M25 IV variant) amino acid sequence
MVAAEASSALFSVRTPGTSPKPGKFGNWPTSLSVPFKSKSNHNGGFQVKANASARPKANGSA VSLKSGSLDTQEDTS S S S SPPRTFINQLPD WSMLLS AITTVF VAAEKQ WTMLDRKSKRPDMLM DPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEM CKRDLIWWTKMHIEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDFHTGDILIRATSVC AMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRWNDLDV NQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEGDRSLYQH LLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS*
SEQ ID NO: 23
Cuphea hyssopifolia (Chs) FATB3b (V204I,C239F, E243D, M25 IV variant) coding DNA sequence
ATGGTGGCTGCCGAAGCAAGTTCTGCACTCTTCTCCGTTCGAACCCCGGGAACCTCCCCTA AACCCGGGAAGTTCGGGAATTGGCCAACGAGCTTGAGCGTCCCCTTCAAGTCCAAATCAA ACCACAATGGCGGCTTTCAGGTTAAGGCAAACGCCAGTGCCCGTCCTAAGGCTAACGGTT CTGCAGTAAGTCTAAAGTCTGGCAGCCTCGACACTCAGGAGGACACTTCATCGTCGTCCTC TCCTCCTCGGACTTTCATTAACCAGTTGCCCGACTGGAGTATGCTGCTGTCCGCGATCACG ACCGTCTTCGTGGCGGCTGAGAAGCAGTGGACGATGCTTGATCGGAAATCTAAGAGGCCC GACATGCTCATGGACCCGTTTGGGGTTGACAGGGTTGTTCAGGATGGGGCTGTGTTCAGA C AGAGTTTTTCGATTAGGTCTT ACGAAATAGGCGCTGATCGAACAGCCTCT AT AGAGACG CTGATGAACATCTTCCAGGAAACATCTCTCAATCATTGTAAGAGTATCGGTCTTCTCAATG ACGGCTTTGGTCGTACTCCTGAGATGTGTAAGAGGGACCTCATTTGGGTGGTTACAAAAAT GCACATCGAGGTTAATCGCTATCCTACTTGGGGTGATACTATCGAGGTCAATACTTGGGTC TCCGAGTCGGGGAAAACCGGTATGGGTCGTGATTGGCTGATAAGTGATTTTCATACAGGA GAC ATTCTAAT AAGAGC AACGAGCGTGTGTGCT ATGATGAATC AAAAGACGAGAAGATTC TCAAAATTTCCATATGAGGTTCGACAGGAGTTAGCGCCTCATTTTGTGGACTCTGCTCCAG TCATTGAAGACTATCAAAAATTGCACAAGCTTGATGTGAAGACGGGTGATTCCATTTGCA ATGGCCTAACTCCAAGGTGGAATGACTTGGATGTCAATCAGCACGTTAACAATGTGAAGT ACATTGGGTGGATTCTCGAGAGTGTTCCAACGGAAGTTTTCGAGACCCAGGAGCTATGTG GCCTCACCCTTGAGTATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCCGTGACCG CTATGGATCCCTCAAAAGAGGGAGACAGATCTCTGTACCAGCACCTTCTTCGGCTTGAGG ATGGGGCTGATATCGCGAAGGGCAGAACCAAGTGGCGGCCGAAGAATGCAGGAACCAAT GGGGCAATATCAACAGGAAAGACTTCAAATGGAAACTCGATCTCTTAG
SEQ ID NO: 24
Cuphea hyssopifolia (Chs) FATB3b (V204I,C239F, E243D, M25 IV variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGAGGCCTCCTCCGCCCTGTTCTCCGTGCGCACCCCCGGCACCTCCCCCA AGCCCGGCAAGTTCGGCAACTGGCCCACCTCCCTGTCCGTGCCCTTCAAGTCCAAGTCCAA CCACAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCGCCCCAAGGCCAACGGCTC CGCCGTGTCCCTGAAGTCCGGCTCCCTGGACACCCAGGAGGACACCTCCTCCTCCTCCTCC CCCCCCCGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGTCCGCCATCACCA CCGTGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCG ACATGCTGATGGACCCCTTCGGCGTGGACCGCGTGGTGCAGGACGGCGCCGTGTTCCGCC AGTCCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACATCTTCCAGGAGACCTCCCTGAACCACTGCAAGTCCATCGGCCTGCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGAT GCACATCGAGGTGAACCGCTACCCCACCTGGGGCGACACCATCGAGGTGAACACCTGGGT GTCCGAGTCCGGCAAGACCGGCATGGGCCGCGACTGGCTGATCTCCGACTTCCACACCGG CGACATCCTGATCCGCGCCACCTCCGTGTGCGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGTTCCCCTACGAGGTGCGCCAGGAGCTGGCCCCCCACTTCGTGGACTCCGCCCCC GTGATCGAGGACTACCAGAAGCTGCACAAGCTGGACGTGAAGACCGGCGACTCCATCTGC AACGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAA GTACATCGGCTGGATCCTGGAGTCCGTGCCCACCGAGGTGTTCGAGACCCAGGAGCTGTG CGGCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTGAC CGCCATGGACCCCTCCAAGGAGGGCGACCGCTCCCTGTACCAGCACCTGCTGCGCCTGGA GGACGGCGCCGACATCGCCAAGGGCCGCACCAAGTGGCGCCCCAAGAACGCCGGCACCA ACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCATCTCCTGA
SEQ ID NO: 25
Cuphea PSR23 (Cu) FATB3 amino acid sequence
M WAAATS AFFP VPAPGT SPKPGKSGNWP S SLSPTFKPKSIPNAGFQVKANAS AHPKANGS AV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKCIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSQSGKIGMASDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDQKLHKFDVKTGDSIRKGLTPRWNDLD VNQHVSNVKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAVDPSENGGRSQYK HLLRLEDGTDIVKSRTEWRPKNAGTNGAISTSTAKTSNGNSVS*
SEQ ID NO: 26
Cuphea PSR23 (Cu) FATB3 coding DNA sequence
ATGGTGGTGGCTGCAGCAACTTCTGCATTCTTCCCCGTTCCAGCCCCGGGAACCTCCCCTA AACCCGGGAAGTCCGGCAACTGGCCATCGAGCTTGAGCCCTACCTTCAAGCCCAAGTCAA TCCCCAATGCCGGATTTCAGGTTAAGGCAAATGCCAGTGCCCATCCTAAGGCTAACGGTTC TGCAGTAAATCT AAAGTCTGGC AGCCTC AACACTC AGGAGGACACTTCGTCGTCCCCTCCT CCCCGGGCTTTCCTTAACCAGTTGCCTGATTGGAGTATGCTTCTGACTGCAATCACGACCG TCTTCGTGGCGGCAGAGAAGCAGTGGACTATGCTTGATAGGAAATCTAAGAGGCCTGACA TGCTCGTGGACTCGGTTGGGTTGAAGTGTATTGTTCGGGATGGGCTCGTGTCCAGACAGAG TTTTTTGATTAGATCTTATGAAATAGGCGCTGATCGAACAGCCTCTATAGAGACGCTGATG AACC ACTTGCAGGAAAC ATCTATC AATCATTGT AAGAGTTTGGGTCTTCTCAATGACGGCT TTGGTCGTACTCCTGGGATGTGTAAAAACGACCTCATTTGGGTGCTTACAAAAATGCAGAT CATGGTGAATCGCTACCCAACTTGGGGCGATACTGTTGAGATCAATACCTGGTTCTCTCAG TCGGGGAAAATCGGTATGGCTAGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATT CTTATAAGAGCAACGAGCGTGTGGGCTATGATGAATCAAAAGACGAGAAGATTCTCAAGA CTTCCATACGAGGTTCGCCAGGAGTTAACGCCTCATTTTGTGGACTCTCCTCATGTCATTG AAGACAATGATCAGAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTCGCAAGG GTCTAACTCCGAGGTGGAACGACTTGGATGTGAATCAGCACGTAAGCAACGTGAAGTACA TTGGGTGGATTCTCGAGAGTATGCCAATAGAAGTTTTGGAGACACAGGAGCTATGCTCTCT CACCGTAGAATATAGGCGGGAATGCGGAATGGACAGTGTGCTGGAGTCCGTGACTGCTGT GGATCCCTCAGAAAATGGAGGCCGGTCTCAGTACAAGCACCTTCTGCGGCTTGAGGATGG GACTGATATCGTGAAGAGCAGAACTGAGTGGCGACCGAAGAATGCAGGAACTAACGGGG CGATATCAACATCAACAGCAAAGACTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 27
Cuphea PSR23 (Cu) FATB3 coding DNA sequence codon optimized for Prototheca moriformis ATGGTGGTGGCCGCCGCCACCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCACCTTCAAGCCCAAGTCCAT CCCCAACGCCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGACCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGTGCATCGTGCGCGACGGCCTGGTGTCCCGCCAGT CCTTCCTGATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGAT GAACCACCTGCAGGAGACCTCCATCAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGG CTTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCA GATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTC CCAGTCCGGCAAGATCGGCATGGCCTCCGACTGGCTGATCTCCGACTGCAACACCGGCGA GATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCC CGCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTTCGTGGACTCCCCCCACGTG ATCGAGGACAACGACCAGAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGTGAAG TACATCGGCTGGATCCTGGAGTCCATGCCCATCGAGGTGCTGGAGACCCAGGAGCTGTGC TCCCTGACCGTGGAGTACCGCCGCGAGTGCGGCATGGACTCCGTGCTGGAGTCCGTGACC GCCGTGGACCCCTCCGAGAACGGCGGCCGCTCCCAGTACAAGCACCTGCTGCGCCTGGAG GACGGCACCGACATCGTGAAGTCCCGCACCGAGTGGCGCCCCAAGAACGCCGGCACCAA CGGCGCCATCTCCACCTCCACCGCCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 28
Cuphea wrightii (Cw) FATB3 amino acid sequence
MWAAAASSAFFPVPAPRTTPKPGKFGNWPSSLSPPFKPKSNPNGRFQVKANVSPHPKANGSA VSLKSGSLNTLEDPPSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQFTRLDRKSKRPDMLVDW FGSETIVQDGLVFRERFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGFGRTSEMC TRDLIWVLTKMQIWNRYPTWGDTVEINSWFSQSGKIGMGRDWLISDCNTGEILVRATSAWA MMNQKTRRFSKLPCEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPGWNDLDV NQHVSNVKYIGWILESMPTEVLETQELCSLTLEYRRECGRESVVESVTSMNPSKVGDRSQYQH LLRLEDGADIMKGRTEWRPKNAGTNRAIST*
SEQ ID NO: 29
Cuphea wrightii (Cw) FATB3 coding DNA sequence
ATGGTGGTGGCTGCTGC AGCAAGTTCTGC ATTCTTCCCTGTTCC AGC ACCTAGAACC ACGC CTAAACCCGGGAAGTTCGGCAATTGGCCATCGAGCTTGAGCCCGCCCTTCAAGCCCAAGT CAAACCCCAATGGTAGATTTCAGGTTAAGGCAAATGTCAGTCCTCATCCTAAGGCTAACG GTTCTGCAGTAAGTCTAAAGTCTGGCAGCCTCAACACTCTGGAGGACCCTCCGTCGTCCCC TCCTCCTCGGACTTTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCGGACTGCAATCACG ACCGTCTTCGTGGCGGCAGAGAAGC AGTTC ACTAGGCTCGATCGAAAATCT AAGAGGCCT GACATGCTAGTGGACTGGTTTGGGTCAGAGACTATTGTTCAGGATGGGCTCGTGTTCAGA GAGAGATTTTCGATCAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATAGAGACG CTGATGAACCACTTGCAGGACACATCTCTGAATCATTGTAAGAGTGTGGGTCTTCTCAATG ACGGCTTTGGTCGTACCTCGGAGATGTGTACAAGAGACCTCATTTGGGTGCTTACAAAAAT GCAGATCGTGGTGAATCGCTATCCAACTTGGGGCGATACTGTCGAGATCAATAGCTGGTT CTCCCAGTCGGGGAAAATCGGTATGGGTCGCGATTGGCTAATAAGTGATTGCAACACAGG AGAAATTCTTGTAAGAGCAACGAGCGCTTGGGCCATGATGAATCAAAAGACGAGAAGATT CTCAAAACTTCCATGCGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTGGACGCTCCTCCT GTCATTGAAGACAATGATCGGAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTT GCAAGGGTCTAACTCCGGGGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGA AGTACATTGGGTGGATTCTCGAGAGTATGCCTACAGAAGTTTTGGAGACCCAGGAGCTAT GCTCTCTCACCCTTGAATATAGGCGGGAATGTGGAAGGGAAAGTGTGGTAGAGTCCGTGA CCTCTATGAATCCCTCAAAAGTTGGAGACCGGTCTCAGTACCAACACCTTCTGCGGCTTGA GGATGGGGCTGATATCATGAAGGGCAGAACTGAGTGGAGACCAAAGAATGCAGGAACCA ACCGGGCGATATCAACATGA
SEQ ID NO: 30
Cuphea wrightii (Cw) FATB3 coding DNA sequence codon optimized for Prototheca moriformis ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCCCCCGCACCACCC CCAAGCCCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCCCCCCCTTCAAGCCCAAGTC CAACCCCAACGGCCGCTTCCAGGTGAAGGCCAACGTGTCCCCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCTGGAGGACCCCCCCTCCTCCCCC CCCCCCCGCACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCGCACCGCCATCACCA CCGTGTTCGTGGCCGCCGAGAAGCAGTTCACCCGCCTGGACCGCAAGTCCAAGCGCCCCG ACATGCTGGTGGACTGGTTCGGCTCCGAGACCATCGTGCAGGACGGCCTGGTGTTCCGCG AGCGCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACCACCTGCAGGACACCTCCCTGAACCACTGCAAGTCCGTGGGCCTGCTGAACGA CGGCTTCGGCCGCACCTCCGAGATGTGCACCCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCGTGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCGACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGGTGCGCGCCACCTCCGCCTGGGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGCTGCCCTGCGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CTGCAAGGGCCTGACCCCCGGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCACCGAGGTGCTGGAGACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGGTGGAGTCCGT GACCTCCATGAACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCGCCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACCGCGCCATCTCCACCTGA
SEQ ID NO: 31
Cuphea wrightii (Cw) FATB4a amino acid sequence
MVATAASSAFFPVPSADTSSSRPGKLGSGPSSLSPLKPKSIPNGGLQVKANASAPPKINGSSVGL KSGGFKTQEDSPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDPF GLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSNDGFGRTPEMYK RDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSVW VMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPRWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSAEGYASRFQH LLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGDFF*
SEQ ID NO: 32
Cuphea wrightii (Cw) FATB4a coding DNA sequence
TTGGTGGCTACCGCTGCAAGTTCTGCATTTTTCCCCGTGCCATCCGCCGACACCTCCTCCTC GAGACCCGGAAAGCTCGGCAGTGGACCATCGAGCTTGAGCCCCCTCAAGCCCAAATCGAT CCCC AATGGCGGCTTGC AGGTT AAGGC AAACGCCAGTGCCCCTCCTAAGATC AATGGTTC CTCGGTCGGTCTAAAGTCGGGCGGTTTCAAGACTCAGGAAGACTCTCCTTCGGCCCCTCCT CCGCGGACTTTTATCAACCAGTTGCCTGATTGGAGTATGCTTCTTGCTGCAATCACTACTG TCTTCTTGGCTGCAGAGAAGCAGTGGATGATGCTTGATTGGAAACCTAAGAGGCCTGACA TGCTCGTGGACCCGTTCGGATTGGGAAGTATTGTTCAGGATGGGCTTGTGTTCAGGCAGAA TTTTTCAATTAGGTCCTACGAAATAGGCGCCGATCGAACTGCGTCTATAGAGACGGTGATG AACCATTTGCAGGAAACAGCTCTCAATCATGTCAAGATTGCTGGGCTTTCTAATGACGGCT TTGGTCGTACTCCTGAGATGTATAAAAGAGACCTTATTTGGGTTGTTGCAAAAATGCAGGT CATGGTTAACCGCTATCCTACTTGGGGTGACACGGTTGAAGTGAATACTTGGGTTGCCAAG TCAGGGAAAAATGGTATGCGTCGTGACTGGCTCATAAGTGATTGCAATACTGGAGAGATT CTTACAAGAGCATCAAGCGTGTGGGTCATGATGAATCAAAAGACAAGAAGATTGTCAAAA ATTCCAGATGAGGTTCGAAATGAGATAGAGCCTCATTTTGTGGACTCTGCTCCCGTCGTTG AAGATGATGATCGGAAACTTCCCAAGCTGGATGAGAACACTGCTGACTCCATCCGCAAGG GTCTAACTCCGAGGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTACA TCGGATGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGCTCCCT GACCCTGGAATAC AGGCGGGAATGTGGAAGGGAGAGCGTGCTGGAGTCCCTC ACTGCTGT CGACCCGTCTGCAGAGGGCTATGCGTCCCGGTTTCAGCACCTTCTGCGGCTTGAGGATGGA GGTGAGATCGTGAAGGCGAGAACTGAGTGGCGACCCAAGAATGCTGGAATCAATGGGGT GGTACCATCCGAGGAGTCCTCACCTGGAGACTTCTTTTAG SEQ ID NO: 33
Cuphea wrightii (Cw) FATB4a coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCGCCGACACCTCCTCCTC CCGCCCCGGCAAGCTGGGCTCCGGCCCCTCCTCCCTGTCCCCCCTGAAGCCCAAGTCCATC CCCAACGGCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCC TCCGTGGGCCTGAAGTCCGGCGGCTTCAAGACCCAGGAGGACTCCCCCTCCGCCCCCCCC CCCCGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCG TGTTCCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACA TGCTGGTGGACCCCTTCGGCCTGGGCTCCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAA CTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATG AACCACCTGCAGGAGACCGCCCTGAACCACGTGAAGATCGCCGGCCTGTCCAACGACGGC TTCGGCCGCACCCCCGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAG GTGATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCC AAGTCCGGCAAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAG ATCCTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCCGCCTGTCC AAGATCCCCGACGAGGTGCGCAACGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTG GTGGAGGACGACGACCGCAAGCTGCCCAAGCTGGACGAGAACACCGCCGACTCCATCCG CAAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGA AGTACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGT GCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGA CCGCCGTGGACCCCTCCGCCGAGGGCTACGCCTCCCGCTTCCAGCACCTGCTGCGCCTGGA GGACGGCGGCGAGATCGTGAAGGCCCGCACCGAGTGGCGCCCCAAGAACGCCGGCATCA ACGGCGTGGTGCCCTCCGAGGAGTCCTCCCCCGGCGACTTCTTCTGA
SEQ ID NO: 34
Cuphea wrightii (Cw) FATB4b amino acid sequence
MVATAASSAFFPVPSADTSSSRPGKLGNGPSSLSPLKPKSIPNGGLQVKANASAPPKINGSSVGL KSGSFKTQEDAPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDPF GLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSSDGFGRTPAMSK RDLIWWAKMQVMVNRYPAWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSV WVMMNQKTRRLSKIPDEVRNEIEPHFVDSAP WEDDDRKLPKLDENTADSIRKGLTPRWNDL DVNQHVNNVKYIGWILE STPAE VLETQELC SLTLEYRRECGRE S VLESLTAVDP SGEGDGSKF QHLLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGGDFF*
SEQ ID NO: 35
Cuphea wrightii (Cw) FATB4b coding DNA sequence
TTGGTGGCTACCGCTGCAAGTTCTGCATTTTTCCCCGTACCATCCGCCGACACCTCCTCATC GAGACCCGGAAAGCTCGGCAATGGGCCATCGAGCTTGAGCCCCCTCAAGCCGAAATCGAT CCCCAATGGCGGGTTGCAGGTTAAGGCAAACGCCAGTGCCCCTCCTAAGATCAATGGTTC CTCGGTCGGTCTGAAGTCGGGCAGTTTCAAGACTCAGGAAGACGCTCCTTCGGCCCCTCCT CCTCGGACTTTTATCAACCAGTTGCCTGATTGGAGTATGCTTCTTGCTGCAATCACTACTGT CTTCTTGGCTGCAGAGAAGCAGTGGATGATGCTTGATTGGAAACCTAAGAGGCCTGACAT GCTTGTCGACCCGTTCGGATTGGGAAGTATTGTTCAGGATGGGCTTGTTTTCAGGCAGAAT TTCTCGATTAGGTCCTACGAAATAGGCGCTGATCGCACTGCGTCTATAGAGACGGTGATG AACCATTTGCAGGAAACAGCTCTCAATCATGTTAAGATTGCTGGGCTTTCTAGTGATGGCT TTGGTCGTACTCCTGCGATGTCTAAACGGGACCTCATTTGGGTTGTTGCGAAAATGCAGGT CATGGTTAACCGCTACCCTGCTTGGGGTGACACGGTTGAAGTGAATACTTGGGTTGCCAA GTCAGGGAAAAATGGTATGCGTCGTGACTGGCTCATAAGTGATTGCAACACTGGAGAGAT TCTTACAAGAGCATCAAGCGTGTGGGTCATGATGAATCAAAAGACAAGAAGATTGTCAAA AATTCCAGATGAGGTTCGAAATGAGATAGAGCCTCATTTTGTGGACTCTGCGCCCGTCGTT GAAGACGATGACCGGAAACTTCCCAAGCTGGATGAGAACACTGCTGACTCCATCCGCAAG GGTCT AACTCCGAGGTGGAATGACTTGGATGTC AATC AGC ACGTC AAC AACGTGAAGTAC ATTGGGTGGATTCTTGAGAGTACTCCAGCAGAAGTTCTGGAGACCCAGGAATTATGTTCCC TGACCCTGGAATACAGGCGGGAATGTGGAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTG TAGATCCGTCTGGAGAGGGCGATGGGTCCAAGTTCCAGCACCTTCTGCGGCTTGAGGATG GAGGTGAGATCGTGAAGGCGAGAACTGAGTGGCGACCAAAGAATGCTGGAATCAATGGG GTGGTACCATCCGAGGAGTCCTCACCTGGTGGAGACTTCTTTTAA
SEQ ID NO: 36
Cuphea wrightii (Cw) FATB4b coding DNA sequence codon optimized for Prototheca moriformis ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCGCCGACACCTCCTCCTC CCGCCCCGGCAAGCTGGGCAACGGCCCCTCCTCCCTGTCCCCCCTGAAGCCCAAGTCCATC CCCAACGGCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCC TCCGTGGGCCTGAAGTCCGGCTCCTTCAAGACCCAGGAGGACGCCCCCTCCGCCCCCCCCC CCCGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGT GTTCCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACAT GCTGGTGGACCCCTTCGGCCTGGGCTCCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAA CTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATG AACCACCTGCAGGAGACCGCCCTGAACCACGTGAAGATCGCCGGCCTGTCCTCCGACGGC TTCGGCCGCACCCCCGCCATGTCCAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAG GTGATGGTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCC AAGTCCGGCAAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAG ATCCTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCCGCCTGTCC AAGATCCCCGACGAGGTGCGCAACGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTG GTGGAGGACGACGACCGCAAGCTGCCCAAGCTGGACGAGAACACCGCCGACTCCATCCG CAAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGA AGTACATCGGCTGGATCCTGGAGTCCACCCCCGCCGAGGTGCTGGAGACCCAGGAGCTGT GCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGA CCGCCGTGGACCCCTCCGGCGAGGGCGACGGCTCCAAGTTCCAGCACCTGCTGCGCCTGG AGGACGGCGGCGAGATCGTGAAGGCCCGCACCGAGTGGCGCCCCAAGAACGCCGGCATC AACGGCGTGGTGCCCTCCGAGGAGTCCTCCCCCGGCGGCGACTTCTTCTGA
SEQ ID NO: 37
Cuphea wrightii (Cw) FATB5 amino acid sequence
MVAAAAS S AFF S VPTPGTPPKPGKFGNWP S SLS VPFKPDNGGFHVKANAS AHPKANGS AVNL KSGSLETPPRSFINQLPDLSVLLSKITTVFGAAEKQWKRPGMLVEPFGVDRIFQDGVFFRQSFSI RSYEIGVDRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWWTKIQVEVNRYP TWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQNTRRLSKFPYEVRQE IAPHFVDSAPVIEDDQKLQKLDVKTGDSIRDGLTPRWNDLDVNQHVNNVKYIGWILKSVPIEV FETQELCGVTLEYRRECGRDSVLESVTAMDPAKEGDRCVYQHLLRLEDGADITIGRTEWRPK NAGANGAMS SGKT SNGNCLIEGRGWQPFRWRLIF *
SEQ ID NO: 38
Cuphea wrightii (Cw) FATB5 coding DNA sequence
ATGGTGGCTGCCGCAGCAAGTTCTGCATTCTTCTCTGTTCCAACCCCGGGAACGCCCCCTA AACCCGGGAAGTTCGGTAACTGGCCATCGAGCTTGAGCGTCCCCTTCAAGCCCGACAATG GTGGCTTTCATGTCAAGGCAAACGCCAGTGCCCATCCTAAGGCTAATGGTTCTGCGGTAA ATCTAAAGTCTGGCAGCCTCGAGACTCCTCCTCGGAGTTTCATTAACCAGCTGCCGGACTT GAGTGTGCTTCTGTCCAAAATCACGACTGTCTTCGGGGCGGCTGAGAAGCAGTGGAAGAG GCCCGGCATGCTCGTGGAACCGTTTGGGGTTGACAGGATTTTTCAGGATGGTGTTTTTTTC AGACAGAGTTTTTCTATCAGGTCTTACGAAATAGGCGTTGATCGAACAGCCTCGATAGAG ACACTGATGAACATCTTCCAGGAAACATCTTTGAATCATTGCAAGAGTATCGGTCTTCTCA ACGATGGCTTTGGTCGTACTCCTGAGATGTGTAAGAGGGACCTCATTTGGGTGGTTACGAA AATTCAGGTCGAGGTGAATCGCTATCCTACTTGGGGTGACACTATCGAAGTCAATACTTGG GTCTCGGAGTCGGGGAAAAACGGTATGGGTCGGGATTGGCTGATAAGTGATTGCCGTACT GGAGAGATTCTTATAAGAGCAACGAGCGTGTGGGCGATGATGAATCAAAACACGAGAAG ATTGTCAAAATTTCCATATGAGGTTCGACAGGAGATAGCGCCTCATTTTGTGGACTCTGCT CCTGTC ATTGAAGACGATC AAAAGTTGC AGAAGCTTGATGTGAAGAC AGGTGATTCC ATT CGCGATGGTCTAACTCCGAGATGGAATGACTTGGATGTCAATCAACACGTTAACAATGTG AAGTACATTGGATGGATTCTCAAGAGTGTTCCAATAGAAGTTTTCGAGACACAGGAGCTA TGCGGCGTCACACTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCAGTG ACCGCTATGGATCCAGCAAAAGAGGGAGACCGGTGTGTGTACCAGCACCTTCTTCGGCTT GAGGATGGAGCTGATATCACTATAGGCAGAACCGAGTGGCGGCCGAAGAATGCAGGAGC CAATGGTGCAATGTCATCAGGAAAGACTTCAAATGGAAACTGTCTCATAGAAGGAAGGGG TTGGCAACCTTTCCGAGTTGTGCGTTTAATTTTCTGA
SEQ ID NO: 39
Cuphea wrightii (Cw) FATB5 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCTCCGTGCCCACCCCCGGCACCCCCCCCA AGCCCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCGTGCCCTTCAAGCCCGACAACGG CGGCTTCCACGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTCCGCCGTGAA CCTGAAGTCCGGCTCCCTGGAGACCCCCCCCCGCTCCTTCATCAACCAGCTGCCCGACCTG TCCGTGCTGCTGTCCAAGATCACCACCGTGTTCGGCGCCGCCGAGAAGCAGTGGAAGCGC CCCGGCATGCTGGTGGAGCCCTTCGGCGTGGACCGCATCTTCCAGGACGGCGTGTTCTTCC GCCAGTCCTTCTCCATCCGCTCCTACGAGATCGGCGTGGACCGCACCGCCTCCATCGAGAC CCTGATGAACATCTTCCAGGAGACCTCCCTGAACCACTGCAAGTCCATCGGCCTGCTGAAC GACGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAG ATCCAGGTGGAGGTGAACCGCTACCCCACCTGGGGCGACACCATCGAGGTGAACACCTGG GTGTCCGAGTCCGGCAAGAACGGCATGGGCCGCGACTGGCTGATCTCCGACTGCCGCACC GGCGAGATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAACACCCGCCGC CTGTCCAAGTTCCCCTACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACTCCGCCC CCGTGATCGAGGACGACCAGAAGCTGCAGAAGCTGGACGTGAAGACCGGCGACTCCATC CGCGACGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTG AAGTACATCGGCTGGATCCTGAAGTCCGTGCCCATCGAGGTGTTCGAGACCCAGGAGCTG TGCGGCGTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTG ACCGCCATGGACCCCGCCAAGGAGGGCGACCGCTGCGTGTACCAGCACCTGCTGCGCCTG GAGGACGGCGCCGACATCACCATCGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCGC CAACGGCGCCATGTCCTCCGGCAAGACCTCCAACGGCAACTGCCTGATCGAGGGCCGCGG CTGGCAGCCCTTCCGCGTGGTGCGCCTGATCTTCTGA
SEQ ID NO: 40
Cuphea heterophylla {Chi) FATBla amino acid sequence
MVAAAAS S AFF S VPTPGT STKPGNFGNWPS SLS VPFKPE SNHNGGFRVKANAS AHPKANGS A VNLKSGSLETQEDTSSSSPPPRTFIKQLPDWGMLLSKITTVFGAAERQWKRPGMLVEPFGVDRI FQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWV VTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNRKT RRLSKFPYEVRQEIAPHF VDSAPVIEDDKKLHKLDVKTGDSIRKGLTPRWNDLDVNQHVNNV KYIGWILKSVPAEVFETQELCGVTLEYRRECGRDSVLESVTAMDTAKEGDRSLYQHLLRLEDG ADITIGRTEWRPKNAGANGAISTGKTSNENSVS*
SEQ ID NO: 41
Cuphea heterophylla {Chi) FATBla coding DNA sequence
ATGGTGGCTGCCGCAGCAAGTTCTGCATTCTTCTCCGTTCCAACCCCGGGAACCTCCACTA AACCCGGGAACTTCGGCAATTGGCCATCGAGCTTGAGCGTCCCCTTCAAGCCCGAATCAA ACCACAATGGTGGCTTTCGGGTCAAGGCAAACGCCAGTGCTCATCCTAAGGCTAACGGTT CTGCAGTAAATCTAAAGTCTGGCAGCCTCGAGACTCAGGAGGACACTTCATCGTCGTCCC CTCCTCCTCGGACTTTTATTAAGCAGTTGCCCGACTGGGGTATGCTTCTGTCCAAAATCAC GACTGTCTTCGGGGCGGCTGAGAGGCAGTGGAAGAGGCCCGGCATGCTTGTGGAACCGTT TGGGGTTGACAGGATTTTTCAGGATGGGGTTTTTTTCAGACAGAGTTTTTCGATCAGGTCT TACGAAATAGGCGCTGATCGAACAGCCTCAATAGAGACGCTGATGAACATCTTCCAGGAA ACATCTCTGAATCATTGTAAGAGTATCGGTCTTCTCAATGACGGCTTTGGTCGTACTCCTG AGATGTGTAAGAGGGACCTCATTTGGGTGGTTACGAAAATTCAGGTCGAGGTGAATCGCT ATCCT ACTTGGGGTGAT ACTATTGAGGTC AAT ACTTGGGTCTCAGAGTCGGGGAAAAACG GTATGGGTCGTGATTGGCTGATAAGCGATTGCCGTACCGGAGAAATTCTTATAAGAGCAA CGAGCGTGTGGGCTATGATGAATCGAAAGACGAGAAGATTGTCAAAATTTCCATATGAGG TTCGACAGGAGATAGCGCCTCATTTTGTGGACTCTGCTCCTGTCATTGAAGACGATAAAAA ATTGCACAAGCTTGATGTTAAGACGGGTGATTCCATTCGCAAGGGTCTAACTCCAAGGTG GAATGACTTGGATGTCAATCAGCACGTTAACAATGTGAAGTACATTGGGTGGATTCTCAA GAGTGTTCCAGCAGAAGTTTTCGAGACCCAGGAGCTATGCGGAGTCACCCTTGAGTACAG GCGGGAATGTGGAAGGGACAGTGTGCTGGAGTCCGTGACCGCTATGGATACCGCAAAAG AGGGAGACCGGTCTCTGTACCAGCACCTTCTTCGGCTTGAGGATGGGGCTGATATCACCAT AGGCAGAACCGAGTGGCGGCCGAAGAATGCAGGAGCCAATGGGGCAATATCAACAGGAA AGACTTCAAATGAAAACTCTGTCTCTTAG
SEQ ID NO: 42
Cuphea heterophylla {Chi) FATBla coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCTCCGTGCCCACCCCCGGCACCTCCACCA
AGCCCGGCAACTTCGGCAACTGGCCCTCCTCCCTGTCCGTGCCCTTCAAGCCCGAGTCCAA
CCACAACGGCGGCTTCCGCGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGGAGACCCAGGAGGACACCTCCTCCTCCTCCCCC CCCCCCCGCACCTTCATCAAGCAGCTGCCCGACTGGGGCATGCTGCTGTCCAAGATCACCA CCGTGTTCGGCGCCGCCGAGCGCCAGTGGAAGCGCCCCGGCATGCTGGTGGAGCCCTTCG GCGTGGACCGCATCTTCCAGGACGGCGTGTTCTTCCGCCAGTCCTTCTCCATCCGCTCCTA CGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGATGAACATCTTCCAGGAGAC CTCCCTGAACCACTGCAAGTCCATCGGCCTGCTGAACGACGGCTTCGGCCGCACCCCCGA GATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGATCCAGGTGGAGGTGAACCGCTA CCCCACCTGGGGCGACACCATCGAGGTGAACACCTGGGTGTCCGAGTCCGGCAAGAACGG CATGGGCCGCGACTGGCTGATCTCCGACTGCCGCACCGGCGAGATCCTGATCCGCGCCAC CTCCGTGTGGGCCATGATGAACCGCAAGACCCGCCGCCTGTCCAAGTTCCCCTACGAGGT GCGCCAGGAGATCGCCCCCCACTTCGTGGACTCCGCCCCCGTGATCGAGGACGACAAGAA GCTGCACAAGCTGGACGTGAAGACCGGCGACTCCATCCGCAAGGGCCTGACCCCCCGCTG GAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAAGTACATCGGCTGGATCCTGAA GTCCGTGCCCGCCGAGGTGTTCGAGACCCAGGAGCTGTGCGGCGTGACCCTGGAGTACCG CCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTGACCGCCATGGACACCGCCAAGGA GGGCGACCGCTCCCTGTACCAGCACCTGCTGCGCCTGGAGGACGGCGCCGACATCACCAT CGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCGCCAACGGCGCCATCTCCACCGGCAA GACCTCCAACGAGAACTCCGTGTCCTGA
SEQ ID NO: 43
Cuphea heterophylla (Cht) FATBlb (P16S, T20P, G94S, G105W, S293F, L305F variant) amino acid sequence
MVAAAAS S AFF S VPTSGT SPKPGNFGNWP S SLS VPFKPES SHNGGFQVKANAS AHPKANGS AV NLKSGSLETQEDT S S S SPPPRTFIKQLPD WSMLLSKITTVF WAAERQ WKRPGMLVEPFGVDRIF QDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWW TKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNRKTRR LSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDFIRKGLTPRWNDFDVNQHVNNVKYI GWILKSVPAEVFETQELCGVTLEYRRECGRDSVLESVTAMDTAKEGDRSLYQHLLRLEDGADI TIGRTEWRPKNAGANGAISTGKTSNENSVS*
SEQ ID NO: 44
Cuphea heterophylla (Cht) FATBlb(P16S, T20P, G94S, G105W, S293F, L305F variant) coding DNA sequence
ATGGTGGCTGCCGCAGCAAGTTCTGCATTCTTCTCCGTTCCAACCTCGGGAACCTCCCCTA AACCCGGGAACTTCGGCAATTGGCCATCGAGCTTGAGCGTCCCCTTCAAGCCCGAATCAA GCCACAATGGTGGCTTTCAGGTCAAGGCAAACGCCAGTGCCCATCCTAAGGCTAACGGTT CTGCAGTAAATCTAAAGTCTGGCAGCCTCGAGACTCAGGAGGACACTTCATCGTCGTCCC CTCCTCCTCGGACTTTTATTAAGC AGTTGCCCGACTGGAGT ATGCTTCTGTCCAAAATC AC GACTGTCTTCTGGGCGGCTGAGAGGCAGTGGAAGAGGCCCGGCATGCTTGTGGAACCGTT TGGGGTTGACAGGATTTTTCAGGATGGGGTTTTTTTCAGACAGAGTTTTTCGATCAGGTCT TACGAAATAGGCGCTGATCGAACAGCCTCAATAGAGACGCTGATGAACATCTTCCAGGAA ACATCTCTGAATCATTGTAAGAGTATCGGTCTTCTCAATGACGGCTTTGGTCGTACTCCTG AGATGTGTAAGAGGGACCTCATTTGGGTGGTTACGAAAATTCAGGTCGAGGTGAATCGCT ATCCTACTTGGGGTGATACTATTGAGGTCAATACTTGGGTCTCAGAGTCGGGGAAAAACG GTATGGGTCGTGATTGGCTGATAAGCGATTGCCGTACCGGAGAAATTCTTATAAGAGCAA CGAGCGTGTGGGCTATGATGAATCGAAAGACGAGAAGATTGTCAAAATTTCCATATGAGG TTCGACAGGAGATAGCGCCTCATTTTGTGGACTCTGCTCCTGTCATTGAAGACGATAAAAA ATTGCACAAGCTTGATGTTAAGACGGGTGATTTCATTCGCAAGGGTCTAACTCCAAGGTG GAATGACTTTGATGTCAATCAGCACGTTAACAATGTGAAGTACATTGGGTGGATTCTCAA GAGTGTTCCAGCAGAAGTTTTCGAGACCCAGGAGCTATGCGGAGTCACCCTTGAGTATAG GCGGGAATGTGGAAGGGACAGTGTGCTGGAGTCCGTGACCGCTATGGATACCGCAAAAG AGGGAGACCGGTCTCTGTACCAGCACCTTCTTCGGCTTGAGGATGGGGCTGATATCACCAT AGGCAGAACCGAGTGGCGGCCGAAGAATGCAGGAGCCAATGGGGCAATATCAACAGGAA AGACTTCAAATGAAAACTCTGTCTCTTAG SEQ ID NO: 45
Cuphea heterophylla (Cht) FATBlb (P16S, T20P, G94S, G105W, S293F, L305F variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCTCCGTGCCCACCTCCGGCACCTCCCCCA AGCCCGGCAACTTCGGCAACTGGCCCTCCTCCCTGTCCGTGCCCTTCAAGCCCGAGTCCTC CCACAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGGAGACCCAGGAGGACACCTCCTCCTCCTCCCCC CCCCCCCGCACCTTCATCAAGCAGCTGCCCGACTGGTCCATGCTGCTGTCCAAGATCACCA CCGTGTTCTGGGCCGCCGAGCGCCAGTGGAAGCGCCCCGGCATGCTGGTGGAGCCCTTCG GCGTGGACCGCATCTTCCAGGACGGCGTGTTCTTCCGCCAGTCCTTCTCCATCCGCTCCTA CGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGATGAACATCTTCCAGGAGAC CTCCCTGAACCACTGCAAGTCCATCGGCCTGCTGAACGACGGCTTCGGCCGCACCCCCGA GATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGATCCAGGTGGAGGTGAACCGCTA CCCCACCTGGGGCGACACCATCGAGGTGAACACCTGGGTGTCCGAGTCCGGCAAGAACGG CATGGGCCGCGACTGGCTGATCTCCGACTGCCGCACCGGCGAGATCCTGATCCGCGCCAC CTCCGTGTGGGCC ATGATGAACCGCAAGACCCGCCGCCTGTCCAAGTTCCCCT ACGAGGT GCGCCAGGAGATCGCCCCCCACTTCGTGGACTCCGCCCCCGTGATCGAGGACGACAAGAA GCTGCACAAGCTGGACGTGAAGACCGGCGACTTCATCCGCAAGGGCCTGACCCCCCGCTG GAACGACTTCGACGTGAACCAGCACGTGAACAACGTGAAGTACATCGGCTGGATCCTGAA GTCCGTGCCCGCCGAGGTGTTCGAGACCCAGGAGCTGTGCGGCGTGACCCTGGAGTACCG CCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTGACCGCC ATGGAC ACCGCCAAGGA GGGCGACCGCTCCCTGTACCAGCACCTGCTGCGCCTGGAGGACGGCGCCGACATCACCAT CGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCGCCAACGGCGCCATCTCCACCGGCAA GACCTCCAACGAGAACTCCGTGTCCTGA
SEQ ID NO: 46
Cuphea heterophylla (Cht) FATB2b amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHPKANGSA VSLKSGSLNTQEGTSSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQLTMLDRKSKKPDMHVD WFGLEIIVQDGLVFRESFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGFGRTPEM CKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIWA MMNQKTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPEWNDLDV NQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDRSQYQ HLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS*
SEQ ID NO: 47
Cuphea heterophylla (Cht) FATB2b coding DNA sequence
ATGGTGGTGGCTGCTGC AGC AAGCTCTGC ATTCTTCCCTGTTCCGGC ATCTGGAACCTCCC CTAAACCCGGGAAGTTCGGGACTTGGCTATCGAGCTCGAGCCCTTCCTACAAGCCCAAGT CAAACCCCAGTGGTGGATTTCAGGTTAAGGCAAATGCCAGTGCTCATCCTAAGGCTAACG GTTCCGCAGTAAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGGGCACTTCGTCGTCCCC TCCTCCTCGGACTTTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCGGACTGCAATCACG ACCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAGTCTAAGAAGCCT GACATGCACGTGGACTGGTTTGGGTTGGAGATTATTGTTCAGGATGGGCTCGTGTTCAGAG AGAGTTTTTCGATCAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATAGAAACGTT GATGAACCATTTGCAGGACACATCTTTGAACCATTGTAAGAGTGTGGGTCTTCTCAATGAC GGCTTTGGTCGTACCCCGGAGATGTGTAAAAGGGACCTCATTTGGGTGCTTACAAAAATG CAGATCATGGTGAATCGCTATCCAACTTGGGGCGATACTGTCGAGATCAATAGCTGGTTCT CCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGGAG AAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAAGACGAGAAGATTCT CAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTGGACGCCCCTCCTGT CATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTTGC AAGGGTCTAACACCGGAGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAAG TACATTGGGTGGATTCTCGAGAGTATGCCAAAAGAAGTTTTGGACACCCAGGAGCTATGC TCTCTCACCCTTGAATATAGGCGGGAATGCGGAAGGGATAGTGTGCTGGAGTCTGTGACC GCTATGGATCCCTCAAAAGTTGGAGACCGATCTCAGTACCAGCACCTTCTGCGGCTTGAA GATGGGACTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCAA CGGGGCTATATCAACAGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 48
Cuphea heterophylla {Chi) FATB2b coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCTCCGGCACCTCCC CCAAGCCCGGCAAGTTCGGCACCTGGCTGTCCTCCTCCTCCCCCTCCTACAAGCCCAAGTC CAACCCCTCCGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGGGCACCTCCTCCTCCCCC CCCCCCCGCACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCGCACCGCCATCACCA CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACCATGCTGGACCGCAAGTCCAAGAAGCCCG ACATGCACGTGGACTGGTTCGGCCTGGAGATCATCGTGCAGGACGGCCTGGTGTTCCGCG AGTCCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACCACCTGCAGGACACCTCCCTGAACC ACTGC AAGTCCGTGGGCCTGCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAAGACCCGCCGCTT CTCC AAGCTGCCC AACGAGGTGCGCC AGGAGATCGCCCCCC ACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CTGCAAGGGCCTGACCCCCGAGTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCAAGGAGGTGCTGGACACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGT GACCGCCATGGACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCACCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 49
Cuphea heterophylla (Cht) FATB2a (S17P, P21 S, T28N, L30P, S33L, G76D, S78P, G137W variant) amino acid sequence
MWAAAASSAFFPVPAPGTTSKPGKFGNWPSSLSPSFKPKSNPNGGFQVKANASAHPKANGS AVSLKSGSLNTKEDTPSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQLTMLDRKSKKPDMHV DWFGLEIIVQDWLVFRESFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGFGRTPE MCKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIW AMMNQKTRRF SKLPNEVRQEIAPHF VDAPPLIEDNDRKLHKFDVKTGD SICKGLTPE WNDLD VNQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDRSQY QHLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS* SEQ ID NO: 50
Cuphea heterophylla (Cht) FATB2a (S17P, P21 S, T28N, L30P, S33L, G76D, S78P, G137W variant) coding DNA sequence
ATGGTGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCCCTGTTCCAGCACCTGGAACCACGT CTAAACCCGGGAAGTTCGGCAATTGGCCATCGAGCTTGAGCCCTTCCTTCAAGCCCAAGTC AAACCCCAATGGTGGATTTCAGGTTAAGGCAAATGCCAGCGCTCATCCTAAGGCTAACGG GTCTGCAGTAAGTCTAAAGTCTGGCAGCCTCAACACTAAGGAGGACACTCCGTCGTCCCC TCCTCCTCGGACTTTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCGGACTGCAATCACG ACCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAGTCTAAGAAGCCT GACATGCACGTGGACTGGTTTGGGTTGGAGATTATTGTTCAGGATTGGCTCGTGTTCAGAG AGAGTTTTTCGATCAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATAGAAACGTT GATGAACCATTTGCAGGACACATCTTTGAACCATTGTAAGAGTGTGGGTCTTCTCAATGAC GGCTTTGGTCGTACCCCGGAGATGTGTAAAAGGGACCTCATTTGGGTGCTTACAAAAATG CAGATCATGGTGAATCGCTATCCAACTTGGGGCGATACTGTCGAGATCAATAGCTGGTTCT CCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGGAG AAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAAGACGAGAAGATTCT CAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCTCCTCATTTTGTGGACGCCCCTCCTCT CATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTTGC AAGGGTCTAACACCGGAGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAAG TACATTGGGTGGATTCTCGAGAGTATGCCAAAAGAAGTTTTGGACACCCAGGAGCTATGC TCTCTCACCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCTGTGACC GCTATGGATCCCTCAAAAGTTGGAGACCGATCTCAGTACCAGCACCTTCTGCGGCTTGAA GATGGGACTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCAA CGGGGCGATATCAACAGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG SEQ ID NO: 51
Cuphea heterophylla (Cht) FATB2a (S17P, P21 S, T28N, L30P, S33L, G76D, S78P, G137W variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCACCT CCAAGCCCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGCCCAAGTC C AACCCC AACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCC AAGGCC AACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCAAGGAGGACACCCCCTCCTCCCCC CCCCCCCGCACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCGCACCGCCATCACCA CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACCATGCTGGACCGCAAGTCCAAGAAGCCCG ACATGCACGTGGACTGGTTCGGCCTGGAGATCATCGTGCAGGACTGGCTGGTGTTCCGCG AGTCCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACCACCTGCAGGACACCTCCCTGAACCACTGCAAGTCCGTGGGCCTGCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CCTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CTGCAAGGGCCTGACCCCCGAGTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCAAGGAGGTGCTGGACACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGT GACCGCCATGGACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCACCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 52
Cuphea heterophylla (Cht) FATB2c (G76D, S78P variant) amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHPKANGSA VSLKSGSLNTKEDTPSSPPPRTFLNQLPDWNRLRTAITTVFVAAEKQLTMLDRKSKKPDMHVD WFGLEIIVQDGLVFRESFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGFGRTPEM CKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIWA MMNQKTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPEWNDLDV NQHVS VKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDRSQYQ HLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS*
SEQ ID NO: 53
Cuphea heterophylla {Chi) FATB2c (G76D, S78P variant) coding DNA sequence
ATGGTGGTGGCTGCTGCAGCAAGCTCTGCATTCTTCCCTGTTCCGGCATCTGGAACCTCCC CTAAACCCGGGAAGTTCGGGACTTGGCTATCGAGCTCGAGCCCTTCCTACAAGCCCAAGT CAAACCCCAGTGGTGGATTTCAGGTTAAGGCAAATGCCAGTGCTCATCCTAAGGCTAACG GTTCCGCAGTAAGTCTAAAGTCTGGCAGCCTCAACACTAAGGAGGACACTCCGTCGTCCC CTCCTCCTCGGACTTTCCTTAACCAGTTGCCTGATTGGAATAGGCTTCGGACTGCAATCAC GACCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAGTCTAAGAAGCC TGACATGCACGTGGACTGGTTTGGGTTGGAGATTATTGTTCAGGATGGGCTCGTGTTCAGA GAGAGTTTTTCGATCAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATAGAAACG TTGATGAACCATTTGCAGGACACATCTTTGAACCATTGTAAGAGTGTGGGTCTTCTCAATG ACGGCTTTGGTCGTACCCCGGAGATGTGTAAAAGGGACCTCATTTGGGTGCTTACAAAAA TGCAGATCATGGTGAATCGCTATCCAACTTGGGGCGATACTGTCGAGATCAATAGCTGGTT CTCCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGG AGAAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAAGACGAGAAGATT CTCAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTGGACGCCCCTCCT GTCATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTT GCAAGGGTCTAACACCGGAGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGA AGTACATTGGGTGGATTCTCGAGAGTATGCCAAAAGAAGTTTTGGACACCCAGGAGCTAT GCTCTCTCACCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCTGTGA CCGCTATGGATCCCTCAAAAGTTGGGGACCGATCTCAGTACCAGCACCTTCTGCGGCTTGA AGATGGGACTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCA ACGGGGCTATATCAACAGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 54
Cuphea heterophylla {Chi) FATB2c (G76D, S78P variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCTCCGGCACCTCCC CCAAGCCCGGCAAGTTCGGCACCTGGCTGTCCTCCTCCTCCCCCTCCTACAAGCCCAAGTC CAACCCCTCCGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCAAGGAGGACACCCCCTCCTCCCCC CCCCCCCGC ACCTTCCTGAACCAGCTGCCCGACTGGAACCGCCTGCGCACCGCC ATC ACC A CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACCATGCTGGACCGCAAGTCCAAGAAGCCCG ACATGCACGTGGACTGGTTCGGCCTGGAGATCATCGTGCAGGACGGCCTGGTGTTCCGCG AGTCCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACCACCTGCAGGACACCTCCCTGAACCACTGCAAGTCCGTGGGCCTGCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CTGCAAGGGCCTGACCCCCGAGTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCAAGGAGGTGCTGGACACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGT GACCGCCATGGACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGC ACCGACATCATGAAGGGCCGC ACCGAGTGGCGCCCC AAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA SEQ ID NO: 55
Cuphea heterophylla (Cht) FATB2d (S21P, T28N, L30P, S33L, G76D, R97L, H124L, W127L, I132S, K258N, C303R, E309G, K334T, T386A variant) amino acid sequence
MWAAAASSAFFPVPAPGTTSKPGKFGNWPSSLSPSFKPKSNPNGGFQVKANASAHPKANGS AVSLKSGSLNTQEDTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRPDMLV DLFGLESIVQDGLVFRESYSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGFGRTPE MCKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIW AMMNQNTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSIRKGLTPGWNDLD VNQHVSNVKYIGWILESMPTEVLETQELCSLTLEYRRECGRESVLESVTAMNPSKVGDRSQYQ HLLRLEDGADIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS*
SEQ ID NO: 56
Cuphea heterophylla (Cht) FATB2d (S21P, T28N, L30P, S33L, G76D, R97L, H124L, W127L, I132S, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence
ATGGTGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCCCTGTTCCAGCACCTGGAACCACGT CTAAACCCGGGAAGTTCGGCAATTGGCCATCGAGCTTGAGCCCTTCCTTCAAGCCCAAGTC AAACCCCAATGGTGGATTTCAGGTTAAGGCAAATGCCAGTGCTCATCCTAAGGCTAACGG TTCTGCGGTAAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCCT CCTCCTCGGACATTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCTGACTGCAATCTCGA CCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAATCTAAGAGGCCTG ACATGCTCGTGGACTTGTTTGGGTTGGAGAGTATTGTTCAGGATGGGCTCGTGTTCAGAGA GAGTTATTCGATCAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATAGAAACGTT GATGAACCATTTGCAGGACACATCTTTGAACCATTGTAAGAGTGTGGGTCTTCTCAATGAC GGCTTTGGTCGTACCCCGGAGATGTGTAAAAGGGACCTCATTTGGGTGCTTACAAAAATG CAGATCATGGTGAATCGCTATCCAACTTGGGGCGATACTGTCGAGATCAATAGCTGGTTCT CCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGGAG AAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAATACGAGAAGATTCT CAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTTGACGCTCCTCCTGT CATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTCG CAAGGGTCTAACTCCGGGGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAA GT ACATTGGGTGGATTCTCGAGAGTATGCC AACAGAAGTTTTGGAGACCC AGGAGCTATG CTCTCTCACCCTTGAATATAGGCGGGAATGCGGAAGGGAAAGTGTGCTGGAGTCCGTGAC CGCTATGAATCCCTCAAAAGTTGGAGACCGGTCTCAGTACCAGCACCTTCTACGGCTTGAG GATGGGGCTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCAA CGGGGCGATATCAACAGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG SEQ ID NO: 57
Cuphea heterophylla (Cht) FATB2d (S21P, T28N, L30P, S33L, G76D, R97L, H124L, W127L, I132S, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCACCT CCAAGCCCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGCCCAAGTC CAACCCCAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCC CCCCCCCGCACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCTCCA CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACCATGCTGGACCGCAAGTCCAAGCGCCCCG ACATGCTGGTGGACCTGTTCGGCCTGGAGTCCATCGTGCAGGACGGCCTGGTGTTCCGCG AGTCCTACTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACCACCTGCAGGACACCTCCCTGAACCACTGCAAGTCCGTGGGCCTGCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGC AAGATCGGC ATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAACACCCGCCGCTT CTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CCGCAAGGGCCTGACCCCCGGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCACCGAGGTGCTGGAGACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCGT GACCGCCATGAACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCGCCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 58
Cuphea heterophylla (Cht) FATB2e (G76D, R97L, H124L, I132S, G152S, H165L, T211N, K258N, C303R, E309G, K334T, T386A variant) amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHPKANGSA VSLKSGSLNTQEDTSSSPPPQTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRPDMLVD WFGLESIVQDGLVFRESYSIRSYEISADRTASIETVMNLLQETSLNHCKSMGILNDGFGRTPEM CKRDLIWVLTKMQILVNRYPNWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIWA MMNQNTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSIRKGLTPGWNDLDV NQHVSNVKYIGWILESMPTEVLETQELCSLTLEYRRECGRDSVLESVTAMNPSKVGDRSQYQ HLLRLEDGADIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS*
SEQ ID NO: 59
Cuphea heterophylla (Cht) FATB2e (G76D, R97L, H124L, I132S, G152S, H165L, T211N, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence
ATGGTGGTGGCTGCTGCAGCAAGCTCTGCATTCTTCCCTGTTCCGGCATCTGGAACCTCCC CTAAACCCGGGAAGTTCGGGACTTGGCTATCGAGCTCGAGCCCTTCCTACAAGCCCAAGT CAAACCCCAGTGGTGGATTTCAGGTTAAGGCAAATGCCAGTGCTCATCCTAAGGCTAACG GTTCTGCAGTAAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCC TCCTCCTCAGACATTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCTGACAGCAATCTCG ACCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAATCTAAAAGGCCT GACATGCTCGTGGACTGGTTTGGGTTGGAGAGTATTGTTCAGGATGGGCTCGTGTTCAGAG AGAGTTATTCGATCAGGTCTTACGAAATAAGCGCTGATCGAACAGCCTCTATAGAGACGG TGATGAACCTCTTGCAGGAAACATCTCTCAATCATTGTAAGAGTATGGGTATTCTCAATGA CGGCTTTGGTCGTACCCCGGAGATGTGCAAAAGGGACCTCATTTGGGTGCTTACAAAAAT GC AGATCTTGGTGAATCGCT ATCCAAATTGGGGTGATACTGTCGAGATC AATAGCTGGTTC TCCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGGA GAAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAATACGAGAAGATTC TCAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTTGACGCTCCTCCTG TCATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTCG C AAGGGTCTAACTCCGGGGTGGAATGACTTGGATGTC AATC AGC ACGTAAGC AACGTGAA GTACATTGGGTGGATTCTCGAGAGTATGCCAACAGAAGTTTTGGAGACCCAGGAGCTATG CTCTCTCACCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCCGTGAC CGCTATGAATCCCTCAAAAGTTGGAGACCGGTCTCAGTACCAGCACCTTCTACGGCTTGAG GATGGGGCTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCAA CGGGGCGATATCAACAGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 60
Cuphea heterophylla (Cht) FATB2e (G76D, R97L, H124L, I132S, G152S, H165L, T211N, K258N, C303R, E309G, K334T, T386A variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCTCCGGCACCTCCC CCAAGCCCGGCAAGTTCGGCACCTGGCTGTCCTCCTCCTCCCCCTCCTACAAGCCCAAGTC CAACCCCTCCGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCC CCCCCCCAGACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCTCCA CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACC ATGCTGGACCGC AAGTCC AAGCGCCCCG ACATGCTGGTGGACTGGTTCGGCCTGGAGTCCATCGTGCAGGACGGCCTGGTGTTCCGCG AGTCCTACTCCATCCGCTCCTACGAGATCTCCGCCGACCGCACCGCCTCCATCGAGACCGT GATGAACCTGCTGCAGGAGACCTCCCTGAACCACTGCAAGTCCATGGGCATCCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCCTGGTGAACCGCTACCCCAACTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAACACCCGCCGCTT CTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CCGCAAGGGCCTGACCCCCGGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCACCGAGGTGCTGGAGACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGT GACCGCCATGAACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCGCCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 61
Cuphea heterophylla (Cht) FATB2f (R97L, H124L, I132S, G152S, H165L, T211N variant) amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHPKANGSA VSLKSGSLNTQEGTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRPDMLVD WFGLESIVQDGLVFRESYSIRSYEISADRTASIETVMNLLQETSLNHCKSMGILNDGFGRTPEM CKRDLIWVLTKMQILVNRYPNWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIWA MMNQKTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPEWNDLDV NQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDRSQYQ HLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS*
SEQ ID NO: 62
Cuphea heterophylla (Cht) FATB2f (R97L, H124L, I132S, G152S, H165L, T21 IN variant) coding DNA sequence
ATGGTGGTGGCTGCTGCAGCAAGCTCTGCATTCTTCCCTGTTCCGGCATCTGGAACCTCCC CTAAACCCGGGAAGTTCGGGACTTGGCTATCGAGCTCGAGCCCTTCCTACAAGCCCAAGT CAAACCCCAGTGGTGGATTTCAGGTTAAAGCAAATGCCAGTGCTCATCCTAAGGCTAACG GTTCCGCAGTAAGTCTAAAGTCTGGCAGCCTCAAC ACTC AGGAGGGC ACTTCGTCGTCCCC TCCTCCTCGGACATTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCTGACTGCAATCTCG ACCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAATCTAAGAGGCCT GACATGCTCGTGGACTGGTTTGGGTTGGAGAGTATTGTTCAGGATGGGCTCGTGTTCAGAG AGAGTTATTCGATCAGGTCTTACGAAATAAGCGCTGATCGAACAGCCTCTATAGAGACGG TGATGAACCTCTTGC AGGAAACATCTCTC AATC ATTGTAAGAGTATGGGT ATTCTC AATGA CGGCTTTGGTCGTACCCCGGAGATGTGCAAAAGGGACCTCATTTGGGTGCTTACAAAAAT GCAGATCTTGGTGAATCGCTATCCAAATTGGGGTGATACTGTCGAGATCAATAGCTGGTTC TCCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGGA GAAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAAGACGAGAAGATTC TCAAAACTTCCAAATGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTGGACGCCCCTCCTG TCATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTTG CAAGGGTCTAACACCGGAGTGGAACGACTTGGATGTCAATCAGCACGTAAGCAACGTGAA GTACATTGGGTGGATTCTCGAGAGTATGCCAAAAGAAGTTTTGGACACCCAGGAGCTATG CTCTCTCACCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCTGTGAC CGCTATGGATCCCTCAAAAGTTGGAGACCGATCTCAGTACCAGCACCTTCTGCGGCTTGAA GATGGGACTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCAA CGGGGCGATATCAACAGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 63
Cuphea heterophylla (Cht) FATB2f (R97L, H124L, I132S, G152S, H165L, T21 IN variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCTCCGGCACCTCCC CCAAGCCCGGCAAGTTCGGCACCTGGCTGTCCTCCTCCTCCCCCTCCTACAAGCCCAAGTC CAACCCCTCCGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGGGCACCTCCTCCTCCCCC CCCCCCCGCACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCTCCA CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACCATGCTGGACCGCAAGTCCAAGCGCCCCG ACATGCTGGTGGACTGGTTCGGCCTGGAGTCCATCGTGCAGGACGGCCTGGTGTTCCGCG AGTCCTACTCCATCCGCTCCTACGAGATCTCCGCCGACCGCACCGCCTCCATCGAGACCGT GATGAACCTGCTGCAGGAGACCTCCCTGAACCACTGCAAGTCCATGGGCATCCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCCTGGTGAACCGCTACCCCAACTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CTGCAAGGGCCTGACCCCCGAGTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCAAGGAGGTGCTGGACACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGT GACCGCCATGGACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCACCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA SEQ ID NO: 64
Cuphea heterophylla (Cht) FATB2g (A6T, A16V, S17P, G76D, R97L, H124L, I132S, S 143I, G152S, A157T, H165L, T21 IN, G414A variant) amino acid sequence
MWAATAS S AFFP VPVPGT SPKPGKFGT WLS S S SPS YKPKSNPSGGFQVKANAS AHPKANGS A VSLKSGSLNTQEDTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRPDMLVD WFGLESIVQDGLVFREIYSIRSYEISADRTTSIETVMNLLQETSLNHCKSMGILNDGFGRTPEMC KRDLIWVLTKMQILVNRYPNWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSIWAM MNQKTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPEWNDLDVN QHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDRSQYQH LLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNANSVS*
SEQ ID NO: 65
Cuphea heterophylla (Cht) FATB2g (A6T, A16V, S17P, G76D, R97L, H124L, I132S, S143I, G152S, A157T, H165L, T21 IN, G414A variant) coding DNA sequence
ATGGTGGTGGCTGCTACAGCAAGTTCTGCATTCTTCCCTGTTCCTGTACCTGGAACCTCCC CTAAACCCGGAAAGTTCGGGACTTGGCTATCGAGCTCGAGCCCTTCCTACAAGCCCAAGT CAAACCCCAGTGGTGGATTTCAGGTTAAGGCAAATGCCAGTGCTCATCCTAAGGCTAACG GTTCTGCAGTAAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCC TCCTCCTCGGACATTCCTTAACCAGTTGCCTGATTGGAGTAGGCTTCTGACTGCAATCTCG ACCGTCTTCGTGGCGGCAGAGAAGCAGTTGACTATGCTCGATCGAAAATCTAAGAGGCCT GACATGCTCGTGGACTGGTTTGGGTTGGAGAGTATTGTTCAGGATGGGCTCGTGTTCAGAG AGATTTATTCGATCAGGTCTTACGAAATAAGCGCTGATCGAACAACCTCTATAGAGACGG TGATGAACCTCTTGCAGGAAACATCTCTCAATCATTGTAAGAGTATGGGTATTCTCAATGA CGGCTTTGGTCGTACCCCGGAGATGTGCAAAAGGGACCTCATTTGGGTGCTTACAAAAAT GCAGATCTTGGTGAATCGCTATCCAAATTGGGGTGATACTGTCGAGATCAATAGCTGGTTC TCCCAGTCCGGGAAAATCGGTATGGGTCGCAATTGGCTAATAAGTGATTGCAACACAGGA GAAATTCTTATAAGAGCAACGAGCATTTGGGCCATGATGAATCAAAAGACGAGAAGATTC TCAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCGCCTCATTTTGTGGACGCCCCTCCTG TCATTGAAGACAATGATCGAAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTTG CAAGGGTCTAACACCGGAGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAA GTACATTGGGTGGATTCTCGAGAGTATGCCAAAAGAAGTTTTGGACACCCAGGAGCTATG CTCTCTC ACCCTTGAATAT AGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCTGTGAC CGCTATGGATCCCTCAAAAGTTGGAGACCGATCTCAGTACCAGCACCTTCTGCGGCTTGAA GATGGGACTGATATCATGAAGGGCAGAACTGAGTGGCGACCAAAGAATGCAGGAACCAA CGGGGCGATATCAACAGGAAAGACTTCAAATGCAAACTCGGTCTCTTAG SEQ ID NO: 66
Cuphea heterophylla (Cht) FATB2g (A6T, A16V, S17P, G76D, R97L, H124L, I132S, S 143I, G152S, A157T, H165L, T21 IN, G414A variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGTGGCCGCCACCGCCTCCTCCGCCTTCTTCCCCGTGCCCGTGCCCGGCACCTCCC CCAAGCCCGGCAAGTTCGGCACCTGGCTGTCCTCCTCCTCCCCCTCCTACAAGCCCAAGTC CAACCCCTCCGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGG CTCCGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCC CCCCCCCGCACCTTCCTGAACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCTCCA CCGTGTTCGTGGCCGCCGAGAAGCAGCTGACCATGCTGGACCGCAAGTCCAAGCGCCCCG ACATGCTGGTGGACTGGTTCGGCCTGGAGTCCATCGTGCAGGACGGCCTGGTGTTCCGCG AGATCTACTCCATCCGCTCCTACGAGATCTCCGCCGACCGCACCACCTCCATCGAGACCGT GATGAACCTGCTGCAGGAGACCTCCCTGAACCACTGCAAGTCCATGGGCATCCTGAACGA CGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGCTGACCAAGAT GCAGATCCTGGTGAACCGCTACCCCAACTGGGGCGACACCGTGGAGATCAACTCCTGGTT CTCCCAGTCCGGCAAGATCGGCATGGGCCGCAACTGGCTGATCTCCGACTGCAACACCGG CGAGATCCTGATCCGCGCCACCTCCATCTGGGCCATGATGAACCAGAAGACCCGCCGCTT CTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCACTTCGTGGACGCCCCCCC CGTGATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCAT CTGCAAGGGCCTGACCCCCGAGTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGT GAAGTACATCGGCTGGATCCTGGAGTCCATGCCCAAGGAGGTGCTGGACACCCAGGAGCT GTGCTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGT GACCGCCATGGACCCCTCCAAGGTGGGCGACCGCTCCCAGTACCAGCACCTGCTGCGCCT GGAGGACGGCACCGACATCATGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCA CCAACGGCGCCATCTCCACCGGCAAGACCTCCAACGCCAACTCCGTGTCCTGA
SEQ ID NO: 67
Cuphea heterophylla (Cht) FATB3aamino acid sequence
MVATAASSAFFPVPSPDTSSRPGKLGNGSSSLRPLKPKFVANAGLQVKANASAPPKINGSSVSL KSCSLKTHEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDPF GLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNEGFGRTPEMY KRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSV WVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTPKWNDL DVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKGFGPQFQ HLLRLEDGGEIVKGRTE WRPKTAGINGTIASGET SPGNS *
SEQ ID NO: 68
Cuphea heterophylla (Cht) FATB3a coding DNA sequence
ATGGTGGCCACCGCTGCAAGTTCTGCATTCTTCCCGGTGCCGTCCCCGGACACCTCCTCTA GACCGGGAAAGCTCGGAAATGGGTCATCAAGCTTGAGGCCCCTCAAGCCCAAATTTGTTG CCAATGCTGGGCTGCAGGTTAAGGCAAACGCCAGTGCCCCTCCTAAGATCAATGGTTCCT CGGTCAGTCTAAAGTCTTGCAGTCTCAAGACTCATGAAGACACTCCTTCAGCTCCTCCTCC GCGGACTTTTATCAACCAGTTGCCTGATTGGAGCATGCTTCTTGCTGCAATCACTACTGTC TTCTTGGCAGCAGAGAAGCAGTGGATGATGCTTGATTGGAAACCAAAGAGGCCTGACATG CTTGTGGACCCGTTCGGATTGGGAAGGATTGTTCAGGATGGGCTTGTGTTCAGGCAGAATT TTTCGATTAGGTCCTATGAAATAGGCGCTGATCGCACTGCATCCATAGAGACGGTGATGA ACCACTTGCAGGAAACGGCTCTCAATCATGTTAAGAGTGCGGGGCTTCTTAATGAAGGCT TTGGTCGTACTCCTGAGATGTATAAAAGGGACCTTATTTGGGTTGTCGCGAAAATGCAGGT CATGGTTAACCGCTATCCTACTTGGGGTGACACGGTTGAAGTGAATACTTGGGTTGCCAAG TCAGGGAAAAATGGTATGCGTCGTGATTGGCTCATAAGTGATTGCAATACAGGAGAAATT CTTACAAGGGCATCAAGTGTGTGGGTCATGATGAATCAAAAGACAAGAAAATTGTCAAAG ATTCCAGATGAGGTTCGGCATGAGATAGAGCCTCATTTTGTGGACTCTGCTCCCGTCATTG AAGACGATGACTGGAAACTTCCCAAGCTGGATGAGAAAACTGCTGACTCCATCCGCAAGG GTCTAACTCCGAAGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTACA TTGGGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCCT TACCCTGGAATACAGGCGGGAATGCGGAAGGGAGAGTGTGCTGGAGTCCCTCACTGCTGT GGACCCCTCTGGAAAGGGCTTTGGGCCCCAGTTTCAGCACCTTCTGAGGCTTGAGGATGG AGGTGAGATCGTAAAGGGGAGAACTGAGTGGCGACCCAAGACTGCAGGTATCAATGGGA CGATTGCATCTGGGGAGACCTCACCTGGAAACTCTTAG SEQ ID NO: 69
Cuphea heterophylla {Chi) FATB3a coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCCCCGACACCTCCTCCCG CCCCGGCAAGCTGGGCAACGGCTCCTCCTCCCTGCGCCCCCTGAAGCCCAAGTTCGTGGCC AACGCCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCCTCC GTGTCCCTGAAGTCCTGCTCCCTGAAGACCCACGAGGACACCCCCTCCGCCCCCCCCCCCC GCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGTGTT CCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACATGCT GGTGGACCCCTTCGGCCTGGGCCGCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAACTT CTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATGAA CCACCTGCAGGAGACCGCCCTGAACCACGTGAAGTCCGCCGGCCTGCTGAACGAGGGCTT CGGCCGCACCCCCGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAGGT GATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCCAA GTCCGGCAAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGAT CCTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCAAGCTGTCCAA GATCCCCGACGAGGTGCGCCACGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTGAT CGAGGACGACGACTGGAAGCTGCCCAAGCTGGACGAGAAGACCGCCGACTCCATCCGCA AGGGCCTGACCCCCAAGTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAAG TACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGC TCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACC GCCGTGGACCCCTCCGGCAAGGGCTTCGGCCCCCAGTTCCAGCACCTGCTGCGCCTGGAG GACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGACCGCCGGCATCAA CGGCACCATCGCCTCCGGCGAGACCTCCCCCGGCAACTCCTGA
SEQ ID NO: 70
Cuphea heterophylla {Chi) FATB3b (C67G, H72Q, L128F, N179I variant) amino acid sequence
MVATAASSAFFPVPSPDTSSRPGKLGNGSSSLRPLKPKFVANAGLQVKANASAPPKINGSSVSL KSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDPF GFGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLIEGFGRTPEMYK RDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSVW VMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTPKWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKGFGPQFQH LLRLEDGGEIVKGRTE WRPKTAGINGTIASGET SPGNS *
SEQ ID NO: 71
Cuphea heterophylla (Cht) FATB3b (C67G, H72Q, L128F, N179I variant) coding DNA sequence ATGGTGGCCACCGCTGCAAGTTCTGCATTCTTCCCGGTGCCATCCCCGGACACCTCCTCTA GACCGGGAAAGCTCGGAAATGGGTCATCAAGCTTGAGGCCCCTCAAGCCCAAATTTGTTG CCAATGCTGGGCTGCAGGTTAAGGCAAACGCCAGTGCCCCTCCTAAGATCAATGGTTCCT CGGTCAGTCTAAAGTCTGGCAGTCTCAAGACTCAGGAAGACACTCCTTCGGCTCCTCCTCC GCGGACTTTTATCAACCAGTTGCCTGATTGGAGCATGCTTCTTGCTGCAATCACTACTGTC TTCTTGGCAGCAGAGAAGCAGTGGATGATGCTTGATTGGAAACCAAAGAGGCCTGACATG CTTGTGGACCCGTTCGGATTTGGAAGGATTGTTCAGGATGGGCTTGTGTTCAGGCAGAATT TTTCGATTAGGTCCTATGAAATAGGCGCTGATCGCACTGCATCTATAGAGACGGTGATGA ACCACTTGCAGGAAACGGCTCTCAATCATGTTAAGAGTGCGGGGCTTCTTATTGAAGGCTT TGGTCGTACTCCTGAGATGTATAAAAGGGACCTTATTTGGGTTGTCGCGAAAATGCAGGTC ATGGTT AACCGCT ATCCTACTTGGGGTGAC ACGGTTGAAGTGAAT ACTTGGGTTGCC AAGT CAGGGAAAAATGGTATGCGTCGTGATTGGCTCATAAGTGATTGCAATACAGGAGAAATTC TTACTAGAGCATCAAGTGTGTGGGTCATGATGAATCAAAAGACAAGAAAATTGTCAAAGA TTCCAGATGAGGTTCGGCATGAGATAGAGCCTCATTTTGTGGACTCTGCTCCCGTCATTGA AGACGATGACTGGAAACTTCCCAAGCTGGATGAGAAAACTGCTGACTCCATCCGCAAGGG TCTAACTCCGAAGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTACAT TGGGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCCTT ACCCTGGAATACAGGCGGGAATGCGGAAGGGAGAGTGTGCTGGAGTCCCTCACTGCTGTG GACCCCTCTGGAAAGGGCTTTGGGCCCCAGTTTCAGCACCTTCTGAGGCTTGAGGATGGA GGTGAGATCGTAAAGGGGAGAACTGAGTGGCGACCCAAGACTGCAGGTATCAATGGGAC GATTGCATCTGGGGAGACCTCACCTGGAAACTCTTAG
SEQ ID NO: 72
Cuphea heterophylla (CM) FATB3b (C67G, H72Q, L128F, N179I variant) coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCCCCGACACCTCCTCCCG CCCCGGCAAGCTGGGCAACGGCTCCTCCTCCCTGCGCCCCCTGAAGCCCAAGTTCGTGGCC AACGCCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCCTCC GTGTCCCTGAAGTCCGGCTCCCTGAAGACCCAGGAGGACACCCCCTCCGCCCCCCCCCCCC GCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGTGTT CCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACATGCT GGTGGACCCCTTCGGCTTCGGCCGCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAACTTC TCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATGAAC CACCTGCAGGAGACCGCCCTGAACCACGTGAAGTCCGCCGGCCTGCTGATCGAGGGCTTC GGCCGCACCCCCGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAGGTG ATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCCAAG TCCGGCAAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGATC CTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCAAGCTGTCCAAG ATCCCCGACGAGGTGCGCCACGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTGATC GAGGACGACGACTGGAAGCTGCCCAAGCTGGACGAGAAGACCGCCGACTCCATCCGCAA GGGCCTGACCCCCAAGTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAAGT ACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGCT CCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACCG CCGTGGACCCCTCCGGCAAGGGCTTCGGCCCCC AGTTCCAGCACCTGCTGCGCCTGGAGG ACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGACCGCCGGCATCAAC GGCACCATCGCCTCCGGCGAGACCTCCCCCGGCAACTCCTGA
SEQ ID NO: 73
Cuphea viscosissima (Cvis) FATB 1 amino acid sequence
MVAAAATS AFFP VPAPGT SPKPGKSGNWP S SLSPTFKPKSIPNGGFQVKANAS AHPKANGS AV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKSIVRDGLVSRHSFSIRSYEIGADRTASIETLMNHLQETTINHCKSLGLHNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSQSGKIGMASDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDQKLRKFDVKTGDSIRKGLTPRWNDLD VNQHVSNVKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAVDPSENGGRSQYK HLLRLEDGTDIVKSRTEWRPKNAGTNGAISTSTAKTSNGNSVS
SEQ ID NO: 74
Cuphea viscosissima (Cvis) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCACCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCACCTTCAAGCCCAAGTCCAT CCCCAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGACCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGTCCATCGTGCGCGACGGCCTGGTGTCCCGCCACTC CTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGATG AACCACCTGCAGGAGACCACCATCAACCACTGCAAGTCCCTGGGCCTGCACAACGACGGC TTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCAG ATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTCC CAGTCCGGCAAGATCGGCATGGCCTCCGACTGGCTGATCTCCGACTGCAACACCGGCGAG ATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCCC GCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTTCGTGGACTCCCCCCACGTGAT CGAGGACAACGACCAGAAGCTGCGCAAGTTCGACGTGAAGACCGGCGACTCCATCCGCA AGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGTGAAGT ACATCGGCTGGATCCTGGAGTCCATGCCCATCGAGGTGCTGGAGACCCAGGAGCTGTGCT CCCTGACCGTGGAGTACCGCCGCGAGTGCGGCATGGACTCCGTGCTGGAGTCCGTGACCG CCGTGGACCCCTCCGAGAACGGCGGCCGCTCCCAGTACAAGCACCTGCTGCGCCTGGAGG ACGGCACCGACATCGTGAAGTCCCGCACCGAGTGGCGCCCCAAGAACGCCGGCACCAAC GGCGCCATCTCCACCTCCACCGCCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 75
Cuphea viscosissima (Cvis) FATB2 amino acid sequence
MVATAASSAFFPVPSADTSSRPGKLGNGPSSFSPLKPKSIPNGGLQVKASASAPPKINGSSVGLK SGGLKTHDDAPSAPPPRTFINQLPDWSMLLAAITTAFLAAEKQWMMLDRKPKRLDMLEDPFG LGRWQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKTAGLSNDGFGRTPEMYK RDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSVW VMMNQKTRKLSKIPDEVRREIEPHFVDSAPVIEDDDRKLPKLDEKSADSIRKGLTPRWNDLDV NQHVNNAKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGYGSQFQHL LRLEDGGEIVKGRTE WRPKNAGINGWPSEE S SPGDYS
SEQ ID NO: 76
Cuphea viscosissima (Cvis) FATB2 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCGCCGACACCTCCTCCC GCCCCGGCAAGCTGGGCAACGGCCCCTCCTCCTTCTCCCCCCTGAAGCCCAAGTCCATCCC CAACGGCGGCCTGCAGGTGAAGGCCTCCGCCTCCGCCCCCCCCAAGATCAACGGCTCCTC CGTGGGCCTGAAGTCCGGCGGCCTGAAGACCCACGACGACGCCCCCTCCGCCCCCCCCCC CCGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGCC TTCCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACCGCAAGCCCAAGCGCCTGGACATG CTGGAGGACCCCTTCGGCCTGGGCCGCGTGGTGC AGGACGGCCTGGTGTTCCGCC AGAAC TTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATGA ACCACCTGCAGGAGACCGCCCTGAACCACGTGAAGACCGCCGGCCTGTCCAACGACGGCT TCGGCCGCACCCCCGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAGG TGATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCCA AGTCCGGCAAGAACGGC ATGCGCCGCGACTGGCTGATCTCCGACTGC AAC ACCGGCGAGA TCCTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCAAGCTGTCCA AGATCCCCGACGAGGTGCGCCGCGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTGA TCGAGGACGACGACCGCAAGCTGCCCAAGCTGGACGAGAAGTCCGCCGACTCCATCCGCA AGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGCCAAGT ACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGCT CCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACCG CCGTGGACCCCTCCGGCGAGGGCTACGGCTCCCAGTTCCAGCACCTGCTGCGCCTGGAGG ACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCATCAAC GGCGTGGTGCCCTCCGAGGAGTCCTCCCCCGGCGACTACTCCTGA SEQ ID NO: 77
Cuphea viscosissima (Cvis) FATB3 amino acid sequence
MVAAAAS S AFF SFPTPGT SPKPGKFGNWP S SLSIPFNPKSNHNGGIQ VKANAS AHPKANGS AVS LKAGSLETQEDTSSPSPPPRTFISQLPDWSMLVSAITTVFVAAEKQWTMLDRKSKRPDVLVEPF VQDGVSFRQSFSIRSYEIGVDRTASIETLMNIFQETSLNHCKSLGLLNDGFGRTPEMCKRDLIW WTKMQIEVNRYPTWGDTIEVTTWVSESGKNGMSRDWLISDCHSGEILIRATSVWAMMNQK TRRLSKIPDEVRQEIVPYFVDSAPVIEDDRKLHKLDVKTGDSIRNGLTPRWNDFDVNQHVNNV KYIAWLLKSVPTEVFETQELCGLTLEYRRECRRDSVLESVTAMDPSKEGDRSLYQHLLRLENG ADIALGRTEWRPKNAGATGAVSTGKTSNGNSVS
SEQ ID NO: 78
Cuphea viscosissima (Cvis) FATB3 coding DNA sequence codon optimized for Prototheca moriformis ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCTCCTTCCCCACCCCCGGCACCTCCCCCAA GCCCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCATCCCCTTCAACCCCAAGTCCAAC CACAACGGCGGCATCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTCC GCCGTGTCCCTGAAGGCCGGCTCCCTGGAGACCCAGGAGGACACCTCCTCCCCCTCCCCCC CCCCCCGCACCTTCATCTCCCAGCTGCCCGACTGGTCCATGCTGGTGTCCGCCATCACCAC CGTGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGA CGTGCTGGTGGAGCCCTTCGTGCAGGACGGCGTGTCCTTCCGCCAGTCCTTCTCCATCCGC TCCTACGAGATCGGCGTGGACCGCACCGCCTCCATCGAGACCCTGATGAACATCTTCCAG GAGACCTCCCTGAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGGCTTCGGCCGCACC CCCGAGATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGATGCAGATCGAGGTGAAC CGCTACCCCACCTGGGGCGACACCATCGAGGTGACCACCTGGGTGTCCGAGTCCGGCAAG AACGGCATGTCCCGCGACTGGCTGATCTCCGACTGCCACTCCGGCGAGATCCTGATCCGC GCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCCTGTCCAAGATCCCCGAC GAGGTGCGCCAGGAGATCGTGCCCTACTTCGTGGACTCCGCCCCCGTGATCGAGGACGAC CGCAAGCTGCACAAGCTGGACGTGAAGACCGGCGACTCCATCCGCAACGGCCTGACCCCC CGCTGGAACGACTTCGACGTGAACCAGCACGTGAACAACGTGAAGTACATCGCCTGGCTG CTGAAGTCCGTGCCCACCGAGGTGTTCGAGACCCAGGAGCTGTGCGGCCTGACCCTGGAG TACCGCCGCGAGTGCCGCCGCGACTCCGTGCTGGAGTCCGTGACCGCCATGGACCCCTCC AAGGAGGGCGACCGCTCCCTGTACCAGCACCTGCTGCGCCTGGAGAACGGCGCCGACATC GCCCTGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCGCCACCGGCGCCGTGTCCACC GGCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 79
Cuphea calcarata (Ccalc) FATB1 amino acid sequence
MVAASASSAFFSVPTPGTSPKPGKFGNWPSSLSVPFKPRSNNSGGFQVKANASAHPKANGSAV SLKSGSLETQEDNSSSSRPPRTFIKQLPDWSMLLSAITTVFVAAEKQWTMFDRKSKRSDMLVD PFVVDRIVQDGVLFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSMGLLYEGFGRTPEMC KRDLIWWTKIHIKVNRYPTWGDTIEVTTWVSESGKNGMGRDWLISDCHTGEILIRATSVWA MMNQTTRRLSKFPYELRQEIAPHFVDSDPVIEDNRRLLNFDVKTGDSIRKGLTPRWNDLDVNQ HVNNVKYIGWILESVPTEVFDTRELCGLTLEYRQECGRGSVLESVTAMDPSKEGDRSLYQHLL RLEDGTDIVKGRTEWRPKNAGTNGPVSTRKTTNGSSVS SEQ ID NO: 80
Cuphea calcarata (Ccalc) FATB1 coding DNA sequence
ATGGTGGCTGCTTCAGCAAGTTCTGCATTCTTCTCCGTCCCAACCCCGGGAACCTCTCCTA AACCCGGGAAGTTCGGCAATTGGCCATCGAGCTTGAGCGTCCCATTCAAGCCCAGATCAA ACAACAGTGGCGGCTTTCAGGTTAAGGCAAACGCCAGTGCTCATCCTAAGGCTAACGGTT CTGCAGTAAGTCTAAAGTCTGGGAGCCTCGAGACTCAGGAGGACAATTCGTCGTCGTCTC GTCCTCCTCGGACTTTCATTAAACAGTTGCCGGACTGGAGTATGCTTCTTTCCGCGATCAC AACCGTCTTCGTGGCGGCTGAGAAGCAGTGGACGATGTTTGATCGGAAATCTAAGAGGTC TGACATGCTCGTGGACCCGTTTGTGGTTGACAGGATTGTTCAGGATGGGGTTCTGTTCAGA CAGAGTTTTTCGATTAGGTCTTACGAAATAGGCGCTGATCGAACAGCCTCTATTGAGACGC TGATGAACATCTTCCAGGAAACATCTCTCAATCATTGTAAGAGTATGGGTCTTCTCTATGA AGGCTTTGGTCGTACTCCTGAGATGTGTAAGAGGGACCTCATTTGGGTGGTTACGAAAAT ACATATCAAGGTGAATCGCTATCCGACTTGGGGTGATACTATCGAGGTCACTACTTGGGTC TCCGAGTCGGGCAAAAACGGTATGGGTCGCGATTGGCTGATAAGTGATTGCCATACAGGA GAAATTCTTATAAGAGCAACGAGTGTGTGGGCTATGATGAATCAAACGACGAGAAGATTG TCGAAATTTCCATATGAGCTTCGACAGGAGATAGCGCCACATTTTGTGGACTCGGATCCTG TCATTGAAGACAATCGAAGATTGCTCAACTTTGATGTGAAGACGGGTGATTCCATTCGCA AGGGTCTAACTCCAAGGTGGAATGACTTGGATGTCAATCAGCACGTTAACAATGTGAAGT ACATTGGGTGGATTCTCGAGAGTGTTCCAACAGAAGTTTTCGATACCCGGGAGCTATGCG GCCTCACCCTTGAGTATAGGCAGGAATGCGGAAGAGGAAGTGTGCTGGAGTCCGTGACCG CTATGGATCCCTCAAAAGAGGGAGACCGGTCTCTGTACCAGCACCTTCTTCGGCTTGAGG ATGGGACTGATATCGTGAAGGGCAGAACCGAGTGGCGGCCAAAGAATGCAGGAACCAAT GGGCCAGTATCAACAAGAAAGACTACAAATGGAAGCTCAGTCTCTTAG
SEQ ID NO: 81
Cuphea calcarata (Ccalc) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCTCCGCCTCCTCCGCCTTCTTCTCCGTGCCCACCCCCGGCACCTCCCCCA AGCCCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCGTGCCCTTCAAGCCCCGCTCCAA CAACTCCGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGTCCCTGAAGTCCGGCTCCCTGGAGACCCAGGAGGACAACTCCTCCTCCTCCCGC CCCCCCCGCACCTTCATCAAGCAGCTGCCCGACTGGTCCATGCTGCTGTCCGCCATCACCA CCGTGTTCGTGGCCGCCGAGAAGCAGTGGACCATGTTCGACCGCAAGTCCAAGCGCTCCG ACATGCTGGTGGACCCCTTCGTGGTGGACCGCATCGTGCAGGACGGCGTGCTGTTCCGCC AGTCCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCT GATGAACATCTTCCAGGAGACCTCCCTGAACCACTGCAAGTCCATGGGCCTGCTGTACGA GGGCTTCGGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGAT CCACATCAAGGTGAACCGCTACCCCACCTGGGGCGACACCATCGAGGTGACCACCTGGGT GTCCGAGTCCGGCAAGAACGGCATGGGCCGCGACTGGCTGATCTCCGACTGCCACACCGG CGAGATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGACCACCCGCCGCCT GTCCAAGTTCCCCTACGAGCTGCGCCAGGAGATCGCCCCCCACTTCGTGGACTCCGACCCC GTGATCGAGGACAACCGCCGCCTGCTGAACTTCGACGTGAAGACCGGCGACTCCATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAA GTACATCGGCTGGATCCTGGAGTCCGTGCCCACCGAGGTGTTCGACACCCGCGAGCTGTG CGGCCTGACCCTGGAGTACCGCCAGGAGTGCGGCCGCGGCTCCGTGCTGGAGTCCGTGAC CGCCATGGACCCCTCCAAGGAGGGCGACCGCTCCCTGTACCAGCACCTGCTGCGCCTGGA GGACGGCACCGACATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCACCA ACGGCCCCGTGTCCACCCGCAAGACCACCAACGGCTCCTCCGTGTCCTGA SEQ ID NO: 82
Cuphea painteri (Cpai) FATB1 amino acid sequence
MVAAAATSAFFPVPAPGTSPNPRKFGSWPSSLSPSLPKSIPNGGFQVKANASAHPKANGSAVSL KSGSLNTQENTSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLVDLFGLE SSVQDALVFRQSFSIRSYEIGTDRTASIETLMNHLQETSLNHCKSTGILLDGFGRTLEMCKRELI WWIKMQIQVNRYPAWGDTVEINTRFSRLGKIGMGRDWLISDCNTGEILIRATSEYAMMNQK TRRLSKLPYEVHQEIAPLFVDSPPVIEDNDLKVHKFEVKTGDSIQKGLSPGWNDLDVNQHVSN VKYIGWILESMPTEVLETQELCSLALEYRRECGRDSVLESVTAMDPSKVGGRSQYQHLLRLED GTAIVNGITEWRPKNAGANGAISTGKTSNGNSVS SEQ ID NO: 83
Cuphea painteri (Cpai) FATB1 coding DNA sequence
ATGGTGGCTGCTGCAGCAACTTCTGCATTCTTCCCTGTTCCAGCCCCGGGAACCTCCCCAA ATCCCAGGAAATTCGGAAGTTGGCCATCGAGCTTGAGCCCTTCCTTGCCCAAGTCAATCCC CAATGGCGGATTTCAGGTAAAGGCAAATGCCAGTGCCCATCCGAAGGCTAACGGTTCTGC AGTTAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGAACACTTCGTCGTCCCCTCCTCCT CGGACTTTCCTTCACCAGTTGCCTGATTGGAGTAGGCTTCTGACTGCAATCACGACCGTGT TCGTGAAATCTAAGAGGCCTGACATGCATGATCGGAAATCTAAGAGGCCTGACATGCTGG TGGACTTGTTTGGGTTGGAAAGTAGTGTTCAGGATGCGCTCGTGTTCAGACAGAGTTTTTC GATTAGGTCTTATGAAATAGGCACTGATCGAACAGCCTCTATAGAGACGCTGATGAACCA CTTGC AGGAAAC ATCTCTCAATC ATTGT AAAAGT ACCGGTATTCTCCTTGACGGCTTCGGT CGTACTCTTGAGATGTGTAAAAGGGAACTCATTTGGGTGGTAATAAAAATGCAAATTCAG GTGAATCGCTATCCAGCATGGGGCGATACTGTCGAGATCAATACCCGGTTCTCCCGGTTGG GGAAAATTGGTATGGGTCGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATTCTAA TAAGAGCAACGAGCGAGTATGCCATGATGAATCAAAAGACGAGAAGACTCTCAAAACTT CCATACGAGGTTCACCAGGAGATAGCGCCTCTTTTTGTCGACTCTCCTCCTGTGATTGAAG ACAATGATCTGAAAGTGCATAAATTTGAAGTGAAGACTGGTGATTCCATTCAAAAGGGTC TATCCCCGGGGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAAGTACATTG GGTGGATTCTCGAGAGTATGCCAACAGAAGTTTTGGAGACCCAGGAGCTATGCTCTCTCG CCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTGGAGTCCGTGACCGCAATGG ATCCCTCAAAAGTTGGAGGCCGTTCTCAGTACCAGCACCTTCTGCGGCTTGAGGATGGGA CTGCTATCGTGAACGGCATAACTGAGTGGCGGCCGAAGAATGCAGGAGCTAATGGGGCG ATATCAACGGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG SEQ ID NO: 84
Cuphea painteri (Cpai) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCACCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCTCCCCCA
ACCCCCGCAAGTTCGGCTCCTGGCCCTCCTCCCTGTCCCCCTCCCTGCCCAAGTCCATCCCC
AACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTCCGCC GTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGAACACCTCCTCCTCCCCCCCCCCCC GCACCTTCCTGCACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCACCACCGTGTT CGTGAAGTCCAAGCGCCCCGACATGCACGACCGCAAGTCCAAGCGCCCCGACATGCTGGT GGACCTGTTCGGCCTGGAGTCCTCCGTGCAGGACGCCCTGGTGTTCCGCCAGTCCTTCTCC ATCCGCTCCTACGAGATCGGCACCGACCGCACCGCCTCCATCGAGACCCTGATGAACCAC CTGCAGGAGACCTCCCTGAACCACTGCAAGTCCACCGGCATCCTGCTGGACGGCTTCGGC CGCACCCTGGAGATGTGCAAGCGCGAGCTGATCTGGGTGGTGATCAAGATGCAGATCCAG GTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGATCAACACCCGCTTCTCCCGCCTG GGCAAGATCGGCATGGGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGATCCTG ATCCGCGCCACCTCCGAGTACGCCATGATGAACCAGAAGACCCGCCGCCTGTCCAAGCTG CCCTACGAGGTGCACCAGGAGATCGCCCCCCTGTTCGTGGACTCCCCCCCCGTGATCGAG GACAACGACCTGAAGGTGCACAAGTTCGAGGTGAAGACCGGCGACTCCATCCAGAAGGG CCTGTCCCCCGGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGTGAAGTACAT CGGCTGGATCCTGGAGTCCATGCCCACCGAGGTGCTGGAGACCCAGGAGCTGTGCTCCCT GGCCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTGACCGCCAT GGACCCCTCC AAGGTGGGCGGCCGCTCCC AGT ACC AGC ACCTGCTGCGCCTGGAGGACGG CACCGCCATCGTGAACGGCATCACCGAGTGGCGCCCCAAGAACGCCGGCGCCAACGGCGC CATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA
SEQ ID NO: 85
Cuphea hookeriana (Chook) FATB4 amino acid sequence
MVAAAATSAFFPVPAPGTSPNPRKFGSWPSSLSPSLPNSIPNGGFQVKANASAHPKANGSAVSL KSGSLNTQENTSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLVDLFGLE SSVQDALVFRQRFSIRSYEIGTDRTASMETLMNHLQETSLNHCKSTGILLDGFGRTLEMCKREL IWWIKMQIQVNRYPAWGDTVEINTRFSRLGKIGMGRDWLISDCNTGEILIRATSEYAMMNQK TRRLSKLPYEVRQEIAPLFVDSPPVIEDNDLKVHKFEVKTGDSIHKGLTPGWNDLDVNQHVNN VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAMDPSGGGYGSQFQHLLRLEDG GEIVKGRTEWRPKNGVINGWPTGESSPGDYS
SEQ ID NO: 86
Cuphea hookeriana (Chook) FATB4 coding DNA sequence
ATGGTGGCTGCTGCAGCAACTTCTGCATTCTTCCCTGTTCCAGCCCCGGGAACCTCCCCTA ATCCCAGGAAATTCGGAAGTTGGCCATCGAGCTTGAGCCCTTCCTTGCCCAACTCAATCCC CAATGGCGGATTTCAGGTAAAGGCAAATGCCAGTGCCCATCCGAAGGCTAACGGTTCTGC AGTTAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGAACACTTCGTCGTCCCCTCCTCCT CGGACTTTCCTTCACCAGTTGCCTGATTGGAGTAGGCTTCTGACTGCAATCACGACCGTGT TCGTGAAATCTAAGAGGCCTGACATGCATGATCGGAAATCTAAGAGGCCTGACATGCTGG TGGACTTGTTTGGGTTGGAGAGTAGTGTTCAGGATGCGCTCGTGTTCAGACAGAGATTTTC GATTAGGTCTTATGAAATAGGCACTGATCGAACAGCCTCTATGGAGACGCTGATGAACCA CTTGCAGGAAACATCTCTCAATCATTGTAAAAGTACCGGTATTCTCCTTGACGGCTTCGGT CGTACTCTTGAGATGTGTAAAAGGGAACTCATTTGGGTGGTAATAAAAATGCAGATTCAG GTGAATCGCTATCCAGCATGGGGCGATACTGTCGAGATCAATACCCGGTTCTCCCGGTTGG GGAAAATTGGTATGGGTCGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATTCTTA TAAGAGCAACGAGCGAGTATGCCATGATGAATCAAAAGACGAGAAGACTCTCAAAACTT CCATACGAGGTTCGCCAGGAGATAGCGCCTCTTTTTGTCGACTCTCCTCCTGTGATTGAAG ACAATGATCTGAAAGTGCATAAATTTGAAGTGAAGACTGGTGATTCCATTCACAAGGGTC TAACTCCGGGGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTACATCG GGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCCTTAC TCTGGAATACAGGCGGGAATGTGGAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTATGGA TCCCTCTGGAGGGGGTTATGGGTCCCAGTTTCAGCACCTTCTGCGGCTTGAGGATGGAGGT GAGATCGTGAAGGGGAGAACCGAGTGGCGACCCAAGAATGGTGTAATCAATGGGGTGGT ACCAACCGGGGAGTCCTCACCTGGAGACTACTCTTAG
SEQ ID NO: 87
Cuphea hookeriana (Chook) FATB4 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCACCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCTCCCCCA ACCCCCGCAAGTTCGGCTCCTGGCCCTCCTCCCTGTCCCCCTCCCTGCCCAACTCCATCCCC AACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTCCGCC GTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGAACACCTCCTCCTCCCCCCCCCCCC GCACCTTCCTGCACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCACCACCGTGTT CGTGAAGTCCAAGCGCCCCGACATGCACGACCGCAAGTCCAAGCGCCCCGACATGCTGGT GGACCTGTTCGGCCTGGAGTCCTCCGTGCAGGACGCCCTGGTGTTCCGCCAGCGCTTCTCC ATCCGCTCCTACGAGATCGGCACCGACCGCACCGCCTCCATGGAGACCCTGATGAACCAC CTGCAGGAGACCTCCCTGAACCACTGCAAGTCCACCGGCATCCTGCTGGACGGCTTCGGC CGCACCCTGGAGATGTGCAAGCGCGAGCTGATCTGGGTGGTGATCAAGATGCAGATCCAG GTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGATCAACACCCGCTTCTCCCGCCTG GGCAAGATCGGCATGGGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGATCCTG ATCCGCGCCACCTCCGAGTACGCCATGATGAACCAGAAGACCCGCCGCCTGTCCAAGCTG CCCTACGAGGTGCGCCAGGAGATCGCCCCCCTGTTCGTGGACTCCCCCCCCGTGATCGAG GAC AACGACCTGAAGGTGC AC AAGTTCGAGGTGAAGACCGGCGACTCC ATCC AC AAGGG CCTGACCCCCGGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAAGTACAT CGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGCTCCCT GACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACCGCCAT GGACCCCTCCGGCGGCGGCTACGGCTCCCAGTTCCAGCACCTGCTGCGCCTGGAGGACGG CGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGGCGTGATCAACGGCG TGGTGCCCACCGGCGAGTCCTCCCCCGGCGACTACTCCTGA
SEQ ID NO: 88
Cuphea avigera var. pulcherrima (Ca) FATB1 amino acid sequence
MVAAAASSAFFSVPVPGTSPKPGKFRIWPSSLSPSFKPKPIPNGGLQVKANSRAHPKANGSAVS LKSGSLNTQEDTSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLMDSFG LESIVQEGLEFRQSFSIRSYEIGTDRTASIETLMNYLQETSLNHCKSTGILLDGFGRTPEMCKRDL IWWTKMKIKVNRYPAWGDTVEINTWFSRLGKIGKGRDWLISDCNTGEILIRATSAYATMNQ KTRRLSKLPYEVHQEIAPLFVDSPPVIEDNDLKLHKFEVKTGDSIHKGLTPGWNDLDVNQHVS NVKYIG WILE SMPTEVLETQELC SLALE YRRECGRDS VLE S VTAMDPTKVGGRSQYQHLLRLE DGTDIVKCRTEWRPKNPGANGAISTGKTSNGNSVS
SEQ ID NO: 89
Cuphea avigera var. pulcherrima (Ca) FATB1 coding DNA sequence
ATGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCTCTGTTCCAGTCCCGGGAACCTCTCCTA AACCCGGGAAGTTCAGAATTTGGCCATCGAGCTTGAGCCCTTCCTTCAAGCCCAAGCCGA TCCCCAATGGTGGATTGCAGGTTAAGGCAAATTCCAGGGCACATCCGAAGGCTAACGGTT CTGCAGTTAGTCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCCTCC TCCTCGGACTTTCCTTCACCAGTTGCCTGATTGGAGTAGGCTTCTGACTGCAATCACGACC GTGTTCGTGAAATCTAAGAGGCCTGAC ATGCATGATCGGAAATCTAAGAGGCCTGACATG CTGATGGACTCGTTTGGGTTGGAGAGTATTGTTCAAGAAGGGCTCGAGTTCAGACAGAGT TTTTCGATTAGGTCTTATGAAATAGGCACTGATCGAACAGCCTCTATAGAGACGCTGATGA ACTACTTGCAGGAAACATCTCTCAATCATTGTAAGAGTACCGGTATTCTCCTTGACGGCTT TGGTCGTACTCCTGAGATGTGTAAAAGGGACCTCATTTGGGTGGTAACAAAAATGAAGAT CAAGGTGAATCGCTATCCAGCTTGGGGCGATACTGTCGAGATCAATACCTGGTTCTCCCGG TTGGGGAAAATCGGAAAGGGTCGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATT CTTATAAGAGCAACGAGCGCGTATGCCACGATGAATCAAAAGACGAGAAGACTCTCAAA ACTTCCATACGAGGTTCACCAGGAGATAGCGCCTCTCTTTGTCGACTCTCCTCCTGTCATT GAAGACAATGATCTGAAATTGCATAAGTTTGAAGTGAAGACTGGTGATTCCATTCACAAG GGTCTAACTCCGGGGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAAGTAC ATTGGGTGGATTCTCGAGAGTATGCCAACAGAAGTTTTGGAGACCCAGGAGCTATGCTCT CTCGCCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTGCTAGAGTCCGTGACAGCT ATGGATCCCACAAAAGTTGGAGGCCGGTCTCAGTACCAGCACCTTCTGCGACTTGAGGAT GGGACTGATATCGTGAAGTGCAGAACTGAGTGGCGGCCGAAGAATCCAGGAGCTAATGG GGCAATATCAACGGGAAAGACTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 90
Cuphea avigera var. pulcherrima (Ca) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCTCCGTGCCCGTGCCCGGCACCTCCCCCA AGCCCGGCAAGTTCCGCATCTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGCCCAAGCCCAT CCCCAACGGCGGCCTGCAGGTGAAGGCCAACTCCCGCGCCCACCCCAAGGCCAACGGCTC CGCCGTGTCCCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCACCTTCCTGCACCAGCTGCCCGACTGGTCCCGCCTGCTGACCGCCATCACCACCG TGTTCGTGAAGTCCAAGCGCCCCGACATGCACGACCGCAAGTCCAAGCGCCCCGACATGC TGATGGACTCCTTCGGCCTGGAGTCCATCGTGCAGGAGGGCCTGGAGTTCCGCCAGTCCTT CTCCATCCGCTCCTACGAGATCGGCACCGACCGCACCGCCTCCATCGAGACCCTGATGAA CTACCTGCAGGAGACCTCCCTGAACCACTGCAAGTCCACCGGCATCCTGCTGGACGGCTTC GGCCGCACCCCCGAGATGTGCAAGCGCGACCTGATCTGGGTGGTGACCAAGATGAAGATC AAGGTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTCCCGC CTGGGCAAGATCGGCAAGGGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGATC CTGATCCGCGCCACCTCCGCCTACGCCACCATGAACCAGAAGACCCGCCGCCTGTCCAAG CTGCCCTACGAGGTGCACCAGGAGATCGCCCCCCTGTTCGTGGACTCCCCCCCCGTGATCG AGGACAACGACCTGAAGCTGCACAAGTTCGAGGTGAAGACCGGCGACTCCATCCACAAG GGCCTGACCCCCGGCTGGAACGACCTGGACGTGAACC AGCACGTGTCC AACGTGAAGT AC ATCGGCTGGATCCTGGAGTCCATGCCCACCGAGGTGCTGGAGACCCAGGAGCTGTGCTCC CTGGCCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCCGTGCTGGAGTCCGTGACCGCC ATGGACCCCACCAAGGTGGGCGGCCGCTCCCAGTACCAGCACCTGCTGCGCCTGGAGGAC GGCACCGACATCGTGAAGTGCCGCACCGAGTGGCGCCCCAAGAACCCCGGCGCCAACGG CGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCC
SEQ ID NO: 91
Cuphea paucipetala (Cpau) FATB1 amino acid sequence
MVAAAASSAFFPVPAPGTSPKPGKSGNWPSSLSPSIKPMSIPNGGFQVKANASAHPKANGSAV NLKSGSLNTQEDTS SSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMRDRKSKRPDMLVD SVGLKSWLDGLVSRQIFSIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMC KNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSHSGKIGMASDWLITDCNTGEILIRATSVWA MMNQKTRRFSRLPYEVRQELTPHYVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDV NQHVSNVKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAMDPSEDEGRSQYKH LLRLEDGTDIVKGRTEWRPKNAGTNGAISTAKPSNGNSVS
SEQ ID NO: 92
Cuphea paucipetala (Cpau) FATB1 coding DNA sequence
ATGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCCCTGTTCCAGCCCCCGGAACCTCCCCTA AACCCGGGAAGTCCGGC AACTGGCCATC AAGCTTGAGCCCTTCC ATCAAGCCC ATGTC AA TCCCCAATGGCGGATTTCAGGTTAAGGCAAATGCCAGTGCCCATCCTAAGGCTAACGGTT CTGCAGTAAATCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCCTCC TCCTCGGGCTTTCCTTAACCAGTTGCCTGATTGGAGTATGCTTCTGACTGCAATCACGACC GTCTTCGTGGCGGCAGAGAAGCAGTGGACTATGCGTGATCGGAAATCTAAGAGGCCTGAC ATGCTCGTGGACTCGGTTGGGTTGAAGAGTGTTGTTCTGGATGGGCTCGTGTCCAGACAGA TTTTTTCGATTAGGTCTTATGAAATAGGCGCTGATCGAACTGCCTCTATAGAGACGCTGAT GAACCACTTGCAGGAAACATCTATCAATCATTGTAAGAGTTTGGGTCTTCTCAATGACGGC TTTGGTCGTACTCCTGGGATGTGTAAAAATGACCTCATTTGGGTGCTTACAAAAATGCAGA TCATGGTGAATCGCTACCCAACTTGGGGCGATACTGTTGAGATCAATACCTGGTTCTCCCA TTCGGGGAAAATTGGTATGGCTAGCGATTGGCTAATAACTGATTGCAACACAGGAGAAAT TCTTATAAGAGCAACGAGCGTGTGGGCCATGATGAATCAAAAGACGAGAAGATTCTCAAG ACTTCCATACGAGGTTCGCCAGGAGTTAACGCCTCATTATGTGGACTCTCCTCATGTCATT GAAGATAATGATCGGAAATTGCATAAGTTTGATGTGAAGACTGGTGATTCCATTCGTAAG GGTCTAACTCCGAGGTGGAATGACTTGGATGTCAATCAGCACGTAAGCAACGTGAAGTAC ATTGGGTGGATTCTCGAGAGTATGCCAATAGAAGTTTTGGAGACCCAGGAGCTATGCTCT CTCACCGTTGAATATAGGCGGGAATGCGGAATGGACAGTGTGCTGGAGTCCGTGACTGCT ATGGATCCCTCAGAAGATGAAGGCCGGTCTCAGTACAAGCACCTTCTGCGGCTTGAGGAT GGGACTGACATCGTGAAGGGCAGAACTGAGTGGCGACCGAAGAATGCAGGAACTAACGG GGCGATATCAACAGCAAAGCCTTCAAATGGAAACTCGGTCTCTTAG
SEQ ID NO: 93
Cuphea paucipetala (Cpau) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCATCAAGCCCATGTCCAT CCCCAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGACCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCGCGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGTCCGTGGTGCTGGACGGCCTGGTGTCCCGCCAGA TCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGAT GAACCACCTGCAGGAGACCTCCATCAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGG CTTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCA GATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTC CCACTCCGGCAAGATCGGCATGGCCTCCGACTGGCTGATCACCGACTGCAACACCGGCGA GATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCC CGCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTACGTGGACTCCCCCCACGTG ATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGTGAAG TACATCGGCTGGATCCTGGAGTCCATGCCCATCGAGGTGCTGGAGACCCAGGAGCTGTGC TCCCTGACCGTGGAGTACCGCCGCGAGTGCGGCATGGACTCCGTGCTGGAGTCCGTGACC GCCATGGACCCCTCCGAGGACGAGGGCCGCTCCCAGTACAAGCACCTGCTGCGCCTGGAG GACGGCACCGACATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCACCAA CGGCGCCATCTCCACCGCCAAGCCCTCCAACGGCAACTCCGTGTCCTGA SEQ ID NO: 94
Cuphea procumbens (Cproc) FATB1 amino acid sequence
MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHVSNVKYIGWILESMPIEVLEAQELCSLTVEYRRECGMDSVLESVTAVDPSEDGGRSQYN HLLRLEDGTDWKGRTEWRPKNAETNGAISPGNTSNGNSIS SEQ ID NO: 95
Cuphea procumbens (Cproc) FATB1 coding DNA sequence
ATGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCCCTGCTCCAGCCCCGGGATCCTCACCTA AACCCGGGAAGTCCGGTAATTGGCCATCGAGCTTGAGCCCTTCCTTCAAGTCCAAGTCAAT CCCCTATGGCCGATTTCAGGTTAAGGCAAATGCCAGTGCCCATCCTAAGGCTAACGGTTCT GCAGTAAATCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCCTCCTC CTCGGGCTTTCCTTAACCAGTTGCCTGATTGGAGTATGCTTCTGTCTGCAATCACGACTGT ATTCGTGGCGGCAGAGAAGCAGTGGACTATGCTTGATCGGAAATCTAAGAGGCCTGACAT GCTTGTGGACTCGGTTGGGTTGAAGAATATTGTTCGGGATGGGCTCGTGTCCAGACAGAG TTTTTTGATTAGATCTTATGAAATAGGCGCTGATCGAACAGCTTCTATAGAGACACTGATG AACCACTTGCAGGAAACATCTATCAATCATTGTAAGAGTTTGGGTCTTCTCAATGACGGCT TTGGTCGTACTCCTGGGATGTGTAAAAACGACCTCATTTGGGTGCTTACTAAAATGCAGAT CATGGTGAATCGCTACCCAGCTTGGGGCGATACTGTTGAGATCAATACCTGGTTCTCCCAG TCGGGGAAAATCGGTATGGGTAGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATT CTTATAAGAGCAACGAGCGTGTGGGCCATGATGAATCAAAAAACGAGAAGATTCTCAAG ACTTCCATACGAGGTTCGCCAGGAGTTAACGCCTCATTTTGTGGACTCTCCTCATGTCATT GAAGACAATGATCGGAAATTGCATAAGTTCGATGTGAAGACTGGTGATTCTATTCGCAAG GGTCTAACTCCGAGGTGGAATGACTTGGATGTCAATCAGCACGTGAGCAACGTGAAGTAC ATTGGGTGGATTCTCGAGAGTATGCCAATAGAAGTTTTGGAGGCCCAGGAACTATGCTCT CTCACCGTTGAATATAGGCGGGAATGCGGAATGGACAGTGTGCTGGAGTCCGTGACTGCT GTAGATCCCTCAGAAGATGGAGGCCGGTCTCAGTACAATCACCTTCTGCGGCTTGAGGAT GGGACTGATGTCGTGAAGGGCAGAACTGAGTGGCGACCGAAGAATGCAGAAACTAACGG GGCGATATCACCAGGAAACACTTCAAATGGAAACTCGATCTCCTAG SEQ ID NO: 96
Cuphea procumbens (Cproc) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGCCCCCGCCCCCGGCTCCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGTCCAAGTCCAT CCCCTACGGCCGCTTCCAGGTGAAGGCC AACGCCTCCGCCC ACCCC AAGGCC AACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGTCCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGAACATCGTGCGCGACGGCCTGGTGTCCCGCCAGT CCTTCCTGATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGAT GAACCACCTGCAGGAGACCTCCATCAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGG CTTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCA GATCATGGTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTC CCAGTCCGGCAAGATCGGCATGGGCTCCGACTGGCTGATCTCCGACTGCAACACCGGCGA GATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCC CGCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTTCGTGGACTCCCCCCACGTG ATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGTGAAG TACATCGGCTGGATCCTGGAGTCCATGCCCATCGAGGTGCTGGAGGCCCAGGAGCTGTGC TCCCTGACCGTGGAGTACCGCCGCGAGTGCGGCATGGACTCCGTGCTGGAGTCCGTGACC GCCGTGGACCCCTCCGAGGACGGCGGCCGCTCCCAGTACAACCACCTGCTGCGCCTGGAG GACGGCACCGACGTGGTGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGAGACCAA CGGCGCCATCTCCCCCGGCAACACCTCCAACGGCAACTCCATCTCCTGA SEQ ID NO: 97
Cuphea procumbens (Cproc) FATB2 amino acid sequence
MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD
VNQHV VKYIGWILESTPPEVLETQELCSLTLEYRQECGRESVLESLTAVDPSGKGFGSQFQH
LLRLEDGGEIVKGRTEWRPKTAGINGAIASGETSPGDF SEQ ID NO: 98
Cuphea procumbens (Cproc) FATB2 coding DNA sequence
ATGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCCCTGCTCCAGCCCCGGGATCCTCACCTA
AACCCGGGAAGTCCGGTAATTGGCCATCGAGCTTGAGCCCTTCCTTCAAGTCCAAGTCAAT
CCCCTATGGCCGATTTCAGGTTAAGGCAAATGCCAGTGCCCATCCTAAGGCTAACGGTTCT GCAGTAAATCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCCTCCTC CTCGGGCTTTCCTTAACCAGTTGCCTGATTGGAGTATGCTTCTGTCTGCAATCACGACTGT ATTCGTGGCGGCAGAGAAGCAGTGGACTATGCTTGATCGGAAATCTAAGAGGCCTGACAT GCTTGTGGACTCGGTTGGGTTGAAGAATATTGTTCGGGATGGGCTCGTGTCCAGACAGAG TTTTTTGATTAGATCTTATGAAATAGGCGCTGATCGAACAGCTTCTATAGAGACACTGATG AACCACTTGCAGGAAACATCTATCAATCATTGTAAGAGTTTGGGTCTTCTCAATGACGGCT TTGGTCGTACTCCTGGGATGTGTAAAAACGACCTCATTTGGGTGCTTACTAAAATGCAGAT CATGGTGAATCGCTACCCAGCTTGGGGCGATACTGTTGAGATCAATACCTGGTTCTCCCAG TCGGGGAAAATCGGTATGGGTAGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATT CTTATAAGAGCAACGAGCGTGTGGGCCATGATGAATCAAAAAACGAGAAGATTCTCAAG ACTTCCATACGAGGTTCGCCAGGAGTTAACGCCTCATTTTGTGGACTCTCCTCATGTCATT GAAGACAATGATCGGAAATTGCATAAGTTCGATGTGAAGACTGGTGATTCTATTCGCAAG GGTCTAACTCCGAGGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTAC ATCGGGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCC TTACCCTGGAATACAGGCAGGAATGCGGAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTG TGGACCCCTCTGGAAAGGGCTTTGGGTCCCAGTTCCAACACCTTCTGAGGCTTGAGGATGG AGGTGAGATCGTGAAGGGGAGAACTGAGTGGCGACCCAAGACTGCAGGTATCAATGGGG CGATAGCATCCGGGGAGACCTCACCTGGAGACTTTTAG
SEQ ID NO: 99
Cuphea procumbens (Cproc) FATB2 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGCCCCCGCCCCCGGCTCCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGTCCAAGTCCAT CCCCTACGGCCGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAAC ACCCAGGAGGAC ACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGTCCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGAACATCGTGCGCGACGGCCTGGTGTCCCGCCAGT CCTTCCTGATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGAT GAACCACCTGCAGGAGACCTCCATCAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGG CTTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCA GATCATGGTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTC CCAGTCCGGCAAGATCGGCATGGGCTCCGACTGGCTGATCTCCGACTGCAACACCGGCGA GATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCC CGCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTTCGTGGACTCCCCCCACGTG ATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAA GTACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTG CTCCCTGACCCTGGAGTACCGCCAGGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGAC CGCCGTGGACCCCTCCGGC AAGGGCTTCGGCTCCC AGTTCC AGC ACCTGCTGCGCCTGGA GGACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGACCGCCGGCATCA ACGGCGCCATCGCCTCCGGCGAGACCTCCCCCGGCGACTTCTGA
SEQ ID NO: 100
Cuphea procumbens (Cproc) FATB3 amino acid sequence MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHV VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGGYGSQFQ HLLRLEDGGEIVKGRTEWRPKNAGINGVLPTGE*
SEQ ID NO: 101
Cuphea procumbens (Cproc) FATB3 coding DNA sequence
ATGGTGGCTGCTGCAGCAAGTTCTGCATTCTTCCCTGCTCCAGCCCCGGGATCCTCACCTA AACCCGGGAAGTCCGGTAATTGGCCATCGAGCTTGAGCCCTTCCTTCAAGTCCAAGTCAAT CCCCTATGGCCGATTTCAGGTTAAGGCAAATGCCAGTGCCCATCCTAAGGCTAACGGTTCT GCAGTAAATCTAAAGTCTGGCAGCCTCAACACTCAGGAGGACACTTCGTCGTCCCCTCCTC CTCGGGCTTTCCTTAACCAGTTGCCTGATTGGAGTATGCTTCTGTCTGCAATCACGACTGT ATTCGTGGCGGCAGAGAAGCAGTGGACTATGCTTGATCGGAAATCTAAGAGGCCTGACAT GCTTGTGGACTCGGTTGGGTTGAAGAATATTGTTCGGGATGGGCTCGTGTCCAGACAGAG TTTTTTGATTAGATCTTATGAAATAGGCGCTGATCGAACAGCTTCTATAGAGACACTGATG AACCACTTGCAGGAAACATCTATCAATCATTGTAAGAGTTTGGGTCTTCTCAATGACGGCT TTGGTCGTACTCCTGGGATGTGTAAAAACGACCTCATTTGGGTGCTTACTAAAATGCAGAT CATGGTGAATCGCTACCCAGCTTGGGGCGATACTGTTGAGATCAATACCTGGTTCTCCCAG TCGGGGAAAATCGGTATGGGTAGCGATTGGCTAATAAGTGATTGCAACACAGGAGAAATT CTTATAAGAGCAACGAGCGTGTGGGCCATGATGAATCAAAAAACGAGAAGATTCTCAAG ACTTCCATACGAGGTTCGCCAGGAGTTAACGCCTCATTTTGTGGACTCTCCTCATGTCATT GAAGACAATGATCGGAAATTGCATAAGTTCGATGTGAAGACTGGTGATTCTATTCGCAAG GGTCTAACTCCGAGGTGGAATGACTTGGATGTCAATCAGCACGTCAACAACGTGAAGTAC ATCGGGTGGATTCTTGAGAGTACTCCACCAGAAGTTCTGGAGACCCAGGAGTTATGTTCCC TTACCCTGGAATACAGGCGGGAATGTGGAAGGGAGAGCGTGCTGGAGTCCCTCACTGCTG TGGACCCCTCTGGAGAGGGGGGCTATGGATCCCAGTTTCAGCACCTTCTGCGGCTTGAGG ATGGAGGTGAGATCGTGAAGGGGAGAACTGAGTGGCGACCCAAGAATGCTGGAATCAAT GGGGTGTTACCAACCGGGGAGTAG
SEQ ID NO: 102
Cuphea procumbens (Cproc) FATB3 coding DNA sequence codon optimized for Prototheca moriformis
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGCCCCCGCCCCCGGCTCCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGTCCAAGTCCAT CCCCTACGGCCGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGTCCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGAACATCGTGCGCGACGGCCTGGTGTCCCGCCAGT CCTTCCTGATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGAT GAACCACCTGCAGGAGACCTCCATCAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGG CTTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCA GATCATGGTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTC CCAGTCCGGCAAGATCGGCATGGGCTCCGACTGGCTGATCTCCGACTGCAACACCGGCGA GATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCC CGCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTTCGTGGACTCCCCCCACGTG ATCGAGGAC AACGACCGC AAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCC ATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAA GTACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTG CTCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGAC CGCCGTGGACCCCTCCGGCGAGGGCGGCTACGGCTCCCAGTTCCAGCACCTGCTGCGCCT GGAGGACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCC AAGAACGCCGGC A TCAACGGCGTGCTGCCCACCGGCGAGTGA SEQ ID NO: 103
Cuphea ignea (Cignea) FATB1 amino acid sequence
PGTSRKTGKFGNWPSSLSPSFKPKSIPNGGFQVKANARAHPKANGSAVSLKSVSLNTQEDTSLS PPPRAFLNQLPDWRMLRTALTTVFVAAEKQWTMLDRKSKRPDMLVDSFGLESIVQEGLVFRQ SFSIRSYEIGIDRTASIETLMNHLQETSLNQCKSAGILHDGFGRTLEMCKRDLIWWTKMQIKV NRYPAWGDTVEISTRFSRLGKIGMGRDWLICDCNTGEILIRATSAYAMMNQKTRRLSKLPNEV RQEIAPLFVDSDPVIEENDMKLHKFEVKTGDSICKGLTPRWSDLDVNQHVS VKYIGWILESM PTEVLETQELCSLALEYRRECGRDSVLESVTSMDPSKVGGWSQYQHLLRLEDGADIVKGRTE WRPKNAGANGAISTGKT
SEQ ID NO: 104
Cuphea ignea (Cignea) FATB1 coding DNA sequence
CCGGGAACCTCACGTAAAACCGGGAAGTTCGGCAATTGGCCATCAAGCTTGAGCCCTTCC TTCAAGCCCAAGTCAATCCCCAATGGCGGATTTCAGGTTAAGGCTAATGCCAGAGCCCAT CCTAAGGCTAACGGTTCTGCAGTAAGTCTAAAGTCTGTCAGCCTCAACACTCAGGAGGAC ACTTCGTTGTCCCCTCCTCCTCGTGCTTTCCTTAACCAGTTGCCTGATTGGAGGATGCTTCG GACTGCACTCACGACCGTCTTTGTGGCGGCAGAGAAGCAGTGGACTATGCTTGATCGGAA ATCTAAGAGGCCTGACATGCTCGTGGACTCGTTTGGGTTGGAGAGTATTGTTCAAGAAGG GCTCGTGTTCAGACAGAGCTTTTCGATTAGGTCTTATGAAATAGGCATTGATCGAACAGCC TCTATAGAGACGCTGATGAACCACTTGCAGGAAACATCTCTCAATCAATGTAAGAGTGCT GGTATTCTCCATGACGGCTTCGGTCGTACTCTTGAGATGTGTAAAAGGGACCTCATTTGGG TTGTTACGAAAATGCAGATCAAGGTGAATCGCTATCCAGCTTGGGGCGATACTGTCGAGA TCAGTACCCGGTTCTCCCGGTTGGGGAAAATCGGTATGGGTCGCGATTGGCTAATATGTGA TTGCAACACAGGAGAAATTCTTATAAGAGCAACGAGCGCGTATGCCATGATGAATCAAAA GACGAGAAGACTCTCAAAACTTCCAAACGAGGTTCGCCAGGAGATAGCGCCTCTTTTTGT GGACTCTGATCCTGTCATTGAAGAAAATGATATGAAATTGCATAAGTTTGAAGTGAAGAC TGGTGATTCCATTTGCAAGGGTCTAACTCCGAGGTGGAGTGACTTGGATGTCAATCAGCAC GTAAGCAACGTGAAGTACATAGGGTGGATTCTCGAGAGTATGCCAACAGAAGTTTTGGAG ACACAGGAGCTATGCTCTCTCGCCCTTGAATATAGGCGGGAATGCGGAAGGGACAGTGTG CTGGAGTCTGTGACCTCT ATGGATCCCTC AAAAGTTGGAGGCTGGTCTC AGT ACC AGC ACC TTCTGCGACTTGAGGATGGGGCGGATATCGTGAAGGGCAGAACTGAGTGGCGGCCGAAG AATGCAGGAGCTAACGGGGCGATATCAACAGGAAAGACTTGA
SEQ ID NO: 105
Cuphea ignea (Cignea) FATB1 coding DNA sequence codon optimized for Prototheca moriformis
CCCGGCACCTCCCGCAAGACCGGCAAGTTCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCT TCAAGCCCAAGTCCATCCCCAACGGCGGCTTCCAGGTGAAGGCCAACGCCCGCGCCCACC CCAAGGCCAACGGCTCCGCCGTGTCCCTGAAGTCCGTGTCCCTGAACACCCAGGAGGACA CCTCCCTGTCCCCCCCCCCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGCGCATGCTGCG CACCGCCCTGACCACCGTGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAA GTCCAAGCGCCCCGACATGCTGGTGGACTCCTTCGGCCTGGAGTCCATCGTGCAGGAGGG CCTGGTGTTCCGCCAGTCCTTCTCCATCCGCTCCTACGAGATCGGCATCGACCGCACCGCC TCCATCGAGACCCTGATGAACCACCTGCAGGAGACCTCCCTGAACCAGTGCAAGTCCGCC GGCATCCTGCACGACGGCTTCGGCCGCACCCTGGAGATGTGCAAGCGCGACCTGATCTGG GTGGTGACCAAGATGCAGATCAAGGTGAACCGCTACCCCGCCTGGGGCGACACCGTGGAG ATCTCCACCCGCTTCTCCCGCCTGGGCAAGATCGGCATGGGCCGCGACTGGCTGATCTGCG ACTGCAACACCGGCGAGATCCTGATCCGCGCCACCTCCGCCTACGCCATGATGAACCAGA AGACCCGCCGCCTGTCCAAGCTGCCCAACGAGGTGCGCCAGGAGATCGCCCCCCTGTTCG TGGACTCCGACCCCGTGATCGAGGAGAACGACATGAAGCTGCACAAGTTCGAGGTGAAG ACCGGCGACTCC ATCTGC AAGGGCCTGACCCCCCGCTGGTCCGACCTGGACGTGAACC AG CACGTGTCCAACGTGAAGTACATCGGCTGGATCCTGGAGTCCATGCCCACCGAGGTGCTG GAGACCCAGGAGCTGTGCTCCCTGGCCCTGGAGTACCGCCGCGAGTGCGGCCGCGACTCC GTGCTGGAGTCCGTGACCTCCATGGACCCCTCCAAGGTGGGCGGCTGGTCCCAGTACCAG CACCTGCTGCGCCTGGAGGACGGCGCCGACATCGTGAAGGGCCGCACCGAGTGGCGCCCC AAGAACGCCGGCGCC AACGGCGCC ATCTCC ACCGGC AAGACCTGA SEQ ID NO: 106
JcFatBl consensus amino acid sequence
MVAAAASSAFFPVPAPGTSPKPGKSGNWPSSLSPSFKPKSIPNGGFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKRIVQDGLVSRQSFSIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHVS VKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAMDPSENGGRSQYK HLLRLEDGTDIVKGRTEWRPKNAGTNGAISTGKTSNGNSVS*
SEQ ID NO: 107
JcFatBl consensus DNA sequence codon optimized for Prototheca
ATGGTGGCCGCCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCGCCCCCGGCACCTCCCCCA AGCCCGGCAAGTCCGGCAACTGGCCCTCCTCCCTGTCCCCCTCCTTCAAGCCCAAGTCCAT CCCCAACGGCGGCTTCCAGGTGAAGGCCAACGCCTCCGCCCACCCCAAGGCCAACGGCTC CGCCGTGAACCTGAAGTCCGGCTCCCTGAACACCCAGGAGGACACCTCCTCCTCCCCCCCC CCCCGCGCCTTCCTGAACCAGCTGCCCGACTGGTCCATGCTGCTGACCGCCATCACCACCG TGTTCGTGGCCGCCGAGAAGCAGTGGACCATGCTGGACCGCAAGTCCAAGCGCCCCGACA TGCTGGTGGACTCCGTGGGCCTGAAGCGCATCGTGCAGGACGGCCTGGTGTCCCGCCAGT CCTTCTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCCTGAT GAACCACCTGCAGGAGACCTCCATCAACCACTGCAAGTCCCTGGGCCTGCTGAACGACGG CTTCGGCCGCACCCCCGGCATGTGCAAGAACGACCTGATCTGGGTGCTGACCAAGATGCA GATCATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGATCAACACCTGGTTCTC CCAGTCCGGCAAGATCGGCATGGGCTCCGACTGGCTGATCTCCGACTGCAACACCGGCGA GATCCTGATCCGCGCCACCTCCGTGTGGGCCATGATGAACCAGAAGACCCGCCGCTTCTCC CGCCTGCCCTACGAGGTGCGCCAGGAGCTGACCCCCCACTTCGTGGACTCCCCCCACGTG ATCGAGGACAACGACCGCAAGCTGCACAAGTTCGACGTGAAGACCGGCGACTCCATCCGC AAGGGCCTGACCCCCCGCTGGAACGACCTGGACGTGAACCAGCACGTGTCCAACGTGAAG TACATCGGCTGGATCCTGGAGTCCATGCCCATCGAGGTGCTGGAGACCCAGGAGCTGTGC TCCCTGACCGTGGAGTACCGCCGCGAGTGCGGCATGGACTCCGTGCTGGAGTCCGTGACC GCCATGGACCCCTCCGAGAACGGCGGCCGCTCCCAGTACAAGCACCTGCTGCGCCTGGAG GACGGCACCGACATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGAACGCCGGCACCAA CGGCGCCATCTCCACCGGCAAGACCTCCAACGGCAACTCCGTGTCCTGA SEQ ID NO: 108
JcFatB2 consensus amino acid sequence
MVATAASSAFFPVPSPDTSSRPGKLGNGSSSLSPLKPKSVANGGLQVKANASAPPKINGSSVGL KSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLVDPF GLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFGRTPEMY KRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSV WVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTPKWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKGYGSQFQ HLLRLEDGGEIVKGRTEWRPKTAGINGAIASGETSPGDSS*
SEQ ID NO: 109
JcFatB2 consensus DNA sequence codon optimized for Prototheca
ATGGTGGCCACCGCCGCCTCCTCCGCCTTCTTCCCCGTGCCCTCCCCCGACACCTCCTCCCG CCCCGGCAAGCTGGGCAACGGCTCCTCCTCCCTGTCCCCCCTGAAGCCCAAGTCCGTGGCC AACGGCGGCCTGCAGGTGAAGGCCAACGCCTCCGCCCCCCCCAAGATCAACGGCTCCTCC GTGGGCCTGAAGTCCGGCTCCCTGAAGACCCAGGAGGACACCCCCTCCGCCCCCCCCCCC CGCACCTTCATCAACCAGCTGCCCGACTGGTCCATGCTGCTGGCCGCCATCACCACCGTGT TCCTGGCCGCCGAGAAGCAGTGGATGATGCTGGACTGGAAGCCCAAGCGCCCCGACATGC TGGTGGACCCCTTCGGCCTGGGCCGCATCGTGCAGGACGGCCTGGTGTTCCGCCAGAACTT CTCCATCCGCTCCTACGAGATCGGCGCCGACCGCACCGCCTCCATCGAGACCGTGATGAA CCACCTGCAGGAGACCGCCCTGAACCACGTGAAGTCCGCCGGCCTGCTGAACGACGGCTT CGGCCGCACCCCCGAGATGTACAAGCGCGACCTGATCTGGGTGGTGGCCAAGATGCAGGT GATGGTGAACCGCTACCCCACCTGGGGCGACACCGTGGAGGTGAACACCTGGGTGGCCAA GTCCGGCAAGAACGGCATGCGCCGCGACTGGCTGATCTCCGACTGCAACACCGGCGAGAT CCTGACCCGCGCCTCCTCCGTGTGGGTGATGATGAACCAGAAGACCCGCCGCCTGTCCAA GATCCCCGACGAGGTGCGCCACGAGATCGAGCCCCACTTCGTGGACTCCGCCCCCGTGAT CGAGGACGACGACCGCAAGCTGCCCAAGCTGGACGAGAAGACCGCCGACTCCATCCGCA AGGGCCTGACCCCCAAGTGGAACGACCTGGACGTGAACCAGCACGTGAACAACGTGAAG TACATCGGCTGGATCCTGGAGTCCACCCCCCCCGAGGTGCTGGAGACCCAGGAGCTGTGC TCCCTGACCCTGGAGTACCGCCGCGAGTGCGGCCGCGAGTCCGTGCTGGAGTCCCTGACC GCCGTGGACCCCTCCGGCAAGGGCTACGGCTCCCAGTTCCAGCACCTGCTGCGCCTGGAG GACGGCGGCGAGATCGTGAAGGGCCGCACCGAGTGGCGCCCCAAGACCGCCGGCATCAA CGGCGCCATCGCCTCCGGCGAGACCTCCCCCGGCGACTCCTCCTGA
SEQ ID NO: 110
CuPSR23 FATB3 amino acid sequence
MWAAATSAFFPVPAPGTSPKPGKSGNWPSSLSPTFKPKSIPNAGFQVKANASAHPKA NGSAVNLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRP DMLVDSVGLKCIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFG RTPGMCKNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSQSGKIGMASDWLISDCNTGEILIR ATSVWAMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDQKLHKFDVKTGDSIRKGLTPR WNDLDVNQHVS VKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAVDPSENG GRSQYKHLLRLEDGTDIVKSRTEWRPKNAGTNGAISTSTAKTSNGNSVS SEQ ID NO: 111
CuPSR23 FATB3b amino acid sequence
MWAAATSAFFPVPAPGTSPKPGKSGNWPSSLSPTFKPKSIPNAGFQVKANASAHPKA NGSAVNLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRP DMLVDSVGLKSIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFG RTPGMCKNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSQSGKIGMASDWLISDCNTGEILIR ATSVWAMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDQKLHKFDVKTGDSIRKGLTPR WNDLDVNQHVSNVKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAVDPSENG GRSQYKHLLRLEDGTDIVKSRTEWRPKNAGTNGAISTSTAKTSNGNSAS
SEQ ID NO: 112
CwFATB3 amino acid sequence:
MWAAAASSAFFPVPAPRTTPKPGKFGNWPSSLSPPFKPKSNPNGRFQVKANVSPHPK ANGSAVSLKSGSLNTLEDPPSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQFTRLDRKSKRPD MLVD WFGSETIVQDGLVFRERF SIRS YEIGADRT ASIETLMNHLQDT SLNHCKS VGLLNDGFG RTSEMCTRDLIWVLTKMQIWNRYPTWGDTVEINSWFSQSGKIGMGRDWLISDCNTGEILVR AT S AWAMMNQKTRRF SKLPCEVRQEIAPHF VDAPPVIEDNDRKLHKFDVKTGD SICKGLTPG WNDLDVNQHVSNVKYIGWILESMPTEVLETQELCSLTLEYRRECGRESWESVTSMNPSKVG DRSQYQHLLRLEDGADIMKGRTEWRPKNAGTNRAIST
SEQ ID NO: 113
CwFATB3a amino acid sequence:
MWAAAASSAFFPVPAPRTTPKPGKFGNWPSSLSPPFKPKSNPNGRFQVKANVSPHPK ANGSAVSLKSGSLNTLEDPPSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQFTRLDRKSKRPD MLVD WFGSETIVQDGLVFRERF SIRS YEIGADRT ASIETLMNHLQDT SLNHCKS VGLLNDGFG RTSEMCTRDLIWVLTKMQIWNRYPTWGDTVEINSWFSQSGKIGMGRDWLISDCNTGEILVR AT S AWAMMNQKTRRF SKLPCEVRQEIAPHF VDAPPVIEDNDRKLHKFDVKTGD SICKGLTPG WNDLDVNQHVSNVKYIGWILESMPTEVLETQELCSLTLEYRRECGRESWESVTSMNPSKVG DRSQYQHLLRLEDGADIMKGRTEWRPKNAGTNRAIST SEQ ID NO: 114
CwFATB3b amino acid sequence
MWAAAASSAFFPVPAPRTTPKPGKFGNWPSSLSPPFKPKSNPNGRFQVKANVSPHPK ANGSAVSLKSGSLNTLEDLPSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQFTRLDRKSKRPD MLVD WFGSETIVQDGLVFRERF SIRS YEIGADRT ASIETLMNHLQDT SLNHCKS VGLLNDGFG RTSEMCTRDLIWVLTKMQIWNRYPTWGDTVEINSWFSQSGKIGMGRDWLISDCNTGEILVR AT S AWAMMNQKTRRF SKLPCEVRQEIAPHF VDAPPVIEDNDRKLHKFDVKTGD SICKGLTPG WNDLDVNQHVS VKYIGWILEKFWRPRSYALSPLNIGG VEGKVW
SEQ ID NO: 115
CwFATB3c amino acid sequence
MWAAAASSAFFPVPAPRTTPKPGKFGNWPSSLSPPFKPKSNPNGRFQVKANVSPHPK ANGSAVSLKSGSLNTLEDLPSSPPPRTFLNQLPDWSRLRTAITTVFVATEKQFTRLDRKSKRPD MLVD WFGSETIVQDGLVFRERF SIRS YEIGADRT ASIETLMNHLQDT SLNHCKS VGLLNDGFG RTSEMCTRDLIWVLTKMQIWNRYPTWGDTVEINSWFSQSGKIGMGRDWLISDCNTGEILVR AT S AWAMMNQKTRRF SKLPCEVRQEIAPHF VDAPPVIEDNDRKLHKFDVKTGD SICKGLTPG WNDLDVNQHVSNVKYIGWILEKFWRPRSYALSPLNIGGNVEGKVW
SEQ ID NO: 116
CwFATB4a amino acid sequence
MVATAASSAFFPVPSADTSSSRPGKLGSGPSSLSPLKPKSIPNGGLQVKANASAPPKIN GSSVGLKSGGFKTQEDSPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSNDGFGR TPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPR WNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSAEGY ASRFQHLLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGDFF
SEQ ID NO: 117
CwFATB4a.l amino acid sequence
MVAT AAS S AFFPVPS ADT S S SRPGKLGSGP S SLSPLKPKSIPNGGLQ VKANAS APPKIN GSSVGLKSGGFKTQEDSPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSNDGFGR TPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPR WNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSAEGY ASRFQHLLRLEDGGEIVKARTEWRPKNAGINWWPSEESSPGDFF SEQ ID NO: 118
CwFATB4a.2 amino acid sequence:
MVATAASSAFFPVPSADTSSSRPGKLGNGPSSLSPLKPKSIPNGGLQVKANASAPPKIN GSSVGLKSGSFKTQEDAPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSNDGFGR TPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPR WNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSAEGY ASRFQHLLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGDFF
SEQ ID NO: 119
CwFATB4a.3 amino acid sequence
MVATAASSAFFPVPSADTSSSRPGKLGSGPSSLSPLKPKSIPNGGLQVKANASAPPKIN GSSVGLKSGGFKTQEDSPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSNDGFGR TPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPR WNDLDVNQHV VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSAEGY VSRFQHLLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGDFF
SEQ ID NO: 120
CwFATB4b amino acid sequence
MVATAASSAFFPVPSADTSSSRPGKLGNGPSSLSPLKPKSIPNGGLQVKANASAPPKIN GSSVGLKSGSFKTQEDAPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSSDGFGR TPAMSKRDLIWWAKMQVMVNRYPAWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPR WNDLDVNQHVNNVKYIGWILESTPAEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGD GSKFQHLLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGGDFF
SEQ ID NO: 121
CwFATB4b.1 amino acid sequence
MVATAASSAFFPVPSADTSSSRPGKLGSGPSSLSPLKPKSIPNGGLQVKANASAPPKIN GSSVGLKSGSFKTQEDAPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGSIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKIAGLSSDGFGR TPAMSKRDLIWWAKMQVMVNRYPAWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRRLSKIPDEVRNEIEPHFVDSAPWEDDDRKLPKLDENTADSIRKGLTPR WNDLDVNQHVNNVKYIGWILESTPAEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGD GSKFQHLLRLEDGGEIVKARTEWRPKNAGINGWPSEESSPGGDFF
SEQ ID NO: 122
CwFATB5 amino acid sequence
MVAAAASSAFFSVPTPGTPPKPGKFGNWPSSLSVPFKPDNGGFHVKANASAHPKANG SAVNLKSGSLETPPRSFINQLPDLSVLLSKITTVFGAAEKQWKRPGMLVEPFGVDRIFQDGVFF RQSFSIRSYEIGVDRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWWTKIQVE VNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQNTRRLSKFP YEVRQEIAPHFVDSAPVIEDDQKLQKLDVKTGDSIRDGLTPRWNDLDVNQHVNNVKYIGWIL KSVPIEVFETQELCGVTLEYRRECGRDSVLESVTAMDPAKEGDRCVYQHLLRLEDGADITIGR TEWRPKNAGANGAMSSGKTSNGNCLIEGRGWQPFRWRLIF SEQ ID NO: 123
CwFATB5a amino acid sequence
MVAAAASSAFFSVPTPGTPPKPGKFGNWPSSLSVPFKPDNGGFHVKANASAHPKANG SAVNLKSGSLETPPRSFINQLPDLSVLLSKITTVFGAAEKQWKRPGMLVEPFGVDRIFQDGFFFR QSFSIRSYEIGVDRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWWTKIQVEV NRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQNTRRLSKFPYE VRQEIAPHFVDSAPVIEDDQKLQKLDVKTGDSIRDGLTPRWNDLDVNQHVNNVKYIGWILKS VPIEVFETQELCGVTLEYRRECGRDSVLESVTAMDPAKEGDRCVYQHLLRLEDGADITIGRTE WRPKNAGANGAMSSGKTSNGNCLIEGRGWQPFRWRLIF
SEQ ID NO: 124
CwFATB5b amino acid sequence
MVAAAASSAFFSVPTPGTPPKPGKFGNWPSSLSVPFKPDNGGFHVKANASAHPKANG SAVNLKSGSLETPPRSFINQLPDLSVLLSKITTVFGAAEKQWKRPGMLVEPFGVDRIFQDGVFF RQSFSIRSYEIGVDRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWWTKIQVE VNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQNTRRLSKFP YEVRQEIAPHFVDSAPVIEDDQKLQKLDVKTGDSIRDGLTPRWNDLDVNQHVNNVKYIGWIL KSVPIEVFETQELCGVTLEYRRECGRDSVLESVTAMDPAKEGDRCVYQHLLWLEDGADITIGR TEWRPKNAGANGAMSSGKTSNGNCLIEGRGWQPFRWRLIF
SEQ ID NO: 125
CwFATB5c amino acid sequence
MVAAAASSAFFSVPTPGTPPKPGKFGNWPSSLSVPFKPDNGGFHVKANASAHPKANG SAVNLKSGSLETPPRSFINQLPDLSVLLSKITTVFGAAEKQWKRPGMLVEPFGVDRIFQDGVFF RQSFSIRSYEIGVDRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLIWWTKIQVE VNRYPIWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQNTRRLSKFPY EVRQEIAPHFVDSAPVIEDDQKLQKLDVKTGDSIRDGLTPRWNDLDVNQHV NVKYIGWILK SVPIEVFETQELCGVTLEYRRECGRDSVLESVTAMDPAKEGDRCVYQHLLRLEDGADITIGRTE WRPKNAGANGAMS SGKT SNGNCLIEGMGWQPFRWRLIF
SEQ ID NO: 126
CwFATB5.1 amino acid sequence
MVAAAAS S AFF S VPTPGT SPKPGKFRNWPS SLS VPFKPETNHNGGFHIKANAS AHPKA NGSALNLKSGSLETQEDTSLSSPPRTFIKQLPDWSMLLSKITTVFGAAEKQLKRPGMLVEPFGV DRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLI WWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQ NTRRLSKFPYEVRQEIAPHFVDSAPVIEDDRKLYKLNVKTGDSIRDGLTPRWNDLDVNQHVNN VKFIGWILKSVPTKVFETQELCGVTLEYRRECGKDSVLESVTAMDPAKEGDRSVYQHLLRLED GADITIGRTE WRPKNAGANE AIS SGKT SNGNS AS
SEQ ID NO: 127
CwFATB5.1a amino acid sequence
MVAAAAS S AFF S VPTPGT SPKPGKFRNWPLSLS VPFKPETNHNGGFHIKANAS AHPKA NGSALNLKSGSLETQEDTSLSSPPRTFIKQLPDWSMLLSKITTVFGAAEKQLKRPGMLVEPFGV DRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKRDLI WWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMMNQ NTRRLSKFPYEVRQEIAPHFVDSAPVIEDDRKLYKLNVKTGDSIRDGLTPRWNDLDVNQHVNN VKFIGWILKSVPTKVFETQELCGVTLEYRRECGKDSVLESVTAMDPAKEGDRSVYQHLLRLED GADITIGRTE WRPKNAGANE AIS SGKT SNGNS AS SEQ ID NO: 128
CcFATB2b amino acid sequence
MVTTSLASAYFSMKAVMLAPDGRGIKPRSSGLQVRAGNERNSCKVINGTKVKDTEG LKGCSTLQGQSMLDDHFGLHGLVFRRTFAIRCYEVGPDRSTSIMAVMNHLQEAARNHAESLG LLGDGFGETLEMSKRDLIWWRRTHVAVERYPAWGDTVEVEAWVGASGNTGMRRDFLVRD CKTGHILTRCTSVSVMMNMRTRRLSKIPQEVRAEIDPLFIEKVAVKEGEIKKLQKLNDSTADYI QGGWTPRWNDLDVNQHVNNIIYVGWIFKSVPDSISENHHLSSITLEYRRECIRGNKLQSLTTVC GGSSEAGIICEHLLQLEDGSEVLRARTEWRPKHTDSFQGISERFPQQEPHK
SEQ ID NO: 129
CcFATB3 amino acid sequence
MVATAAASAFFPVGAPATSSATSAKASMMPDNLDARGIKPKPASSSGLQVKANAHA SPKINGSKVSTDTLKGEDTLTSSPAPRTFINQLPDWSMFLAAITTIFLAAEKQWTNLDWKPRRP DMLADPFGIGRFMQDGLIFRQHFAIRSYEIGADRTASIETLMNHLQETALNHVRSAGLLGDGF GATPEMSRRDLIWWTRMQVLVDRYPAWGDIVEVETWVGASGKNGMRRDWLVRDSQTGEI LTRATSVWVMMNKRTRRLSKLPEEVRGEIGPYFIEDVAIIEEDNRKLQKLNENTADNVRRGLT PRWSDLDVNQHVNNVKYIGWILESAPGSILESHELSCMTLEYRRECGKDSVLQSMTAVSGGG SAAGGSPESSVECDHLLQLESGPEWRGRTEWRPKSANNSRSILEMPAESL
SEQ ID NO: 130
CcFATB3b amino acid sequence
MVATAAASAFFPVGAPATSSATSAKASMMPDNLDARGIKPKLASSSGLQVKANAHA SPKINGSKVSTDTLKGEDTLTSSPAPRTFINQLPDWSMFLAAITTIFLAAEKQWTNLDWKPRRP DMLADPFGIGRFMQDGLIFRQHFAIRSYEIGADRTASIETLMNHLQETALNHVRSAGLLGDGF GATPEMSRRDLIWWTRMQVLVDRYPAWGDIVEVETWVGASGKNGMRRDWLVRDSQTGEI LTRATSVWVMMNKRTRRLSKLPEEVRGEIGPYFIEDVAIIEEDNRKLQKLNENTADNVRRGLT PRWSDLDVNQHVNNVKYIGWILESAPGSILESHELSCMTLEYRRECGKDSVLQSMTAVSGGG SAAGGSPESSVECDHLLQLESGPEWRGRTEWRPKSANNSRSILEMPAESL SEQ ID NO: 131
CcFATB3c amino acid sequence
MVATAAASAFFPVGAPATSSATSAKASMMPDNLDARGIKPKPASSSGLQVKANAHA SPKINGSKVSTDTLKGEDTLTSSPAPRTFINQLPDWSMFLAAITTIFLAAEKQWTNLDWKPRRP DMLADPFGIGRFMQDGLIFRQHFAIRSYEIGADRTASIETLMNHLQETALNHVRSAGLLGDGF GATPEMSRRDLIWWTRMQVLVDRYPAWGDIVEVETWVGASGKNGMRRDWLVRDSQTGEI LTRATSVWVMMNKRTRRLSKLPEEVRGEIGPYFIEDVAIIEEDNRKLQKLNENTAD VRRGLT PRWSDLDVNQHV NAKYIGWILESAPGSILESHELSCMTLEYRRECGKDSVLQSMTAVSGGG SAAGGSPESSVECDHLLQLESGPEWRGRTEWRPKSA NSRSILEMPAESL SEQ ID NO: 132
ChtFATBla amino acid sequence
MVAAAASSAFFSVPTPGTSTKPGNFGNWPSSLSVPFKPESNHNGGFRVKANASAHPK ANGSAVNLKSGSLETQEDTSSSSPPPRTFIKQLPDWGMLLSKITTVFGAAERQWKRPGMLVEP FGVDRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCK RDLIWWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAM MNRKTRRLSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDSIRKGLTPRWNDLDVNQ HVNNVKYIGWILKS VPAE VFETQELCGVTLEYRRECGRD S VLE S VTAMDTAKEGDRSLYQHL LRLEDGADITIGRTEWRPKNAGANGAISTGKTSNENSVS
SEQ ID NO: 133
ChtF ATB 1 a.1 amino acid sequence
MVAAAASSAFFSVPTPGTSPKPGNFGNWPSSLSVPFKPESNHNGGFRVKANASAHPK ANGSAVNLKSGSLETQEDTSSSSPPPRTFIKQLPDWGMLLSKITTVFGAAERQWKRPGMLVEP FGVDRIFQDGVFFRHSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCK RDLIWWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLIGDCRTGEILIRATSVWAM MNRKTRRLSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDSIRKGLTPRWNDLDVNQ HVNNVKYIGWILKS VPAE VFETQELCGVTLEYRRECGRD S VLE S VTAMDTAKEGDRSLYQHL LRLEDGADITIGRTEWRPKNAGANGALSTGKTSNGNSVS
SEQ ID NO: 134
ChtFATBla.2 amino acid sequence
MVAAAASSAFFSVPTPGTSPKPGNFGNWPSNLSVPFKPESNHNGGFRVKANASAHPK ANGSAVNLKSGSLETQEDTSSSSPPPRTFIKQLPDWGMLLSKITTVFGAAERQWKRPGMLVEP FGVDRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCK RDLIWWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAM MNRKTRRLSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDSIRKGLTPRWNDFDVNQ HVNNVKYIGWILKS VPAE VFETQELCGVTLEYRRECGRD S VLE S VTAMDTAKEGDRSLYQHL LRLEDGADITIGRTEWRPKNAGANGAISTGKTSNENSVS
SEQ ID NO: 135
ChtFATBla.3 amino acid sequence
MVAAAASSAFFSVPTPGTSPKPGNFGNWPSSLSVPFKPESNHNGGFRVKANASAHPK ANGSAVNLKSGSLETQEDTSSSSPPPRTFIKQLPDWGMLLSKITTVFGAAERQWKRPGMLVEP FGVDRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCK RDLIWWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAM MNRKTRRLSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDSIRKGLTPRWNDFDVNQ HVNNVKYIGWILKS VPAE VFETQELCGVTLEYRRECGRD S VLE S VTAMDTAKEGDRSLYQHL LRLEDGADITIGRTEWRPKNAGVNGAISTGKTSNENSVS
SEQ ID NO: 136
ChtFATBla.4 amino acid sequence
MVAAAASSAFFSVPTPGTSPKPGNFGNWPSSLSVPFKPESNHNGGFRVKANASAHPK ANGSAVNLKSGSLETQEDTSSSSPPPRTFIKQLPDWSMLLSKITTVFGAAERQWKRPGMLVEPF GVDRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCKR DLIWWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAMM NRKTRRLSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDSIRKGLTPRWNDFDVNQHV VKYIGWILKSVPAEVFETQELCGVTLEYRRECGRDSVLESVTAMDTAKEGDRSLYQHLLR
LEDGADITIGRTEWRPKNAGANGAISTGKTSNENSVS
SEQ ID NO: 137
ChtFATBlb amino acid sequence
MVAAAAS S AFF S VPTSGT SPKPGNFGNWP S SLS VPFKPES SHNGGFQVKANAS AHPK ANGSAVNLKSGSLETQEDTSSSSPPPRTFIKQLPDWSMLLSKITTVFWAAERQWKRPGMLVEP FGVDRIFQDGVFFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFGRTPEMCK RDLIWWTKIQVEVNRYPTWGDTIEVNTWVSESGKNGMGRDWLISDCRTGEILIRATSVWAM MNRKTRRLSKFPYEVRQEIAPHFVDSAPVIEDDKKLHKLDVKTGDFIRKGLTPRWNDFDVNQ HV VKYIGWILKS VPAE VFETQELCGVTLEYRRECGRD S VLE S VTAMDTAKEGDRSLYQHL LRLEDGADITIGRTEWRPKNAGANGAISTGKTSNENSVS
SEQ ID NO: 138
ChtFATB2b amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHP KANGSAVSLKSGSLNTQEGTSSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQLTMLDRKSKK PDMHVDWFGLEIIVQDGLVFRESFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGF GRTPEMCKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIR AT SIWAMMNQKTRRF SKLPNEVRQEIAPHF VDAPPVIEDNDRKLHKFDVKTGD SICKGLTPE W NDLDVNQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGD RSQYQHLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS
SEQ ID NO: 139
ChtFATB2a amino acid sequence
MWAAAASSAFFPVPAPGTTSKPGKFGNWPSSLSPSFKPKSNPNGGFQVKANASAHP KANGSAVSLKSGSLNTKEDTPSSPPPRTFLNQLPDWSRLRTAITTVFVAAEKQLTMLDRKSKK PDMHVDWFGLEIIVQDWLVFRESFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGF GRTPEMCKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIR AT SIWAMMNQKTRRF SKLPNEVRQEIAPHF VDAPPLIEDNDRKLHKFDVKTGD SICKGLTPE W NDLDVNQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGD RSQYQHLLRLEDGTDIMKGRTE WRPKNAGTNGAISTGKTSNGNSVS
SEQ ID NO: 140
ChtFATB2c amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHP KANGSAVSLKSGSLNTKEDTPSSPPPRTFLNQLPDWNRLRTAITTVFVAAEKQLTMLDRKSKK PDMHVDWFGLEIIVQDGLVFRESFSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGF GRTPEMCKRDLIWVLTKMQIMVNRYPTWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIR AT SIWAMMNQKTRRF SKLPNEVRQEIAPHF VDAPPVIEDNDRKLHKFDVKTGD SICKGLTPE W NDLDVNQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGD RSQYQHLLRLEDGTDIMKGRTE WRPKNAGTNGAISTGKTSNGNSVS SEQ ID NO: 141
ChtFATB2d amino acid sequence
MWAAAASSAFFPVPAPGTTSKPGKFGNWPSSLSPSFKPKSNPNGGFQVKANASAHP KANGSAVSLKSGSLNTQEDTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRP DMLVDLFGLESrVQDGLVFRESYSIRSYEIGADRTASIETLMNHLQDTSLNHCKSVGLLNDGFG RTPEMCKRDLIWVLTKMQIMVNRYPT WGDTVEINS WF SQ SGKIGMGRNWLISDCNTGEILIRA TSIWAMMNQNTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSIRKGLTPGWN DLDVNQHVSNVKYIGWILESMPTEVLETQELCSLTLEYRRECGRESVLESVTAMNPSKVGDRS QYQHLLRLEDGADIMKGRTE WRPKNAGTNGAISTGKTSNGNSVS SEQ ID NO: 142
ChtFATB2e amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHP KANGSAVSLKSGSLNTQEDTSSSPPPQTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRP DMLVDWFGLESIVQDGLVFRESYSIRSYEISADRTASIETVMNLLQETSLNHCKSMGILNDGFG RTPEMCKRDLIWVLTKMQILVNRYPNWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRA TSIWAMMNQNTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSIRKGLTPGWN DLDVNQHVSNVKYIGWILE SMPTE VLETQELC SLTLE YRRECGRD S VLE S VTAMNPSKVGDRS QYQHLLRLEDGADIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS SEQ ID NO: 143
ChtFATB2f amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHP KANGSAVSLKSGSLNTQEGTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRP DMLVDWFGLESIVQDGLVFRESYSIRSYEISADRTASIETVMNLLQETSLNHCKSMGILNDGFG RTPEMCKRDLIWVLTKMQILVNRYPNWGDTVEINS WF SQSGKIGMGRNWLISDCNTGEILIRA TSIWAMMNQKTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPEWN DLDVNQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDR SQYQHLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS
SEQ ID NO: 144
ChtFATB2g amino acid sequence
MWAATAS S AFFPVPVPGT SPKPGKFGT WLS S S SP S YKPKSNP SGGFQVKANAS AHPK ANGSAVSLKSGSLNTQEDTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRPD MLVDWFGLESIVQDGLVFREIYSIRSYEISADRTTSIETVMNLLQETSLNHCKSMGILNDGFGRT PEMCKRDLIWVLTKMQILVNRYPNWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRATSI WAMMNQKTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSICKGLTPEWNDL DVNQHVSNVKYIGWILESMPKEVLDTQELCSLTLEYRRECGRDSVLESVTAMDPSKVGDRSQ YQHLLRLEDGTDIMKGRTEWRPKNAGTNGAISTGKTSNANSVS
SEQ ID NO: 145
ChtFATB2h amino acid sequence
MWAAAASSAFFPVPASGTSPKPGKFGTWLSSSSPSYKPKSNPSGGFQVKANASAHP KANGSAVSLKSGSLNTQEGTSSSPPPRTFLNQLPDWSRLLTAISTVFVAAEKQLTMLDRKSKRP DMLVDWFGLESIVQDGLVFRESYSIRSYEISADRTASIETVMNLLQETSLNHCKSMGILNDGFG RTPEMCKRDLIWVLTKMQILVNRYPNWGDTVEINSWFSQSGKIGMGRNWLISDCNTGEILIRA TSIWAMMNQNTRRFSKLPNEVRQEIAPHFVDAPPVIEDNDRKLHKFDVKTGDSIRKGLTPGWN DLDVNQHVSNVKYIGWILESIPTEVLETQELCSLTLEYRRECGRESVLESVTAMNPSKVGDRSQ YQHLLRLEDGADIMKGRTEWRPKNAGTNGAISTGKTSNGNSVS
SEQ ID NO: 146
ChtFATB3a amino acid sequence
MVAT AAS S AFFP VPSPDT S SRPGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN GSSVSLKSCSLKTHEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNEGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG FGPQFQHLLRLEDGGEIVKGRTEWRPKTAGINGTIASGETSPGNS
SEQ ID NO: 147
ChtFATB3b amino acid sequence
MVAT AAS S AFFP VPSPDT S SRPGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN GSSVSLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGFGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLIEGFGR TPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILT RASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTPK WNDLDVNQHV VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKGF GPQFQHLLRLEDGGEIVKGRTEWRPKTAGINGTIASGETSPGNS
SEQ ID NO: 148
ChtFATB3c amino acid sequence
MVAT AAS S AFFP VPSPDT S SRPGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN GSSVSLKSCSLKTHEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNEGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTP KWNDLDVNQHV NVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSEKG FGPQFQHLLRLEDGGEIVKGRTEWRPKTAGINGAIAFGETSPGDS
SEQ ID NO: 149
ChtFATB3d amino acid sequence
MVAT AAS S AFFP VPSPDT S SRPGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN
GSSVSLKSCSLKTHEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIKTVMNHLQETALNHVKSAGLLNEGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG FGPQFQHLLRLEDGGEIVKGRTEWRPKTAGINGTIASGETSPGNS
SEQ ID NO: 150
ChtFATB3e amino acid sequence
MVAT AAS S AFFP VPSPDT S SRPGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN GSSVSLKSGSLKTHEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNEGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG FGPQFQHLLRLEDGGEIVKGRTE WRPKTAGINGTIASGETSPGNS
SEQ ID NO: 151
ChtFATB3f amino acid sequence
MVAT AAS S AFFPVPSPDT S SRLGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN GSSVSLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MPVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNEGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSEKG FGPQFQHLLRLEDGGEIVKGRTE WRPKTAGINGTIASGETSPGNS SEQ ID NO: 152
ChtFATB3g amino acid sequence
MVAT AAS S AFFPVPSPDT S SRAGKLGNGS S SLRPLKPKF VANAGLQVKANAS APPKIN GSSVSLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNEGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRKLSKIPDEVRHEIEPHFVDSAPVIEDDDWKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG FGPQFQHLLRLEDGGEIVKGRTE WRPKTAGINGTIASGETSPGNS SEQ ID NO: 153
ChsFATB 1 amino acid sequence
MVATNAAAFSAYTFFLTSPTHGYSSKRLADTQNGYPGTSLKSKSTPPPAAAAARNGA LPLLASICKCPKKADGSMQLDSSLVFGFQFYIRSYEVGADQTVSIQTVL YLQEAAINHVQSAG YFGDSFGATPEMTKRNLIWVITKMQVLVDRYPAWGDWQVDTWTCSSGKNSMQRDWFVRD LKTGDIITRASSVWVLMNRLTRKLSKIPEAVLEEAKLFVMNTAPTVDDNRKLPKLDGSSADYV LSGLTPRWSDLDMNQHV VKYIAWILESVPQSIPETHKLSAITVEYRRECGKNSVLQSLT V SGDGITCGNSIIECHHLLQLETGPEILLARTEWISKEPGFRGAPIQAEKVYNNK
SEQ ID NO: 154
ChsFATB2 amino acid sequence
MVATAASSAFFPVPSPDASSRPGKLGNGSSSLSPLKPKLMANGGLQVKANASAPPKIN GSSVGLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFG RTLEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEI LTRASSVWVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG SGSQFQHLLRLEDGGEIVKGRTEWRPKTAGINGPIASGETSPGDSS
SEQ ID NO: 155
ChsFatB2b amino acid sequence
MVATAASSAFFPVPSPDASSRPGKLGNGSSSLSPLKPKLMANGGLQVKANASAPPKIN GSSVGLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFG RTLEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEI LTRASSKSQIMLPLHYCSVWVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLD EKTADSIRKGLTPKWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESV LESLTAVDPSGKGSGSQFQHLLRLEDGGEIVKGRTEWRPKTAGINGPIASGETSPGDSS
SEQ ID NO: 156
ChsFatB2c amino acid sequence
MVATAASSAFFPVPSPDASSRPGKLGNGSSSLSPLKPKLMANGGLQVKANASAPPKIN GSSVGLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFG RTLEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEI LTRASSVWVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG SGSQFQHLMRLEDGGEIVKGRTEWRPKTAGINGPIASGETSPGDSS
SEQ ID NO: 157
ChsFatB2d amino acid sequence
MVATAASSAFFPVPSPDASSRPGKLGNGSSSLSPLKPKLMANGGLQVKANASAPPKIN GSSVGLKSGSLKTQEDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPD MLVDPFGLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFG RTPEMYKRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEIL TRASSVWVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTP KWNDLDVNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGKG SGSQFQHLLRLEDGGEIVKGRTEWRPKTAGINGPIASGETSPGDSS SEQ ID NO: 158
Chs FATB3 amino acid sequence
MVAAEAS S ALF S VRTPGTSPKPGKFGNWPT SLS VPFKSKSNHNGGFQVKANAS ARPK ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILIR ATSMCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHV VKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEGD RSLYQHLLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS
SEQ ID NO: 159
ChsFatb3b amino acid sequence
MVAAEAS S ALF S VRTPGTSPKPGKFGNWPT SLS VPFKSKSNHNGGFQVKANAS ARPK
ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHIEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDFHTGDILIR ATSVCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHV VKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEGD RSLYQHLLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS
SEQ ID NO: 160
ChsFatB3c amino acid sequence
MVAAEAS S ALF S VRTPGTSPKPGKFGNWPT SLS VPFKSKSNHNGGFQVKANAS ARPK ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILIR ATSMCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRQECGRDSVLESVTAMDPSKEGD RSLYQHLLRLEDGTDIAKGRTKWRPKNAGKTSNGNSIS
SEQ ID NO: 161
ChsFATB3d amino acid sequence
MVAAEASSALFSVRTPGTSPKPGKFGNWPSSLSVPFKSKSNHNGGFQVKANASARPK ANGSAVSLKSGSLDTQEDASSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKR SDMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGF GRTPEMCKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILI RATSMCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPR WNDLDVNQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEG DRSLYQHLLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS SEQ ID NO: 162
ChsFATB3e amino acid sequence
MVAAEASSALFSVRTPGTSPKPGKFGNWPSSLSVPFKSKSNHNGGFQVKANASARPK ANGSAVSLKSGSLDTQEDASSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKR SDMLMDPFGVDRWQDGWFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGF GRTPEMCKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILI RATSMCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPR WNDLDVNQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEG DRSLYQHLLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS
SEQ ID NO: 163
ChsFATB3f amino acid sequence
MVAAEASSALFSVRTPGTSPKPGKFGNWPSSLSVPFKSKSNHNGGFQVKANASARPK ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILIR ATSMCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRRECGRDSVLESVTAMDPSKEGD RSLYQHLLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS
SEQ ID NO: 164
ChsFATB3g amino acid sequence
MVAAEAS S ALF S VRTPGTSPKPGKFGNWPT SLS VPFKSKSNHNGGFQVKANAS ARPK ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHIEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDFHTGDILIR ATSVCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHV VKYIGWILESVPTEVFETQELCGLTLEYRQECGRDSVLESVTAMDPSKEGD RSLYQHLLRLEDGTDIAKGRTKWRPKNAGKTSNGNSIS
SEQ ID NO: 165
ChsFATB3h amino acid sequence
MVAAEASSALFSVRTPGTSPKPGKFGNWPSSLSVPFKSKSNHNGGFQVKANASARPK ANGSAVSLKSGSLDTQEDASSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKR SDMLMDPFGVDRWQDGWFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGF GRTPEMCKRDLIWWTKMHIEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDFHTGDILI RATSVCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPR WNDLDVNQHV VKYIGWILESVPTEVFETQELCGLTLEYRQECGRDSVLESVTAMDPSKEG DRSLYQHLLRLEDGTDIAKGRTKWRPKNAGKTSNGNSIS
SEQ ID NO: 166
ChsFATB3i amino acid sequence
MVAAEAS S ALF S VRTPGTSPKPGKFGNWPT SLS VPFKSKSNHNGGFQVKANAS ARPK ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHVEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDCHTGEILIR ATSMCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRRECGGDSVLESVTAMDPSKEGD RSLYQHLLRLEDGADIAKGRTKWRPKNAGTNGAISTGKTSNGNSIS SEQ ID NO: 167
ChsFATB3j amino acid sequence
MVAAEAS S ALF S VRTPGTSPKPGKFGNWPT SLS VPFKSKSNHNGGFQVKANAS ARPK ANGSAVSLKSGSLDTQEDTSSSSSPPRTFINQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRP DMLMDPFGVDRWQDGAVFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSIGLLNDGFG RTPEMCKRDLIWWTKMHIEVNRYPTWGDTIEVNTWVSESGKTGMGRDWLISDFHTGDILIR ATSVCAMMNQKTRRFSKFPYEVRQELAPHFVDSAPVIEDYQKLHKLDVKTGDSICNGLTPRW NDLDVNQHVNNVKYIGWILESVPTEVFETQELCGLTLEYRQECGRDSVLESVTAMDPSKEGD RSLYQHLLRLEDGTDIAKGRTKWRPKNAGKTSNGNSIS
SEQ ID NO: 168
CcalcFATB 1 (Cuphea calcarata FATB 1 )
MVAAAATSAFFPVPAPGTSPNPRKFGSWPSSLSPSLPKSIPNGGFQVKANASAHPKANGSAVSL KSGSLNTQENTSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLVDLFGLE SSVQDALVFRQSFSIRSYEIGTDRTASIETLMNHLQETSLNHCKSTGILLDGFGRTLEMCKRELI WWIKMQIQVNRYPAWGDTVEINTRFSRLGKIGMGRDWLISDCNTGEILIRATSEYAMMNQK TRRLSKLPYEVHQEIAPLFVDSPPVIEDNDLKVHKFEVKTGDSIQKGLSPGWNDLDVNQHVSN VKYIGWILESMPTEVLETQELCSLALEYRRECGRDSVLESVTAMDPSKVGGRSQYQHLLRLED GTAIVNGITEWRPKNAGANGAISTGKTSNGNSVS
SEQ ID NO: 169
ChookFATB4 (Cuphea hookeriana FATB4)
MVAAAATSAFFPVPAPGTSPNPRKFGSWPSSLSPSLPNSIPNGGFQVKANASAHPKANGSAVSL KSGSLNTQENTSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLVDLFGLE SSVQDALVFRQRFSIRSYEIGTDRTASMETLMNHLQETSLNHCKSTGILLDGFGRTLEMCKREL IWWIKMQIQVNRYPAWGDTVEINTRFSRLGKIGMGRDWLISDCNTGEILIRATSEYAMMNQK TRRLSKLPYEVRQEIAPLFVDSPPVIEDNDLKVHKFEVKTGDSIHKGLTPGWNDLDVNQHVNN VKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAMDPSGGGYGSQFQHLLRLEDG GEIVKGRTEWRPKNGVINGWPTGESSPGDYS
SEQ ID NO: 170
CaFATBl (Cuphea avigera var. pulcherrima FATB1)
MVAAAASSAFFSVPVPGTSPKPGKFRIWPSSLSPSFKPKPIPNGGLQVKANSRAHPKANGSAVS LKSGSLNTQEDTSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLMDSFG LESIVQEGLEFRQSFSIRSYEIGTDRTASIETLMNYLQETSLNHCKSTGILLDGFGRTPEMCKRDL IWWTKMKIKVNRYPAWGDTVEINTWFSRLGKIGKGRDWLISDCNTGEILIRATSAYATMNQ KTRRLSKLPYEVHQEIAPLFVDSPPVIEDNDLKLHKFEVKTGDSIHKGLTPGWNDLDVNQHVS VKYIG WILE SMPTEVLETQELC SLALE YRRECGRDS VLE S VTAMDPTKVGGRSQYQHLLRLE DGTDIVKCRTEWRPKNPGANGAISTGKTSNGNSVS
SEQ ID NO: 171
CpauFATBl (Cuphea paucipetala FATB1)
MVAAAASSAFFPVPAPGTSPKPGKSGNWPSSLSPSIKPMSIPNGGFQVKANASAHPKANGSAV NLKSGSLNTQEDTS SSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMRDRKSKRPDMLVD SVGLKSWLDGLVSRQIFSIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMC KNDLIWVLTKMQIMVNRYPTWGDTVEINTWFSHSGKIGMASDWLITDCNTGEILIRATSVWA MMNQKTRRFSRLPYEVRQELTPHYVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDV NQHVSNVKYIGWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAMDPSEDEGRSQYKH LLRLEDGTDIVKGRTEWRPKNAGTNGAISTAKPSNGNSVS
SEQ ID NO: 172
CprocFATBl (Cuphea procumbens FATB1)
MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHVSNVKYIGWILESMPIEVLEAQELCSLTVEYRRECGMDSVLESVTAVDPSEDGGRSQYN HLLRLEDGTDWKGRTEWRPKNAETNGAISPGNTSNGNSIS SEQ ID NO: 173
CprocFATB2 (Cuphea procumbens FATB2)
MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRQECGRESVLESLTAVDPSGKGFGSQFQH LLRLEDGGEIVKGRTEWRPKTAGINGAIASGETSPGDF
SEQ ID NO: 174
CprocFATB3 (Cuphea procumbens FATB3)
MVAAAASSAFFPAPAPGSSPKPGKSGNWPSSLSPSFKSKSIPYGRFQVKANASAHPKANGSAV NLKSGSLNTQEDTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVD SVGLKNIVRDGLVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGM CKNDLIWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVW AMMNQKTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGGYGSQFQ HLLRLEDGGEIVKGRTEWRPKNAGINGVLPTGE SEQ ID NO: 175
CigneaFATBl (Cuphea ignea FATB1)
PGTSRKTGKFGNWPSSLSPSFKPKSIPNGGFQVKANARAHPKANGSAVSLKSVSLNTQEDTSLS PPPRAFLNQLPDWRMLRTALTTVFVAAEKQWTMLDRKSKRPDMLVDSFGLESIVQEGLVFRQ SFSIRSYEIGIDRTASIETLMNHLQETSLNQCKSAGILHDGFGRTLEMCKRDLIWWTKMQIKV NRYPAWGDTVEISTRFSRLGKIGMGRDWLICDCNTGEILIRATSAYAMMNQKTRRLSKLPNEV RQEIAPLFVDSDPVIEENDMKLHKFEVKTGDSICKGLTPRWSDLDVNQHVS VKYIGWILESM PTEVLETQELCSLALEYRRECGRDSVLESVTSMDPSKVGGWSQYQHLLRLEDGADIVKGRTE WRPKNAGANGAISTGKT
SEQ ID NO: 176
CcalcFATBl (Cuphea calcarata FATB1)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI[NASAHPKANGSAVSLKSGSLETQED
NS S S SRPPRTFIKQLPD WSMLLS AITTVF VAAEKQ WTMFDRKSKRSDMLVDPF WDRIVQDGV LFRQSFSIRSYEIGADRTASIETLMNIFQETSLNHCKSMGLLYEGFGRTPEMCKRDLIWWTKIH IKVNRYPTWGDTIEVTTWVSESGKNGMGRDWLISDCHTGEILIRATSVWAMMNQTTRRLSKF PYELRQEIAPHFVDSDPVIEDNRRLLNFDVKTGDSIRKGLTPRWNDLDVNQHVNNVKYIGWIL ESVPTEVFDTRELCGLTLEYRQECGRGSVLESVTAMDPSKEGDRSLYQHLLRLEDGTDIVKGR TEWRPKNAGTNGPVSTRKTTNGSSVS
SEQ ID NO: 177
ChookFATB4 (Cuphea hookeriana FATB4)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI|NASAHPKANGSAVSLKSGSLNTQEN
TSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLVDLFGLESSVQDALVFR
QRF SIRS YEIGTDRT ASMETLMNHLQET SLNHCKSTGILLDGFGRTLEMCKRELI WWIKMQIQ
VNRYPAWGDTVEINTRFSRLGKIGMGRDWLISDCNTGEILIRATSEYAMMNQKTRRLSKLPYE
VRQEIAPLFVDSPPVIEDNDLKVHKFEVKTGDSIHKGLTPGWNDLDVNQHVNNVKYIGWILES
TPPEVLETQELCSLTLEYRRECGRESVLESLTAMDPSGGGYGSQFQHLLRLEDGGEIVKGRTE
WRPKNGVINGWPTGESSPGDYS
SEQ ID NO: 178
CaFATBl (Cuphea avigera var. pulcherrima FATB1)
IMATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAA SRAHPKANGSAVSLKSGSLNTQED
TSSSPPPRTFLHQLPDWSRLLTAITTVFVKSKRPDMHDRKSKRPDMLMDSFGLESIVQEGLEFR
QSFSIRSYEIGTDRTASIETLMNYLQETSLNHCKSTGILLDGFGRTPEMCKRDLIWWTKMKIK
VNRYPAWGDTVEINTWFSRLGKIGKGRDWLISDCNTGEILIRATSAYATMNQKTRRLSKLPYE
VHQEIAPLFVDSPPVIEDNDLKLHKFEVKTGDSIHKGLTPGWNDLDVNQHVSNVKYIGWILES
MPTEVLETQELCSLALEYRRECGRDSVLESVTAMDPTKVGGRSQYQHLLRLEDGTDIVKCRTE
WRPKNPGANGAISTGKTSNGNSVS
SEQ ID NO: 179
CpauFATBl (Cuphea paucipetala FATB1)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI[NASAHPKANGSAVNLKSGSLNTQE DTSSSPPPRAFLNQLPDWSMLLTAITTVFVAAEKQWTMRDRKSKRPDMLVDSVGLKSWLDG LVSRQIFSIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMCKNDLIWVLTK MQIMVNRYPTWGDTVEINTWFSHSGKIGMASDWLITDCNTGEILIRATSVWAMMNQKTRRFS RLPYEVRQELTPHYVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVSNVKYI GWILESMPIEVLETQELCSLTVEYRRECGMDSVLESVTAMDPSEDEGRSQYKHLLRLEDGTDI VKGRTEWRPKNAGTNGAISTAKPSNGNSVS SEQ ID NO: 180
CprocFATBl (Cuphea procumbens FATB1)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI[NASAHPKANGSAVNLKSGSLNTQE
DTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVDSVGLKNIVRDG LVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMCKNDLIWVLTK MQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVWAMMNQKTRRFS RLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVS VKYI GWILESMPIEVLEAQELCSLTVEYRRECGMDSVLESVTAVDPSEDGGRSQYNHLLRLEDGTDV VKGRTEWRPKNAETNGAISPGNTSNGNSIS SEQ ID NO: 181
CprocFATB2 (Cuphea procumbens FATB2)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI[NASAHPKANGSAVNLKSGSLNTQE
DTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVDSVGLKNIVRDG LVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMCKNDLIWVLTK MQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVWAMMNQKTRRFS RLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVNNVKYI GWILESTPPEVLETQELCSLTLEYRQECGRESVLESLTAVDPSGKGFGSQFQHLLRLEDGGEIV KGRTEWRPKTAGINGAIASGETSPGDF
SEQ ID NO: 182
CprocFATB3 (Cuphea procumbens FATB3)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI[NASAHPKANGSAVNLKSGSLNTQE
DTSSSPPPRAFLNQLPDWSMLLSAITTVFVAAEKQWTMLDRKSKRPDMLVDSVGLKNIVRDG LVSRQSFLIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMCKNDLIWVLTK MQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVWAMMNQKTRRFS RLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVNNVKYI GWILESTPPEVLETQELCSLTLEYRRECGRESVLESLTAVDPSGEGGYGSQFQHLLRLEDGGEI VKGRTEWRPKNAGINGVLPTGE
SEQ ID NO: 183
CigneaFATBl (Cuphea ignea FATB1)
|MATASTFSAFNARCGDLRRSAGSGPRRPARPLPVRAAI[NARAHPKANGSAVSLKSVSLNTQED
TSLSPPPRAFLNQLPDWRMLRTALTTVFVAAEKQWTMLDRKSKRPDMLVDSFGLESIVQEGL
VFRQSFSIRSYEIGIDRTASIETLMNHLQETSLNQCKSAGILHDGFGRTLEMCKRDLIWWTKM
QIKVNRYPAWGDTVEISTRFSRLGKIGMGRDWLICDCNTGEILIRATSAYAMMNQKTRRLSKL
PNEVRQEIAPLFVDSDPVIEENDMKLHKFEVKTGDSICKGLTPRWSDLDVNQHVSNVKYIGWI
LESMPTEVLETQELCSLALEYRRECGRDSVLESVTSMDPSKVGGWSQYQHLLRLEDGADIVK
GRTEWRPKNAGANGAISTGKT
SEQ ID NO: 184
CgFATBl (Cuphea glossostoma FATB1)_
IMVAAAASSAFFPSPAPGSSPKPGNRPSSLSPSFKPKSIPNGAFQVKANASAHPKANGSAVNLKSI
|GSLNTQEDSSSSPSPRAFLNQLPDWSVLLTAITTVFVAAEKQWTMLDRKSKRPDVLVDSVGLK|
SIVQDGLVSRQSFSIRSYEIGADRTASIETLMNHLQETSINHCKSLGLLNDGFGRTPGMCKNDLII
IWVLTKMQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVWAMMNQI
|KTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVS|
PSTVKYIG WILE SMPIEVLETQELC SLT VEYRRECGMD S VLE S VT AMDPSEDGGRSQYNHLLRLE|
DGTDWKGRTEWRPKNAGTNGAISTTKTSNGNSVS SEQ ID NO: 185
CgFATBlb (Cuphea glossostoma FATB1 C170F,M198T,T374S variant)
MVAAAASSAFFPSPAPGSSPKPGNRPSSLSPSFKPKSIPNGAFQVKANASAHPKANGSAVNLKS GSLNTQEDSSSSPSPRAFLNQLPDWSVLLTAITTVFVAAEKQWTMLDRKSKRPDVLVDSVGLK SIVQDGLVSRQSFSIRSYEIGADRTASIETLMNHLQETSINHFKSLGLLNDGFGRTPGMCKNDLI WVLTKTQIMVNRYPAWGDTVEINTWFSQSGKIGMGSDWLISDCNTGEILIRATSVWAMMNQ KTRRFSRLPYEVRQELTPHFVDSPHVIEDNDRKLHKFDVKTGDSIRKGLTPRWNDLDVNQHVS VKYIG WILE SMPIEVLETQELC SLT VEYRRECGMD S VLE S VS AMDP SEDGGRSQYNHLLRLE DGTDWKGRTEWRPKNAGTNGAISTTKTSNGNSVS
SEQ ID NO: 186
Umbellularia californica UcFATB3 amino acid sequence
MVATAAASAFFPVGSPATSSATSAKASMMPDNLDARGIKPKPASSSGLQVKANAHASPKINGS KVSTDTLKGEDTLTSSPAPRTFINQLPDWSMFLAAITTIFLAAEKQWTNLDWKPRRPDMLADP FGIGRFMQDGLIFRQHFAIRSYEIGADRTASIETLMNHLQETALNHVRSAGLLGDGFGATPEMS RRDLIWWTRMQVLVDRYPAWGDIVEVETWVGASGKNGMRRDWLVRDSQTGEILTRATSV WVMMNKRTRRLSKIPEEVRGEIGPYFMENVAIIEEDSRKLQKLNENIIEEDSRKLQKLNENTAD NVRRGLTPRWSDLDVNQHVNNVKYIGWILESAPGSILESHELSCMTLEYRRECGKDSVLQSM TVVSGGGSAAGGSPESSVECDHLLQLESGPEWKARTEWRPKSANNPRSILEMPAESS*
SEQ ID NO: 187
Cuphea carthagenensis CCrFATB2c (V138L variant of FATB2)
MVAAAASSAFFPVTTPGTSRKPGKFGNWLSSLSPPFRPKSIPSGGFQVKANASAHPKANGSAV SLKSGSLNTQEDTSSSPPPRAFINQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRSDMLVDS FGMERIVQDGLLFRQSFSIRSYEIGADRRASIETLMNHLQETSLNHCKSIRLLNEGFGRTPEMCK RDLIWWTRMHIMVNRYPTWGDTVEINTWVSQSGKNGMGRDWLISDCNTGEILIRATSAWA MMNQKTRRLSKLPYEVSQEIAPHFVDSPPVIEDGDRKLHKFDVKTGDSIRKGLTPRWNDLDV NQHVNNVKYIGWILESMPTEVLETHELCFLTLEYRRECGRDSVLESVTAMDPSNEGGRSHYQ HLLRLEDGTDIVKGRTEWRPKNARNIGAISTGKTSNGNPAS*
SEQ ID NO: 188
Cuphea carthagenensis CCrFATB2
MVAAAASSAFFPVTTPGTSRKPGKFGNWLSSLSPPFRPKSIPSGGFQVKANASAHPKANGSAV SLKSGSLNTQEDTSSSPPPRAFINQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRSDMLVDS FGMERIVQDGLVFRQSFSIRSYEIGADRRASIETLMNHLQETSLNHCKSIRLLNEGFGRTPEMCK RDLIWWTRMHIMVNRYPTWGDTVEINTWVSQSGKNGMGRDWLISDCNTGEILIRATSAWA MMNQKTRRLSKLPYEVSQEIAPHFVDSPPVIEDGDRKLHKFDVKTGDSIRKGLTPRWNDLDV NQHVNNVKYIGWILESMPTEVLETHELCFLTLEYRRECGRDSVLESVTAMDPSNEGGRSHYQ HLLRLEDGTDIVKGRTEWRPKNARNIGAISTGKTSNGNPAS*
SEQ ID NO: 189
CcrFATB2b
MVAAAASSAFFPVTTPGTSRKPGKFGNWLSSLSPPFRPKSIPSGGFQVKANASAHPKANGSAV SLKSGSLNTQEDTSSSPPPRAFINQLPDWSMLLTAITTVFVAAEKQWTMLDRKSKRSDMLVDS FGMERIVQDGLVFRQSFSIRSYEIGADRRASIETLMNHLQETSLNHCKSIRLLNEGFGRTPEMCK RDLIWVFTRMHIMVNRYPTWGDTVEINTWVSQSGKNGMGRDWLISDCNTGEILIRATSAWA MMNQKTRRLSKLPYEVSQEIAPHFVDSPPVIEDGDRKLHKFDVKTGDSIRKGLTPRWNDLDV NQHVNNVKYIGWILESMPTEVLETHELCFLTLEYRRECGRDSVLESVTAMDPSNEGGRSHYQ HLLRLEDGTDIVKGRTE WRPKNARNIGAIPTGKTSNGNPAS * SEQ ID NO: 190
CcrFATBl
MVAT AAS S AFFPVPSPD S S SRPGKLGNGPS SLSPLKPKSTPNGGLQVKANAS APPKINGS S VGL KSSSLKTQDDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLTDPF GLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFGRTPEMY KRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSV WVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTPKWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGKESVLESLTAVDPSGKGWGSHFQ HLLRLEDGGEIVKGRTEWRPKNAGINGAVAFEETSPGDS*
SEQ ID NO: 191
CcrFATBlb
MVAT AAS S AFFPVPSPD S S SRPGKLGNGPS SLSPLKPKSTPNGGLQVKANAS APPKINGS S VGL KSSSLKTQDDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLTDPF GLGRIAQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFGRTPEMY KRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSV WVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTPKWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGKESVLESLTAVDPSGKGWGSHFQ HLLRLEDGGEIVKGRTEWRPKNAGINGAVAFEETSPGDS*
SEQ ID NO: 192
CCrFATBlc
MVAT AAS S AFFPVPSPD S S SRPGKLGNGPS SLSPLKPKSTPNGGLQVKANAS APPKINGS S VGL KSSSLKTQDDTPSAPPPRTFINQLPDWSMLLAAITTVFLAAEKQWMMLDWKPKRPDMLTDPF GLGRIVQDGLVFRQNFSIRSYEIGADRTASIETVMNHLQETALNHVKSAGLLNDGFGRTPEMY KRDLIWWAKMQVMVNRYPTWGDTVEVNTWVAKSGKNGMRRDWLISDCNTGEILTRASSV WVMMNQKTRRLSKIPDEVRHEIEPHFVDSAPVIEDDDRKLPKLDEKTADSIRKGLTPKWNDLD VNQHVNNVKYIGWILESTPPEVLETQELCSLTLEYRRECGKESVLESLTAVDPSGKGWGSHFQ HLLRLEDGGEIVKGRTEWRPKNA*

Claims

What is claimed is:
1. A nucleic acid construct comprising a regulatory element and a FatB gene expressing an active acyl-ACP thioesterase operable to produce an altered fatty acid profile in an oil produced by a cell expressing the nucleic acid construct, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 5 of Table la, the sequence having at least 94.6% sequence identity with each of SEQ ID NOs: 88, 82, 85, and 103, and optionally wherein the fatty acid of the oil is enriched in C8 and CIO fatty acids.
A nucleic acid construct comprising a regulatory element and a FatB gene expressing an active acyl-ACP thioesterase operable to produce an altered fatty acid profile in an oil produced by a cell expressing the nucleic acid construct, wherein the FatB gene expresses a protein having an amino acid sequence falling within one of clades 1-12 of Table la.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 1 of Table la, the sequence having at least 85.9%> sequence identity with each of SEQ ID NOs: 19, 161, 22, and 160, and optionally wherein the fatty acid of the oil is enriched in C14 and C16 fatty acids.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 2 of Table la, the sequence having at least 89.5%> sequence identity with each of SEQ ID NOs: 134-136, 132, 133, 137, 124, 122, 123, 125, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 3 of Table la, the sequence having at least 92.5% sequence identity with each of SEQ ID NOs: 126 and 127, and optionally wherein the fatty acid of the oil is enriched and C14 fatty acids.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 4 of Table la, the sequence having at least 83.8% sequence identity with SEQ ID NO: 79, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 6 of Table la, the sequence having at least 99.9% sequence identity with each of SEQ ID NOs: 111 and 110, and optionally wherein the fatty acid of the oil is enriched in CIO fatty acids.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 7 of Table la, the sequence having at least 89.5%> sequence identity with each of SEQ ID NOs: 73, 106, 185, 172, 171, 173, 174, and optionally wherein the fatty acid of the oil is enriched in CIO and C12 fatty acids.
The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 8 of Table la, the sequence having at least 85.9%> sequence identity with each of SEQ ID NOs: 112, 113, 142, 145, 143, 144, 139, 140, 138, 141, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
10. The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 9 of Table la, the sequence having at least 83.8% sequence identity with each of SEQ ID NOs: 187-189, and optionally wherein the fatty acid of the oil is enriched in C12 and C14 fatty acids.
11. The nucleic acid construct of claim 2, wherein the FatB gene expresses a protein having an amino acid sequence falling within clade 10 of Table la, the sequence having at least 95.9% sequence identity with each of SEQ ID NOs: 147, 149, 146, 150, 152, 151, 148, 154, 156, 155, 157, 108, 75, 190, 191, and 192, and optionally wherein the fatty acid of the oil is enriched in C14 and
C16 fatty acids.
12. The nucleic acid construct of claim 2, wherein the FatB gene expresses a
protein having an amino acid sequence falling within clade 11 of Table la, the sequence having at least 88.7% sequence identity with SEQ ID NO: 121, and optionally wherein the fatty acid of the oil is enriched in C14 and C16 fatty acids.
13. The nucleic acid construct of claim 2, wherein the FatB gene expresses a
protein having an amino acid sequence falling within clade 12 of Table la, the sequence having at least 72.8%> sequence identity with each of SEQ ID NOs: 129 and 186, and optionally wherein the fatty acid of the oil is enriched in C16 fatty acids. 14. An isolated nucleic acid or recombinant DNA construct comprising a nucleic acid, wherein the nucleic acid has at least 70% sequence identity to any of SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41, 42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 63, 65, 66, 68, 69, 71, 72, 74, 76, 78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 93, 95, 96, 98, 99, 101, 102, 104, 105, 107, 109 or any equivalent sequences by virtue of the degeneracy of the genetic code.
15. An isolated nucleic acid sequence encoding a protein or a host cell expressing a protein having at least 70%> sequence identity to any of SEQ ID NOS: 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58, 61, 64, 67, 70, 73, 75, 77, 79, 82, 85, 88, 91, 94, 97, 100, 103, 106, 108, 110-192 or a fragment thereof having acyl-ACP thioesterase activity.
16. The isolated nucleic acid of claim 15, wherein, the protein has acyl-ACP thioesterase activity operable to alter the fatty acid profile of an oil produced by a recombinant cell comprising that sequence. 17. A method of producing a recombinant cell that produces an altered fatty acid profile, the method comprising transforming the cell with a nucleic acid according to any of claims 1-3.
18. A host cell produced by the method of claim 17.
19. The host cell of claim 18, wherein the host cell is selected from a plant cell, a microbial cell, and a microalgal cell.
20. A method for producing an oil or oil-derived product, the method comprising cultivating a host cell of claim 5 or 6, and extracting oil produced thereby, optionally wherein the cultivation is heterotrophic growth on sugar.
21. The method of claim 20, further comprising producing a fatty acid, fuel, chemical, or other oil-derived product from the oil.
22. An oil produced by the method of claim 20, optionally having a fatty acid profile comprising at least 20% C8, CIO, C12, C14 or C16 fatty acids.
23. An oil-derived product produced by the method of claim 21.
24. The oil of claim 23, wherein the oil is produced by a microalgae and
optionally, lacks C24-alpha sterols.
EP14769502.7A 2013-03-15 2014-03-13 Thioesterases and cells for production of tailored oils Withdrawn EP2971024A4 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201361791861P 2013-03-15 2013-03-15
US13/837,996 US9290749B2 (en) 2013-03-15 2013-03-15 Thioesterases and cells for production of tailored oils
US201361917217P 2013-12-17 2013-12-17
PCT/US2014/026644 WO2014151904A1 (en) 2013-03-15 2014-03-13 Thioesterases and cells for production of tailored oils

Publications (2)

Publication Number Publication Date
EP2971024A1 true EP2971024A1 (en) 2016-01-20
EP2971024A4 EP2971024A4 (en) 2016-11-16

Family

ID=51581068

Family Applications (1)

Application Number Title Priority Date Filing Date
EP14769502.7A Withdrawn EP2971024A4 (en) 2013-03-15 2014-03-13 Thioesterases and cells for production of tailored oils

Country Status (9)

Country Link
EP (1) EP2971024A4 (en)
JP (1) JP2016518112A (en)
KR (1) KR20150128770A (en)
CN (1) CN105143458A (en)
AU (2) AU2014236763B2 (en)
BR (1) BR112015023192A8 (en)
CA (1) CA2904395A1 (en)
MX (1) MX2015011507A (en)
WO (1) WO2014151904A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9816079B2 (en) 2013-01-29 2017-11-14 Terravia Holdings, Inc. Variant thioesterases and methods of use
US9567615B2 (en) 2013-01-29 2017-02-14 Terravia Holdings, Inc. Variant thioesterases and methods of use
US9290749B2 (en) 2013-03-15 2016-03-22 Solazyme, Inc. Thioesterases and cells for production of tailored oils
US9783836B2 (en) 2013-03-15 2017-10-10 Terravia Holdings, Inc. Thioesterases and cells for production of tailored oils
ES2772123T3 (en) 2014-07-24 2020-07-07 Corbion Biotech Inc Thioesterase variants and methods of use
CN107208103A (en) * 2014-09-18 2017-09-26 泰拉瑞亚控股公司 Acyl group ACP thioesterases and its mutant
US20180142218A1 (en) 2016-10-05 2018-05-24 Terravia Holdings, Inc. Novel acyltransferases, variant thioesterases, and uses thereof
US20230143841A1 (en) 2020-01-16 2023-05-11 Corbion Biotech, Inc. Beta-ketoacyl-acp synthase iv variants

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5298421A (en) 1990-04-26 1994-03-29 Calgene, Inc. Plant medium-chain-preferring acyl-ACP thioesterases and related methods
US5512482A (en) * 1990-04-26 1996-04-30 Calgene, Inc. Plant thioesterases
US5344771A (en) 1990-04-26 1994-09-06 Calgene, Inc. Plant thiosterases
US5639790A (en) 1991-05-21 1997-06-17 Calgene, Inc. Plant medium-chain thioesterases
US5455167A (en) * 1991-05-21 1995-10-03 Calgene Inc. Medium-chain thioesterases in plants
US5654495A (en) 1992-10-30 1997-08-05 Calgene, Inc. Production of myristate in plant cells
US5850022A (en) 1992-10-30 1998-12-15 Calgene, Inc. Production of myristate in plant cells
EP0716708A1 (en) * 1993-09-03 1996-06-19 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Medium chain-specific thioesterases
EP0728212A1 (en) * 1993-11-10 1996-08-28 Calgene, Inc. Plant acyl acp thioesterase sequences
US5807893A (en) 1993-11-18 1998-09-15 Voelker; Toni Alois Plant thioesterases and use for modification of fatty acid composition in plant seed oils
WO2008151149A2 (en) 2007-06-01 2008-12-11 Solazyme, Inc. Production of oil in microorganisms
US8981180B2 (en) * 2007-07-09 2015-03-17 Bayer Cropscience N.V. Brassica plant comprising mutant fatty acyl-ACP thioesterase alleles
US7982035B2 (en) 2007-08-27 2011-07-19 Duquesne University Of The Holy Spirit Tricyclic compounds having antimitotic and/or antitumor activity and methods of use thereof
CN103834683A (en) * 2008-06-20 2014-06-04 巴斯夫植物科学有限公司 Plants having enhanced yield-related traits and a method for making the same
MX2011005630A (en) 2008-11-28 2011-09-28 Solazyme Inc Manufacturing of tailored oils in recombinant heterotrophic microorganisms.
CA2758301A1 (en) * 2009-04-10 2010-10-14 Ls9, Inc. Production of fatty acid derivatives
EP2575486B1 (en) 2010-05-28 2021-09-01 Corbion Biotech, Inc. Food compositions comprising tailored oils
SG192594A1 (en) 2011-02-02 2013-09-30 Solazyme Inc Tailored oils produced from recombinant oleaginous microorganisms
US8951762B2 (en) * 2011-07-27 2015-02-10 Iowa State University Research Foundation, Inc. Materials and methods for using an acyl—acyl carrier protein thioesterase and mutants and chimeras thereof in fatty acid synthesis
CN102586350A (en) 2012-01-09 2012-07-18 北京化工大学 Production method for C8:0/C10:0/C12:0/C14:0 medium-chain fatty acid and ethyl ester thereof
KR20150001830A (en) 2012-04-18 2015-01-06 솔라짐, 인코포레이티드 Tailored oils

Also Published As

Publication number Publication date
BR112015023192A2 (en) 2017-11-21
EP2971024A4 (en) 2016-11-16
CA2904395A1 (en) 2014-09-25
MX2015011507A (en) 2016-04-07
CN105143458A (en) 2015-12-09
AU2014236763B2 (en) 2018-08-23
JP2016518112A (en) 2016-06-23
KR20150128770A (en) 2015-11-18
BR112015023192A8 (en) 2018-01-02
WO2014151904A1 (en) 2014-09-25
AU2014236763A1 (en) 2015-10-01
AU2018267601A1 (en) 2018-12-06

Similar Documents

Publication Publication Date Title
US10557114B2 (en) Thioesterases and cells for production of tailored oils
US10316299B2 (en) Ketoacyl ACP synthase genes and uses thereof
US10125382B2 (en) Acyl-ACP thioesterases and mutants thereof
AU2018267601A1 (en) Thioesterases and cells for production of tailored oils
US20190002934A1 (en) Tailored oils
US20200392470A1 (en) Novel acyltransferases, variant thioesterases, and uses thereof
US9102973B2 (en) Tailored oils
US20160251685A1 (en) Thioesterases and cells for production of tailored oils
US20190382780A1 (en) Novel acyltransferases and methods of using
CA3060515A1 (en) Novel acyltransferases, variant thioesterases, and uses thereof
WO2023212726A2 (en) Regiospecific incorporation of fatty acids in triglyceride oil

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20150921

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAX Request for extension of the european patent (deleted)
RIC1 Information provided on ipc code assigned before grant

Ipc: C12N 9/00 20060101ALI20160630BHEP

Ipc: C12N 5/02 20060101ALI20160630BHEP

Ipc: C12P 7/64 20060101AFI20160630BHEP

Ipc: C12N 5/00 20060101ALI20160630BHEP

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: TERRAVIA HOLDINGS, INC.

A4 Supplementary search report drawn up and despatched

Effective date: 20161013

RIC1 Information provided on ipc code assigned before grant

Ipc: C12N 5/00 20060101ALI20161007BHEP

Ipc: C12P 7/64 20060101AFI20161007BHEP

Ipc: C12N 5/02 20060101ALI20161007BHEP

Ipc: C12N 9/00 20060101ALI20161007BHEP

17Q First examination report despatched

Effective date: 20170711

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: CORBION BIOTECH, INC.

RIN1 Information on inventor provided before grant (corrected)

Inventor name: CASOLARI, JASON

Inventor name: FRANKLIN, SCOTT

Inventor name: RUDENKO, GEORGE N.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20200603