WO2023056338A1

WO2023056338A1 - Biosynthetic production of vitamin a compounds

Info

Publication number: WO2023056338A1
Application number: PCT/US2022/077235
Authority: WO
Inventors: Wanli Lu; Yisheng WU; Jacob Thomas COURTNEY; Sean Robert JOHNSON; Oliver YU
Original assignee: Conagen Inc.
Priority date: 2021-09-30
Filing date: 2022-09-29
Publication date: 2023-04-06
Also published as: EP4409018A1; CN118339307A; JP2024536212A

Abstract

Provided herein are synthetic/recombinant nucleic acid molecules, nucleic acid constructs, enzymes, fusion enzymes, transformed host cells, and methods for making vitamin A compounds retinal, retinol and retinyl esters.

Description

BIOSYNTHETIC PRODUCTION OF VITAMIN A COMPOUNDS

RELATED APPLICATION

This application claims the benefit under 35 U.S.C. § 119(e) to U.S. Provisional Application No. 63/250,556, filed on September 30, 2021 , entitled “BIOSYNTHETIC PRODUCTION OF VITAMIN A COMPOUNDS,” the entire contents of which are incorporated herein by reference.

REFERENCE 1'0 AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (C149770062WO00-SEQ-ZJG.xml; Size: 92,395 bytes: and Date of Creation: September 21, 2022) is herein incorporated by reference in its entirety.

FIELD OF THE DISCLOSURE

The field of the disclosure relates to methods and processes for the biosynthetic production of a group of vitamin A (VA) compounds that include retinal, retinol, and retinyl esters. More specifically, the present methods and processes include those that make use of microbial host cells that have been transformed to include heterologous nucleic acids encoding a set of enzymes including novel bacteriorhodopsin-related protein-like homolog ( Bib) proteins, retinal reductase and acyltransferase,

BACKGROUND

Vitamins are substances that our bodies need and must be obtained regularly from the diet, among which vitamin A refers to a group of unsaturated isoprenoids including retinal, retinol, retinoic acid, and retinyl esters, which are the aldehyde, alcohol, carboxylic, and ester forms of vitamin A, respectively. The most common retinyl esters are retinyl acetate and retinyl palmitate whose structures are retinol attached to acetate or palmitate, respectively (FIG. 1).

Vitamin A plays essential roles for multiple physiological functions like embryonic development, adult growth and development, eye development and vision, maintenance of the immune system, and differentiation of epithelial cells. Vitamin A deficiency is a major cause of childhood blindness and associated with an increased susceptibility to infections, skin and thyroid disorders. In addition to its usage as nutritional supplements, vitamin A also attract broad interests as animal feed additives, raw materials for anti-aging cosmetics, medicaments for wrinkle improvement and pharmaceuticals for skin disorders.

The worldwide vitamin A market size is estimated at about 864 million US dollars by 2021. Currently, the chemical synthesis of vitamin A contributes significantly to the market share. The chemical synthesis production of vitamin A is not ideal for safe and sustainable production, which entails disadvantages such as increased costs due to the costs of key starting materials, use of toxic precursors, complicated purification processes, generation of undesirable byproducts and potential cause of environmental pollution.

SUMMARY OF THE DISCLOSURE

In the world of prokaryotes, retinal molecule plays a critical role for the survival of purple bacteria like Halobacteria Halobacterium halobium. The purple bacteria, as well as some other archaea and eubacteria, take advantage of retinal-based photosynthesis process to generate metabolic energy from sunlight. Bacteriorhodopsin, an integral membrane protein, acts as a proton pump which captures light and then pumps protons out of the cells. The proton gradients could be converted as chemical energy. Retinal is the molecule that covalently attached to Lys216 of bacteriorhodopsin via a Schiff base linkage, and light absorption triggered cisltrans isomerization leads to the conformational change of bacteriorhodopsin when absorbing proton, resulting in the light-driven transmembrane proton translocation (FIG. 2). Retinal is abundant in retinal-based photosynthetic microbes. Bacteriorhodopsin-related protein-like homolog (Blh) protein has been reported to catalyze the conversion of beta-carotene (BC) to retinal.

According to the current disclosure, retinal, retinol, retinyl acetate and retinyl palmitate can be reliably produced at a high yield by metabolic engineering and fermentation technology using microbial cell cultures such as oleaginous yeast Yarrowia lipolytica, baker's yeast Saccharomyces cerevisiae and/or Corynebacterium glutamicum. These microbial cell cultures can synthesize “natural” retinal, retinol and retinyl esters de novo in commercially significant yields. Hence, new biosynthetic methods are provided herein contribute to create an unnatural way to sustainably and economically produce naturally derived vitamin A compounds and lessen the burden of chemical synthetic production of vitamin A compounds. More specifically, the present disclosure encompasses methods and compositions for making “natural” retinal, retinol, retinyl acetate and retinyl palmitate by microbial fermentation, where such methods and compositions involve the transformation of a eukaryotic cell with a set of heterologous nucleic acid molecules that encode enzymes including novel bacteriorhodopsin- related protein-like homolog (Blh) proteins, retinal reductase and acyltransferase.

The present disclosure is based, in part, on the finding that certain Blh proteins derived from retinal-based photosynthetic microbes can efficiently catalyze the conversion of betacarotene to retinal. Comparing with currently widely used beta- carotene 15,15'-dioxygenases derived from fungi, bacteria, algae or animal, the newly discovered Blh proteins displayed higher catalytic efficiency in heterologous hosts that could produce high content of retinoids. By combining the overexpression of nucleic acid molecules encoding Blh protein, retinal reductase and acyltransferase, and the overexpression of a couple of beta-carotene biosynthetic genes encoding phytoene dehydrogenase (CarB) and bifunctional lycopene cyclase/phytoene synthase (CarRP), as well as carotenoid precursor biosynthetic gene encoding a fusion protein comprising famesyl diphosphate synthase (FPPS) fused in-frame to geranylgeranyl diphosphate synthase (GGPPS), and a fatty acid elongation protein (ELO1), the production of retinal, retinol and retinyl esters can be improved by providing higher titers than strategies using alternative betacarotene 15,15’-dioxygenases. The present disclosure, therefore, provides economical and reliable methods for producing “natural” vitamin A compounds without the disadvantages associated with chemical synthesis.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Blh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a SARI 16 cluster alpha proteobacterium HLMB100 beta-carotene 15, 15 '-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 1. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 1. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 1. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 2. Ill various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Deltaproleobacteria bacterium betacarotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 3. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 3. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 3. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%. 98%, or 99% identity to SEQ ID NO: 4.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a SARI J 6 cluster bacterium betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 5. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 5. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 5. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%. 95%, 97%. 98%, or 99% identity to SEQ ID NO: 6.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15’- dioxygenase (BCDO). For example, the enzyme can be an Alphaproteobacteria bacterium betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 7. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 7. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 7. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 8.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Methylococcaceae bacterium TMED69 beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 9. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 9. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 9. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 91%, 98%, or 99% identity to SEQ ID NO: 10.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Blh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be an Uncultured marine bacterium HF10_19P19 beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 11. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 11. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 11. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 12.

In various embodiments, the enzyme having carotenoid 15, 15 ’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Blh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a SARI 16 cluster bacterium betacarotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 13. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 13. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 13. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 14.

In various embodiments, the enzyme having carotenoid 15, 15 ’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Blh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a SARI 16 cluster bacterium betacarotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 15. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 15. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 15. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 16.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Candidatus Puriiceispirillum sp. betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 17. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 17. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 17. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 18.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase ( BCDO). For example, the enzyme can be an Uncultured marine bacterium betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 19. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 19. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 19. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 20.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be an Uncultured marine bacterium. EB000___55Bll beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 21. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 21. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 21. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 22. Ill various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be an Opitutae bacterium Tous-Cl OFEB beta-carotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 23. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 23. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 23. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%. 98%, or 99% identity to SEQ ID NO: 24. hi various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Gammaproteobacteria bacterium betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 25. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 25. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 25. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%. 95%, 97%. 98%, or 99% identity to SEQ ID NO: 26. in various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15’- dioxygenase (BCDO). For example, the enzyme can be a PSI clade bacterium beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 27. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 27. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 27. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 28. in various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15’- dioxygenase (BCDO). For example, the enzyme can be a SARI 16 cluster bacterium betacarotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 29. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 29. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 29. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 91%, 98%, or 99% identity to SEQ ID NO: 30. hi various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be an Alphaproteobacteria bacterium TMED87 beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 31. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 31. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 31. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 32.

In various embodiments, the enzyme having carotenoid 15, 15 ’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be an Euryarchaeota archaeon betacarotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 33. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 33. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 33. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 34.

In various embodiments, the enzyme having carotenoid 15, 15 ’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Gammaproteobacteria bacterium betacarotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 35. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 35. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 35. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 36.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be an Euryarchaeota archaeon betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 37. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 37. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 37. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 38.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase ( BCDO). For example, the enzyme can be a Porticoccaceae bacterium betacarotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 39. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 39. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 39. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 40.

In various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Rhodobacter sp. BACL10 MAG- 120419-binl 5 beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 41. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 41. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 41. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 42. Ill various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Ceilvibrionales bacterium TMED49 beta-carotene 15,15’-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 43. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 43. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 43. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%. 98%, or 99% identity to SEQ ID NO: 44. hi various embodiments, the enzyme having carotenoid 15,15’-cleavage activity can be a bacteriorhodopsin-related protein-like homolog (BIh) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the enzyme can be a Candidatus Puniceispirlllum sp. TMED52 beta-carotene 15,15'-dioxygenase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 45. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 45. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 45. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 46. hi various embodiments, the microorganisms having retinal-producing capability of the present disclosure may be transformed with a gene encoding retinal reductase (RALR). The RALR enzyme can catalyze the conversion from retinal to retinol. For example, the RALR enzyme can be a Proteobacteria NADP+-dependent aldehyde reductase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 47. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 47. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 47. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 48.

In various embodiments, the microorganisms having retinol-producing capability of the present disclosure may be transformed with a gene encoding acyltransferase (AT). The AT enzyme can catalyze the conversion from retinol to retinyl esters that include but not limited to retinyl acetate and retinyl palmitate. For example, the AT enzyme can be a Yarrowia lipolytica CUB 122 YALI0E32769p acyl-CoA:diacylglycerol acyl transferase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 49. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 49. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 49. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 50. hi various embodiments, the microorganisms having retinol -producing capability of the present disclosure may be transformed with a gene encoding acyltransferase (AT). The AT enzyme can catalyze the conversion from retinol to retinyl esters that include but not limited to retinyl acetate and retinyl palmitate. For example, the AT enzyme can be a Mus musculus diacylglycerol O-acyltransferase 1 or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 51. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 51. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 51. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 91%, 98%, or 99% identity to SEQ ID NO: 52. hi various embodiments, the microorganisms having retinol-producing capability of the present disclosure may be transformed with a gene encoding acyltransferase (AT). The AT enzyme can catalyze the conversion from retinol to retinyl esters that include but not limited to retinyl acetate and retinyl palmitate. For example, the AT enzyme can be a Campylobacter concisus lecithin retinol acyltransferase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 53. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 53. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 53. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99%; identity to SEQ ID NO: 54. hi various embodiments, the microorganisms having retinol -producing capability of the present disclosure may be transformed with a gene encoding acyltransferase (AT). The AT enzyme can catalyze the conversion from retinol to retinyl esters that include but not limited to retinyl acetate and retinyl palmitate. For example, the AT enzyme can be a Pontibacillus halophilus lecithin retinol acyltransferase family protein or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 55. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 55. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 55. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 56.

In various embodiments, the microorganisms having retinol-producing capability of the present disclosure may be transformed with a gene encoding acyltransferase (AT). The AT enzyme can catalyze the conversion from retinol to retinyl esters that include but not limited to retinyl acetate and retinyl palmitate. For example, the AT enzyme can be an Aeromonas sp. L_1B5_3 lecithin retinol acyltransferase or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 57. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 57. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 57. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%', 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 58.

In various embodiments, the microorganisms having retinyl palmitate-producing capability of the present disclosure may be transformed with a gene encoding elongation of fatty acids protein 1 which belongs to ELO family. The EL.0 family enzyme is involved in fatty acids elongation to further improve the yield of retinyl palmitate. For example, the ELO family enzyme can be a Yarrowia lipolytica CLIB122 YALI0F06754p elongation of fatty acids protein 1 or a functional variant thereof. For example, the enzyme can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 59. In certain embodiments, the enzyme can comprise the amino acid sequence of SEQ ID NO: 59. In other embodiments, the enzyme can consist of the amino acid sequence of SEQ ID NO: 59. Accordingly, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 60. Yet another aspect of the present disclosure can relate to a method of transforming a host cell. The method can include introducing into a host cell any of the present nucleic acid molecules or nucleic acid constructs described in the present disclosure, and selecting or screening for a transformed host cell. The host cell can be a prokaryotic cell or a eukaryotic cell. In some embodiments, the host cell can be microbial cell such as a bacterial cell, a yeast cell, an algal cell, or a fungal cell. In certain embodiments, the host cell can be selected from Yarrowia; Escherichia; Salmonella; Bacillus; Acinetobacter; Streptomyces; Corynebacterium; Methylosinus; Methylomonas; Rhodococcus; Pseudomonas; Rhodobacter; Synechocystis; Saccharomyces; Zygosaccharomyces; Kluyveroumyces; Candida; Hansenula; Debaryomyces; Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium; Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium. In particular embodiments, the host cell can be Yarrowia lipolytica. Yet another aspect of the present disclosure can relate to a recombinant cell that includes any of the present nucleic acid molecules described above. The recombinant cell can be a prokaryotic cell or a eukaryotic cell. In some embodiments, the recombinant cell can be microbial cell such as a bacterial cell, a yeast cell, an algal cell, or a fungal cell. In certain embodiments, the recombinant cell can be selected from Yarrowia; Escherichia; Salmonella; Bacillus; Acinetobacter; Streptomyces; Corynebacterium; Methylosinus; Methylomonas; Rhodococcus; Pseudomonas; Rhodobacter; Synechocystis; Saccharomyces; Zygosaccharomyces; Kluyveroumyces; Candida; Hansenula; Debaryomyces; Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium; Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium. In particular embodiments, the recombinant cell can be Yarrowia lipolytica. Yet another aspect of the present disclosure can relate to a method for synthesizing vitamin A compounds. The method can include culturing a transformed host cell according to the present teachings in a suitable medium, where the transformed host cell includes any of the present nucleic acid molecule described above, resulting in synthesis of vitamin A compounds by the transformed host cell. In some embodiments, the transformed host cell can produce geranylgeranyl diphosphate (GGPP) via a native mevalonate pathway. In some embodiments, the

C1497.70062US00 transformed host cell can be further transformed to overexpress one or more genes involved in the mevalonate pathway. For example, to increase production of beta-carotene, the precursor to the production of vitamin A compounds, the transformed host cell can be transformed to overexpress one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), farnesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the transformed host cell can be transformed to overexpress a synthetic nucleic acid molecule that encodes a fusion enzyme including an FPPS fused in-frame to a GGPPS (FPPS::GGPPS). In some embodiments, the transformed host cell can be further transformed to overexpress a phytoene dehydrogenase (CarB) and a bifunctional lycopene cyclase/phytoene synthase (CarRP) to increase the production of beta-carotene, the immediate precursor to retinal, which in turn increase the production of vitamin A compounds. In one aspect, the present disclosure relates to a method of producing vitamin A compounds, the method comprising incubating a first reaction mixture comprising beta-carotene and a bacteriorhodopsin-related protein-like homolog for a sufficient time to produce retinal. In some embodiments, the bacteriorhodopsin-related protein-like homolog comprises an amino acid sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) identical to the amino acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and 45. In some embodiments, the bacteriorhodopsin-related protein-like homolog comprises the amino acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and 45. In some embodiments, the method further comprises incubating a second reaction mixture comprising the retinal and a retinal reductase for a sufficient time to produce retinol. In some embodiments, the retinal reductase comprises an amino acid sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) identical to the amino acid sequence of SEQ ID NO: 47. In some embodiments, the retinal reductase comprises the amino acid sequence of SEQ ID NO: 47. In some embodiments, the method further comprises incubating a third reaction mixture comprising the retinol and an acyltransferase for a sufficient time to produce retinyl ester. In some embodiments, the acyltransferase comprises an amino acid sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) identical to the amino acid sequence of any one of SEQ ID NOs: 49, 51, 53, 55, and 57.In some embodiments, the acyltransferase comprises the amino acid sequence of any one of SEQ ID NOs: 49, 51, 53, 55, and 57. In some embodiments, the retinyl ester comprises retinyl acetate, retinyl palmitate, or combination thereof. In some embodiments, the retinyl ester comprises retinyl palmitate. In some embodiments, the third reaction mixture further comprises an elongation of fatty acids protein 1. In some embodiments, the elongation of fatty acids protein 1 comprises an amino acid sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) identical to SEQ ID NO: 59. In some embodiments, the elongation of fatty acids protein 1 comprises the amino acid sequence of SEQ ID NO: 59. In some embodiments, the method further comprises obtaining beta-carotene. In some embodiments, the beta-carotene is obtained by incubating geranylgeranyl diphosphate (GGPP) with a phytoene dehydrogenase (CarB) and a bifunctional lycopene cyclase/phytoene synthase (CarRP). In some embodiments, the GGPP is obtained from acetoacetyl-CoA via a mevalonate pathway comprising hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl- CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), farnesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the GGPP is obtained from Acetoacetyl-CoA via a mevalonate pathway comprising hydroxymethylglutaryl- CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), and a fusion enzyme comprising a farnesyl diphosphate synthase (FPPS) fused to a geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the method is an in vitro method. In some embodiments, the method is carried out in a host cell. In some embodiments, the host cell is a prokaryotic cell or a eukaryotic cell. In some embodiments, the host cell is a bacterial cell, a yeast cell, an algal cell, or a fungal cell. In some embodiments, the host cell is selected from Yarrowia; Escherichia; Salmonella; Bacillus; Acinetobacter; Streptomyces; Corynebacterium; Methylosinus; Methylomonas; Rhodococcus; Pseudomonas; Rhodobacter; Synechocystis; Saccharomyces; Zygosaccharomyces; Kluyveroumyces; Candida; Hansenula; Debaryomyces; Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium; Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium. In some embodiments, the host cell is Yarrowia lipolytica. In some embodiments, the host cell is transformed with a nucleic acid molecule encoding a bacteriorhodopsin-related protein-like homolog, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100%) identical to the nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and 46. In some embodiments, the host cell is further transformed with a nucleic acid molecule encoding a retinal reductase, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100%) identical to the nucleotide sequence of SEQ ID NO: 48. In some embodiments, the host cell is further transformed with a nucleic acid molecule encoding an acyltransferase, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100%) identical to the nucleotide sequence of any one of SEQ ID NOs: 50, 52, 54, 56, and 58. In some embodiments, the host cell is further transformed with a nucleic acid molecule encoding an elongation of fatty acids protein 1, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 70% (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100%) identical to the nucleotide sequence of SEQ ID NO: 60. In some embodiments, the host cell is further transformed with a nucleic acid molecule encoding a phytoene dehydrogenase (CarB) and/or a nucleic acid molecule encoding a bifunctional lycopene cyclase/phytoene synthase (CarRP). In some embodiments, the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), farnesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), and a fusion enzyme comprising a farnesyl diphosphate synthase (FPPS) fused to a geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the vitamin A compounds comprise retinal, retinol, retinyl ester, and combinations thereof. In some embodiments, the retinyl ester comprises retinyl acetate, retinyl palmitate, and combination thereof. In some embodiments, the method further comprises isolating the vitamin A compounds. In another aspect, the present disclosure relates to a nucleic acid molecule comprising a nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, and 60. In some embodiments, the nucleotide sequence is operably linked to a promoter. In some embodiments, the nucleic acid molecule is a vector, optionally an expression vector. Another aspect of the present disclosure relates to a host transformed with the nucleic acid molecule described herein. Yet another aspect of the present disclosure relates to a host cell transformed with a nucleic acid molecule comprising a nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and 46. In some embodiments, the host cell is further transformed with a nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO: 48. In some embodiments, the host cell is further transformed with a nucleic acid molecule comprising a nucleotide sequence of any one of SEQ ID NOs: 50, 52, 54, 56, and 58. In some embodiments, the host cell is further transformed with a nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO: 60. In some embodiments, the host cell is further transformed with a nucleic acid molecule encoding a phytoene dehydrogenase (CarB) and/or a nucleic acid molecule encoding a bifunctional lycopene cyclase/phytoene synthase (CarRP). In some embodiments, the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), farnesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), and a fusion enzyme comprising a farnesyl diphosphate synthase (FPPS) fused to a geranylgeranyl diphosphate synthase (GGPPS). In some embodiments, the host cell is a bacterial cell, a yeast cell, an algal cell, or a fungal cell. In some embodiments, the host cell is selected from Yarrowia; Escherichia; Salmonella; Bacillus; Acinetobacter; Streptomyces; Corynebacterium; Methylosinus; Methylomonas; Rhodococcus; Pseudomonas; Rhodobacter; Synechocystis; Saccharomyces; Zygosaccharomyces; Kluyveroumyces; Candida; Hansenula; Debaryomyces; Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium; Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium. In some embodiments, the host cell is Yarrowia lipolytica. Another aspect of the present disclosure relates to a method of producing vitamin A compounds, the method comprising culturing any one of the host cells described herein. While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawing and will herein be described in detail. It should be understood, however, that the drawings and detailed description presented herein are not intended to limit the disclosure to the particular embodiment disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure as defined by the appended claims. Other features and advantages of this disclosure will become apparent in the following detailed description of preferred embodiments of this disclosure, taken with reference to the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present disclosure, which can be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein. FIG.1 shows the structures of exemplary vitamin A compounds and their derivatives including retinal, retinol, retinoic acid, retinyl esters (retinyl acetate and retinyl palmitate). FIG.2 describes a role of retinal molecule in bacteriorhodopsin-based photosynthesis, representing the light-driven retinal trans to cis isomerization, and its spontaneous reversion. HN⁺ and H_P ⁺ are protons on the negative and positive side of the membrane. FIG.3 illustrates a biosynthesis pathway to vitamin A compounds according to the present teachings. The top portion illustrates a mevalonate pathway which can take place endogenously in a selected host cell. The bottom portion illustrates a heterologous biosynthetic pathway from geranylgeranyl diphosphate (GGPP) to vitamin A compounds including retinal, retinol, and retinyl ester. In the presence of a bacteriorhodopsin-related protein-like homolog (Blh), retinal reductase (RALR) and acyltransferase (AT) enzymes according to the present teachings, beta-carotene is converted to retinal and then to retinol, and then lead to the production of retinyl ester. The underlined genes are those that are overexpressed in the transformed host cell. FIGs.4A to 4C show the plasmids generated to overexpress the various genes highlighted in FIG.3. These gene cassettes were integrated into the chromosome of Yarrowia lipolytica ATCC 90811 strain as described in more detail in Examples 1, 4 and 7. FIG.5 shows structures and a detailed vitamin A biosynthetic pathway from beta- carotene to vitamin A compounds including retinal, retinoic acid, retinol and retinyl ester, comprising Blh or beta-carotene 15,15'-dioxygenase (BCDO), retinal dehydrogenase (RALDH), RALR, and lecithin: retinol acyltransferase (LRAT) or acyl-CoA: retinol acyltransferase (ARAT). FIG.6 shows a phylogenetic tree generated from 23 novel Blh proteins and the functional known Blh protein from uncultured marine bacterium 66A03 (underlined). The tree is rooted using an outgroup BCDO1 Mus musculus gene. FIG.7 shows amino acid sequence identities among 23 novel Blh proteins and the functional known Blh protein from uncultured marine bacterium 66A03. FIG.8 shows the phenotypes of Yarrowia lipolytica strains having capabilities to produce beta-carotene, retinal and retinol as described in Example 4. FIG.9 describes a role of retinol acyltransferase for the production of retinyl esters. Acyl- CoA and palmitoyl-CoA could be utilized as substrates for retinyl acetate and retinyl palmitate, respectively. FIG.10 shows HPLC profile and UV spectra showing beta-carotene production in cell culture of a transformed Yarrowia lipolytica strain. FIG.11A shows an HPLC profile of a retinal standard. FIG.11B shows an HPLC profile of a retinal production in cell culture of a transformed Yarrowia lipolytica strain. FIG.11C shows UV spectra of a retinal standard. FIG.11D shows a retinal production in cell culture of a transformed Yarrowia lipolytica strain. In FIG.11C and FIG.11D, each line represents either the start timepoint , the peak timepoint, or the end timepoint produced by HPLC analysis. FIG.12A shows an HPLC profile of a retinol standard. FIG.12B shows an HPLC profile of a retinol production in cell culture of a transformed Yarrowia lipolytica strain. FIG.12C shows UV spectra of a retinol standard. FIG.12D shows a retinol production in cell culture of a transformed Yarrowia lipolytica strain. In FIG.12C and FIG.12D, each line represents either the start timepoint, the peak timepoint, or the end timepoint produced by HPLC analysis. FIG.13 compares a production titer of retinal and retinol by Yarrowia lipolytica strains that have been transformed to overexpress 24 different blh genes as described in Example 6. FIGs.14A to 14B show an HPLC profile (FIG.14A) and UV spectra (FIG.14B) showing retinyl acetate production in cell culture of a transformed Yarrowia lipolytica strain. In FIG.14B, each line represents either the start timepoint , the peak timepoint, or the end timepoint produced by HPLC analysis. FIGs.15A to 15B show an HPLC profile (FIG.15A) and UV spectra (FIG.15B) showing retinyl palmitate production in cell culture of a transformed Yarrowia lipolytica strain. In FIG. 15B, each line represents either the start timepoint, the peak timepoint, or the end timepoint produced by HPLC analysis. DETAILED DESCRIPTION The oleaginous yeast Yarrowia lipolytica is one of the most prolific heterologous hosts for carotenoids production due to its large intercellular pool size of acetyl-CoA (the starting materials of the carotenoid backbone, see FIG.3), and well-established genetic toolboxes. A number of Yarrowia lipolytica strains were screened and Yarrowia lipolytica ATCC 90811 was identified as a gifted heterologous host for carotenoid production. Through co-overexpression of FPPS, GGPPS, CarRP and CarB genes, a highly efficient host strain for beta-carotene production has been generated. Up to 4 g/L of beta-carotene is produced in cell culture of Yarrowia lipolytica ATCC 90811 strain transformed with a pYconVec-Ura-TEF-carB-TEF-carRP-TEF- GGPPS::FPPS construct (FIG.4A). As shown in FIG.5, beta-carotene (C₄₀), the immediate precursor for vitamin A biosynthesis in nature, is cleaved at central double bond (15, 15' position) and as a result yields two molecules of retinal (C₂₀). The enzyme catalyzes the cleavage of beta-carotene is beta- carotene 15,15'-dioxygenase (BCDO). BCDO derived from fungi or animal sources are currently used for industrial biomanufacturing of vitamin A. Until recent years, a bacteriorhodopsin- related protein-like homolog (Blh) protein derived from uncultured marine bacterium 66A03 (GenBank: AAY68319.1) has been characterized and utilized for vitamin A production in both Escherichia coli and Saccharomyces cerevisiae platforms, demonstrating a higher catalytic efficiency than the BCDO enzymes utilized in the same study. A number of Blh functional homologs were discovered with sequence similarity network (SSN) analysis. The blh homologous genes are originated from the photosynthetic microbes, representing an untapped genetic reservoir for novel retinal biosynthetic genes discovery. Using beta-carotene producing Yarrowia lipolytica strain as a host, 23 novel Blh proteins were screened for the retinal and retinol production. Phylogenetic analysis (phylogenetic tree of novel and known Blh proteins is shown in FIG.6) and amino acid sequence alignment results (FIG.7) demonstrated the 23 novel Blh proteins share low identity (<40%) with the known Blh protein from uncultured marine bacterium 66A03. After the cleavage of beta-carotene into two retinal molecules, retinal could be either oxidized into retinoic acid by retinal dehydrogenase (RALDH) or aldehyde oxidase, or reduced to retinol by retinal reductase (RALR) (FIG.5). Retinal feeding experiments were performed and showed endogenous promiscuous reductases of Yarrowia lipolytica could catalyze the conversion of retinal to retinol. As retinol could be esterified with a fatty acid or fatty acyl-CoA by acyltransferases into more stable retinyl esters, additional copy of codon optimized Proteobacteria NADP+-dependent aldehyde reductase gene was co-expressed with oleosin tag fused blh gene to modulate the ratio between retinol and retinal. A pYconVec-Leu-TEF-oleosin- blh-TEF-ybbo construct (FIG.4B) was transformed into Yarrowia lipolytica beta-carotene producing strain, showing obvious phenotype change from orange colored beta-carotene clones to light yellow colored vitamin A clones on a yeast synthetic dropout medium plate (FIG.8). Except retinal and retinol, two most well-known vitamin A compounds sold in market are retinyl esters: retinyl acetate and retinyl palmitate. The esterification of retinol improves its stability. Retinyl acetate, used to fortify foods, is generally recognized as safe and recommended for maternal supplementation during pregnancy. Retinyl palmitate, available in oily or dry forms, is used as skin care product and treatment for dry eye and vitamin A deficiency. In nature, two different groups of acyltransferases, lecithin: retinol acyltransferase (LRAT) and acyl-CoA: retinol acyltransferase (ARAT), are involved in conversion of retinol towards retinyl esters (FIG.5). LRAT catalyzes the transfer of a fatty acid from the sn-1 position of lecithin (phosphatidylcholine) to retinol, which is present in multiple animal tissues. A number of LRAT functional homologs discovered from bacteria were expressed in the Yarrowia lipolytica retinol producing strain. ARAT use acyl-CoA or a fatty-acyl CoA, such as palmitoyl-CoA, as an acyl donor for retinol esterification (FIG.9). Studies demonstrated the neutral lipid synthesis enzyme acyl-CoA:diacylglycerol acyltransferase 1 (DGAT1) functions as the ARAT in murine skin. On the other hand, Yarrowia lipolytica has a rich acetyl-CoA pool for fatty acid. Codon optimized Mus musculus derived DGAT1 genes and its homologous gene from Yarrowia lipolytica were expressed to convert retinol into retinyl esters. To improve the palmitoyl-CoA substrate supply, a copy of fatty acid elongation protein (ELO1) was cloned into pYconVec-Lys-TEF-AT-TEF- ELO1 construct to co-express with putative retinol acyltransferase (FIG.4C). As the Examples show, expression of novel blh genes led to significant improvements in the cleavage efficiency at the 15, 15’ position of beta-carotene to produce retinal. Followed with expression of aldehyde reductase and acyltransferase genes, high yield of vitamin compounds can be produced. The present disclosure, therefore, provides an economical and reliable approach for producing “natural” vitamin A compounds that is suitable for commercial scale-up production. Pathways and Enzymes for Biosynthesis Vitamin A Compounds Referring to FIG.3 and FIG.5, biosynthetic production of vitamin A compounds from carbon sources like glucose or glycerol can be performed via in vivo overexpression of a series of native and heterologous genes in microbial host cells. Isopentenyl diphosphate (IPP) and its isomer dimethylallyl diphosphate (DMAPP) are the C5 building blocks for making geranylgeranyl diphosphate (GGPP), which is the direct precursor of carotenoids. In plant or fungal host cells, IPP and DMAPP can be generated from the mevalonate (MVA) pathway. Two molecules of acetyl-CoA are condensed into one molecule of acetoacetyl-CoA by acetyl-CoA acetyltransferase (AtoB). Acetoacetyl-CoA is converted into mevalonic acid via an intermediate hydroxymethylglutaryl-CoA (HMG-CoA) by HMG-CoA synthase (HMGS) and HMG-CoA reductase (HMGR), respectively. Then IPP is produced from mevalonic acid by three enzymes, mevalonate kinase (MevK), phosphomevalonate kinase (PMK) and phosphomevalonate decarboxylase (PMD). MV A pathway requires isopentenyl diphosphate isomerase (IPI) to generate DMAPP from IPP. Through the coupling of multiple IPP and DMAPP molecules, the GGPP is formed. Through the condensation of two GGPP units catalyzed by a bifunctional lycopene cyclase/phytoene synthase CarRP, phytoene is produced, which is then converted to lycopene by phytoene dehydrogenase CarB. Lycopene can be cyclized to beta-carotene by CarRP. The cleavage of beta-carotene by a Blh or BCDO enzyme at 15, 15’ position results in the production of retinal. Retinal is then reduced to retinol by retinal reductase (RALR). Retinol could be esterified with a fatty acid or fatty acyl-CoA by acyltransferase (AT) into retinyl esters, such as retinyl acetate and retinyl palmitate.

With continued reference to FIG. 3, genes that are overexpressed in the transformed microbial host cells according to the present teachings are shaded. Specifically, these include: FPPS (NCBI RefSeq XP .503599.1); GGPPS (NCBI RefSeq XP...502923.1); CarRP (UniProtKB/Swiss-Prot: Q9UUQ6.1); CarB (GenBank accession No. OAD07725.1 ); Blh (SEQ ID 1-46), RALR (SEQ ID 47, 48), and AT (SEQ ID 49-58).

Methods of Making Retinal Molecule

Methods described herein, in some embodiments, provide for the production of retinal molecule using novel Blh enzymes and transformed host cells that have been transformed to express such novel Blh enzymes. In various embodiments, the present methods involve cellular systems that include growing cells that have been transformed with a nucleic acid molecule, where such nucleic acid molecule encodes an enzyme having beta-carotene 15,15'-cleavage activity (sometimes referred herein as a protein containing “blh_monoox” domain, Accession No. TIGR03753). In some embodiments, the cellular systems can comprise bacterial cells, yeast cells, plant cells that do not naturally produce the retinal molecule, algal cells, bacterial cells and/or fungal cells that do not naturally encode Blh enzymes described herein. In particular embodiments, the cellular system can comprise growing bacteria and/or yeast cells selected from the group consisting of Yarrowia, Escherichia; Salmonella; Bacillus; Acinetobacter;

Streptomyces; Corynebacterium; Methylosinus; Methylomonas; Rhodococcus; Pseudomonas: Rhodobacter; Synechocystis; Saccharotnyces; Zygosaccharomyces; Kluyveroumyces; Candida: Hansenula; Debaryomyces; Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium; Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium.

Enzymes that exhibit beta-carotene 15,15'-cleavage activity can include various bacteriorhodopsin-related protein-like homolog (Bill) protein or a beta-carotene 15,15'- dioxygenase (BCDO). For example, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from SARI 16 cluster alpha proteobacterium H 1 MB 100 hybrida or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 1. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: I. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 1. In other embodiments, the beta-carotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Deltaproteobacteria bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 3. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 3. In other embodiments , the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 3. In other embodiments, the betacarotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from SARI 16 cluster bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%', or 99% identity to SEQ ID NO: 5. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 5. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 5. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from SARI 16 cluster bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 7. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 7. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 7. In other embodiments, the betacarotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Methylococcaceae bacterium TMED69 or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%' identity to SEQ ID NO: 9. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 9. In other embodiments, the Bih (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 9. In other embodiments, the beta-carotene 15,15’-cleavage enzyme can be a Bih (or BCDO) from Uncultured marine bacterium HF10_J9P 19 or a functional variant thereof. For example, the Bill (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%', or 99% identity to SEQ ID NO: 11. In certain embodiments, the Bill (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 11 . In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 11. In other embodiments, the beta- carotene 15, 15 '-cleavage enzyme can be a Blh (or BCDO) from SARI 16 cluster bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 13. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 13. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 13. In other embodiments, the betacarotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from SARI 16 cluster bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 15. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 15. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 15. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Candidatus Puniceispirillum sp. or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%', 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 17. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 17. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 17. In other embodiments, the beta-carotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Uncultured marine bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 19. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 19. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 19. In other embodiments, the beta-carotene 15,15'- cleavage enzyme can be a Blh (or BCDO) from Uncultured marine bacterium EB000_55Bl 1 or a functional variant thereof. For example, the Bill (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 21. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 21. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 21 . In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Opitutae bacterium Tous-CIOFEB or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 23. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 23. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 23. In other embodiments, the beta-carotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Gammaprote obacte ria bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 25. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 25. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 25. In other embodiments, the betacarotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from PSI clade bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%', or 99% identity to SEQ ID NO: 27. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 27. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 27. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from SARI 16 cluster bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 29. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 29. In other embodiments, the Bih (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 29. In other embodiments, the betacarotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Alphaproteobacteria bacterium TMED87 or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%' identity to SEQ ID NO: 31. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 31. In other embodiments, the Bill (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 31. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Eur yarchaeota archaeon or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 33. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 33. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 33. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Gammaproteobacteria bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 35. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 35. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 35. In other embodiments, the betacarotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Euryarchaeota archaeon or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 37. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 37. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 37. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Porticoccaceae bacterium or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 39. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 39. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 39. In other embodiments, the betacarotene 15,15’-cleavage enzyme can be a Blh (or BCDO) from Rhodobacter sp. BACL10 MAG- I20419-binl5 or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 41 . In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 41. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 41. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Cellvibt 'ionales bacterium TMED49 or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%. 98%, or 99% identity to SEQ ID NO: 43. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 43. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 43. In other embodiments, the beta-carotene 15,15'-cleavage enzyme can be a Blh (or BCDO) from Candidatus Puniceispirillum sp. TMED52 or a functional variant thereof. For example, the Blh (or BCDO) can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 45. In certain embodiments, the Blh (or BCDO) can comprise the amino acid sequence of SEQ ID NO: 45. In other embodiments, the Blh (or BCDO) can consist of the amino acid sequence of SEQ ID NO: 45.

Conversion of Retinal to Retinol

Referring to FIG. 3 and FIG. 5, retinal is reduced to retinol by retinal reductase (RALR). In addition to endogenous promiscuous reductases in Yarrowia lipolytica, a copy of codon optimized Proteobacteria NADP⁺-dependent aldehyde reductase gene was overexpressed to improve the efficiency of retinal to retinol reduction reaction. In various embodiments, the RALR can be a Proteobacteria NADP+-dependent aldehyde reductase or a functional variant thereof. For example, the RALR can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 47. In certain embodiments, the RALR can comprise the amino acid sequence of SEQ ID NO: 47. In other embodiments, the RALR can consist of the amino acid sequence of SEQ ID NO: 47.

Esterification of Retinol to Retinyl Esters

The esterification of retinol is desirable to produce more stable vitamin A compounds including retinyl acetate and retinyl palmitate. Referring to FIG. 5, two groups of ATs, LRATs and ARATs, are capable of enzymatically converting retinol towards retinyl esters. Currently, the vast majority of patented ATs for the esterification reaction are enzymes derived from fungi, plant and animals. i\ number of LRAT functional homologs discovered from bacteria, and acyl- CoA:diacylglycerol acyltransferase 1 (DGAT1) functional homologs derived from animal and fungi were validated for their retinol esterification activities. In various embodiments, the AT enzyme that catalyze the conversion from retinol to retinyl esters can be a Yarrowia lipolytica CLIB 122 YALI0E32769p acyl-CoA:diacylglycerol acyltransferase or a functional variant thereof. For example, the AT can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%>, 97%, 98%, or 99% identity to SEQ ID NO: 49. In certain embodiments, the AT can comprise the amino acid sequence of SEQ ID NO: 49. In other embodiments, the AT can consist of the amino acid sequence of SEQ ID NO: 49. In various embodiments, the AT enzyme that catalyze the conversion from retinol to retinyl esters can be a Mus musculus diacylglycerol O-acyllransferase 1 or a functional variant thereof. For example, the AT can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 51. In certain embodiments, the AT can comprise the amino acid sequence of SEQ ID NO: 51. In other embodiments, the AT can consist of the amino acid sequence of SEQ ID NO: 51. In various embodiments, the AT enzyme that catalyze the conversion from retinol to retinyl esters can be a Campylobacler concisus lecithin retinol acyltransferase or a functional variant thereof. For example, the AT can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 53. In certain embodiments, the AT can comprise the amino acid sequence of SEQ ID NO: 53. In other embodiments, the AT can consist of the amino acid sequence of SEQ ID NO: 53. In various embodiments, the AT enzyme that catalyze the conversion from retinol to retinyl esters can be a Pontibacillus halophilus lecithin retinol acyltransferase family protein or a functional variant thereof. For example, the AT can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 55. In certain embodiments, the AT can comprise the amino acid sequence of SEQ ID NO: 55. In other embodiments, the AT can consist of the amino acid sequence of SEQ ID NO: 55. In various embodiments, the AT enzyme that catalyze the conversion from retinol to retinyl esters can be an Aeromonas sp. L_1B5_3 lecithin retinol acyltransferase or a functional variant thereof. For example, the AT can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 57. In certain embodiments, the AT can comprise the amino acid sequence of SEQ ID NO: 57. In other embodiments, the AT can consist of the amino acid sequence of SEQ ID NO: 57.

The retinol esterification product could be a retinyl ester mix, particularly mix of retinyl long chain esters. The composition of fatty acid or fatty acyl-CoA substrate pool of the Yarrowia lipolytica host is the determining factor for the production of desired retinyl ester product retinyl palmitate. Fatty acid elongation protein (ELO1) is involved in medium chain fatty acids elongation, i.e., the elongation of myristic acid (Cu) to palmitic acid (Cie). ELO1 is capable of improving the palmitic acid or palmitoyl-CoA substrate supply for the Yarrowia lipolytica host. In various embodiments, the ELO1 enzyme that is involved in medium chain fatty acids elongation can be a Yarrowia lipolytica CLIB122 YALI0F06754p elongation of fatty acids protein 1 or a functional variant thereof. For example, the ELO1 can comprise an amino acid sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 59. In certain embodiments, the ELO1 can comprise the amino acid sequence of SEQ ID NO: 59. In other embodiments, the ELO1 can consist of the amino acid sequence of SEQ ID NO: 59.

Nucleic Acids Molecules

Nucleic acid molecules according to the present teachings include synthetic and recombinant nucleic acid molecules having nucleic acid sequence encode a bacteriorhodopsin- related protein-like homolog (Bill) protein or a beta-carotene 15,15’-dioxygenase (BCDO). In various embodiments, the nucleic acid sequence can comprise a nucleic acid sequence having at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity to SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14. SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30. SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38. SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, or SEQ ID NO: 46.

Cellular Sv stems

As referred herein, a cellular system according to the present methods can include any cell or cells that can be used to express the present bacteriorhodopsin-related protein-like homolog (Bih) protein or a beta-carotene 15,15'-dioxygenase (BCDO). Such cellular system can include, but are not limited to, bacterial cells, yeast cells, plant cells, and animal cells. In some embodiments, the cellular system comprises bacterial cells, yeast cells, or a combination thereof. In some embodiments, the cellular system comprises prokaryotic cells, eukaryotic cells, and combi n ations thereof.

Bacterial cells of the present disclosure include, without limitation, Escherichia spp., Streptomyces spp., Zymomonas spp., Acetobacter spp., Citrobacter spp., Synechocystis spp., Rliizobium spp., Clostridium spp., Corynebacterium spp.. Streptococcus spp., Xanthomonas spp.. Lactobacillus spp., Lactococcus spp.. Bacillus spp., Alcaligenes spp., Pseudomonas spp., Aeromonas spp., Azotobacter spp., Comamonas spp., Mycobacterium spp., Rhodococcus spp., Gluconobacter spp., Ralstonia. spp., Acidithiobacillus spp., Microlunatus spp., Geobacter spp., Geobacillus spp., Arthrobacter spp., Flavobacterium spp., Serratia spp., Saccharopolyspora spp., Thermus spp., Stenotrophomonas spp., Chromobacterium spp., Sinorhizobium spp., Saccharopolyspora spp., Agrobacterium spp., Pantoea spp., and Vibrio natriegens.

Yeast cells of the present disclosure include, without limitation, Saccharomyces spp., Schizosaccharomyces, Hansenula, Candida, Kluyveromyces, Yarrowia, Candida boidinii, and Pichia. According to the current disclosure, a yeast as claimed herein are eukaryotic, singlecelled microorganisms classified as members of the fungus kingdom. Yeasts are unicellular organisms which evolved from multicellular ancestors but with some species useful for the current disclosure being those that can develop multicellular characteristics by forming strings of connected budding cells known as pseudo hyphae or false hyphae.

Cell Culture

A cell culture refers to any cell or cells that are in a culture. Culturing or Incubating is the process in which cells are grown under controlled conditions, typically outside of their natural environment. For example, cells, such as yeast cells, may be grown as a cell suspension in liquid nutrient broth. A cell culture includes, but is not limited to, a bacterial cell culture, a yeast cell culture, a plant cell culture, and an animal cell culture. In some embodiments, the cell culture comprises bacterial cells, yeast cells, or a combination thereof.

A bacterial cell culture of the present disclosure comprises bacterial cells including, but not limited to, Escherichia spp., Streptomyces spp., Zymomonas spp., Acetobacter spp., Citrobacter spp., Synechocystis spp., Rhizobium spp., Clostridium spp., Corynebacterium spp., Streptococcus spp., Xanthomonas spp., Lactobacillus spp., Lactococcus spp., Bacillus spp., Alcaligenes spp., Pseudomonas spp., Aeromonas spp., Azotobacter spp., Comamonas spp., Mycobacterium spp., Rhodococcus spp., Gluconobacter spp., Ralstonia spp., Acidithiobacillus spp., Microlunatus spp., Geobacter spp., Geobaciilus spp., Arthrobacter spp., Flavobacterium spp., Serratia spp., Saccharopolyspora spp.. Thermos spp., Stenotrophomonas spp.. Chromobacterium spp., Sinorhizobium spp., Saccharopolyspora spp., Agrobacterium spp., Pantoea spp, and Vibrio natriegens. A yeast cell culture of the present disclosure comprises yeast cells including, but not limited to Saccharomyces spp., Schizosaccharomyces, Hansenula, Candida, Kluyveromyces, Yarrowia, Candida boidinii, and Pichia.

In some embodiments, a cell culture as described herein can be an aqueous medium including one or more nutrient substances as known in the art. Such liquid medium can include one or more carbon sources, nitrogen sources, inorganic salts, and/or growth factors. Suitable carbon sources can include glucose, fructose, xylose, sucrose, maltose, lactose, mannitol, sorbitol, glycerol, and corn syrup. Examples of suitable nitrogen sources can include organic and inorganic nitrogen-containing substances such as peptone, corn steep liquor, mean extract, yeast extract, casein, urea, amino acids, ammonium salts, nitrates and mixtures thereof. Examples of inorganic salts can include phosphates, sulfates, magnesium, sodium, calcium, and potassium salts. The liquid medium also can include one or more vitamins and/or minerals.

In some embodiments, cells are cultured at a temperature of 16"C to 40°C. For example, cells may be cultured at a temperature of 16°C, 17’C, 18°C, 19°C, 20’C, 2FC, 22°C, 23°C, 24°C, 25’C, 26°C, 27°C, 28°C, 29°C, 30°C, 3 FC, 32’C, 33’C, 34°C, 35°C, 36°C, 37°C, 38°C, 39°C or 40°C.

In some embodiments, cells are cultured at a pH range from about 3 to about 9, preferably in the range of from about 4 to about 8. The pH can be regulated by the addition of an inorganic or organic acid or base such as hydrochloric acid, acetic acid, sodium hydroxide, calcium carbonate, ammonia, or by the addition of a buffer such as phosphate, phthalate or Tris

In some embodiments, cells are cultured for a period of 0.5 hours to 96 hours, or more. For example, cells may be cultured for a period of 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, or 72 hours. Typically, cells, such as bacterial cells, are cultured for a period of 12 to 24 hours. In some embodiments, cells are cultured for 12 to 24 hours at a temperature of 37°C. In some embodiments, cells are cultured for 12 to 24 hours at a temperature of 16°C.

In some embodiments, cells are cultured to a density of 1 x 10⁸ (ODf>oo< 1) to 2 x 10¹¹ (OD ~ 200) viable cells/ml cell culture medium. In some embodiments, cells are cultured to a density of 1 x 10⁸, 2 x 10^s, 3 x 10⁸, 4 x 10⁸, 5 x 10⁸, 6 x 10^s, 7 x 10⁸, 8 x 10^s, 9 x 10⁸, 1 x 10⁹, 2 x 10⁹, 3 x 10⁹, 4 x 10⁹, 5 x 10⁹, 6 x 10⁹, 7 x 10⁹, 8 x 10⁹, 9 x 10⁹, 1 x IO¹⁰, 2 x 10¹⁰, 3 x 10¹⁰, 4 x IO¹⁰, 5 x 10^!0, 6 x l()^!0, 7 x IO¹⁰, 8 x IO¹⁰, 9 x IO¹⁰, 1 x IO¹¹, or 2 x 10^H viable cells/ml.

(Conversion factor: OD 1 = 8 x 10^s cells/ml).

Synthetic Biology

Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described, for example, by Sambrook, J., Fritsch, E. F. and Maniatis, T. MOLECULAR CLONING: A LAB ORATORY MANUAL,, 2nd ed.; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y,, 1989 (hereinafter "Maniatis"); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W. EXPERIMENTS WITH GENE FUSIONS; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1984; and Ausubel, F. M. et al., IN CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, published by GREENE PUBLISHING AND WILEY- INTERSCIENCE, 1987; the entirety of each of which is hereby incorporated herein by reference.

Microbial Production Systems

Expression of proteins in transformed host cells is most often carried out in a bacterial or yeast host cell with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such vectors are within the scope of the present disclosure.

In an embodiment, the expression vector includes those genetic elements for expression of the recombinant polypeptide in bacterial cells. The elements for transcription and translation in the bacterial cell can include a promoter, a coding region for the protein complex, and a transcriptional terminator.

Persons of ordinary skill in the art will be aware of the molecular biology techniques available for the preparation of expression vectors. The polynucleotide used for incorporation into the expression vector of the subject technology, as described herein, can be prepared by routine techniques such as polymerase chain reaction (PCR).

A number of molecular biology techniques have been developed to operably link DNA to vectors via complementary cohesive termini. In one embodiment, complementary homopolymer tracts can be added to the nucleic acid molecule to be inserted into the vector DN A. The vector and nucleic acid molecule are then joined by hydrogen bonding between the complementary homopolymeric tails to form recombinant DNA molecules.

In an alternative embodiment, synthetic linkers containing one or more restriction sites provide are used to operably link the polynucleotide of the subject technology to the expression vector. In an embodiment, the polynucleotide is generated by restriction endonuclease digestion. In an embodiment, the nucleic acid molecule is treated with bacteriophage T4 DNA polymerase or E. coll DNA polymerase I, enzymes that remove protruding, 3 ‘-single- stranded termini with their 3’-5'-exonucleolytic activities and fill-in recessed 3'-ends with their polymerizing activities, thereby generating blunt-ended DNA segments. The blunt-ended segments are then incubated with a large molar excess of linker molecules in the presence of an enzyme that is able to catalyze the ligation of blunt-ended DNA molecules, such as bacteriophage T4 DNA ligase. Thus, the product of the reaction is a polynucleotide carrying polymeric linker sequences at its ends. These polynucleotides are then cleaved with the appropriate restriction enzyme and ligated to an expression vector that has been cleaved with an enzyme that produces termini compatible with those of the polynucleotide.

Alternatively, a vector having ligation-independent cloning (LIC) sites can be employed. The required PCR amplified polynucleotide can then be cloned into the LIC vector without restriction digest or ligation (Aslanidis and de Jong, NUCL. ACID. RES. 18 6069-74, (1990), Haun, et al, BIOTECHNIQUES 13, 515-18 (1992), which is incorporated herein by reference to the extent it is consistent herewith).

In an embodiment, in order to isolate and/or modify the polynucleotide of interest for insertion into the chosen plasmid, it is suitable to use PCR. Appropriate primers for use in PCR preparation of the sequence can be designed to isolate the required coding region of the nucleic acid molecule, add restriction endonuclease or LIC sites, place the coding region in the desired reading frame. Ill an embodiment, a polynucleotide for incorporation into an expression vector of the subject technology is prepared by the use of PCR using appropriate oligonucleotide primers. The coding region is amplified, whilst the primers themselves become incorporated into the amplified sequence product. In an embodiment, the amplification primers contain restriction endonuclease recognition sites, which allow the amplified sequence product to be cloned into an appropriate vector.

The expression vectors can be introduced into plant or microbial host cells by conventional transformation or transfection techniques. Transformation of appropriate cells with an expression vector of the subject technology is accomplished by methods known in the art and typically depends on both the type of vector and cell. Suitable techniques include calcium phosphate or calcium chloride co-precipitation, DEAE-dextran mediated transfection, lipofection, chemoporation or electroporation.

Successfully transformed cells, that is, those cells containing the expression vector, can be identified by techniques well known in the art. For example, cells transfected with an expression vector of the subject technology can be cultured to produce polypeptides described herein. Cells can be examined for the presence of the expression vector DNA by techniques well known in the art.

The host cells can contain a single copy of the expression vector described previously, or alternatively, multiple copies of the expression vector, hi some embodiments, the transformed cell can be a bacterial cell, a yeast cell, an algal cell, a fungal cell, a plant cell, an insect cell or an animal cell. In some embodiments, the cell is a plant cell selected from the group consisting of: canola plant cell, a rapeseed plant cell, a palm plant cell, a sunflower plant cell, a cotton plant cell, a corn plant cell, a peanut plant cell, a flax plant cell, a sesame plant cell, a soybean plant cell, and a petunia plant cell.

Microbial host cell expression systems and expression vectors containing regulatory sequences that direct high-level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct vectors for expression of the recombinant polypeptide of the subjection technology in a microbial host cell. These vectors could then be introduced into appropriate microorganisms via transformation to allow for high level expression of the recombinant polypeptide of the subject technology. Vectors or cassettes useful for the transformation of suitable microbial host cells are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant polynucleotide, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the polynucleotide which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. It is preferred for both control regions to be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a host.

Initiation control regions or promoters, which are useful to drive expression of the recombinant polypeptide in the desired microbial host cell are numerous and familiar to those skilled in the art. Virtually any promoter capable of driving these genes is suitable for the subject technology including but not limited to CYCI, HIS3, GALI, GALIO, ADHI, PGK, PH05, GAPDH, ADC I, TRPI, UR A3, LEU2, ENO, TPI (useful for expression in Saccharomyces); TEF (useful for expression in Yarrowia); AOXI (useful for expression in Pichidy and lac, trp, JPL, IPR, T7, tac, and trc (useful for expression in Escherichia coli).

Termination control regions may also be derived from various genes native to the microbial hosts. A termination site optionally may be included for the microbial hosts described herein.

In plant cells, the expression vectors of the subject technology can include a coding region operably linked to promoters capable of directing expression of the recombinant polypeptide of the subject technology in the desired tissues at the desired stage of development. For reasons of convenience, the polynucleotides to be expressed may comprise promoter sequences and translation leader sequences derived from the same polynucleotide. 3’ non-coding sequences encoding transcription termination signals should also be present. The expression vectors may also comprise one or more introns in order to facilitate polynucleotide expression.

For plant host cells, any combination of any promoter and any terminator capable of inducing expression of a coding region may be used in the vector sequences of the subject technology. Some suitable examples of promoters and terminators include those from nopaline synthase (nos), octopine synthase (ocs) and cauliflower mosaic virus (CaMV) genes. One type of efficient plant promoter that may be used is a high-level plant promoter. Such promoters, in operable linkage with an expression vector of the subject technology should be capable of promoting the expression of the vector. High level plant promoters that may be used in the subject technology include the promoter of the small subunit (s) of the ribulose-l,5-bisphosphate carboxylase for example from soybean (Berry-Lowe et al., J. MOLECULAR AND APP. GEN., 1 :483-98 (1982), the entirety of which is hereby incorporated herein to the extent it is consistent herewith), and the promoter of the chlorophyll binding protein. These two promoters are known to be light-induced in plant cells (see, for example, GENETIC ENGINEERING OF PLANTS, AN AGRICULTURAL PERSPECTIVE, A. Cashmore, Plenum, N.Y. (1983), pages 29-38: Coruzzi, G. et al., THE JOURNAL OF BIOLOGICAL CHEMISTRY, 258: 1399 (1983), and Dunsmuir, P. et al., JOURNAL OF MOLECULAR AND APPLIED GENETICS, 2:285 (1983), each of which is hereby incorporated herein by reference to the extent they are consistent herewith).

Analysis of Sequence Similarity Using Identity Scoring

As used herein, “sequence identity” refers to the extent to which two optimally aligned polynucleotide or peptide sequences are invariant throughout a window of alignment of components, e.g., nucleotides or amino acids. An “identity fraction” for aligned segments of a test sequence and a reference sequence is the number of identical components which are shared by the two aligned sequences divided by the total number of components in reference sequence segment, i.e., the entire reference sequence or a smaller defined part of the reference sequence.

As used herein, the term “percent sequence identity” or “percent identity” refers to the percentage of identical nucleotides in a linear polynucleotide sequence of a reference (“query”) polynucleotide molecule (or its complementary strand) as compared to a test (“subject”) polynucleotide molecule (or its complementary strand) when the two sequences are optimally aligned (with appropriate nucleotide insertions, deletions, or gaps totaling less than 20 percent of the reference sequence over the window of comparison). Optimal alignment of sequences for aligning a comparison window' are well known to those skilled in the art and may be conducted by tools such as the local homology algorithm of Smith and Waterman, the homology alignment algorithm of Needleman and Wunsch, the search for similarity method of Pearson and Lipman, and preferably by computerized implementations of these algorithms such as GAP, BESTFIT, FASTA, and TFASTA available as part of the GCG® Wisconsin Package® (Accelrys Inc., Burlington, MA). An "identity fraction" for aligned segments of a test sequence and a reference sequence is the number of identical components which are shared by the two aligned sequences divided by the total number of components in the reference sequence segment, i.e., the entire reference sequence or a smaller defined part of the reference sequence. Percent sequence identity is represented as the identity fraction multiplied by 100. The comparison of one or more polynucleotide sequences may be to a full-length polynucleotide sequence or a portion thereof, or to a longer polynucleotide sequence. For purposes of this disclosure "percent identity" may also be determined using BLASTX version 2.0 for translated nucleotide sequences and BLASTN version 2.0 for polynucleotide sequences.

The percent of sequence identity is preferably determined using the “Best Fit” or “Gap” program of the Sequence Analysis Software Package™ (Version 10; Genetics Computer Group, Inc., Madison, WI). “Gap” utilizes the algorithm of Needleman and Wunsch (Needleman and Wunsch, JOURNAL OF MOLECULAR BIOLOGY 48:443-53, 1970) to find the alignment of two sequences that maximizes the number of matches and minimizes the number of gaps. “BestFit” performs an optimal alignment of the best segment of similarity between two sequences and inserts gaps to maximize the number of matches using the local homology algorithm of Smith and Waterman (Smith and Waterman, ADVANCES IN APPLIED MATHEMATICS, 2:482-489, 1981 , Smith et al., NUCLEIC ACIDS RESEARCH 11:2205- 2220, 1983). The percent identity is most preferably determined using the “Best Fit” program.

Useful methods for determining sequence identity are also disclosed in the Basic Local Alignment Search Tool (BLAST) programs which are publicly available from National Center Biotechnology Information (NCBI) at the National Library of Medicine, National Institute of Health, Bethesda, Md. 20894; see BLAST Manual, Altschul et al., NCBI, NLM, NIH; Altschul et al., J. MOL. BIOL. 215:403-10 (1990); version 2.0 or higher of BLAST programs allows the introduction of gaps (deletions and insertions) into alignments; for peptide sequence BLASTX can be used to determine sequence identity; and, for polynucleotide sequence BLASTN can be used to determine sequence identity.

As used herein, the term “substantial percent sequence identity” refers to a percent sequence identity of at least about 70% sequence identity, at least about 80% sequence identity, at least about 85% identity, at least about 90% sequence identity, or even greater sequence identity, such as about 98% or about 99% sequence identity. Thus, one embodiment of the disclosure is a polynucleotide molecule that has at least about ?()% sequence identity, at least about 80% sequence identity, at least about 85% identity, at least about 90% sequence identity, or even greater sequence identity, such as about 98% or about 99% sequence identity with a polynucleotide sequence described herein.

Identity and Similarity

Identity is the fraction of amino acids that are the same between a pair of sequences after an alignment of the sequences (which can be done using only sequence information or structural information or some other information, but usually it is based on sequence information alone), and similarity is the score assigned based on an alignment using some similarity matrix. The similarity index can be any one of the following BLOSUM62, PAM250, or GONNET, or any matrix used by one skilled in the art for the sequence alignment of proteins.

Identity is the degree of correspondence between two sub-sequences (no gaps between the sequences). An identity of 25% or higher implies similarity of function, while 18- 25% implies similarity of structure or function. Keep in mind that two completely unrelated or random sequences (that are greater than 100 residues) can have higher than 20% identity. Similarity is the degree of resemblance between two sequences when they are compared. This is dependent on their identity.

Explanation of Terms Used Herein:

As used herein, the singular forms “a, an” and “the” include plural references unless the content clearly dictates otherwise.

To the extent that the term “include,” “have,” or the like is used in the description or the claims, such term is intended to be inclusive in a manner similar to the term “comprise” as ■’comprise” is interpreted when employed as a transitional word in a claim.

The word “exemplary” is used herein to mean "serving as an example, instance, or illustration.” Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term “complementary” is to be given its ordinary and customary' meaning to a person of ordinary skill in the art and is used without limitation to describe the relationship between nucleotide bases that are capable to hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the subjection technology also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences.

The terms “nucleic acid” and “nucleotide” are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or doublestranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally-occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified or degenerate variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated.

“Coding sequence” is to be given its ordinary and customary meaning to a person of ordinary skill in the art, and is used without limitation to refer to a DNA sequence that encodes for a specific amino acid sequence.

The term “isolated” is to be given its ordinary and customary meaning to a person of ordinary skill in the art, and when used in the context of an isolated nucleic acid or an isolated polypeptide, is used without limitation to refer to a nucleic acid or polypeptide that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated nucleic acid or polypeptide can exist in a purified form or can exist in a non-native environment such as, for example, in a transgenic host cell.

The terms “incubating” and “incubation” as used herein means a process of mixing two or more chemical or biological entities (such as a chemical compound and an enzyme) and allowing them to interact under conditions favorable for producing the desired product.

The term “degenerate variant” refers to a nucleic acid sequence having a residue sequence that differs from a reference nucleic acid sequence by one or more degenerate codon substitutions. Degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed base and/or deoxyinosine residues. A nucleic acid sequence and all of its degenerate variants will express the same amino acid or polypeptide. The terms “polypeptide,” “protein,” and “peptide” are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art; the three terms are sometimes used interchangeably, and are used without limitation to refer to a polymer of amino acids, or amino acid analogs, regardless of its size or function. Although "protein” is often used in reference to relatively large polypeptides, and "peptide” is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. The term "polypeptide" as used herein refers to peptides, polypeptides, and proteins, unless otherwise noted. The terms "protein," "polypeptide," and "peptide" are used interchangeably herein when referring to a polynucleotide product. Thus, exemplary polypeptides include polynucleotide products, naturally occurring proteins, homologs, orthologs, paralogs, fragments and other equivalents, variants, and analogs of the foregoing.

The terms "polypeptide fragment" and "fragment," when used in reference to a reference polypeptide, are to be given their ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to a polypeptide in which amino acid residues are deleted as compared to the reference polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino -terminus or carboxy-terminus of the reference polypeptide, or alternatively both.

The term "functional fragment" of a polypeptide or protein refers to a peptide fragment that is a portion of the full-length polypeptide or protein, and has substantially the same biological activity, or carries out substantially the same function as the full-length polypeptide or protein (e.g., carrying out the same enzymatic reaction).

The terms "variant polypeptide," "modified amino acid sequence" or "modified polypeptide," which are used interchangeably, refer to an amino acid sequence that is different from the reference polypeptide by one or more amino acids, e.g., by one or more amino acid substitutions, deletions, and/or additions. In an aspect, a variant is a "functional variant" which retains some or all of the ability of the reference polypeptide.

The term "functional variant" further includes conservatively substituted variants. The term "conservatively substituted variant" refers to a peptide having an amino acid sequence that differs from a reference peptide by one or more conservative amino acid substitutions and maintains some or all of the activity of the reference peptide. A "conservative amino acid substitution" is a substitution of an amino acid residue with a functionally similar residue. Examples of conservative substitutions include the substitution of one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another; the substitution of one charged or polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between threonine and serine; the substitution of one basic residue such as lysine or arginine for another; or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another; or the substitution of one aromatic residue, such as phenylalanine, tyrosine, or tryptophan for another. Such substitutions are expected to have little or no effect on the apparent molecular weight or isoelectric point of the protein or polypeptide. The phrase "conservatively substituted variant" also includes peptides wherein a residue is replaced with a chemically-derivatized residue, provided that the resulting peptide maintains some or all of the activity of the reference peptide as described herein.

The term "variant," in connection with the polypeptides of the subject technology, further includes a functionally active polypeptide having an amino acid sequence at least 75%, at least 76%, at least ~n%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least

83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least

90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least

97%, at least 98%, at least 99%, and even 100% identical to the amino acid sequence of a reference polypeptide.

“Percent (%) amino acid sequence identity” with respect to the variant polypeptide sequences of the subject technology refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues of a reference polypeptide, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity.

Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. For example, the % amino acid sequence identity may be determined using the sequence comparison program NCBI-BLAST2. The NCBI-BLAST2 sequence comparison program may be downloaded from ncbi.nlm.nih.gov. NCBI BLAST2 uses several search parameters, wherein all of those search parameters are set to default values including, for example, unmask yes, strand=all, expected occurrences 10, minimum low complexity length=15/5, multi-pass e-value=0.01, constant for multi-pass=25, dropoff for final gapped alignment-25 and scoring matrix=BLOSUM62. In situations where NCBI-BLAST2 is employed for amino acid sequence comparisons, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows: 100 times the fraction X/Y where X is the number of amino acid residues scored as identical matches by the sequence alignment program NCBI-BLAST2 in that program’s alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A.

In this sense, techniques for determining amino acid sequence “similarity” are well known in the art. In general, “similarity” refers to the exact amino acid to amino acid comparison of two or more polypeptides at the appropriate place, where amino acids are identical or possess similar chemical and/or physical properties such as charge or hydrophobicity. A so-termed “percent similarity” may then be determined between the compared polypeptide sequences. Techniques for determining nucleic acid and amino acid sequence identity also are well known in the art and include determining the nucleotide sequence of the mRNA for that gene (usually via a cDNA intermediate) and determining the amino acid sequence encoded therein, and comparing this to a second amino acid sequence. In general, “identity” refers to an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two polynucleotides or polypeptide sequences, respectively. Two or more polynucleotide sequences can be compared by determining their “percent identity”, as can two or more amino acid sequences. The programs available in the Wisconsin Sequence Analysis Package, Version 8 (available from Genetics Computer Group, Madison, Wis.), for example, the GAP program, are capable of calculating both the identity between two polynucleotides and the identity and similarity between two polypeptide sequences, respectively. Other programs for calculating identity or similarity between sequences are known by those skilled in the art.

An amino acid position “corresponding to” a reference position refers to a position that aligns with a reference sequence, as identified by aligning the amino acid sequences. Such alignments can be done by hand or by using well-known sequence alignment programs such as ClustalW2, Blast 2, etc.

Unless specified otherwise, the percent identity of two polypeptide or polynucleotide sequences refers to the percentage of identical amino acid residues or nucleotides across the entire length of the shorter of the two sequences.

The term "homologous" in all its grammatical forms and spelling variations refers to the relationship between polynucleotides or polypeptides that possess a "common evolutionary origin," including polynucleotides or polypeptides from super families and homologous polynucleotides or proteins from different species (Reeck et al., CELL 50:667, 1987). Such polynucleotides or polypeptides have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or the presence of specific amino acids or motifs at conserved positions. For example, two homologous polypeptides can have amino acid sequences that are at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%', at least 89%, at least 900 at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and even 100% identical.

"Suitable regulatory sequences" is to be given its ordinary and customary meaning to a person of ordinary skill in the art, and is used without limitation to refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3’ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences.

"Promoter" is to be given its ordinary and customary meaning to a person of ordinary skill in the art, and is used without limitation to refer to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3’ to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters, which cause a gene to be expressed in most cell types at most times, are commonly referred to as "constitutive promoters.” It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.

The term "operably linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.

The term "expression" as used herein, is to be given its ordinary and customary meaning to a person of ordinary skill in the art, and is used without limitation to refer to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the subject technology. "Over-expression" refers to the production of a gene product in transgenic or recombinant organisms that exceeds levels of production in normal or nontransformed organisms.

"Transformation” is to be given its ordinary and customary meaning to a person of reasonable skill in the field, and is used without limitation to refer to the transfer of a polynucleotide into a target cell for further expression by that cell. The transferred polynucleotide can be incorporated into the genome or chromosomal DNA of a target cell, resulting in genetically stable inheritance, or it can replicate independent of the host chromosomal. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" or "recombinant” or "transformed" organisms.

The terms "transformed," "transgenic," and "recombinant," when used herein in connection with host cells, are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art, and are used without limitation to refer to a cell of a host organism, such as a plant or microbial cell, into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host cell, or the nucleic acid molecule can be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating. Transformed cells, tissues, or subjects are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.

The terms "recombinant," "heterologous," and "exogenous," when used herein in connection with polynucleotides, are to be given their ordinary and customary meanings to a person of ordinary skill in the art, and are used without limitation to refer to a polynucleotide (e.g., a DNA sequence or a gene) that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified through, for example, the use of site-directed mutagenesis or other recombinant techniques. The terms also include non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position or form within the host cell in which the element is not ordinarily found.

Similarly, the terms "recombinant," "heterologous," and "exogenous," when used herein in connection with a polypeptide or amino acid sequence, means a polypeptide or amino acid sequence that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, recombinant DNA segments can be expressed in a host cell to produce a recombinant polypeptide.

The terms "plasmid," "vector," and "cassette" are to be given their respective ordinary and customary meanings to a person of ordinary skill in the art and are used without limitation to refer to an extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell. "Transformation cassette" refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell. ■’Expression cassette" refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure belongs. Although any methods and materials similar to or equivalent to those described herein can be used in the practice or testing of the present disclosure, the preferred materials and methods are described below.

The disclosure will be more fully understood upon consideration of the following nonlimiting Examples. It should be understood that these Examples, while indicating preferred embodiments of the subject technology, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of the subject technology, and without departing from the spirit and scope thereof, can make various changes and modifications of the subject technology to adapt it to various uses and conditions.

EXAMPLES

Example 1: Generation of beta-carotene producing Yarrowia lipolytica strains fa general, the constitutive TEE promoter and the XPR2 terminator were used for the overexpression of individual genes in Yarrowia lipolytica ATCC 90811 host cells. A pUC57- Kan based vector carrying both the TEF promoter and the XPR2 terminator was cloned. Individual genes were cloned in between the TEF promoter and the XPR2 terminator using Gibson assembly. Cassette containing the TEF promoter, one or more genes for overexpression, and the XPR2 terminator were amplified with PCR and then cloned into high copy number Yarrowia integration vectors such as pYconVec-Ura, pYconVec-Leu, and pYconVec-Lys, which has auxotrophic uracil, leucine and lysine markers, respectively.

In order to boost the carotenoids precursors supply, a gene encoding a fusion protein comprising FPPS fused in-frame to GGPPS was cloned into pUC57-Kan-TEF-XPR2 vector to generate pUC57-TEF-GGPPS::FPPS-XPR2 construct. Similarly, two genes responsible for betacarotene production, encoding CarRP and CarB, were cloned into pUC57-Kan-TEF-XPR2 vector to generate pUC57-TEF-CarRP-XPR2 and pUC57-TEF-CarB-XPR2 constructs. respectively. TEF-GGPPS::FPPS-XPR2 cassette, TEF-CarRP-XPR2 cassette, and TEF-CarB- XPR2 cassette were sequentially cloned into pYconVec-Ura vector to generate a pYconVec-Ura- TEF-carB-TEF-carRP-TEF-GGPPS::FPPS construct (FIG. 4A). The pYconVec-Ura-TEF-carB- TEF-carRP-TEF-GGPPS::FPPS construct was then used to transform Yarrowia lipolytica ATCC 90811 host and select on minimal media plate without uracil. The orange-colored clones were Yarrowia lipolytica beta-carotene producers. 20 orange-colored clones were screened for the capability to produce beta-carotene. The colony with highest beta-carotene content was named as BC-Ura. Thus, the Yarrowia lipolytica host strain for retinal production was generated.

Example 2: Extraction of beta-caratene for HPLC analysis

The beta-carotene producing Yarrowia lipolytica strains were streaked onto minimal media plate without uracil and grown at 30°C. After 4 days of incubation, the orange-colored colonies were grown in 5 ml of YPD liquid medium at 250 rpm and at 30°C for 4 days. 0.2 ml of cell culture was harvested by centrifugation and the cell pellet was resuspended with 0.8 ml of methanol and 0.2 mi of di chloromethane. 0.5 mm diameter glass beads were then added. The Yarrowia. lipolytica cells were disrupted for 1 minute in a bead beater homogenizer. Then the mixture was centrifuged for 10 min at 15,000 rpm and the supernatant was collected for HPLC analysis.

HPLC analysis of beta-carotene was performed using an Ultimate 3000 HPLC System (Dionex, Sunnyvale, CA) that included a quaternary pump, a temperature-controlled column compartment, an auto sampler and a UV absorbance detector. Gemini C18 (150mm x 4.6mm 3 pm) with guard column was used for the characterization of beta-carotene. A flow rate of 1.0 ml/min was applied, and the mobile phase was composed of isopropanol (A) and acetonitrile (B). The program is 0-5 min 85% B: 5-13 min 85%-60% B; 13-14 min 60%-85% B; 14-16 min 85% B. The detection wavelengths were 474 nm for beta-carotene.

As shown in FIG. 10, the Yarrowia lipolytica hosting genes of FPPS:: GGPPS, CarRP and CarB accumulates beta-carotene (13.5 min peak). The retention time and UV spectrum of the extracted beta-carotene peak were consistent with those of the beta-carotene standard.

Example 3: Bioinformatic analysis to search Bih functional! homologs Hmmsearch (version 3.2.1) was used to find proteins in the UniProt database containing the BCD domain (Pfam accession PF15461), using an e-value cutoff of 0.001. Hits _Were filtered to remove sequences smaller than 225 amino acids or larger than 800 amino acids. An all-to-all Diamond (vO.9.22.123) search in “sensitive” mode was performed among the sequences using and e-value cutoff of le-20 and maximum hits per sequence 1000.

The Diamond output was used to generate a sequence similarity network (SSN) where each sequence was a node, and edges were drawn between nodes based on the Diamond results, with an edge drawn between pairs of nodes with an e-value of le-30 or lower. The SSN was visualized in Cytoscape (Version 3.7.1), and the node coordinates generated using the “perfuse force directed layout” algorithm.

In the resulting network, there was a clearly distinct cluster of nodes around the node for the Blh homolog from uncultured marine bacterium 66A03. The 151 Blh homologous sequences from this cluster were aligned using Muscle, and a phylogenetic tree was generated using FastTree. From the phylogenetic tree, 9 sequences most similar to the reference sequence (from the clade containing the known Blh homolog from uncultured marine bacterium 66A03) and 1 or 2 sequences from each of other clades were selected for functional characterization.

Example 4: Identifying Bih homologs with 15, 15’ beta-carotene cleavage activity

Functional Blh (or BCDO) can catalyze the cleavage of beta-carotene (Co) at the central double bond (15, 15’ position) to produce two molecules of retinal (C20) (FIG. 5). Referring to FIG. 3 and FIG. 5, an efficient Blh protein is the determining factor to improve the production yield of vitamin A compounds and subsequently RALR is capable of converting retinal to retinol.

It has been reported that Blh originated from uncultured marine bacterium 66A03 demonstrated superior activity than BCDOs reported from fungi or animal sources. Based on published literature and sequence similarity network (SSN) analysis, the known Blh homolog (Blh...Q4PN10..UNCMB), 9 Blh homologs fall into the same clade with the known Blh homolog, and 1-2 Blh homologs from clades far from the known Blh homolog were selected for in vivo screening of more efficient Blh protein using Yarrowia lipolytica host. To improve the efficiency of downstream retinal to retinol conversion, a copy of codon optimized Proteobacteria NADP⁺- dependent aldehyde reductase gene (ybbo) was cloned into pUC57-Kan-TEF-XPR2 vector to generate pUC57-TEF-ybbo-XPR2 construct. TEF-ybbo-XPR2 cassette were then cloned into pYconVec-Leu vector to generate a pYconVec-Leu-TEF-ybbo construct. Candidate blh homolog genes have been individually cloned into a pUC57-Kan-TEF-oleosin-XPR2 vector. Oleosin is a lipid body compartmentalization signal tag to improve the activities of Blh proteins. The cassettes containing the TEF promoter, oleosin fused individual blh homologs gene and the XPR2 terminator were cloned into pYconVec-Leu-TEF-ybbo vector to generate individual pYconVec-Leu-TEF-oleosin-blh_homo-TEF-ybbo constructs (FIG. 4B). Each pYconVec-Leu- TEF-oleosin-blh_homo-TEF-ybbo construct was then used to transform Yarrowia lipolytica beta-carotene producing BC-Ura strain and select on minimal media plate without leucine.

Around 5-7 days after transformation, light yellow colored VA clones showed up which could be easily distinguished from orange colored BC clones (FIG. 8). A number of light-yellow colored clones from each plate were picked and screened for their 15, 15’ beta-carotene cleavage activities.

Example 5: Extraction and HPLC analysis of retinal and retinol produced by engineered Yarrowia lipolytica strains

The two-phase in situ extraction strategy was adopted to prevent retinal and retinol degradation, and to easily extract retinal and retinol from cell cultures and measure their titers. A number of vegetative oils and chemicals for retinal and retinol two-phase in situ extraction were tested. Among them olive oil, canola oil, peanut oil, dioctyl phthalate and dodecane are not harmful for cell growth and demonstrated decent recover rate.

The light yellow colored putative retinal and retinol producing Yarrowia lipolytica clones were streaked onto minimal media plate without leucine and grown at 30°C. After 4 days incubation, the colonies were inoculated into 5 ml YPD liquid medium with 625 pl (1/8 volume) of dodecane and grown at 250 rpm and at 30°C for 4 days. To analyze retinal and retinol titers, 300-400 pl of the upper dodecane phase was collected and transferred into 1.5 ml Eppendorf tubes. After centrifugation at 14,000 rpm for 10 minutes, 10 pl of the clean dodecane phase was analyzed by HPLC using an Ultimate 3000 HPLC System (Dionex, Sunnyvale, CA) that included a quaternary pump, a temperature-controlled column compartment, an auto sampler and a UV absorbance detector. Gemini Cl 8 (150mm x 4.6mm 3 pm) with guard column was used for the characterization of retinal and retinol. A flow rate of 0.6 ml/min and column temperature of 40 °C were applied, and the mobile phase was composed of acetonitrile/tetrahydrofuran (4:1) (A) and water (B). The program is 0-2 m 95% B; 2-7.2 m 95%-0% B; 7.2-12.5 m 0% B; 12.5- 13.5 m 0%-95% B; 13.5-16 m 95% B. The detection wavelengths were 375 nm for retinal, and 325 nm for retinol.

As shown in FIGs. 1 1 A to 1 I D, the Yarrowia lipolytica hosting genes of FPPS:: GGPPS, CarRP, CarB, oleosin tagged blh homolog and ybbo accumulated retinal (11.7 min peak) by comparing the retention time (top) and the UV spectrum (bottom) of the extracted retinal against those of the retinal standard. As shown in FIGs. 12A to 12D, the engineered Yarrowia lipolytica strain also accumulated retinol (11.4 min peak) by comparing the retention time (top) and the UV spectrum (bottom) of the extracted retinol against those of the retinol standard.

The authentic standards of retinal and retinol were dissolved in acetone. The standard curves for retinal and retinol were generated based on 11.7 min peaks (375 nm) and 1 1.4 min peaks (325 nm) on HPLC profiles, respectively. The titers of dodecane extracted retinal and retinol were calculated based on peak areas and standard curves. The retinoids titer is the sum of retinal titer and retinol titer for each sample.

Example 6; Analysis and comparison of active BIh homologs

Using beta-carotene producing Yarrowia lipolytica strain as a host, 23 novel active BIh homologs were identified for the retinal and retinol production. Amino acid sequence blast results suggest these BIh homologs are likely originated from the photosynthetic microbes. Phylogenetic tree generated from the known BIh proteins (Blh_Q4PN10_UNCMB) and 23 novel BIh homologs is shown in FIG. 6. Referring to FIG. 7, amino acid sequence alignment results showed the 23 novel BIh proteins share low identity (<40%) with the known BIh protein from uncultured marine bacterium 66A03.

The retinoids (retinal and retinol) titer comparisons from cell cultures of Yarrowia lipolytica strains hosting the 23 novel blh genes and the known blh gene were summarized in FIG. 13. Although the known Blh protein has higher activity than currently widely used BCDOs derived from fungi or animal sources, most of the newly discovered Blh homologs, 20 among 23 active Blh homologs, demonstrated improved catalytic activities than the known Blh_Q4PN10_UNCMB. The strongest Blh protein discovered is Blh homolog AOA368EJT9__9PROT (SEQ ID NO: 27) from the PSI clade bacterium, showing 2.66-fold of retinoids titer when the Blh_Q4PN10_UNCMB severed as the control. The best Yarrowia lipolyticci clone hosting pYconVec-Ura-TEF-carB-TEF-carRP-TEF-GGPPS::FPPS construct and pYconVec-Leu-TEF-oleosin-blh_A0A368EJT9-TEF-ybbo construct was saved as strain VA-27 for ATs screening.

Example 7: Identifying ATs with retinol esterification activity

Two groups of ATs, LRATs and ARATs, were tested for their activities to convert retinol towards retinyl esters. Two DGAT1 genes from animal or fungi, and three LRAT genes from bacteria were cloned separately into pUC57-Kan-TEF-XPR2 vector to generate five different pUC57-TEF-AT-XPR2 constructs. Each candidate AT gene, along with its own TEF promoter and XPR2 terminator, were cloned into pYconVec-Lys vector to generate pYconVec-Lys-TEF- AT constructs. Yarrowia lipolytica gene encoding fatty acid elongation protein (ELO1) was expressed to enhance the palmitic acid or palmitoyl-CoA substrate supply. ELO1 gene was cloned into pUC57-Kan-TEF-XPR2 vector to generate pUC57-TEF-EL01-XPR2 construct. The cassette containing the TEF promoter, ELO1 gene and the XPR2 terminator was cloned into five different pUC57-TEF-AT-XPR2 constructs to generate five pYconVec-Lys-TEF-AT-TEF-ELOl constructs (FIG. 4C). Five pUC57-TEF-AT-XPR2 constructs and five pYconVec-Lys-TEF-AT- TEF-ELO1 cons tracts were separately transformed into retinoids-producing Yarrowia lipolytica host strain VA-27 and select on minimal media plate without lysine. Around 5-7 days after transformation, light yellow colored clones showed up and screened for their retinol esterification acti vi ty .

The cell culture and two-phase in situ extraction methods for retinyl esters were as same as the methods described in Example 5. Dodecane phase was analyzed by HPLC using an Ultimate 3000 HPLC System (Dionex, Sunnyvale, CA) that included a quaternary pump, a temperature-controlled column compartment, an auto sampler and a UV absorbance detector. Gemini Cl 8 (250mm x 4.6mm 5 pm) with guard column was used for the characterization of retinyl acetate and retinyl palmitate. A flow rate of 1.2 ml/min and column temperature of 40 "C were applied, and the mobile phase was composed of methanol and acetonitrile in 95:5 ratio. The detection wavelengths were 325 nm for retinyl acetate and retinyl palmitate. As shown in FIGs. 14A, 14B, 15A, and 15B, the Yarrowia lipolytica strain hosting genes of, FPPS:: GGPPS, CarRP, CarB, oleosin tagged WA_A0A368EJT9, ybbo, AT, with or without ELO1 accumulated retinyl acetate (4.7 min peak) or retinyl palmitate (21.5 min). The retention time and the UV spectrum of the extracted retinyl esters were consistent with those of the authentic standards.

SEQUENCES

SAR1I6 cluster alpha proteobacterium HIMB100 putative beta-carotene 15,15'-dioxygenase

Amino acid sequence of the UniProtKB G5ZZI8 (G5ZZI8 9PROT) (SEQ ID NO: 1):

Codon -optimized nucleotide sequence of UniProtKB G5ZZ18 for Yarrowia lipolytica (SEQ ID

Claims

CLAIMS What is claimed is:

1. A method of producing vitamin A compounds, the method comprising incubating a first reaction mixture comprising beta-carotene and a bacteriorhodopsin-related protein-like homolog for a sufficient time to produce retinal.

2. The method of claim 1 , wherein the bacteriorhodopsin -related protein-like homolog comprises an amino acid sequence that is at least 85% identical to the amino acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and 45.

3. The method of claim 1 or claim 2, wherein the bacteriorhodopsin-related protein-like homolog comprises the amino acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and 45.

4. The method of any one of claims 1-3, further comprising incubating a second reaction mixture comprising the retinal and a retinal reductase for a sufficient time to produce retinol.

5. The method of claim 4, wherein the retinal reductase comprises an amino acid sequence that is at least 85% identical to the amino acid sequence of SEQ ID NO: 47.

6. The method of claim 4 or claim 5, wherein the retinal reductase comprises the amino acid sequence of SEQ ID NO: 47.

7. The method of any one of claims 4-6, further comprising incubating a third reaction mixture comprising the retinol and an acyltransferase for a sufficient time to produce retinyl ester.

8. The method of claim 7, wherein the acyltransferase comprises an amino acid sequence that is at least 85% identical to the amino acid sequence of any one of SEQ ID NOs: 49, 51 , 53, 55, and 57.

9. The method of claim 7 or claim 8, wherein the acyltransferase comprises the amino acid sequence of any one of SEQ ID NOs: 49, 51, 53, 55, and 57.

10. The method of any one of claims 7-9, wherein the retinyl ester comprises retinyl acetate, retinyl palmitate, or combination thereof.

11. The method of any one of claims 7-10. wherein the retinyl ester comprises retinyl palmitate.

12. The method of claim 11, wherein the third reaction mixture further comprises an elongation of fatty acids protein 1.

13. The method of claim 12, wherein the elongation of fatty acids protein 1 comprises an amino acid sequence that is at least 85% identical to SEQ ID NO: 59.

14. The method of claim 13, wherein the elongation of fatty acids protein 1 comprises the amino acid sequence of SEQ ID NO: 59.

15. The method of any one of claims 1-14, further comprising obtaining the beta -carotene.

16. The method of claim 15, wherein the beta-carotene is obtained by incubating geranylgeranyl diphosphate (GGPP) with a phytoene dehydrogenase (CarB ) and a bifunctional lycopene cyclase/phytoene synthase (CarRP).

17. The method of claim 16, wherein the GGPP is obtained from acetoacetyl-CoA via a mevalonate pathway comprising hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), farnesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS).

18. The method of claim 16, wherein the GGPP is obtained from Acetoacetyl-CoA via a mevalonate pathway comprising hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), and a fusion enzyme comprising a farnesyl diphosphate synthase (FPPS) fused to a geranylgeranyl diphosphate synthase (GGPPS).

19. The method of any one of claims 1-18, wherein the method is an in vitro method.

20. The method of any one of claims 1-18, wherein the method is carried out in a host cell.

21 . The method of claim 20, wherein the host cell is a prokaryotic cell or a eukaryotic cell.

22. The method of claim 20 or claim 21, wherein the host cell is a bacterial cell, a yeast cell, an algal cell, or a fungal cell.

23. The method of any one of claims 20-22, wherein the host cell is selected from Yarrowia; Escherichia; Salmonella; Bacillus; Acinetobacter; Streptomyces; Corynebacterium;

Methylosinus; Methylomonas; Rhodococcus; Pseudomonas; Rhodobacter; Synechocystis; Saccharomyces; Zygosaccharomyces; Kluyveroumyces; Candida; Hansenula; Debaryomyces; Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium;

Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium.

24. The method of any one of claim 20-22, wherein the host cell is Yarrowia lipolytica.

25. The method of any one of claims 20-24, wherein the host cell is transformed with a nucleic acid molecule encoding a bacteriorhodopsin-related protein-like homolog, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 85% identical to the nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and 46.

26. The method of claim 25, wherein the host cell is further transformed with a nucleic acid molecule encoding a retinal reductase, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 85% identical to the nucleotide sequence of SEQ ID NO: 48.

27. The method of claim 26, wherein the host cell is further transformed with a nucleic acid molecule encoding an acyltransferase, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 85% identical to the nucleotide sequence of any one of SEQ ID NOs: 50, 52, 54, 56, and 58.

28. The method of claim 27, wherein the host cell is further transformed with a nucleic acid molecule encoding an elongation of fatty acids protein 1, wherein the nucleic acid molecule comprises a nucleotide sequence that is at least 85% identical to the nucleotide sequence of SEQ ID NO: 60.

29. The method of claim 28, wherein the host cell is further transformed with a nucleic acid molecule encoding a phytoene dehydrogenase (CarB) and/or a nucleic acid molecule encoding a bifunctional lycopene cyclase/phytoene synthase (CarRP).

30. The method of claim 29, wherein the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), famesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS).

31. The method of claim 29, wherein the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), and a fusion enzyme comprising a farnesyl diphosphate synthase (FPPS) fused to a geranylgeranyl diphosphate synthase (GGPPS).

32. The method of any one of claims 1-31, wherein the vitamin A compounds comprise retinal, retinol, retinyl ester, and combinations thereof.

33. The method of any one of claims 1-32, wherein the retinyl ester comprises retinyl acetate, retinyl palmitate, and combination thereof.

34. The method of any one of claims 1-33, further comprising isolating the vitamin A compounds.

35. A nucleic acid molecule comprising a nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6. 8, 10, 12. 14, 16, 18, 20, 22, 24, 26, 28. 30, 32, 34, 36, 38, 40, 42, 44. 46, 48, 50, 52, 54, 56, 58, and 60.

36. The nucleic acid molecule of claim 35, wherein the nucleotide sequence is operably linked to a promoter.

37. The nucleic acid molecule of claim 36, wherein the nucleic acid molecule is a vector, optionally an expression vector.

38. A host cell transformed with the nucleic acid molecule of any one of claims 35-37.

39. A host cell transformed with a nucleic acid molecule comprising a nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, 12. 14, 16, 18, 20, 22. 24, 26, 28. 30, 32, 34, 36, 38. 40, 42, 44, and 46.

40. The host cell of claim 39, wherein the host cell is further transformed with a nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO: 48.

41. The host cell of claim 40, wherein the host cell is further transformed with a nucleic acid molecule comprising a nucleotide sequence of any one of SEQ ID NOs: 50, 52, 54, 56, and 58.

42. The host cell of claim 40, wherein the host cell is further transformed with a nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO: 60.

43. The host cell of claim 42, wherein the host cell is further transformed with a nucleic acid molecule encoding a phytoene dehydrogenase (CarB) and/or a nucleic acid molecule encoding a bifunctional lycopene cyclase/phytoene synthase (CarRP).

44. The host cell of claim 43, wherein the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), famesyl diphosphate synthase (FPPS), and geranylgeranyl diphosphate synthase (GGPPS).

45. The host cell of claim 43, wherein the host cell is further transformed with one or more nucleic acid molecules encoding one or more of hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), isopentenyl diphosphate isomerase (IPI), and a fusion enzyme comprising a farnesyl diphosphate synthase (FPPS) fused to a geranylgeranyl diphosphate synthase (GGPPS).

46. The host cell of any one of claims 38-45, wherein the host cell is a bacterial cell, a yeast cell, an algal cell, or a fungal cell.

47. The host cell of any one of claims 38-46, wherein the host cell is selected from Yarrowia;

Escherichia; Salmonella; Bacillus; Acinetobacter; Streptomyces; Corynebacterium;

Methylosinus; Methylomonas; Rhodococcus; Pseudomonas; Rhodobacter; Synechocystis;

Saccharomyces; Zygosaccharomyces; Kluyveroumyces; Candida; Hansenula; Debaryomyces;

Mucor; Pichia; Torulopsis; Aspergillus; Arthrobotlys; Brevibacteria; Microbacterium;

Arthrobacter; Citrobacter; Klebsiella; Pantoea; and Clostridium.

48. The host cell of any one of claim 38-47, wherein the host cell is Yarrowia lipolytica.

49. A method of producing vitamin A compounds, the method comprising culturing the host cell of any one of claims 38-48.