EP1088078A2 - Genes for the biosynthesis of epothilones - Google Patents

Genes for the biosynthesis of epothilones

Info

Publication number
EP1088078A2
EP1088078A2 EP99929243A EP99929243A EP1088078A2 EP 1088078 A2 EP1088078 A2 EP 1088078A2 EP 99929243 A EP99929243 A EP 99929243A EP 99929243 A EP99929243 A EP 99929243A EP 1088078 A2 EP1088078 A2 EP 1088078A2
Authority
EP
European Patent Office
Prior art keywords
seq
nucleotides
acids
ammo acids
ammo
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP99929243A
Other languages
German (de)
English (en)
French (fr)
Inventor
Thomas Schupp
James Madison Ligon
Istvan Molnar
Ross Zirkle
Jörn Görlach
Devon Cyr
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Novartis Pharma GmbH
Novartis AG
Original Assignee
Novartis Erfindungen Verwaltungs GmbH
Novartis AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Novartis Erfindungen Verwaltungs GmbH, Novartis AG filed Critical Novartis Erfindungen Verwaltungs GmbH
Publication of EP1088078A2 publication Critical patent/EP1088078A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P17/00Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
    • C12P17/18Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms containing at least two hetero rings condensed among themselves or condensed with a common carbocyclic ring system, e.g. rifamycin
    • C12P17/181Heterocyclic compounds containing oxygen atoms as the only ring heteroatoms in the condensed system, e.g. Salinomycin, Septamycin
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • A61P35/04Antineoplastic agents specific for metastasis
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria

Definitions

  • the present invention relates generally to polyketides and genes for their synthesis.
  • the present invention relates to the isolation and characterization of novel poly- ketide synthase and non ⁇ bosomal peptide synthetase genes from Sorangium cellulosum that are necessary for the biosynthesis of epothilones A and B.
  • Polyketides are compounds synthesized from two-carbon building blocks, the ⁇ - carbon of which always carries a keto group, thus the name polyketide. These compounds include many important antibiotics, immunosuppressants, cancer chemotherapeutic agents, and other compounds possessing a broad range of biological properties. The tremendous structural diversity derives from the different lengths of the polyketide chain, the different side-chains introduced (either as part of the two-carbon building blocks or after the polyketide backbone is formed), and the stereochemistry of such groups. The keto groups may also be reduced to hydroxyls, enoyls, or removed altogether. Each round of two-carbon addition is carried out by a complex of enzymes called the polyketide synthase (PKS) in a manner similar to fatty acid biosynthesis.
  • PKS polyketide synthase
  • Type I proteins are polyfunctional, with several catalytic domains carrying out different enzymatic steps covalently linked together (e.g. PKS for erythromycin, soraphen, rifamycin, and avermectin (MacNeil et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C. pp. 245-256 (1993)); whereas type II proteins are monofunctional (Hutchinson et al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al.), American Society for Microbiology, Washington D. C. pp. 203-216 (1993)).
  • PKS for erythromycin, soraphen, rifamycin, and avermectin
  • type II proteins are monofunctional (Hutchinson et al., in Industrial Microorganisms: Basic and Applied
  • NRPSs non- ⁇ bosomal polypeptide synthetases
  • NRPSs are multienzymes that are organized in modules Each module is responsible for the addition (and the additional processing, if required) of one ammo acid building block.
  • NRPSs activate ammo acids by forming aminoacyl-adenylates, and capture the activated ammo acids on thiol groups of phophopanthetemyl prosthetic groups on peptidyl carrier protein domains.
  • NRPSs modify the amino acids by epime ⁇ zation, N-methyla- tion, or cyclization if necessary, and catalyse the formation of peptide bonds between the enzyme-bound ammo acids.
  • NRPSs are responsible for the biosynthesis of peptide secondary metabolites like cyclospo ⁇ n, could provide polyketide chain terminator units as in rapa- mycin, or form mixed systems with PKSs as in yers iabactin biosynthesis.
  • Epothilones A and B are 16-membered macrocyclic polyketides with an acylcyste- ine-de ⁇ ved starter unit that are produced by the bacterium Sorangium cellulosum strain So ce90 (Gerth et al., J. Antibiotics 49: 560-563 (1996), incorporated herein by reference).
  • the structure of epothilone A and B wherein R signifies hydrogen (epothilone A) or methyl (epothilone B) is:
  • epothilones have a narrow antifungal spectrum and especially show a high cytotoxicity in animal cell cultures (see, Hofle et ai, Patent DE 4138042 (1993), incorporated herein by reference). Of significant importance, epothilones mimic the biological effects of taxol, both in vivo and in cultured cells (Bollag et al., Cancer Research 55. 2325- 2333 (1995), incorporated herein by reference). Taxol and taxotere, which stabilize cellular microtubules, are cancer chemotherapeutic agents with significant activity against various human solid tumors (Rowinsky et ai, J. Natl. Cancer Inst.
  • epothilone analogs have been synthesized that have a superior cytotoxic activity as compared to epothilone A or epothilone B as demonstrated by their enhanced ability to induce the polymerization and stabilization of microtubules (WO 98/25929, incorporated herein by reference).
  • one object of the present invention is to isolate the genes that are involved in the synthesis of epothilones, particularly the genes that are involved in the synthesis of epothiiones A and B in myxobactena of the Sorangium/- Polyangium group, i.e., Sorangium cellulosum strain So ce90.
  • a further object of the invention is to provide a method for the recombinant production of epothilones for application in anticancer formulations.
  • the present invention unexpectedly overcomes the difficulties set forth above to provide for the first time a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone.
  • the nucleotide sequence is isolated from a species belonging to Myxobactena, most preferably Sorangium cellulosum.
  • the present invention provides an isolated nucleic acid molecuie comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an amino acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, ammo acids 543-864 of SEQ ID NO:2, ammo acids 974-1273 of SEQ ID NO:2, ammo acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, ammo acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:2, ammo acids 11 -437 of SEQ ID NO:2, am o acids 543-864 of SEQ ID NO:2, am o acids 974-1273 of SEQ ID NO:2, am o acids 1314-1385 of SEQ ID NO:2, SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, am o acids 118-125 of SEQ ID NO:3, am o acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, ammo acids 588- 603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415-5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 1 1549-11764 of SEQ ID NO 1 , nucleotides 1 1872-16104 of SEQ ID NO nucleotides 12085-12114 of SEQ ID NO 1 , nucleotides 122
  • the present invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415-5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643- 8920 of SEQ ID NO:1 , nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 11549-11764 of SEQ ID NO:1 , nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO:1 , nucleotides 3415- 5556 of SEQ ID NO:1 , nucleotides 7610-11875 of SEQ ID NO:1 , nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 9236-10201 of
  • the present invention also provides a chime ⁇ c gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention. Further, the present invention provides a recombinant vector comprising such a chime ⁇ c gene, wherein the vector is capable of being stably transformed into a host cell Still further, the present invention provides a recombinant host cell comprising such a chime ⁇ c gene, wherein the host cell is capable of expressing the nucleotide sequence that encodes at least one polypeptide necessary for the biosynthesis of an epothilone.
  • the recombinant host cell is a bacterium belonging to the order Actmomycetales, and in a more preferred embodiment the recombinant host cell is a strain of Streptomyces. In other embodiments, the recombinant host cell is any other bacterium amenable to fermentation, such as a pseudomonad or E. coll. Even further, the present invention provides a Bac clone comprising a nucieic acid molecule of the invention, preferably Bac clone pEP015. in another aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes an epothilone synthase domain.
  • the epothilone synthase domain is a ⁇ -ketoacyl-syn- thase (KS) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103- 5525 of SEQ ID NO:5, am o acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
  • KS ⁇ -ketoacyl-syn- thase
  • said KS domain preferably comprises an am o acid sequence selected from the group consisting of: am o acids 11 -437 of SEQ ID NO:2, am o acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and am o acids 32-450 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-23116 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO 1 , nucleotides 21860-23116 of SEQ ID NO-1 , nucleotides 26318-27595 of SEQ ID NO:1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO:1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO:1 , nucleotides 16269-17546 of SEQ ID NO:1 , nucleotides 21860-231 16 of SEQ ID NO:1 , nucleotides 26318-27595 of SEQ ID NO: 1 , nucleotides 30815-32092 of SEQ ID NO:1 , nucleotides 37052-38320 of SEQ ID NO 1 , nucleotides 43626-44885 of SEQ ID NO:1 , nucleotides 48087-49361 of SEQ ID NO:1 , and nucleotides 55028-56284 of SEQ ID NO:1.
  • the epothilone synthase domain is an acyltrans- ferase (AT) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: am o acids 543-864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, am o acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631- 5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7.
  • AT acyltrans- ferase
  • said AT domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO:2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, am o acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, ammo acids 5631-5951 of SEQ ID NO:5, ammo acids 561 -881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556- 877 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431-24397 of SEQ ID NO:1 , nucleotides 2791 1 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865- 18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO:1 , nucleotides 27911 - 28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636- 39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680- 50642 of SEQ ID NO:1 , and nucleotides 56
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1 , nucleotides 17865-18827 of SEQ ID NO:1 , nucleotides 23431 -24397 of SEQ ID NO: 1 , nucleotides 27911 -28876 of SEQ ID NO:1 , nucleotides 32408-33373 of SEQ ID NO:1 , nucleotides 38636-39598 of SEQ ID NO:1 , nucleotides 45204-46166 of SEQ ID NO:1 , nucleotides 49680-50642 of SEQ ID NO:1 , and nucleotides 56600-57565 of SEQ ID NO:1.
  • the epothilone synthase domain is an enoyi reductase (ER) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • ER enoyi reductase
  • said ER domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 974-1273 of SEQ ID NO:2, am o acids 4433-4719 of SEQ ID NO:5, am o acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO:1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO-1
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID NO-1 , nucleotides 35042-35902 of SEQ ID NO: 1 , nucleotides 41369-42256 of SEQ ID NO:1 ,
  • said nucleotide sequence most preferably is selected from the group consisting of. nucleotides 10529-11428 of SEQ ID NO:1 , nucleotides 35042-35902 of SEQ ID NO-1 , nucleotides 41369-42256 of SEQ ID NO:1 , and nucleotides 59366-60304 of SEQ ID NO:1.
  • the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of ammo acids 1314-1385 of SEQ ID NO.2, am o acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO-5, ammo acids 2932-3005 of SEQ ID NO.5, am o acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430- 1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, and ammo acids 2093- 2164 of SEQ ID NO.7.
  • ACP acyl carrier protein
  • said ACP domain preferably comprises an ammo acid sequence selected from the group consisting of. ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO-4, am o acids 1434- 1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, ammo acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430-1503 of SEQ ID NO:6, am o acids 3673-3745 of SEQ ID NO:6, and am o acids 2093-2164 of SEQ ID NO.7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides
  • nucleotides 21414-21626 of SEQ ID NO: 1 nucleotides 26045-26263 of SEQ ID NO 1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO 1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO 1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -61426 of SEQ ID NO 1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11549-1 1764 of SEQ ID NO"1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 47811 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 61211 -6
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 1 1549-1 1764 of SEQ ID NO:1 , nucleotides 21414-21626 of SEQ ID NO:1 , nucleotides 26045-26263 of SEQ ID NO:1 , nucleotides 30539-30759 of SEQ ID NO:1 , nucleotides 36773-36991 of SEQ ID NO:1 , nucleotides 43163-43378 of SEQ ID NO:1 , nucleotides 4781 1 -48032 of SEQ ID NO:1 , nucleotides 54540-54758 of SEQ ID NO:1 , and nucleotides 6121 1 -61426 of SEQ ID NO:1.
  • the epothilone synthase domain is a dehydratase (DH) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO.4, ami- no acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
  • DH dehydratase
  • said DH domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, am o acids 2383-2551 of SEQ ID NO:6, and am o acids 887-1051 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 - 33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670- 51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401-33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1 , nucleotides 33401 -33889 of SEQ ID NO:1 , nucleotides 39635-40141 of SEQ ID NO:1 , nucleotides 50670-51176 of SEQ ID NO:1 , and nucleotides 57593-58087 of SEQ ID NO:1.
  • the epothilone synthase domain is a ⁇ -keto- reductase (KR) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID NO:4, am o acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, am o acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and ammo acids 1810-2055 of SEQ ID NO:7.
  • KR ⁇ -keto- reductase
  • said KR domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857- 7101 of SEQ ID NO.5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1 , nucleotides 25184-25942 of SEQ ID NO:1 , nucleotides 29678-30429 of SEQ ID NO:1 , nucleotides 35930-36667 of SEQ ID NO:1 , nucleotides 42314-43048 of SEQ ID NO:1 , nucleotides 46950-47702 of SEQ ID NO:1 , nucleotides 53697-54431 of SEQ ID NO:1 , and nucleotides 60362-61099 of SEQ ID NO:1.
  • the epothilone synthase domain is a methyltransferase (MT) domain comprising an ammo acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6.
  • said MT domain preferably comprises ammo acids 2671-3045 of SEQ ID NO:6.
  • said nucleotide sequence preferably is substantially similar to nucleotides 51534-52657 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 51534-52657 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is nucleotides 51534-52657 of SEQ ID NO:1.
  • the epothilone synthase domain is a thioesterase (TE) domain comprising an ammo acid sequence substantially similar to ammo acids 2165- 2439 of SEQ ID NO:7.
  • said TE domain preferably comprises ammo acids 2165-2439 of SEQ ID NO:7.
  • said nucleotide sequence preferably is substantially similar to nucleotides 61427-62254 of SEQ ID NO:1.
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 61427-62254 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is nucleotides 61427-62254 of SEQ ID NO:1.
  • the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a non-nbosomal peptide synthetase, wherein said non-nbosomal peptide synthetase comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, amino acids 1 18-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, amino acids 549- 565 of SEQ ID NO:3, amino acids 588-603 of SEQ ID NO:3, amino acids 669-684 of SEQ ID NO:3, amino acids 815-821 of SEQ ID NO:3, amino acids 868-892 of SEQ ID NO:3, am o acids 903-912 of SEQ ID NO:3, am o acids 918-940 of SEQ ID NO:3, am o acids
  • said non-nbosomal peptide synthetase preferably comprises an ammo acid sequence selected from the group consisting of: SEQ ID NO:3, ammo acids 72-81 of SEQ ID NO:3, am o acids 118-125 of SEQ ID NO:3, ammo acids 199-212 of SEQ ID NO:3, ammo acids 353-363 of SEQ ID NO:3, ammo acids 549-565 of SEQ ID NO:3, am o acids 588-603 of SEQ ID NO:3, ammo acids 669-684 of SEQ ID NO:3, ammo acids 815-821 of SEQ ID NO:3, ammo acids 868-892 of SEQ ID NO:3, ammo acids 903-912 of SEQ ID NO:3, am o acids 918-940 of SEQ ID NO:3, ammo acids 1268-1274 of SEQ ID NO:3, ammo acids 1285-1297 of SEQ ID NO:3, ammo acids 973-1256 of SEQ ID NO:3, and
  • said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085-121 14 of SEQ ID NO:
  • nucleotides 12223-12246 of SEQ ID NO 1 nucleotides 12466-12507 of SEQ ID NO 1 , nucleotides 12928-12960 of SEQ ID NO 1 , nucleotides 13516-13566 of SEQ ID NO 1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876-13923 of SEQ ID NO 1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO 1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO 1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO 1 , nucleotides 14788-15639 of SEQ ID NO: 1 , and nucleotides 157
  • said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 1 1872-16104 of SEQ ID NO:1 , nucleotides 12085-12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466-12507 of SEQ ID NO:1 , nucleotides
  • nucleotides 13516-13566 of SEQ ID NO:1 nucleotides 13633-13680 of SEQ ID NO 1 , nucleotides 13876-13923 of SEQ ID NO:1 , nucleotides 14313-14334 of SEQ ID NO 1 , nucleotides 14473-14547 of SEQ ID NO:1 , nucleotides 14578-14607 of SEQ ID NO 1 , nucleotides 14623-14692 of SEQ ID NO:1 , nucleotides 15673-15693 of SEQ ID NO 1 , nucleotides 15724-15762 of SEQ ID NO:1 , nucleotides 14788-15639 of SEQ ID NO 1 , and nucleotides 15901 -15924 of SEQ ID NO:1.
  • said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO:1 , nucleotides 12085- 12114 of SEQ ID NO:1 , nucleotides 12223-12246 of SEQ ID NO:1 , nucleotides 12466- 12507 of SEQ ID NO :1 , nucleotides 12928-12960 of SEQ ID NO: 1 , nucleotides 13516- 13566 of SEQ ID NO :1 , nucleotides 13633-13680 of SEQ ID NO: 1 , nucleotides 13876- 13923 of SEQ ID NO :1 , nucleotides 14313-14334 of SEQ ID NO: 1 , nucleotides 14473- 14547 of SEQ ID NO :1 , nucleotides 14578-14607 of SEQ ID NO.1 , nucleo
  • the present invention further provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a polypeptide comprising an ammo acid sequence selected from the group consisting of SEQ ID NOs:2-23.
  • the present invention also provides methods for the recombinant production of polyketides such as epothilones in quantities large enough to enable their purification and use in pharmaceutical formulations such as those for the treatment of cancer
  • polyketides such as epothilones
  • a specific advantage of these production methods is the chira ty of the molecules produced; production in transgenic organisms avoids the generation of populations of racemic mixtures, within which some enantiomers may have reduced activity.
  • the present invention provides a method for heterologous expression of epothilone in a recombinant host, comprising: (a) introducing into a host a chime ⁇ c gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention that comprises a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone; and (b) growing the host in conditions that allow biosynthesis of epothilone in the host.
  • the present invention also provides a method for producing epothilone, comprising: (a) expressing epothilone in a recombinant host by the aforementioned method; and (b) extracting epothilone from the recombinant host.
  • the present invention provides an isolated polypeptide comprising an ammo acid sequence that consists of an epothilone synthase domain.
  • the epothilone synthase domain is a ⁇ -ketoacyl- synthase (KS) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 11-437 of SEQ ID NO:2, am o acids 7-432 of SEQ ID NO:4, ammo acids 39-457 of SEQ ID NO:5, am o acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103- 5525 of SEQ ID NO:5, ammo acids 35-454 of SEQ ID NO:6, am o acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
  • KS ⁇ -ketoacyl- synthase
  • said KS domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 11 -437 of SEQ ID NO:2, ammo acids 7-432 of SEQ ID NO:4, amino acids 39-457 of SEQ ID NO:5, ammo acids 1524-1950 of SEQ ID NO:5, ammo acids 3024-3449 of SEQ ID NO:5, ammo acids 5103-5525 of SEQ ID NO.5, ammo acids 35-454 of SEQ ID NO:6, ammo acids 1522-1946 of SEQ ID NO: 6, and ammo acids 32-450 of SEQ ID NO:7.
  • the epothilone synthase domain is an acyltrans- ferase (AT) domain comprising an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO-2, ammo acids 539-859 of SEQ ID NO:4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, am o acids 5631- 5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556-877 of SEQ ID NO:7.
  • AT acyltrans- ferase
  • said AT domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 543-864 of SEQ ID NO.2, ammo acids 539-859 of SEQ ID NO 4, ammo acids 563-884 of SEQ ID NO:5, ammo acids 2056-2377 of SEQ ID NO:5, ammo acids 3555-3876 of SEQ ID NO:5, am o acids 5631 -5951 of SEQ ID NO:5, ammo acids 561-881 of SEQ ID NO:6, ammo acids 2053-2373 of SEQ ID NO:6, and ammo acids 556- 877 of SEQ ID NO:7.
  • the epothilone synthase domain is an enoyl reductase (ER) domain comprising an ammo acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: am o acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • ER enoyl reductase
  • said ER domain preferably comprises an am o acid sequence selected from the group consisting of: ammo acids 974-1273 of SEQ ID NO:2, ammo acids 4433-4719 of SEQ ID NO:5, ammo acids 6542-6837 of SEQ ID NO:5, and ammo acids 1478-1790 of SEQ ID NO:7.
  • the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an ammo acid sequence substantially similar to an ammo acid sequence selected from the group consisting of: ami- no acids 1314-1385 of SEQ ID NO:2, amino acids 1722-1792 of SEQ ID NO:4, ammo acids 1434-1506 of SEQ ID NO:5, ammo acids 2932-3005 of SEQ ID NO:5, am o acids 5010- 5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, ammo acids 1430-1503 of SEQ ID NO:6, amino acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7.
  • ACP acyl carrier protein
  • said ACP domain preferably comprises an ammo acid sequence selected from the group consisting of: ammo acids 1314-1385 of SEQ ID NO:2, ammo acids 1722-1792 of SEQ ID NO:4, am o acids 1434-1506 of SEQ ID NO:5, amino acids 2932-3005 of SEQ ID NO:5, amino acids 5010-5082 of SEQ ID NO:5, ammo acids 7140-7211 of SEQ ID NO:5, amino acids 1430-1503 of SEQ ID NO:6, ammo acids 3673-3745 of SEQ ID NO:6, and amino acids 2093-2164 of SEQ ID NO:7.
  • the epothilone synthase domain is a dehydratase (DH) domain comprising an ammo acid sequence substantially similar to an am o acid sequence selected from the group consisting of: ammo acids 869-1037 of SEQ ID NO:4, ami- no acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
  • DH dehydratase
  • said DH domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 869-1037 of SEQ ID NO:4, ammo acids 3886-4048 of SEQ ID NO:5, ammo acids 5964-6132 of SEQ ID NO:5, ammo acids 2383-2551 of SEQ ID NO:6, and ammo acids 887-1051 of SEQ ID NO:7.
  • the epothilone synthase domain is a ⁇ -keto- reductase (KR) domain comprising an amino acid sequence substantially similar to an ami- no acid sequence selected from the group consisting of: ammo acids 1439-1684 of SEQ ID NO:4, ammo acids 1 147-1399 of SEQ ID NO:5, ammo acids 2645-2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and am o acids 1810-2055 of SEQ ID NO:7.
  • KR ⁇ -keto- reductase
  • said KR domain preferably comprises an ammo acid sequence selected from the group consisting of: am o acids 1439-1684 of SEQ ID NO:4, ammo acids 1147-1399 of SEQ ID NO:5, ammo acids 2645- 2895 of SEQ ID NO:5, ammo acids 4729-4974 of SEQ ID NO:5, ammo acids 6857-7101 of SEQ ID NO:5, ammo acids 1143-1393 of SEQ ID NO:6, ammo acids 3392-3636 of SEQ ID NO:6, and amino acids 1810-2055 of SEQ ID NO:7.
  • the epothilone synthase domain is a methyl- transferase (MT) domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6.
  • said MT domain preferably comprises amino acids 2671-3045 of SEQ ID NO:6.
  • the epothilone synthase domain is a thioesterase (TE) domain comprising an ammo acid sequence substantially similar to ammo acids 2165- 2439 of SEQ ID NO:7.
  • said TE domain preferably comprises ammo acids 2165-2439 of SEQ ID NO.7.
  • Associated With / Operatively Linked refers to two DNA sequences that are related physically or functionally.
  • a promoter or regulatory DNA sequence is said to be "associated with" a DNA sequence that codes for an RNA or a protein if the two sequences are operatively linked, or situated such that the regulator DNA sequence will affect the expression level of the coding or structural DNA sequence.
  • Chime ⁇ c Gene A recombinant DNA sequence in which a promoter or regulatory DNA sequence is operatively linked to, or associated with, a DNA sequence that codes for an mRNA or which is expressed as a protein, such that the regulator DNA sequence is able to regulate transcription or expression of the associated DNA sequence.
  • the regulator DNA sequence of the chime ⁇ c gene is not normally operatively linked to the associated DNA sequence as found in nature.
  • Coding DNA Sequence A DNA sequence that is translated in an organism to produce a protein.
  • acyl carrier protein ACP
  • KS ⁇ -ketosynthase
  • AT acyltransferase
  • KR ⁇ - ketoreductase
  • DH dehydratase
  • ER enoylreductase
  • TE thioesterase
  • Epothilones 16-membered macrocyc c polyketides naturally produced by the bacterium Sorangium cellulosum strain So ce90, which mimic the biological effects of taxol.
  • epothilone refers to the class of polyketides that includes epothilone A and epothilone B, as well as analogs thereof such as those described in WO 98/25929.
  • Epothilone Synthase A polyketide synthase responsible for the biosynthesis of epothilone.
  • Gene A defined region that is located within a genome and that, besides the aforementioned coding DNA sequence, comprises other, primarily regulatory, DNA sequences responsible for the control of the expression, that is to say the transcription and translation, of the coding portion.
  • Heterologous DNA Sequence A DNA sequence not naturally associated with a host cell into which it is introduced, including non-naturally occurring multiple copies of a naturally occurring DNA sequence
  • Homologous DNA Sequence A DNA sequence naturally associated with a host cell into which it is introduced.
  • an isolated nucleic acid molecule or an isolated enzyme is a nucleic acid molecule or enzyme that, by the hand of man, exists apart from its native environment and is therefore not a product of nature
  • An isolated nucleic acid molecule or enzyme may exist in a purified form or may exist in a non-native environment such as, for example, a recombinant host cell.
  • Module A genetic element encoding all of the distinct activities required in a single round of polyketide biosynthesis, i.e., one condensation step and all the ⁇ -carbonyl processing steps associated therewith.
  • Each module encodes an ACP, a KS, and an AT activity to accomplish the condensation portion of the biosynthesis, and selected post- condensation activities to effect the ⁇ -carbonyl processing
  • NRPS NRPS.
  • a non-nbosomal polypeptide synthetase which is a complex of enzymatic activities responsible for the incorporation of ammo acids into secondary metabolites including, for example, ammo acid adenylation, epime ⁇ zation, N-methylation, cyc zation, peptidyl carrier protein, and condensation domains
  • a functional NRPS is one that catalyzes the incorporation of an ammo acid into a secondary metabolite.
  • NRPS gene One or more genes encoding NRPSs for producing functional secondary metabolites, e.g., epothilones A and B, when under the direction of one or more compatible control elements.
  • Nucleic Acid Molecule A linear segment of single- or double-stranded DNA or RNA that can be isolated from any source. In the context of the present invention, the nucleic acid molecule is preferably a segment of DNA.
  • PKS A polyketide synthase, which is a complex of enzymatic activities (domains) responsible for the biosynthesis of polyketides including, for example, ketoreductase, dehy- dratase, acyl carrier protein, enoylreductase, ketoacyl ACP synthase, and acyltransferase.
  • a functional PKS is one that catalyzes the synthesis of a polyketide.
  • PKS Genes One or more genes encoding various polypeptides required for producing functional polyketides, e.g., epothilones A and B, when under the direction of one or more compatible control elements.
  • nucleic acids a nucleic acid molecule that has at least 60 percent sequence identity with a reference nucleic acid molecule.
  • a substantially similar DNA sequence is at least 80% identical to a reference DNA sequence; in a more preferred embodiment, a substantially similar DNA sequence is at least 90% identical to a reference DNA sequence; and in a most preferred embodiment, a substantially similar DNA sequence is at least 95% identical to a reference DNA sequence.
  • a substantially similar DNA sequence preferably encodes a protein or peptide having substantially the same activity as the protein or peptide encoded by the reference DNA sequence.
  • a substantially similar nucleotide sequence typically hybridizes to a reference nucleic acid molecule, or fragments thereof, under the following conditions: hybridization at 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 pH 7.0, 1 mM EDTA at 50°C; wash with 2X SSC, 1% SDS, at 50°C.
  • SDS sodium dodecyl sulfate
  • a substantially similar am o acid sequence is an amino acid sequence that is at least 90% identical to the am o acid sequence of a reference protein or peptide and has substantially the same activity as the reference protein or peptide.
  • Transformation A process for introducing heterologous nucleic acid into a host cell or organism.
  • Transformed / Transgenic / Recombinant refers to a host organism such as a bacterium into which a heterologous nucleic acid molecule has been introduced.
  • the nucleic acid molecule can be stably integrated into the genome of the host or the nucleic acid molecule can also be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating.
  • Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.
  • a "non-transformed", “non-transgenic", or “non-recombmant" host refers to a wild-type organism, i.e., a bacterium, which does not contain the heterologous nucleic acid molecule.
  • Nucleotides are indicated by their bases by the following standard abbreviations: adenine (A), cytosine (C), thymine (T), and guanine (G).
  • Ammo acids are likewise indicated by the following standard abbreviations: alanine (ala; A), arginme (Arg; R), asparagme (Asn; N), aspartic acid (Asp; D), cysteme (Cys; C), glutamme (Gin; Q), glutamic acid (Glu; E), glycme (Gly; G), histid e (His; H), isoleucme (lie; I), leucme (Leu; L), lysme (lys; K), methionme (Met; M), phenylalanme (Phe; F), prolme (Pro; P), se ⁇ ne (Ser; S), threonme (Thr; T), tryptophan (Trp; W), tyros
  • SEQ ID NO:1 is the nucleotide sequence of a 68750 bp contig containing 22 open reading frames (ORFs), which comprises the epothilone biosynthesis genes.
  • SEQ ID NO:2 is the protein sequence of a type I polyketide synthase (EPOS A) encoded by epoA (nucleotides 7610-1 1875 of SEQ ID NO:1 ).
  • EPOS A type I polyketide synthase
  • SEQ ID NO:3 is the protein sequence of a non-nbosomal peptide synthetase (EPOS P) encoded by epoP (nucleotides 1 1872-16104 of SEQ ID NO:1 )
  • EPOS P non-nbosomal peptide synthetase
  • SEQ ID NO:4 is the protein sequence of a type I polyketide synthase (EPOS B) encoded by epoB (nucleotides 16251 -21749 of SEQ ID NO:1 ).
  • EPOS B type I polyketide synthase
  • SEQ ID NO:5 is the protein sequence of a type I polyketide synthase (EPOS C) encoded by epc € (nucleotides 21746-43519 of SEQ ID NO:1 ).
  • EPOS C type I polyketide synthase
  • SEQ ID NO:6 is the protein sequence of a type I polyketide synthase (EPOS D) encoded by epoD (nucleotides 43524-54920 of SEQ ID NO:1 ).
  • EPOS D type I polyketide synthase
  • SEQ ID NO:7 is the protein sequence of a type I polyketide synthase (EPOS E) encoded by epoE (nucleotides 54935-62254 of SEQ ID NO:1).
  • EPOS E type I polyketide synthase
  • SEQ ID NO:8 is the protein sequence of a cytochrome P450 oxygenase homologue (EPOS F) encoded by epoF (nucleotides 62369-63628 of SEQ ID NO:1 ).
  • EPOS F cytochrome P450 oxygenase homologue
  • SEQ ID NO:9 is a partial protein sequence (partial Orf 1 ) encoded by ort ⁇ (nucleotides 1 -1826 of SEQ ID NO:1 ).
  • SEQ ID NO:10 is a protein sequence (Orf 2) encoded by orf ⁇ (nucleotides 3171 -1900 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:11 is a protein sequence (Orf 3) encoded by orf3 (nucleotides 3415-5556 of SEQ ID NO:1 ).
  • SEQ ID NO:12 is a protein sequence (Orf 4) encoded by orfA (nucleotides 5992-5612 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:13 is a protein sequence (Orf 5) encoded by orf5 (nucleotides 6226-6675 of SEQ ID NO:1 ).
  • SEQ ID NO:14 is a protein sequence (Orf 6) encoded by orf ⁇ (nucleotides 63779- 64333 of SEQ ID NO:1 ).
  • SEQ ID NO:15 is a protein sequence (Orf 7) encoded by orf7 (nucleotides 64290- 63853 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:16 is a protein sequence (Orf 8) encoded by orfQ (nucleotides 64363- 64920 of SEQ ID NO:1 ).
  • SEQ ID NO:17 is a protein sequence (Orf 9) encoded by or/9 (nucleotides 64727- 64287 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:18 is a protein sequence (Orf 10) encoded by orfl O (nucleotides 65063- 65767 of SEQ ID NO:1 ).
  • SEQ ID NO:19 is a protein sequence (Orf 11 ) encoded by ⁇ r l 1 (nucleotides 65874- 65008 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:20 is a protein sequence (Orf 12) encoded by o ⁇ f ⁇ 2 (nucleotides 66338- 65871 on the reverse complement strand of SEQ ID NO:1 ).
  • SEQ ID NO:21 is a protein sequence (Orf 13) encoded by or/13 (nucleotides 66667- 67137 of SEQ ID NO:1 ).
  • SEQ ID NO:22 is a protein sequence (Orf 14) encoded by or/14 (nucleotides 67334- 68251 of SEQ ID NO:1 ).
  • SEQ ID NO:23 is a partial protein sequence (partial Orf 15) encoded by orfl ⁇ (nucleotides 68346-68750 of SEQ ID NO:1).
  • SEQ ID NO:24 is the universal reverse PCR primer sequence.
  • SEQ ID NO:25 is the universal forward PCR primer sequence.
  • SEQ ID NO:26 is the NH24 end “B” PCR primer sequence.
  • SEQ ID NO:27 is the NH2 end "A” PCR primer sequence.
  • SEQ ID NO:28 is the NH2 end "B” PCR primer sequence.
  • SEQ ID NO:29 is the pEP015-NH6 end "B” PCR primer sequence.
  • SEQ ID NO:30 is the pEP015-H2.7 end “A” PCR primer sequence.
  • the genes involved in the biosynthesis of epothilones can be isolated using the techniques according to the present invention.
  • the preferable procedure for the isolation of epothilone biosynthesis genes requires the isolation of genomic DNA from an organism identified as producing epothilones A and B, and the transfer of the isolated DNA on a suitable plasmid or vector to a host organism that does not normally produce the polyketide, followed by the identification of transformed host colonies to which the epothilone-producmg ability has been conferred.
  • the exact region of the transforming epothilone- confer ⁇ ng DNA can be more precisely defined.
  • the transforming epothilone-confernng DNA can be cleaved into smaller fragments and the smallest that maintains the epothilone-confernng ability further characterized.
  • a variation of this technique involves the transformation of host DNA into the same host that has had its epothilone-producmg ability disrupted by mutagenesis.
  • an epothilone-producmg organism is mutated and non- epothilone-produc g mutants are isolated. These are then complemented by genomic DNA isolated from the epothilone-producmg parent strain.
  • a further example of a technique that can be used to isolate genes required for epothilone biosynthesis is the use of transposon mutagenesis to generate mutants of an epothilone-producmg organism that, after mutagenesis, fails to produce the polyketide.
  • the region of the host genome responsible for epothilone production is tagged by the transposon and can be recovered and used as a probe to isolate the native genes from the parent strain.
  • PKS genes that are required for the synthesis of polyketides and that are similar to known PKS genes may be isolated by virtue of their sequence homology to the biosynthetic genes for which the sequence is known, such as those for the biosynthesis of ⁇ famycin or soraphen. Techniques suitable for isolation by homology include standard library screening by DNA hybridization.
  • Preferred for use as a probe molecule is a DNA fragment that is obtainable from a gene or another DNA sequence that plays a part in the synthesis of a known polyketide.
  • a preferred probe molecule comprises a 1.2 kb Sma ⁇ DNA fragment encoding the ketosyntha- se domain of the fourth module of the soraphen PKS (U.S Patent No. 5,716,849), and a more preferred probe molecule comprises the ⁇ -ketoacyl synthase domains from the first and second modules of the ⁇ famycin PKS (Schupp et al., FEMS Microbiology Letters 159: 201-207 (1998)). These can be used to probe a gene library of an epothilone-producmg microorganism to isolate the PKS genes responsible for epothilone biosynthesis.
  • biosynthetic genes for epothilones A and B can surprisingly be cloned from a microorganism that produces that polyketide.
  • the cloned PKS genes can be modified and expressed in transgenic host organisms.
  • the isolated epothilone biosynthetic genes can be expressed in heterologous hosts to enable the production of the polyketide with greater efficiency than might be possible from native hosts. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art. For example, heterologous genes can be expressed in Streptomyces and other actinomycetes using techniques such as those described in McDaniel et ai., Science 262: 1546-1550 (1993) and Kao et ai., Science 265: 509-512 (1994), both of which are incorporated herein by reference.
  • genes responsible for polyketide biosynthesis can also be expressed in other host organisms such as pseudomonads and E coll. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art.
  • PKS genes have been sucessfuliy expressed in E. cob using the pT7-7 vector, which uses the T7 promoter. See, Tabor et al., Proc. Natl. Acad. Sci. USA 82. 1074-1078 (1985), incorporated herein by reference.
  • the expression vectors pKK223-3 and pKK223-2 can be used to express heterologous genes in E.
  • operons encoding multiple ORFs the simplest procedure is to insert the operon into a vector such as pKK223-3 in transc ⁇ ptional fusion, allowing the cognate ⁇ bo- some binding site of the heterologous genes to be used.
  • Techniques for overexpression in gram-positive species such as Bacillus are also known in the art and can be used in the context of this invention (Quax et al., in: Industrial Microorganisms Basic and Applied Molecular Genetics, Eds. Baltz et ai., American Society for Microbiology, Washington (1993))
  • yeast and baculovirus expression systems include yeast and baculovirus expression systems. See, for example, 'The Expression of Recombinant Proteins in Yeasts," Sudbery, P E., Curr. Opin. Biotechnol 7(5): 517-524 (1996); "Methods for Expressing Recombinant Proteins in Yeast,” Mackay, et al., Ed ⁇ tor(s). Carey, Paul R., Protein Eng. Des 105-153, Publisher: Academic, San Diego, Calif (1996); “Expression of heterologous gene products in yeast,” Pichuantes, et al., Ed ⁇ tor(s) Cleland, J.
  • PKS genes in heterologous hosts Another consideration for expression of PKS genes in heterologous hosts is the requirement of enzymes for posttranslational modification of PKS enzymes by phosphopante- theinylation before they can synthesize polyketides.
  • the enzymes responsible for this modification of type I PKS enzymes, phosphopantethemyl (P-pant) transferases are not normally present in many hosts such as E. coli.
  • This problem can be solved by coexpres- sion of a P-pant transferase with the PKS genes in the heterologous host, as described by Kealey et ai., Proc. Natl. Acad. Sci. USA 95: 505-509 (1998), incorporated herein by reference.
  • the significant criteria in the choice of host organism are its ease of manipulation, rapidity of growth (i.e. fermentation), possession or the proper molecular machinery for processes such as posttranslational modification, and its lack of susceptibility to the polyketide being overproduced.
  • Most preferred host organisms are actinomycetes such as strains of Streptomyces.
  • Other preferred host organisms are pseudomonads and E. coli.
  • the above-described methods of polyketide production have significant advantages over the technology currently used in the preparation of the compounds. These advantages include the cheaper cost of production, the ability to produce greater quantities of the compounds, and the ability to produce compounds of a preferred biological enantiomer, as opposed to racemic mixtures inevitably generated by organic synthesis.
  • Compounds produced by heterologous hosts can be used in medical (e.g. cancer treatment in the case of epothilones) as well as agricultural applications.
  • Sorangium cellulosum strain 90 (DSM 6773, Deutsche Sammlung von Mikroorganis- men und Zellkulturen, Braunschweig) is streaked out and grown (30°C) on an agar plate of SolE medium (0.35% glucose, 0.05% tryptone, 0.15% MgS0 x 7H 0, 0.05% ammonium sulfate, 0.1% CaCI 2 , 0.006% K 2 HP0 4 , 0.01% sodium dithionite, 0.0008% Fe-EDTA, 1 2% HEPES, 3.5% [vol/vol] supernatant of sterilized stationary S. cellulosum culture) pH ad.
  • SolE medium 0.35% glucose, 0.05% tryptone, 0.15% MgS0 x 7H 0, 0.05% ammonium sulfate, 0.1% CaCI 2 , 0.006% K 2 HP0 4 , 0.01% sodium dithionite, 0.0008% Fe-EDTA, 1 2% HEPES, 3.5% [vol/vol] superna
  • pBelobacll contains a gene encoding chloramphenicol resistance, a multiple cloning site in the lacZ gene providing for blue/white selection on appropriate medium, as well as the genes required for the replication and maintenance of the plasmid at one or two copies per cell.
  • the ligation mixture is used to transform Eschenchia coli DH10B electrocompetent cells using standard electroporation techniques. Chloramphenicol-resistant recombinant (white, lac mutant) colonies are transferred to a positively charged nylon membrane filter in 384 3X3 grid format. The clones are lysed and the DNA is cross-linked to the filters. The same clones are also preserved as liquid cultures at -80°C.
  • the Bac library filters are probed by standard Southern hybridization procedures
  • the DNA probes used encode ⁇ -ketoacyl synthase domains from the first and second modules of the ⁇ famycin polyketide synthase (Schupp et al., FEMS Microbiology Letters 159: 201 -207 (1998)).
  • the probe DNAs are generated by PCR with primers flanking each ketosynthase domain using the plasmid pNE95 as the template (pNE95 equals cosmid 2 described in Schupp et al. (1998)).
  • PCR-amplified DNA 25 ng is isolated from a 0.5% agarose gel and labeled with 32 P-dCTP using a random primer labeling kit (Gibco-BRL, Bethesda MD, USA) according to the manufacturer's instructions.
  • Hybridization is at 65°C for 36 hours and membranes are washed at high stringency (3 times with 0.1 x SSC and 0.5% SDS for 20 mm at 65°C).
  • the labeled blot is exposed on a phosphorescent screen and the signals are detected on a Phospholmager 445SI (screen and 445SI from Molecular Dynamics). This results in strong hybridization of certain Bac clones to the probes.
  • Bac DNA from the Bac clones of interest is isolated by a typical mmiprep procedure.
  • the cells are resuspended in 200 ⁇ l lysozyme solution (50mM glucose, 10 mM EDTA, 25 mM T ⁇ s-HCI, 5mg/ml lysozyme), lysed in 400 ⁇ l lysis solution (0.2 N NaOH and 2% SDS), the proteins are precipitated (3.0 M potassium acetate, adjusted to pH5.2 with acetic acid), and the Bac DNA is precipitated with isopropanol.
  • the DNA is resuspended in 20 ⁇ l of nuclease-free distilled water, restricted with SamHI (New England Biolabs, Inc.) and separated on a 0.7% agarose gel.
  • the gel is blotted by Southern hybridization as described above and probed under conditions described above, with a 1.2 kb Smal DNA fragment encoding the ketosyn- thase domain of the fourth module of the soraphen polyketide synthase as the probe (see, U.S. Patent No. 5,716,849).
  • Five different hybridization patterns are observed.
  • One clone representing each of the five patterns is selected and named pEP015, pEPO20, pEPO30, pEP031 , and pEP033, respectively.
  • the DNA of the five selected Bac clones is digested with SamHI and random fragments are subcloned into pBluesc ⁇ pt II SK+ (Stratagene) at the SamHI site. Subclones carrying inserts between 2 and 10 kb in size are selected for sequencing of the flanking ends of the inserts and also probed with the 1.2 Smal probe as described above. Subclones that show a high degree of sequence homoiogy to known polyketide synthases and/or strong hybridization to the soraphen ketosynthase domain are used for gene disruption experiments.
  • the SamHI inserts of the subclones generated from the five selected Bac clones as described above are isolated and gated into the unique SamHI site of plasmid pCIB132 (see, U.S. Patent No. 5,716,849).
  • the pClB132 derivatives carrying the inserts are transformed into Eschenchia coli ED8767 containing the helper plasmid pUZ8 (Hedges and Matthew, Plasmid 2- 269-278 (1979).
  • the transformants are used as donors in conjugation experiments with Sorangium cellulosum BCE28/2 as recipient.
  • the mixed cells are then cent ⁇ - fuged at 4000 rpm for 10 minutes and resuspended in 0.5 ml G51 b medium
  • This cell suspension is then plated as a drop in the center of a plate with So1 E agar containg 50 mg/l kanamycm.
  • the cells obtained after incubation for 24 hours at 30°C are harvested and resuspended in 0 8 ml of G51 b medium, and 0.1 to 0.3 ml of this suspension is plated out on a selective So1 E solid medium containing phleomycin (30 mg/l), streptomycin (300 mg/l), and kanamycm (50 mg/l)
  • the counterselection of the donor Eschenchia coli strain takes place with the aid of streptomycin.
  • the colonies that grow on this selective medium after an incubation time of 8-12 days at a temperature of 30°C are isolated with a plastic loop and streaked out and cultivated on the same agar medium for a second round of selection and purification
  • the colony-derived cultures that grow on this selective agar medium after 7 days at a temperature of 30°C are transconjugants of Sorangium cellulosum BCE28/2 that have acquired phleomycin resistance by conjugative transfer of the pCIB132 derivatives carrying the subcloned SamHI fragments.
  • Transconjugant cells grown on about 1 square cm surface of the selective So1 E plates of the second round of selection are transferred by a sterile plastic loop into 10 ml of medium G52-H in an 50 ml Erlenmeyer flask. After incubation at 30°C and 180 rpm for 3 days, the culture is transfered into 50 ml of medium G52-H in an 200 ml Erlenmeyer flask.
  • 10 ml of this culture is transfered into 50 ml of medium 23B3 (0.2 % glucose, 2 % potato starch, 1.6 % soya meal defatted, 0 0008 % Fe-EDTA Sodium salt, 0.5 % HEPES (4-(2-hydroxyethyl)-p ⁇ peraz ⁇ ne-1 - ethane-sulfonic-acid), 2 % vol/vol polysterole resin XAD16 (Rohm & Haas), pH adjusted to 7.8 with NaOH) in an 200 ml Erlenmeyer flask.
  • medium 23B3 0.2 % glucose, 2 % potato starch, 1.6 % soya meal defatted, 0 0008 % Fe-EDTA Sodium salt, 0.5 % HEPES (4-(2-hydroxyethyl)-p ⁇ peraz ⁇ ne-1 - ethane-sulfonic-acid), 2 % vol/vol polysterole resin XAD16 (R
  • Quantitative determination of the epothilone produced takes place after incubation of the cultures at 30°C and 180 rpm for 7 days.
  • the complete culture broth is filtered by suction through a 150 ⁇ m nylon filter.
  • the resin remaining on the filter is then resuspended in 10 ml isopropanol and extracted by shaking the suspension at 180 rpm for 1 hour.
  • epothilones A and B therein is determined by means of an HPLC and detection at 250 nm with a UV_DAD detector (HPLC with Waters -Symetry C18 column and a gradient of 0.02 % phosphoric acid 60%-0% and acetonit ⁇ l 40%-100%).
  • Transconjugants with three different integrated SamHI fragments subcloned from pEP015, namely transconjugants with the SamHI fragment of plasmid pEP015-21 , transconjugants with the SamHI fragment of plasmid pEPOl 5-4-5, and transconjugants with the SamHI fragment of plasmid pEPOl 5-4-1 are tested in the manner described above. HPLC analysis reveals that all transconjugants no longer produce epothilone A or B.
  • epothilone A and B are detectable in a concentration of 2-4 mg/l in transconjugants with SamHI fragments integrated that are derived from pEPO20, pEPO30, pEP031 , pEP033, and in the parental strain BCE28/2.
  • Example 8 Nucleotide Sequence Determination of the Cloned Fragments and
  • Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEP015-21], and the nucleotide sequence of the 2.3-kb SamHI insert in pEP015-21 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleo- tide chain termination method, using Applied Biosystems model 377 sequencers.
  • the primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3" (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)).
  • oligonucle- otides designed for the 3' ends of the previously determined sequences, are used to extend and join contigs. Both strands are entirely sequenced, and every nucleotide is se- quenced at least two times.
  • the nucleotide sequence is compiled using the program Sequencher vers. 3 0 (Gene Codes Corporation), and analyzed using the University of Wisconsin Genetics Computer Group programs.
  • the nucleotide sequence of the 2213-bp insert corresponds to nucleotides 20779-22991 of SEQ ID NO:1.
  • Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEPOl 5-4-1], and the nucleotide sequence of the 3.9-kb SamHI insert in pEPOl 5-4-1 is determined as described in (A) above.
  • the nucleotide sequence of the 3909-bp insert corresponds to nucleotides 16876-20784 of SEQ ID NO:1.
  • Plasmid DNA is isolated from the strain Eschenchia coli DH10B [pEPOl 5-4-5], and the nucleotide sequence of the 2.3-kb SamHI insert in pEPOl 5-4-5 is determined as described in (A) above.
  • the nucleotide sequence of the 2233-bp insert corresponds to nucleotides 42528-44760 of SEQ ID NO:1.
  • Example 9 Subcloning and Ordering of DNA Fragments from pEP015 Containing Epothilone Biosynthesis Genes
  • pEP015 is digested to completion with the restriction enzyme H/ ⁇ dlll and the resulting fragments are subcloned into pBluescript II SK- or pNEB193 (New England Biolabs) that has been cut with H/ ⁇ dlll and dephosphorylated with calf intestinal alkaline phospha- tase.
  • pEP015-NH1 pEP015-NH2
  • pEP015-NH6 pEP015-NH24
  • pEP015-H2.7 and pEP015- H3.0 both based on pBluescript II SK-
  • the SamHI insert of pEP015-21 is isolated and DIG-labeled (Non-radioactive DNA labeling and detection system, Boeh ⁇ nger Mannheim), and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015- NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH24, indicating that pEP015-21 is contained within pEP015-NH24.
  • DIG-labeled Non-radioactive DNA labeling and detection system, Boeh ⁇ nger Mannheim
  • the SamHI insert of pEPOI 5-4-1 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEP015-NH24 and pEP015-H2.7. Nucleotide sequence data generated from one end each of pEP015-NH24 and pEP015-H2.7 are also in complete agreement with the previously determined sequence of the SamHI insert of pEPOI 5-4-1 .
  • the SamHI insert of pEPOI 5-4-5 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH2, indicating that pEP015-21 is contained within pEP015-NH2.
  • Nucleotide sequence data is generated from both ends of pEP015-NH2 and from the end of pEP015-NH24 that does not overlap with pEPOI 5-4-1.
  • PCR primers NH24 end “B”: GTGACTGGCGCCTGGAATCTGCATGAGC (SEQ ID N0:26), NH2 end “A”: AGCGGGAGCTTGCTAGACATTCTGTTTC (SEQ ID N0:27), and NH2 end “B”: GACGCGCCTCGGGCAGCGCCCCAA (SEQ ID NO:28), pointing towards the H/ ⁇ dlll sites, are designed based on these sequences and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates.
  • the H/ ⁇ dlll insert of pEP015-H2.7 is isolated and DIG-labeled as above, and used as a probe in a DNA hybridization experiment at high stringency against pEP015 digested by Noti.
  • a Noti fragment of about 9 kb in size shows a strong a hybridization, and is further subcloned into pBluescript II SK- that has been digested with Noti and dephosphorylated with calf intestinal alkaline phosphatase, to yield pEP015-N9-16.
  • the ⁇ /orl insert of pEP015-N9-16 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1 , pEP015-NH2, pEP015- NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEPOI 5-NH6, and also for the expected clones pEPOI 5-H2.7 and pEPOI 5- NH24. Nucleotide sequence data is generated from both ends of pEP015-NH6 and from the end of pEP015-H2.7 that does not overlap with pEPOI 5-4-1.
  • PCR primers are designed pointing towards the H/ndlll sites and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates. Specific amplification is found with primer pair pEP015-NH6 end "B”: CACCGAAGCGTCGATCTGGTCCATC (SEQ ID NO:29) and pEP015-H2.7 end "A”: CGGTCAGATCGACGACGGGCTTTCC (SEQ ID NO:30) with both templates. The amplimers are cloned into pBluescript II SK- and completely sequenced.
  • sequences of the amplimers are identical, and also agree completely with the end sequences of pEP015- NH6 and pEP015-H2.7, fused at the H/ ⁇ dlll site, establishing that the H/ ⁇ dlll fragments of pEP015-NH6 and pEP015-H2.7 are, in this order, contiguous.
  • a cosmid DNA library of Sorangium cellulosum So ce90 is generated, using established procedures, in pScosT ⁇ plex-ll (Ji, et ai., Genomics Z ⁇ - 185-192 (1996)). Briefly, high- molecular weight genomic DNA of Sorangium cellulosum So ce90 is partially digested with the restriction enzyme Sau3AI to provide fragments with average sizes of about 40 kb, and ligated to SamHI and Xba ⁇ digested pScosT ⁇ plex-ll The hgation mix is packaged with Gigapack III XL (Stratagene) and used to transfect E. coli XL1 Blue MR cells.
  • the cosmid library is screened with the approximately 2.2 kb SamHI - H/ ⁇ dlll fragment, derived from the downstream end of the insert of pEP015-NH2, used as a probe in colony hybridization.
  • a strongly hybridizing clone, named pEP04E7 is selected.
  • pEP04E7 DNA is isolated, digested with several restriction endonucleases, and probed in Southern hybridization experiments with the 2.2 kb SamHI - H/ ⁇ dlll fragment.
  • a strongly hybridizing ⁇ /ofl fragment of approximately 9 kb in size is selected and subcloned into pBluescript II SK- to yield pEP04E7-N9-8.
  • End sequencing reveals, however, that the downstream end of the insert of pEP04E7-N9-8 contains the SamHI - ⁇ /ofl polyl ker of pScosT ⁇ plex-ll, thereby indicating that the genomic DNA insert of pEP04E7 ends at a Sau3AI site within the extending H/ndlll - ⁇ /ofl fragment and that the ⁇ /ofl site is derived from pScosT ⁇ plex-ll.
  • a H/ndlll - EcoRV fragment of about 13 kb in size is found to strongly hybridize to the probe, and is subcloned into pBluescript II SK- digested with H/ndlll and H/ncll to yield pEP032-HEV15.
  • Oligonucleotide primers are designed based on the downstream end sequence of pEP015-NH2 and on the upstream (H/ ⁇ dlll) end sequence derived from pEP032-HEV15, and used in sequencing reactions with pEP04E7-N9-8 as the template.
  • the sequences reveal the existence of a small H/ndlll fragment (EPO4E7-H0.02) of 24 bp, undetectable in standard restriction analysis, separating the H/ndlll site at the downstream end of pEP015- NH2 from the H/ndlll site at the upstream end of pEP032-HEV15.
  • Example 9 the subclone contig described in Example 9 is extended to include the H/ ⁇ dlll fragment EPO4E7-H0.02 and the insert of pEP032-HEV15, and constitutes the inserts of: pEP015-NH6, pEP015-H2.7, pEP015-NH24, pEP015-NH2, EPO4E7-H0.02 and pEP032- HEV15, in this order.
  • the nucleotide sequence of the subclone contig described in Example 10 is determined as follows. pEPOI 5-H2.7. Plasmid DNA is isolated from the strain Eschenchia coli DH1 OB [pEP015-H2.7], and the nucleotide sequence of the 2.7-kb SamHI insert in pEP015-H2.7 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleotide chain termination method, using Applied Biosystems model 377 sequencers.
  • the primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3' (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)).
  • custom- synthesized oligonucleotides designed for the 3' ends of the previously determined sequences, are used to extend and join contigs.
  • the H/ndlll inserts of these pias- mids are isolated, and subjected to random fragmentation using a Hydroshear apparatus (Genomic Instrumentation Services, Inc.) to yield an average fragment size of 1 -2 kb.
  • the fragments are end-repaired using T4 DNA Poiymerase and Klenow DNA Polymerase enzymes in the presence of desoxynucleotide triphosphates, and phosphorylated with T4 DNA Kinase in the presence of ⁇ bo-ATP. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphorylated. Random subclones are sequenced using the universal reverse and the universal forward primers. pEP032-HEV15.
  • pEP032-HEV15 is digested with H/ ⁇ dlll and Sspl, the approximately 13.3 kb fragment containing the -13 kb H/ ⁇ dlll - EcoRV insert from So. cellulosum So ce90 and a 0.3 kb H/ ⁇ cll - Sspl fragment from pBluescript II SK- is isolated, and partially digested with Haelll to yield fragments with an average size of 1 -2 kb. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphorylated. Random subclones are sequenced using the universal reverse and the universal forward primers.
  • the chromatograms are analyzed and assembled into contigs with the Phred, Phrap and Consed programs (Ewmg, et ai., Genome Res. 8(3). 175-185 (1998); Ewmg, et al , Genome Res. 8(3): 186-194 (1998); Gordon, et ai., Genome Res. 8(3): 195-202 (1998)) Contig gaps are filled, sequence discrepancies are resolved, and low-quality regions are resequenced using custom-designed o gonucleotide primers for sequencing on either the original subclones or selected clones from the random subclone libraries. Both strands are completely sequenced, and every basepair is covered with at least a minimum aggregated Phred score of 40 (confidence level of 99.99%).
  • the nucleotide sequence of the 68750 bp contig is shown as SEQ ID NO:1.
  • Example 12 Nucleotide Sequence Analysis of the Epothilone Biosynthesis Genes
  • SEQ ID NO:1 is found to contain 22 ORFs as detailed below in Table 1 :
  • epoA codes for EPOS A (SEQ ID NO:2), a type I polyketide synthase consisting of a single module, and harboring the following domains: ⁇ -ketoacyl-synthase (KS) (nucleotides 7643-8920 of SEQ ID NO:1 , amino acids 11- 437 of SEQ ID NO:2); acyltransferase (AT) (nucleotides 9236-10201 of SEQ ID NO:1 , ammo acids 543-864 of SEQ ID NO:2); enoyl reductase (ER) (nucleotides 10529-11428 of SEQ ID NO:1 , ammo acids 974-1273 of SEQ ID NO:2); and acyl carrier protein homologous domain (ACP) (nucleotides 11549-11764 of SEQ ID NO:1 , am
  • EPOS A Sequence comparisons and motif analysis (Haydock, et al. FEBS Lett. 374: 246- 248 (1995), Tang, et al., Gene 216: 255-265 (1998)) reveal that the AT encoded by EPOS A is specific for malonyl-CoA.
  • EPOS A should be involved in the initiation of epothilone biosynthesis by loading the acetate unit to the multienzyme complex that will eventually form part of the 2-methylth ⁇ azole ring (C26 and C20).
  • epoP nucleotides 11872-16104 of SEQ ID NO:1
  • EPOS P SEQ ID NO:3
  • EPOS P harbors the following domains:
  • motif K (ammo acids 72-81 [FPLTDIQESY] of SEQ ID NO:3, corresponding to nucleotide positions 12085-12114 of SEQ ID NO.1 ); motif L (ammo acids 1 18-125 [VVARHDML] of SEQ ID NO.3, corresponding to nucleotide positions 12223-12246 of SEQ ID NO:1 ), motif M (ammo acids 199- 212 [SIDLINVDLGSLSI] of SEQ ID NO:3, corresponding to nucleotide positions 12466- 12507 of SEQ ID NO:1 ); and motif O (ammo acids 353-363 [GDFTSMVLLDI] of SEQ ID NO:3, corresponding to nucleotide positions 12928-12960 of SEQ ID NO:1 );
  • motif A (ammo acids 549- 565 [LTYEELSRRSRRLGARL] of SEQ ID NO:3, corresponding to nucleotide positions 13516-13566 of SEQ ID NO:1 ); motif B (ammo acids 588-603 [VAVLAVLESGAAYVPI] of SEQ ID NO.3, corresponding to nucleotide positions 13633-13680 of SEQ ID NO:1); motif C (ammo acids 669-684 [AYVIYTSGSTGLPKGV] of SEQ ID NO:3, corresponding to nucleotide positions 13876-13923 of SEQ ID NO:1 ); motif D (ammo acids 815-821 [SLGGATE] of SEQ ID NO:3, corresponding to nucleotide positions 14313-14334 of SEQ ID NO:1 ); motif E (amino acids 868-892 [GQLYIGGVGLALGYWRDEEKTRKSF] of
  • PCP peptidyl carrier protein homologous domain
  • EPOS P is involved in the activation of a cysteme by adenylation, binding the activated cysteine as an ammoacyl-S-PCP, forming a peptide bond between the enzyme-bound cysteme and the acetyl-S-ACP supplied by EPOS A, and the formation of the initial thiazoiine ring by intramolecular heterocyciization.
  • the unknown domain of EPOS P displays very weak homologies to NAD(P)H oxidases and reductases from Bacillus species. Thus, this unknown domain and/or the ER domain of EPOS A may be involved in the oxidation of the initial 2-methylthiazoline ring to a 2-methylthiazole.
  • epoB (nucleotides 16251 -21749 of SEQ ID NO:1 ) codes for EPOS B (SEQ ID NO:4), a type I polyketide synthase consisting of a single module, and harboring the following domains: KS (nucleotides 16269-17546 of SEQ ID NO:1 , am o acids 7-432 of SEQ ID NO:4); AT (nucleotides 17865-18827 of SEQ ID NO:1 , ammo acids 539-859 of SEQ ID NO:4); dehydratase (DH) (nucleotides 18855-19361 of SEQ ID NO:1 , amino acids 869-1037 of SEQ ID NO:4); ⁇ -ketoreductase (KR) (nucleotides 20565-21302 of SEQ ID NO:1 , ammo acids 1439-1684 of SEQ ID NO:4); and ACP (nucleotides 21414
  • EPOS B is specific for methylmalonyl-CoA.
  • EPOS A should be involved in the first polyketide chain extension by catalysing the Claisen-like condensation of the 2-methyl-4-thiazolecarboxyl-S-PCP starter group with the methylmalonyl-S-ACP, and the concomitant reduction of the b-keto group of C17 to an enoyl.
  • epoC (nucleotides 21746-43519 of SEQ ID NO:1 ) codes for EPOS C (SEQ ID NO:5), a type I polyketide synthase consisting of 4 modules.
  • the first module harbors a KS (nucleotides 21860-231 16 of SEQ ID NO:1 , amino acids 39-457 of SEQ ID NO:5); a malonyl CoA- specific AT (nucleotides 23431-24397 of SEQ ID NO:1 , amino acids 563-884 of SEQ ID NO:5); a KR (nucleotides 25184-25942 of SEQ ID NO:1 , amino acids 1147-1399 of SEQ ID NO:5); and an ACP (nucleotides 26045-26263 of SEQ ID NO:1 , amino acids 1434-1506 of SEQ ID NO:5).
  • KS nucleotides 21860-231 16 of SEQ ID NO:1 , amino acids 39-457 of SEQ ID NO:5
  • a malonyl CoA- specific AT nucleotides 23431-24397 of SEQ ID NO:1 , amino acids 563-884 of SEQ ID NO:5
  • KR nucleo
  • This module incorporates an acetate extender unit (C14-C13) and reduces the ⁇ -keto group at C15 to the hydroxyl group that takes part in the final lactonization of the epothilone macrolactone ring.
  • the second module of EPOS C harbors a KS (nucleotides 26318-27595 of SEQ ID NO:1 , ammo acids 1524-1950 of SEQ ID NO:5); a malonyl CoA- specific AT (nucleotides 27911 -28876 of SEQ ID NO:1 , ammo acids 2056-2377 of SEQ ID NO:5), a KR (nucleotides 29678-30429 of SEQ ID NO.1 , ammo acids 2645-2895 of SEQ ID NO:5); and an ACP (nucleotides 30539-30759 of SEQ ID NO.1 , ammo acids 2932-3005 of SEQ ID NO.5).
  • KS
  • This module incorporates an acetate extender unit (C12-C1 1 ) and reduces the ⁇ -keto group at C13 to a hydroxyl group.
  • an acetate extender unit C12-C1 1
  • the nascent polyketide chain of epothilone corresponds to epothilone A
  • the incorporation of the methyl side chain at C12 in epothilone B would require a post-PKS C-methyltransferase activity
  • the formation of the epoxi ring at C13-C12 would also require a post-PKS oxidation step.
  • the third module of EPOS C harbors a KS (nucleotides 30815-32092 of SEQ ID NO:1 , ammo acids 3024-3449 of SEQ ID NO:5); a malonyl CoA-specific AT (nucleotides 32408-33373 of SEQ ID NO:1 , ammo acids 3555-3876 of SEQ ID NO:5); a DH (nucleotides 33401 -33889 of SEQ ID NO.1 , ammo acids 3886-4048 of SEQ ID NO:5); an ER (nucleotides 35042-35902 of SEQ ID NO.1 , ammo acids 4433-4719 of SEQ ID NO.5), a KR (nucleotides 35930-36667 of SEQ ID NO 1 , ammo acids 4729-4974 of SEQ ID NO:5), and an ACP (nucleotides 36773-36991 of SEQ ID NO.1 , am
  • This module incorporates a propionate extender unit (C24 and C8-C7) and fully reduces the ⁇ -keto group at C9.
  • epoD nucleotides 43524-54920 of SEQ ID NO:1
  • EPOS D SEQ ID NO:6
  • the first module harbors a KS (nucleotides 43626-44885 of SEQ ID NO:1 , am o acids 35-454 of SEQ ID NO:6); a methylmalonyl CoA-specific AT (nucleotides 45204-46166 of SEQ ID NO:1 , ammo acids 561 -881 of SEQ ID NO:6); a KR (nucleotides 46950-47702 of SEQ ID NO:1 , ammo acids 1143-1393 of SEQ ID N0:6); and an ACP (nucleotides 47811 -48032 of SEQ ID NO:1 , ami- no acids 1430-1503 of SEQ ID NO:6).
  • KS nucleotides 43626-44885 of SEQ ID NO:1 , am o acids 35-454 of SEQ ID NO:6
  • a methylmalonyl CoA-specific AT nucleotides 45204-46166 of SEQ ID NO:1 , am
  • This module incorporates a propionate extender unit (C23 and C6-C5) and reduces the ⁇ -keto group at C7 to a hydoxyl group.
  • the second module harbors a KS (nucleotides 48087-49361 of SEQ ID NO:1 , am o acids 1522-1946 of SEQ ID NO: 6); a methylmalonyl CoA-specific AT (nucleotides 49680-50642 of SEQ ID NO:1 , am o acids 2053-2373 of SEQ ID NO:6); a DH (nucleotides 50670-51176 of SEQ ID NO:1 , ammo acids 2383-2551 of SEQ ID NO:6); a methyltransferase (MT, nucleotides 51534-52657 of SEQ ID NO.1 , ammo acids 2671-3045 of SEQ ID NO.6), a KR (nucleotides 53697-54431 of SEQ
  • This module incorporates a propionate extender unit (C21 or C22 and C4-C3) and reduces the ⁇ -keto group at C5 to a hydoxyl group. This reduction is somewhat unexpected, since epothilones contain a keto group at C5. Discrepancies of this kind between the deduced reductive capabilities of PKS modules and the redox state of the corresponding positions in the final polyketide products have been, however, reported in the literature (see, for example, Schwecke, et al., Proc. Natl. Acad. Sci. USA 92: 7839-7843 (1995) and Schupp, et al., FEMS Microbiology Letters 159: 201 -207 (1998)).
  • EPOS D is predicted to incorporate a propionate unit into the growing polyketide chain, providing one methyl side chain at C4.
  • This module also contains a methyltransferase domain integrated into the PKS between the DH and the KR domains, in an arrangement similar to the one seen in the HMWP1 yersiniabactm synthase (Gehnng, A.M., DeMoll, E., Fetherston, J.D., Mori, I., Mayhew, G.F., Blattner, F.R., Walsh, C.T., and Perry, R.D.: Iron acquisition in plague: modular logic in enzymatic biogenesis of yersiniabactm by Yersmia pestis. Chem. Biol. 5, 573-586, 1998).
  • EPOS E (SEQ ID NO:7), a type I polyketide synthase consisting of one module, harboring a KS (nucleotides 55028- 56284 of SEQ ID NO:1 , ammo acids 32-450 of SEQ ID NO:7); a malonyl CoA-specific AT (nucleotides 56600-57565 of SEQ ID NO:1 , ammo acids 556-877 of SEQ ID NO:7); a DH (nucleotides 57593-58087 of SEQ ID NO: 1 , am o acids 887-1051 of SEQ ID NO:7); a probably nonfunctional ER (nucleotides 59366-60304 of SEQ ID NO:1 , am
  • the ER domain in this module harbors an active site motif with some highly unusual ammo acid substitutions that probably render this domain inactive.
  • the module incorporates an acetate extender unit (C2-C1 ), and reduces the ⁇ -keto at C3 to an enoyl group.
  • Epothilones contain a hydroxyl group at C3, so this reduction also appears to be excessive as discussed for the second module of EPOS D
  • the TE domain of EPOS E takes part in the release and cychzation of the grown polyketide chain via lactonization between the carboxyl group of C1 and the hydroxyl group of C15.
  • the deduced protein product (Orf 2, SEQ ID NO:10) of or/2 (nucleotides 3171 -1900 on the reverse complement strand of SEQ ID NO:1 ) shows strong similarities to hypothetical ORFs from Mycobacterium and Streptomyces coelicolor, and more distant similarities to carboxypeptidases and DD- peptidases of different bacteria.
  • the deduced protein product of or/3 shows homoiogies to Na/H antiporters of different bacteria. Orf 3 might take part in the export of epothilones from the producer strain orf4 and or/5 have no homologues in the sequence databanks
  • EPOS F SEQ ID NO.8
  • epoF codes for EPOS F (SEQ ID NO.8), a deduced protein with strong sequence similarities to cytochrome P450 oxygenases.
  • EPOS F may take part in the adjustment of the redox state of the carbons C12, C5, and/or C3.
  • the deduced protein product of 0//14 shows strong similarities to Gl.3293544, a hypothetic protein with no proposed function from Streptomyces coelicolor, and also to Gl:2654559, the human emb ⁇ onic lung protein. It is also more distantly related to cation efflux system proteins like Gl:2623026 from Methano- bacte ⁇ um thermoautotrophicum, so it might also take part in the export of epothilones from the producing cells.
  • the remaining ORFs (orfo-o ⁇ f ⁇ 3 and orfl 5) show no homoiogies to entries in the sequence databanks.
  • Epothilone synthase genes according to the present invention are expressed in heterologous organisms for the purposes of epothilone production at greater quantities than can be accomplished by fermentation of Sorangium cellulosum.
  • a preferable host for heterologous expression is Streptomyces, e.g. Streptomyces coelicolor, which natively produces the polyketide actinorhodm. Techniques for recombinant PKS gene expression in this host are described in McDaniel et al., Science 262: 1546-1550 (1993) and Kao et al., Science 265: 509-512 (1994).
  • the heterologous host strain is engineered to contain a chromosomal deletion of the actinorhodm (act) gene cluster.
  • Expression plasmids containing the epothilone synthase genes of the invention are constructed by transferring DNA from a temperature-sensitive donor plasmid to a recipient shuttle vector in E coli (McDaniel et al. (1993) and Kao et al. (1994)), such that the synthase genes are built-up by homologous recombination within the vector.
  • the epothilone synthase gene cluster is introduced into the vector by restriction fragment ligation. Following selection, e.g. as described in Kao et ai.
  • DNA from the vector is introduced into the act-minus Streptomyces coelicolor strain according to protocols set forth in Hopwood et ai., Genetic Manipulation of Streptomyces. A Laboratory Manual (John Innes Foundation, Norwich, United Kingdom, 1985), incorporated herein by reference.
  • the recombinant Streptomyces strain is grown on R2YE medium (Hopwood et ai. (1985)) and produces epothilones.
  • the epothilone synthase genes according to the present invention are expressed in other host organisms such as pseudomonads, Bacillus, yeast, insect cells and/or E. coli.
  • PKS and NRPS genes are preferably expressed in E.
  • the expression vectors pKK223-3 and pKK223-2 are used to express PKS and NRPS genes in E. coli, either in transc ⁇ ptional or translational fusion, behind the tac or trc promoter.
  • G52 Medium yeast extract, low in salt (BioSpringer, Maison Alfort, France) 2 g/l
  • Cyclodextnns (Fluka, Buchs, Switzerland, or Wacker Chemie, Kunststoff, Germany) in different concentrations are sterilised separately and added to the 1 B12 medium prior to seeding.
  • the culture is overseeded every 3-4 days, by adding 50 ml of culture to 450 ml of G52 medium (in a 2 litre Erlenmeyer flask). All experiments and fermentations are carried out by starting with this maintenance culture.
  • Fermentation Fermentations are carried out on a scale of 10 litres, 100 litres and 500 litres 20 litre and 100 litre fermentations serve as an intermediate culture step Whereas the pre- cultures and intermediate cultures are seeded as the maintenance culture 10% (v/v), the mam cultures are seeded with 20% (v/v) of the intermediate culture Important: In contrast to the agitating cultures, the ingredients of the media for the fermentation are calculated on the final culture volume including the inoculum. If, for example, 18 litres of medium + 2 litres of inoculum are combined, then substances for 20 litres are weighed in, but are only mixed with 18 litres
  • 100 litres 90 litres of G52 medium in a fermenter having a total volume of 150 litres are seeded with 10 litres of the 20 litre intermediate culture. Cultivation lasts for 3-4 days, and the conditions are: 30°C, 150 rpm, 0.5 litres of air per litre liquid per mm, 0.5 bars excess pressure, no pH control. Mam culture. 10 litres, 100 litres or 500 litres:
  • 10 litres The media substances for 10 litres of 1 B12 medium are sterilised in 7 litres of water, then 1 litre of a sterile 10% 2-(hydroxypropyl) - ⁇ -cyclodext ⁇ n solution are added, and seeded with 2 litres of a 20 litre intermediate culture.
  • the duration of the mam culture is 6- 7 days, and the conditions are: 30°C, 250 rpm, 0.5 litres of air per litre of liquid per mm, 0.5 bars excess pressure, pH control with H ⁇ SOVKOH to pH 7.6 +/- 0.5 (i.e no control between pH 7.1 and 8.1 ).
  • the media substances for 100 litres of 1 B12 medium are sterilised in 70 litres of water, then 10 litres of a sterile 10% 2-(hydroxypropyl) - ⁇ -cyclodext ⁇ n solution are added, and seeded with 20 litres of a 20 litre intermediate culture.
  • the duration of the mam culture is 6-7 days, and the conditions are. 30°C, 200 rpm, 0.5 litres air per litre liquid per mm., 0.5 bars excess pressure, pH control with H ⁇ SO KOH to pH 7.6 +/- 0.5.
  • the chain of seeding for a 100 litre fermentation is shown schematically as follows: maintenance culture (500ml)
  • 500 litres The media substances for 500 litres of 1 B12 medium are sterilised in 350 litres of water, then 50 litres of a sterile 10% 2-(hydroxypropyl) - ⁇ -cyclodext ⁇ n solution are added, and seeded with 100 litres of a 100 litre intermediate culture.
  • the duration of the main culture is 6-7 days, and the conditions are: 30°C, 120 rpm, 0.5 litres air per litre liquid per mm., 0.5 bars excess pressure, pH control with H 2 SO4/KOH to pH 7.6 +/- 0.5.
  • Solvents A: 0.02 % phosphoric acid
  • Epo A 4.30 mm
  • Epo B 5.38 mm
  • Cyclodext ⁇ ns are cyclic ( ⁇ -1 ,4)-linked oligosacchandes of ⁇ -D-glucopyranose with a relatively hydrophobic central cavity and a hydrophilic external surface area.
  • ⁇ -cyclodextrin (6) ⁇ -cyclodextrin (7), ⁇ - cyclodextrin (8), ⁇ -cyclodext ⁇ n (9), ⁇ - cyclodextrin (10), ⁇ -cyclodext ⁇ n (11 ), ⁇ -cyclodext ⁇ n (12), and ⁇ - cyclodext ⁇ n (13).
  • ⁇ -cyciodext ⁇ n and in particular ⁇ -cyclodextrin, ⁇ - cyclodext ⁇ n or ⁇ -cyclodextrin, or mixtures thereof.
  • Cyclodextrin derivatives are primarily derivatives of the above-mentioned cyclodex- t ⁇ ns, especially of ⁇ -cyclodext ⁇ n, ⁇ -cyclodext ⁇ n or ⁇ -cyclodextnn, primarily those in which one or more up to all of the hydroxy groups (3 per glucose radical) are etherified or este- ⁇ fied.
  • Ethers are primarily alkyl ethers, especially lower alkyl, such as methyl or ethyl ether, also propyl or butyl ether; the aryl-hydroxyalkyl ethers, such as phenyl-hydroxy-lower-alkyl, especially phenyl-hydroxyethyl ether; the hydroxyalkyl ethers, in particular hydroxy-lower- alkyl ethers, especially 2-hydroxyethyl, hydroxypropyl such as 2-hydroxypropyl or hydroxy- butyl such as 2-hydroxybutyl ether; the carboxyalkyl ethers, in particular carboxy-lower-alkyl ethers, especially carboxymethyl or carboxyethyl ether; de ⁇ vatised carboxyalkyl ethers, in particular de ⁇ vatised carboxy-lower-alkyl ether in which the de ⁇ vatised carboxy is etherified or amidated carboxy (primarily aminocarbonyl, mono- or di-lower-alkyl-
  • alk is alkyl, especially lower alkyl, and n is a whole number from 2 to 12, especially 2 to 5, in particular 2 or 3; cyclodext ⁇ ns in which one or more OH groups are etherified with a radical of formula
  • R' is hydrogen, hydroxy, -0-(alk-0) z -H, -0-(alk(-R)-0-) p -H or -0-(alk(-R)-0-)q-alk-CO-Y; alk in all cases is alkyl, especially lower alkyl; m, n, p, q and z are a whole number from 1 to 12, preferably 1 to 5, in particular 1 to 3; and Y is OR ⁇ or NR 2 R 3 , wherein R ⁇ , R 2 and R 3 independently of one another, are hydrogen or lower alkyl, or R 2 and R 3 combined together with the linking nitrogen signify morpholino, pipe ⁇ dino, pyrrolidmo or piperaz o; or branched cyclodext ⁇ ns, in which etherifications or acetals with other sugar molecules are present, especially glucosyl-, diglucosyl- (G 2 - ⁇ -cyclodext ⁇ n), maltosyl- or dim
  • Mixtures of two or more of the said cyclodextnns and/or cyclodextrin derivatives may also exist.
  • the cyclodextnns or cyclodextrin derivatives are added to the culture medium preferably in a concentration of 0.02 to 10, preferably 0.05 to 5, especially 0.1 to 4, for example 0.1 to 2 percent by weight (w/v).
  • Cyclodextnns or cyclodextrin derivatives are known or may be produced by known processes (see for example US 3,459,731 ; US 4,383,992; US 4,535,152; US 4,659,696; EP 0 094 157; EP 0 149 197; EP 0 197 571 ; EP 0 300 526, EP 0 320 032; EP 0 499 322; EP 0 503 710; EP 0 818 469; WO 90/12035; WO 91/1 1200; WO 93/19061 ; WO 95/08993; WO 96/14090; GB 2,189,245; DE 3,118,218; DE 3,317,064 and the references mentioned therein, which also refer to the synthesis of cyclodextnns or cyclodextrin derivatives, or also: T.
  • Fermentation is carried out in a 15 litre glass fermenter.
  • the medium contains 10 g/l of 2-(hydroxypropyl)- ⁇ -cyclodextr ⁇ n from Wacker Chemie, Kunststoff, Germany.
  • the progress of fermentation is illustrated in Table 3. Fermentation is ended after 6 days and working up takes place.
  • Fermentation is carried out in a 150 litre fermenter
  • the medium contains 10 g/l of 2- (Hydroxypropyl)- ⁇ -cyclodextr ⁇ n.
  • the progress of fermentation is illustrated in Table 4
  • the fermentation is harvested after 7 days and worked up.
  • Fermentation is carried out in a 750 litre fermenter.
  • the medium contains 10 g/l of 2- (Hydroxypropyl)- ⁇ -cyclodextr ⁇ n.
  • the progress of fermentation is illustrated in Table 5. The fermentation is harvested after 7 days and worked up.
  • Fermentation is carried out in a 15 litre glass fermenter.
  • the medium does not contain any cyclodextrin or other adsorber.
  • the progress of fermentation is illustrated in Table 6. The fermentation is not harvested and worked up.
  • the mam part of the epothilones are found in the centrifugate,
  • the cent ⁇ fuged cell pulp contains ⁇ 15% of the determined epothilone portion and is not further processed.
  • the resin is discharged from the centrifuge and washed with 10-15 litres of deionised water.
  • Desorption is effected by stirring the resin twice, each time in portions with 30 litres of isopropanol in 30 litre glass stirring vessels for 30 minutes. Separation of the isopropanol phase from the resin takes place using a suction filter.
  • the isopropanol is then removed from the combined isopropanol phases by adding 15-20 litres of water in a vacuum-operated circulating evaporator (Schmid-Verdampfer) and the resulting water phase of ca. 10 litres is extracted 3x each time with 10 litres of ethyl acetate. Extraction is effected in 30 litre glass stirring vessels.
  • the ethyl acetate extract is concentrated to 3-5 litres in a vacuum-operated circulating evaporator (Schmid-Verdampfer) and afterwards concentrated to dryness in a rotary evaporator (B ⁇ chi type) under vacuum.
  • the result is an ethyl acetate extract of 50.2 g.
  • the ethyl acetate extract is dissolved in 500 ml of methanol, the insoluble portions filtered off using a folded filter, and the solution added to a 10 kg Sephadex LH 20 column (Pharmacia, Uppsala, Sweden) (column diameter 20 cm, filling level ca. 1.2 m). Elution is effected with methanol as eluant. Epothilone A and B is present predominantly in fractions 21 -23 (at a fraction size of 1 litre). These fractions are concentrated to dryness in a vacuum on a rotary evaporator (total weight 9.0 g).
  • compositions comprising epothilones are used for example in the treatment of cancerous diseases, such as various human solid tumors.
  • anticancer formulations comprise, for example, an active amount of an epothilone together with one or more organic or inorganic, liquid or solid, pharmaceutically suitable carrier materials.
  • Such formulations are delivered, for example, enterally, nasally, rectally, orally, or parenterally, particularly intramuscularly or intravenously.
  • the dosage of the active ingredient is dependent upon the weight, age, and physical and pharmacokinetical condition of the patient and is further dependent upon the method of delivery.
  • epothilones mimic the biological effects of taxol
  • epothilones may be substituted for taxol in compositions and methods utilizing taxol in the treatment of cancer. See, for example, U.S. Patent Nos. 5,496,804, 5,565,478, and 5,641 ,803, all of which are incorporated herein by reference.
  • epothilone B is supplied in individual 2 ml glass vials formulated as 1 mg/1 ml of clear, colorless intravenous concentrate.
  • the substance is formulated in polyethylene giycol 300 (PEG 300) and diluted with 50 or 100 ml 0.9% Sodium Chloride Injection, USP, to achieve the desired final concentration of the drug for infusion. It is administered as a single 30-m ⁇ nute intravenous infusion every 21 days (treatment three-weekly) for six cycles, or as a single 30-m ⁇ nute intravenous infusion every 7 days (weekly treatment).
  • the dose is between about 0.1 and about 6, preferably about 0.1 and about 5 mg/m 2 , more preferably about 0.1 and about 3 mg/m 2 , even more preferably 0.1 and 1.7 mg/m 2 , most preferably about 0.3 and about 1 mg/m 2 ; for three-weekly treatment (treatment every three weeks or every third week) the dose is between about 0.3 and about 18 mg/m 2 , preferably about 0.3 and about 15 mg/m 2 , more preferably about 0.3 and about 12 mg/m 2 , even more preferably about 0.3 and about 7.5 mg/m 2 , still more preferably about 0.3 and about 5 mg/m 2 , most preferably about 1.0 and about 3.0 mg/m 2 .
  • This dose is preferably administered to the human by intravenous (i.v.) administration during 2 to 180 mm, preferably 2 to 120 mm, more preferably during about 5 to about 30 m , most preferably during about 10 to about 30 mm, e.g. during about 30 m
  • microorganism identified under I. above was accoapanied by: j j a scientific description
  • the microorganism identified under I. above was received by this nternational Depositary Authority on (date of the original deposit) and a request to convert the original deposit to a deposit under the Budap ⁇ t Treaty was received by it on (date of receipt of request for conversion) .

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Oncology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Enzymes And Modification Thereof (AREA)
EP99929243A 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones Withdrawn EP1088078A2 (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US99504 1993-07-30
US9950498A 1998-06-18 1998-06-18
US10163198P 1998-09-24 1998-09-24
US101631P 1998-09-24
US11890699P 1999-02-05 1999-02-05
US118906P 1999-02-05
PCT/EP1999/004171 WO1999066028A2 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones

Publications (1)

Publication Number Publication Date
EP1088078A2 true EP1088078A2 (en) 2001-04-04

Family

ID=27378840

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99929243A Withdrawn EP1088078A2 (en) 1998-06-18 1999-06-16 Genes for the biosynthesis of epothilones

Country Status (16)

Country Link
EP (1) EP1088078A2 (pl)
JP (3) JP2002518004A (pl)
KR (1) KR100511233B1 (pl)
CN (1) CN100374565C (pl)
AU (1) AU753567B2 (pl)
BR (1) BR9911349A (pl)
CA (1) CA2329774A1 (pl)
HU (1) HUP0102186A3 (pl)
ID (1) ID29128A (pl)
IL (3) IL139735A0 (pl)
NO (2) NO20006195L (pl)
NZ (1) NZ508326A (pl)
PL (1) PL200157B1 (pl)
SK (1) SK19242000A3 (pl)
TR (1) TR200003759T2 (pl)
WO (1) WO1999066028A2 (pl)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69734362T2 (de) 1996-12-03 2006-07-20 Sloan-Kettering Institute For Cancer Research Synthese von epothilonen, zwischenprodukte dazu, analoga und verwendungen davon
FR2775187B1 (fr) 1998-02-25 2003-02-21 Novartis Ag Utilisation de l'epothilone b pour la fabrication d'une preparation pharmaceutique antiproliferative et d'une composition comprenant l'epothilone b comme agent antiproliferatif in vivo
DE19846493A1 (de) * 1998-10-09 2000-04-13 Biotechnolog Forschung Gmbh DNA-Sequenzen für die enzymatische Synthese von Polyketid- oder Heteropolyketidverbindungen
US6410301B1 (en) 1998-11-20 2002-06-25 Kosan Biosciences, Inc. Myxococcus host cells for the production of epothilones
NZ511722A (en) * 1998-11-20 2004-05-28 Kosan Biosciences Inc Recombinant methods and materials for producing epothilone and epothilone derivatives
WO2001053533A2 (en) * 2000-01-21 2001-07-26 Kosan Biosciences, Inc. Method for cloning polyketide synthase genes
KR20070092334A (ko) * 2000-04-28 2007-09-12 코산 바이오사이언시즈, 인코포레이티드 폴리케타이드의 제조방법
US6998256B2 (en) 2000-04-28 2006-02-14 Kosan Biosciences, Inc. Methods of obtaining epothilone D using crystallization and /or by the culture of cells in the presence of methyl oleate
JP2005500974A (ja) 2000-10-13 2005-01-13 ザ ユニバーシテイ オブ ミシシッピー エポシロン類及び関連類似体の合成
US7257562B2 (en) 2000-10-13 2007-08-14 Thallion Pharmaceuticals Inc. High throughput method for discovery of gene clusters
DK1483251T3 (da) 2002-03-12 2010-04-12 Bristol Myers Squibb Co C3-cyano-epothilon-derivater
CA2595594C (en) * 2005-01-31 2012-05-01 Merck & Co., Inc. Upstream and a downstream purification process for large scale production of plasmid dna
WO2012103516A1 (en) 2011-01-28 2012-08-02 Amyris, Inc. Gel-encapsulated microcolony screening
SG194785A1 (en) 2011-05-13 2013-12-30 Amyris Inc Methods and compositions for detecting microbial production of water-immiscible compounds
BR112015002724B1 (pt) 2012-08-07 2022-02-01 Total Marketing Services Método para produzir um composto não catabólico heterólogo, e, composição de fermentação
EP2971027B1 (en) 2013-03-15 2019-01-30 Amyris, Inc. Use of phosphoketolase and phosphotransacetylase for production of acetyl-coenzyme a derived compounds
BR112016002526B1 (pt) 2013-08-07 2021-11-23 Total Marketing Services Método para produção de um composto heterólogo não catabólico, e, composição de fermentação
WO2016210350A1 (en) 2015-06-25 2016-12-29 Amyris, Inc. Maltose dependent degrons, maltose-responsive promoters, stabilization constructs, and their use in production of non-catabolic compounds
CN106916834B (zh) * 2015-12-24 2022-08-05 武汉合生科技有限公司 化合物的生物合成基因簇及其应用
CN111138444B (zh) * 2020-01-08 2022-05-03 山东大学 一组埃博霉素b葡萄糖苷类化合物及其酶法制备与应用

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
HU229833B1 (en) * 1996-11-18 2014-09-29 Biotechnolog Forschung Gmbh Epothilone d production process, and its use as cytostatic as well as phytosanitary agents

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9966028A2 *

Also Published As

Publication number Publication date
BR9911349A (pt) 2001-03-13
JP2006061166A (ja) 2006-03-09
WO1999066028A2 (en) 1999-12-23
IL139735A (en) 2009-06-15
NO20006195L (no) 2001-02-16
KR20010052962A (ko) 2001-06-25
IL190391A0 (en) 2008-11-03
WO1999066028A3 (en) 2000-06-29
SK19242000A3 (sk) 2001-07-10
HUP0102186A2 (hu) 2001-10-28
JP2008092958A (ja) 2008-04-24
NZ508326A (en) 2003-10-31
HUP0102186A3 (en) 2005-10-28
ID29128A (id) 2001-08-02
NO20091055L (no) 2001-02-16
NO20006195D0 (no) 2000-12-06
CA2329774A1 (en) 1999-12-23
PL345579A1 (en) 2001-12-17
KR100511233B1 (ko) 2005-08-31
JP2002518004A (ja) 2002-06-25
AU753567B2 (en) 2002-10-24
PL200157B1 (pl) 2008-12-31
IL139735A0 (en) 2002-02-10
CN100374565C (zh) 2008-03-12
TR200003759T2 (tr) 2001-06-21
CN1305530A (zh) 2001-07-25
AU4611699A (en) 2000-01-05

Similar Documents

Publication Publication Date Title
US6858404B2 (en) Genes for the biosynthesis of epothilones
JP2006061166A (ja) エポチロン生合成用遺伝子
JP4662635B2 (ja) エポチロンおよびエポチロン誘導体を生成するための組換え方法および材料
US6410301B1 (en) Myxococcus host cells for the production of epothilones
AU2001295195B2 (en) Myxococcus host cells for the production of epothilones
AU2001295195A1 (en) Myxococcus host cells for the production of epothilones
TWI770070B (zh) 經修飾之抗真菌鏈黴菌(streptomyces fungicidicus)分離株及其用途
RU2265054C2 (ru) Рекомбинантная клетка-хозяин (варианты) и клон вас
RU2234532C2 (ru) Нуклеиновая кислота (варианты), ее использование для экспрессии эпотилонов, полипептид (варианты), клон бактерий е.coli
CN100359014C (zh) 一类新型埃坡霉素化合物及其制备方法和用途
CN100374566C (zh) 用于epothilone生物合成的基因
KR20130097538A (ko) 해양 미생물 하헬라 제주엔시스의 제주엔올라이드 생합성 유전자 클러스터
MXPA00012342A (en) Genes for the biosynthesis of epothilones
CZ20004693A3 (cs) Izolovaná nukleová kyselina kódující polypeptid účastnící se biosyntézy epothilonu, chimérický gen, vektor a hostitelské buňky obsahující tuto nukleovou kyselinu
AU2007200160A1 (en) Heterologous production of polyketides

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20001109

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: RO PAYMENT 20001109;SI PAYMENT 20001109

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: NOVARTIS-ERFINDUNGEN VERWALTUNGSGESELLSCHAFT M.B.

Owner name: NOVARTIS AG

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: NOVARTIS PHARMA GMBH

Owner name: NOVARTIS AG

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: NOVARTIS PHARMA GMBH

Owner name: NOVARTIS AG

17Q First examination report despatched

Effective date: 20070313

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20110419

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1035557

Country of ref document: HK