EP1208209A1 - Evolution and use of enzymes for combination and medicinal chemistry - Google Patents
Evolution and use of enzymes for combination and medicinal chemistryInfo
- Publication number
- EP1208209A1 EP1208209A1 EP00959219A EP00959219A EP1208209A1 EP 1208209 A1 EP1208209 A1 EP 1208209A1 EP 00959219 A EP00959219 A EP 00959219A EP 00959219 A EP00959219 A EP 00959219A EP 1208209 A1 EP1208209 A1 EP 1208209A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- library
- recombinant
- enzymes
- derivatizing
- organic molecule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1086—Preparation or screening of expression libraries, e.g. reporter assays
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/52—Genes encoding for enzymes or proenzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P1/00—Preparation of compounds or compositions, not provided for in groups C12P3/00 - C12P39/00, by using microorganisms or enzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/44—Preparation of O-glycosides, e.g. glucosides
- C12P19/60—Preparation of O-glycosides, e.g. glucosides having an oxygen of the saccharide radical directly bound to a non-saccharide heterocyclic ring or a condensed ring system containing a non-saccharide heterocyclic ring, e.g. coumermycin, novobiocin
- C12P19/62—Preparation of O-glycosides, e.g. glucosides having an oxygen of the saccharide radical directly bound to a non-saccharide heterocyclic ring or a condensed ring system containing a non-saccharide heterocyclic ring, e.g. coumermycin, novobiocin the hetero ring having eight or more ring members and only oxygen as ring hetero atoms, e.g. erythromycin, spiramycin, nystatin
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/02—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01J—ELECTRIC DISCHARGE TUBES OR DISCHARGE LAMPS
- H01J49/00—Particle spectrometers or separator tubes
- H01J49/02—Details
- H01J49/04—Arrangements for introducing or extracting samples to be analysed, e.g. vacuum locks; Arrangements for external adjustment of electron- or ion-optical components
Definitions
- This invention pertains to the field of enzymatic synthesis of combinatorial libraries of organic molecules using evolved enzymes.
- the invention provides libraries of enzymes that, through directed evolution, are capable of biocatalytically synthesizing a multitude of derivatives of organic molecules.
- the libraries of organic molecule derivatives can be screened to identify active compounds, such as antibiotics and other therapeutic reagents, herbicides and pesticides, and the like.
- the lead compound In the process of drug discovery, optimization of a lead compound represents one of many challenges. Very often, the lead compound lacks some of the pharmacological properties required for a fully functional pharmaceutical, such as high potency, selectivity, low toxicity, bioavailability, and the like. Additional modification of the lead compound is therefore often necessary for achieving an optimized drug that has a complete combination of desired properties.
- the traditional approach to derivatization depends upon a large body of empirical experience to guide the medicinal chemist in the choice of which chemical analogs to synthesize and test. Some compounds are chosen for synthesis, and others are not. Similarly, when combinatorial chemistry is used to generate derivatives of lead compounds, particular building blocks are chosen for parallel synthesis of many analogs. Other building blocks are not.
- Improvement of lead compounds having potential for pharmaceutical use is not the only situation in which derivatization of organic molecules is of interest.
- Organic molecules have many uses, including, for example, pesticides, herbicides, and others.
- Combinatorial synthesis methods have the potential to provide a way to synthesize a wide variety of lead compound derivatives without the need for a priori assumptions as to which derivatives are likely to be most favorable. Instead of synthesizing derivatives individually and testing them, one can make a large number of different derivatives simultaneously. Combinatorial synthesis is useful not only for the derivatization of lead compounds, but also for the synthesis of compounds that are screened to identify those that are worthy of further study as potential lead compounds. However, synthesis of combinatorial libraries of organic molecule derivatives is severely limited because many types of derivatives oforganic molecules are difficult or even impossible to synthesize by purely chemical means.
- Enzymes provide a potentially attractive route to the synthesis of chemical compound libraries from which one can identify those compounds that exhibit desired properties. Enzymes can act on mixtures of complex molecules in solution, catalyzing the synthesis of derivatives of the molecules without the production of byproducts. While traditional chemical processes for lead compound derivatization are typically non-selective and require multiple protection and de-protection steps, such steps are not required for enzymatic synthesis. Moreover, enzymes can function under relatively mild conditions that are not destructive to the reaction products. Furthermore, enzymes can carry out several different types of modifications to organic molecules, such as existing and potential lead compounds and other biologically active molecules of interest.
- enzymes can catalyze the addition of a moiety to a compound (e.g., by ester, amide, carbonate, carbamate or glycoside linkage, and the like). Enzymes can also add new functional groups to an organic molecule, or can modify existing functional groups that are present on the compound. Enzymatic biocatalysis can also provide certain further advantages such as substrate-, stereo- and regio-selectivity.
- the present invention provides methods for obtaining a library oforganic molecule derivatives.
- the methods involve contacting an organic molecule with one or more members of a library of recombinant derivatizing enzymes and other necessary reactants to form the library of organic molecule derivatives.
- the derivatizing enzymes catalyze a reaction such as: a) modification of one or more functional groups present on the organic molecule; b) addition of a chemical moiety onto one or more functional groups present on the organic molecule; or c) introduction of a new functional group onto the organic molecule.
- the methods are useful for a wide variety oforganic molecules, including, for example, those that have pharmacological, herbicide, pesticide, or other activities, or are useful in industrial processes.
- the methods further involve performing one or more additional reactions on the derivatives that are obtained by contact with the derivatizing enzymes.
- the products of the initial reaction serve as intermediates for further reactions.
- the further reactions can involve, for example, contacting the library oforganic molecule derivatives with one or more members of a second library of recombinant derivatizing enzymes and other necessary reactants to form a further library oforganic molecule derivatives.
- the intermediates can be modified chemically or with other enzymes.
- the libraries of recombinant derivatizing enzymes are obtained, in some embodiments, by (1) recombining at least first and second forms of a nucleic acid that encodes a derivatizing enzyme, wherein the first and second forms differ from each other in two or more nucleotides, to produce a library of recombinant polynucleotides; and (2) expressing the library of recombinant polynucleotides to obtain the library of recombinant derivatizing enzymes.
- the method can further involve (3) recombining at least one recombinant polynucleotide that encodes a member of the library of recombinant derivatizing enzymes with a further form of the nucleic acid that encodes a derivatizing enzyme, which is the same or different from the first and second forms, to produce a further library of recombinant nucleic acids; (4) expressing the further library of recombinant polynucleotides to obtain a further library of recombinant derivatizing enzymes; and (5) repeating (3) and (4), as necessary, until the further library of recombinant derivatizing enzymes contains a desired number of different recombinant derivatizing enzymes.
- the invention also provides methods of obtaining an enzyme that catalyzes the synthesis of a desired organic molecule derivative. These methods involve contacting an organic molecule with members of a library of recombinant derivatizing enzymes and other necessary reactants to form a library oforganic molecule derivatives; identifying the desired organic molecule derivative in the library oforganic molecule derivatives; and identifying the member of the library of recombinant derivatizing enzymes that catalyzes the synthesis of the desired organic molecule derivative.
- the invention provides libraries oforganic molecule derivatives.
- the libraries are biocatalytically synthesized by contacting an organic molecule having one or more functional groups with a plurality of members of a library of recombinant derivatizing enzymes that catalyze a reaction such as: a) modification of one or more of the functional groups; b) addition of a chemical moiety onto one or more of the functional groups; or c) introduction of a new functional group.
- Figure 1 shows potential sugar attachment points on vancomycin hydrochloride.
- Figure 2 shows potential sugar attachment points on somatostatin.
- Figure 3 shows potential sugar attachment points on cholic acid.
- Figure 4 shows potential sugar attachment points on L-thyroxine.
- Figure 5 shows potential sugar attachment points on nogalamycin.
- Figure 6 shows potential sugar attachment points on syringaldazine.
- Figure 7 shows potential sugar attachment points on alcarubicin.
- Figure 8 shows potential sugar attachment points on ritodrine hydrochloride.
- Figure 9 shows potential sugar attachment points on rifamycin.
- Figure 10 shows sugar attachment points on ristomycin sulfate. Five additional hydroxyls on the backbone are also shown (but not indicated by arrows); these constitute potential sugar attachment points.
- Figure 11 shows a multi-step chemical methylation of erythromycin A and its analogs.
- Figure 12 shows the reaction catalyzed by S-adenosylmethionine (SAM) dependent methyltransferases.
- Figure 13 shows the specificity of O-methyltransferases that can be shuffled to obtain recombinant enzymes that have 6-OMTase activity using erythromycin and its analogs as substrates.
- SAM S-adenosylmethionine
- Figure 14 shows DNA and protein sequence similarity of the O- methyltransferases that are shuffled to obtain recombinant enzymes that have 6-OMTase activity using erythromycin and its analogs as substrates.
- Figure 15 shows a microtiter plate high-throughput primary screen for the identification of methyltransferases that have novel specificity.
- Figure 16 shows a schematic of the use of erythromycin A 6-O- methyltransferase for the biocatalytic synthesis of clarithromycin.
- Figure 17 shows a secondary assay for a clarithromycin synthase. MS/MS detection of a 590/158 pair identifies methylation of the macrolide ring.
- Figure 18 shows a further secondary assay for a clarithromycin synthase. Phenyl Boronate reacts specifically with cis diols at neutral pH. Only clarithromycin has the 11-12-cis diol that can react to give an 834.5 ion.
- Figure 19 shows a map of the vector pCKZEBB.
- a “derivatizing enzyme” is an enzyme that can catalyze a reaction on an organic molecule.
- a derivatizing enzyme can modify an existing functional group that is present on the molecule, add a chemical moiety onto a functional group, or add a new functional group to the organic molecule.
- the organic molecules can include both synthetic (including, e.g., non-naturally occurring compounds such as halo-containing compounds and the like) and naturally occurring compounds.
- a “recombinant derivatizing enzyme” is a non-naturally occurring derivatizing enzyme that differs in sequence from a naturally occurring derivatizing enzyme by at least one amino acid residue.
- Recombinant derivatizing enzymes include derivatizing enzymes that are composed of a plurality of blocks of amino acids, which blocks are not contiguous in a naturally occurring enzyme. The blocks are generally of random length.
- a recombinant derivatizing enzyme may be chimeric, thus having portions of its sequence derived from the sequences of at least two different parental enzymes.
- a chimeric recombinant derivatizing enzyme is encoded by a chimeric gene that contains nucleic acid segments derived from at least two distinct parental genes or parental gene segments.
- a parental gene may optionally encode a derivatizing enzyme.
- library refers to a collection of diverse molecules, such as, for example, recombinant derivatizing enzymes and organic compound analogues.
- Libraries of the present invention have at least two distinct member molecules but can vary in size. Typically, invention libraries have at least about 5 distinct members, and more typically at least about 10 distinct member molecules. Larger libraries of the present invention typically have at least about 100 distinct member molecules, sometimes more than about 10,000, or even more than about 100,000. Very large libraries of the present invention can have more than about 1,000,000 members.
- a “functional group” refers to an atom or group of atoms that define the structure of a particular family oforganic compounds and determines their properties. Functional groups include, for example, alkenes, alkynes, aromatics, halogens, hydroxyls, ethers, esters, aldehydes, ketones, carboxylic acids, amides, amines, and the like.
- a “lead compound” is a prototype compound that has a desired biological or pharmacological activity, but may have other characteristics that are undesirable.
- the lead compound might be toxic, insoluble, have other biological activities, have less than optimal bioavailability (e.g., properties such as absorption, distribution, metabolism, and excretion (i.e., ADME), or less than optimal biological activity, etc.
- Nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form.
- nucleic acids containing known nucleotide analogs or modified backbone residues or linkages which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides.
- analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2- O-methyl ribonucleotides, peptide-nucleic acids (PNAs), and the like.
- nucleic acid is used interchangeably herein with “gene,” “cDNA,” “mRNA,” “oligonucleotide,” and “polynucleotide.”
- polypeptide peptide
- protein protein
- amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
- amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
- Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ -carboxyglutamate, and O-phosphoserine.
- Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an ⁇ carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group (e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfomum). Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Amino acid mimetics refer to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that function in a manner similar to a naturally occurring amino acid.
- Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes. "Conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refer to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences.
- degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al, Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al, J Biol Chem. 260:2605-2608 (1985); Rossolini et al, Mol. Cell Probes 8:91-98 (1994)). Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine.
- nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence recited herein that encodes a polypeptide also describes every possible silent variation of the nucleic acid.
- AUG which, along with GUG in some organisms, is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan
- TGG which is ordinarily the only codon for tryptophan
- each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence.
- amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alter, add or delete a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homo logs, and alleles of the invention.
- shuffling is used herein to indicate recombination between non- identical sequences, in some embodiments shuffling may include crossover via homologous recombination or via non-homologous recombination, such as via cre/lox and/or flp/frt systems.
- Shuffling can be carried out by employing a variety of different formats, including, for example, in vitro and in vivo shuffling formats, in silico shuffling formats, shuffling formats that utilize either double-stranded or single-stranded templates, primer-based shuffling formats, nucleic acid fragmentation-based shuffling formats, oligonucleotide- mediated shuffling formats, all of which are based on recombination events between non- identical sequences and are described in more detail or reference herein below, as well as other similar recombination-based formats.
- the present invention provides libraries of recombinant derivatizing enzymes that are useful for generating combinatorial libraries of chemical compounds, in particular organic molecules. Also provided are libraries oforganic molecule derivatives that are obtained using the recombinant derivatizing enzyme libraries.
- the libraries oforganic molecule derivatives are useful, for example, to identify those derivatives that have a desired biological activity and thus are suitable for testing as lead compounds, e.g., for pharmaceutical or other use, and for creating combinatorial libraries of derivatives of a previously identified lead compound for testing for improved pharmacological or other parameters.
- the chemical compounds are often organic molecules, including synthetic molecules (including, for example, non-naturally occurring compounds) and natural products such as, for example, antibiotics.
- the invention provide several advantages over previously available methods for obtaining libraries of organic molecule derivatives.
- the recombinant library will contain enzymes that exhibit catalytic properties that differ from one another in features such as catalytic rates and constants, stereo-, regio- and enantiomeric specificity, multiplicity of substrate selectivity, product inhibition, stability in a solvent used for biocatalytic synthesis, stability in chemical processes in general, and the like.
- the resulting multitude of different enzymes thus increases the number of different compounds that can be generated by biocatalytic reactions.
- one enzyme is used for biocatalysis with a single organic molecule and a single chemical moiety donor, generally only one derivative is generated.
- DNA shuffling or other methods of recursive recombination are used to generate the libraries of recombinant enzymes. DNA shuffling has proven very effective at improving the level of known activity of a biocatalyst. An additional value of this technology lies in the ability to generate catalytic activities that were previously unknown among wild-type enzymes.
- DNA shuffling of a family of related genes generates functionally diverse gene libraries with different physical properties that span a more complex sequence space than can be found in nature for a particular protein. Since the novel members of these enzyme libraries have never been under selective pressures in an organism, they are unbiased and can be screened for new activities that are rare or non-existent in natural samples. Thus, one can create diverse and complex enzyme libraries that catalyze a spectrum of important chemistries.
- the enzymes can catalyze modifications of functional groups that are present on organic molecules, addition of chemical moieties onto functional groups (e.g., acylation, glycosylation, and methylation), and introduction of new functional groups into the organic molecule (e.g., introduction of hydroxyl groups by oxidation, double bonds by reduction, and the like).
- the enzyme libraries can be used directly to synthesize a multitude of products starting from substrate mixtures, or to synthesize a specific compound starting from a defined substrate set.
- single members of the library of recombinant enzymes can be used to synthesize mixtures of compounds by contacting the members with a mixture of substrates.
- each single member of the library of recombinant derivatizing enzymes can be tested with a defined substrate set to identify enzymes that have new and useful substrate selectivities or other useful features.
- the organic molecule derivatives that are thus synthesized can then be screened to identify those that have a desired property, or can be further modified by one or more additional chemical or enzymatic reactions.
- the recombinant enzymes obtained using the methods of the invention can be used in vitro, or can be expressed by microbial cells that carry out the biocatalysis.
- the microorganisms are modified to express one or more derivatizing enzymes for efficient biocatalytic manufacturing of the derivatized products.
- the microorganisms can include one or more recombinant polynucleotides that encode the improved acyltransferases, glycosyltransferases, oxidases, methyltransferases, or other biocatalytic enzymes, which are then expressed by the microbial cells.
- These polynucleotides can be introduced into organisms that naturally produce the starting substrate of interest.
- a polynucleotide that encodes a recombinant derivatizing enzyme can be introduced into an organism that naturally produces, or has been engineered to produce, a polyketide or other antibiotic.
- the recombinant polynucleotides that encode recombinant derivatizing enzymes of the invention are useful for the in vivo derivatization oforganic compounds for which the backbones were previously prepared, for in vivo derivatization of organic compounds in the organism that biosynthesizes the backbone of the organic molecule, and for in vitro use to derivatize a previously prepared organic molecule.
- the invention involves, in some embodiments, creating recombinant libraries of polynucleotides that are then screened to identify those library members that encode an enzyme or other polypeptide that exhibits a desired property, e.g., enhanced enzymatic activity, stereospecificity, regiospecificity and enantiospecificity, reduced susceptibility to inhibitors, processing stability (e.g., solvent stability, pH stability, thermal stability, etc.), and the like.
- processing stability e.g., solvent stability, pH stability, thermal stability, etc.
- the recombinant libraries can be created using any of various methods, including those described herein. For example, a variety of nucleic acid shuffling protocols are available and fully described in the art.
- Shuffling formats that employ single stranded templates are described in "METHODS AND COMPOSITIONS FOR POLYPEPTIDE ENGINEERING,” WO 9827230, by Patten et al.; "SINGLE-STRANDED NUCLEIC ACID TEMPLATE-MEDIATED RECOMBINATION AND NUCLEIC ACID FRAGMENT ISOLATION” by Affholter, USSN 60/186,482 filed March 2, 2000; "METHODS FOR GENERATING HIGHLY DIVERSE LIBRARIES,” WO 0000632; and "METHOD FOR OBTAINING IN VITRO RECOMBINED
- nucleic acids can be recombined in vitro by any of a variety of techniques discussed in the references above, including e.g., DNAse digestion of nucleic acids to be recombined followed by ligation and/or PCR reassembly of the nucleic acids.
- nucleic acids can be recursively recombined in vivo, e.g., by allowing recombination to occur between nucleic acids in cells.
- whole genome recombination methods can be used in which whole genomes of cells or other organisms are recombined, optionally including spiking of the genomic recombination mixtures with desired library components.
- synthetic recombination methods can be used, in which oligonucleotides corresponding to targets of interest are synthesized and reassembled in PCR or ligation reactions which include oligonucleotides which conespond to more than one parental nucleic acid, thereby generating new recombined nucleic acids.
- Oligonucleotides can be made by standard nucleotide addition methods, or can be made by tri-nucleotide synthetic approaches.
- silico methods of recombination can be effected in which genetic algorithms are used in a computer to recombine sequence strings which correspond to nucleic acid homologues (or even non-homologous sequences).
- the resulting recombined sequence strings are optionally converted into nucleic acids by synthesis of nucleic acids which correspond to the recombined sequences, e.g., in concert with oligonucleotide synthesis/ gene reassembly techniques.
- Any of the preceding general recombination formats can be practiced in a reiterative fashion to generate a more diverse set of recombinant nucleic acids.
- the shuffling method employed to prepare polynucleotides encoding recombinant derivatizing enzymes comprises: initiating a polynucleotide amplification process on overlapping segments of a population of variant polynucleotides under conditions whereby one segment serves as a template for extension of another segment, to generate a population of recombinant polynucleotides; and selecting or screening a recombinant polynculeotide for a desired property.
- the overlapping segments can be prepared by a variety of methods, as described or referenced herein, including, for example, chemical synthesis, cleavage or fragmentation, amplification of the population of polynucleotides, and other methods that are well known in the art.
- the shuffling method used to generate the recombinant derivatizing enzymes comprises: hybridizing at least two sets of nucleic acids, wherein a first set of nucleic acids comprises single-stranded nucleic acid templates and a second set of nucleic acids comprises at least one set of nucleic acid fragments; and, elongating, ligating, or both, sequence gaps between the hybridized nucleic acid fragments, to generate at least substantially full-length chimeric nucleic acid sequences that correspond to the single-stranded nucleic acid templates, thereby recombining the set of nucleic acid fragments, and optionally, denaturing the at least substantially full-length chimeric nucleic acid sequences and the single- stranded nucleic acid templates; and separating the at least substantially full-length chimeric nucleic acid sequences from the single-stranded nucleic acid templates by at least one separation technique; and, fragmenting the separated at least substantially full-length chimeric nucleic acid sequences by nu
- nucleic acids of the invention can be recombined (with each other or with related (or even unrelated) to produce a diverse set of recombinant nucleic acids, including, e.g., sets of homologous nucleic acids.
- any nucleic acids which are produced can be selected for a desired activity.
- this can include testing for and identifying any activity that can be detected in an automatable format, by any of the assays in the art.
- a variety of related (or even unrelated) properties can be assayed for, using any available assay.
- DNA mutagenesis and shuffling provide a robust, widely applicable, means of generating diversity useful for the engineering of proteins, pathways, cells and organisms with improved characteristics.
- shuffling methodologies In addition to the basic formats described above, it is sometimes desirable to combine shuffling methodologies with other techniques for generating diversity.
- shuffling methods In conjunction with (or separately from) shuffling methods, a variety of diversity generation methods can be practiced and the results (i.e., diverse populations of nucleic acids) screened for in the systems of the invention. Additional diversity can be introduced by mutagenesis methods that are known in the art.
- Mutagenesis methods include, for example, those described in Publ. No. WO98/42727; site-directed mutagenesis (Ling et al. (1997) "Approaches to DNA mutagenesis: an overview" In: Anal Biochem. 254(2): 157-78; Dale et al. (1996) "Oligonucleotide-directed random mutagenesis using the phosphorothioate method.”
- deletion mutagenesis (Eghtedarzadeh and Henikoff (1986) "Use of oligonucleotides to generate large deletions” Nucl. Acids Res. 14: 5115), restriction-selection and restriction-selection and restriction-purification (Wells et al. (1986) "Importance of hydrogen-bond formation in stabilizing the transition state of subtilisin” Phil. Trans. R. Soc. Lond. A 317: 415-423), mutagenesis by total gene synthesis (Nambiar et al.
- Kits for mutagenesis are commercially available.
- kits are available from, e.g., Stratagene (e.g., QuickChange site-directed mutagenesis kit; Chameleon double-stranded, site-directed mutagenesis kit), Bio/Can Scientific, Bio-Rad (e.g., using the Kunkel method described above), Boehringer Mannheim Corp., Clonetech Laboratories, DNA Technologies, Epicentre Technologies (e.g., 5 prime 3 prime kit); Genpak Inc,
- any of the described shuffling techniques can be used in conjunction with procedures which introduce additional diversity into a genome, e.g. a bacterial genome.
- techniques have been proposed which produce nucleic acid multimers suitable for transformation into a variety of species, including E. coli and B. subtilis (see e.g., Schellenberger U.S. Patent No. 5,756,316).
- multimers consist of genes that are divergent with respect to one another, (e.g., derived from natural diversity or through application of site directed mutagenesis, error prone PCR, passage through mutagenic bacterial strains, and the like), are transformed into a suitable host, an additional source of nucleic acid diversity for DNA shuffling is introduced.
- Multimers transformed into host species are particularly suitable as substrates for in vivo shuffling protocols.
- a multiplicity of polynucleotides sharing regions of partial sequence similarity can be transformed into a host species and recombined in vivo by the host cell.
- Subsequent rounds of cell division can be used to generate libraries, members of which, each comprise a single, homogenous population of monomeric or pooled nucleic acid.
- the monomeric nucleic acid can be recovered by standard techniques and recombined in any of the described shuffling formats.
- Shuffling formats employing chain termination methods have also been proposed (see e.g., U.S. Patent No. 5,965,408).
- double stranded DNAs corresponding to one or more genes sharing regions of sequence similarity are combined and denatured, in the presence or absence of primers specific for the gene.
- the single stranded polynucleotides are then annealed and incubated in the presence of a polymerase and a chain terminating reagent (e.g., uv, gamma or X-ray irradiation; ethidium bromide or other intercalators; DNA binding proteins, such as single strand binding proteins, transcription activating factors, or histones; polycyclic aromatic hydrocarbons; trivalent chromium or a trivalent chromium salt; or abbreviated polymerization mediated by rapid thermocycling; and the like), resulting in the production of partial duplex molecules.
- a chain terminating reagent e.g., uv, gamma or X-ray irradiation; ethidium bromide or other intercalators; DNA binding proteins, such as single
- the partial duplex molecules e.g., containing partially extended chains, are then denatured and reannealed in subsequent rounds of replication or partial replication resulting in polynucleotides which share varying degrees of sequence similarity and which are chimeric with respect to the starting population of DNA molecules.
- the products or partial pools of the products can be amplified at one or more stages in the process.
- Polynucleotides produced by a chain termination method, such as described above are suitable substrates for further DNA shuffling according to any of the described formats.
- Biotechnol 17:1205 can be used to generate a shuffled library which can optionally serve as a substrate for one or more rounds of in vitro or in vivo shuffling methods.
- a shuffled library which can optionally serve as a substrate for one or more rounds of in vitro or in vivo shuffling methods.
- Multispecies expression libraries are, in general, libraries comprising cDNA or genomic sequences from a plurality of species or strains, operably linked to appropriate regulatory sequences, in an expression cassette.
- the cDNA and/or genomic sequences are optionally randomly concatenated to further enhance diversity.
- the vector can be a shuttle vector suitable for transformation and expression in more than one species of host organism, e.g., bacterial species, eukaryotic cells.
- the library is biased by preselecting sequences which encode a protein of interest, or which hybridize to a nucleic acid of interest. Any such libraries can be provided as substrates for any of the shuffling methods herein described.
- preselect or prescreen libraries e.g., an amplified library, a genomic library, a cDNA library, a normalized library, etc.
- substrate nucleic acids e.g., an amplified library, a genomic library, a cDNA library, a normalized library, etc.
- shuffling procedures can also, independently have these effects.
- recombined CDRs derived from B cell cDNA libraries can be amplified and assembled into framework regions (e.g., Jirholt et al.
- Desired activities can be identified by any method known in the art.
- WO 99/10539 proposes that gene libraries can be screened by combining extracts from the gene library with components obtained from metabolically rich cells and identifying combinations which exhibit the desired activity. It has also been proposed (e.g., WO 98/58085) that clones with desired activities can be identified by inserting bioactive substrates into samples of the library, and detecting bioactive fluorescence corresponding to the product of a desired activity using a fluorescent analyzer, e.g., a flow cytometry device, a CCD, a fluorometer, or a spectrophotometer.
- a fluorescent analyzer e.g., a flow cytometry device, a CCD, a fluorometer, or a spectrophotometer.
- Libraries can also be biased towards nucleic acids which have specified characteristics, e.g., hybridization to a selected nucleic acid probe.
- a desired activity e.g., an enzymatic activity, for example: a lipase, an esterase, a protease, a glycosidase, a glycosyl transferase, a phosphatase, a kinase, an oxygenase, a peroxidase, a hydrolase, a hydratase, a nitrilase, a transaminase, an amidase or an acylase) can be identified from among genomic DNA sequences in the following manner.
- an enzymatic activity for example: a lipase, an esterase, a protease, a glycosidase, a glycosyl transferase, a phosphatase, a kinase, an oxygenase, a peroxidase
- genomic DNA Single stranded DNA molecules from a population of genomic DNA are hybridized to a ligand-conjugated probe.
- the genomic DNA can be derived from either a cultivated or uncultivated microorganism, or from an environmental sample. Alternatively, the genomic DNA can be derived from a multicellular organism, or a tissue derived therefrom.
- Second strand synthesis can be conducted directly from the hybridization probe used in the capture, with or without prior release from the capture medium or by a wide variety of other strategies known in the art.
- the isolated single-stranded genomic DNA population can be fragmented without further cloning and used directly in a shuffling format that employs a single stranded template.
- Some single-stranded template shuffling formats are described in, for example, WO 98 27239, "METHODS AND COMPOSITIONS FOR POLYPEPTIDE ENGINEERING," Patten et al.; "SINGLE- STRANDED NUCLEIC ACID TEMPLATE-MEDIATED RECOMBINATION AND
- NUCLEIC ACID FRAGMENT ISOLATION by Affholter, USSN 60/186,482 filed March 2,2000; "METHODS FOR GENERATING HIGHLY DIVERSE LIBRARIES,” WO 0000632; and “METHOD FOR OBTAINING IN VITRO RECOMBINED POLYNUCLEOTIDE SEQUENCE BANKS AND RESULTING SEQUENCES,” WO 0009679.
- the fragment population derived the genomic library(ies) is annealed with partial, or, often approximately full length ssDNA or RNA corresponding to the opposite strand.
- Assembly of complex chimeric genes from this population is the mediated by nuclease-base removal of non-hybridizing fragment ends, polymerization to fill gaps between such fragments and subsequent single stranded ligation.
- the parental strand can be removed by digestion (if RNA or uracil-containing), magnetic separation under denaturing conditions (if labeled in a manner conducive to such separation) and other available separation purification methods.
- the parental strand is optionally co- purifed with the chimeric strands and removed during subsequent screening and processing steps.
- single-stranded molecules are converted to double-stranded DNA (dsDNA) and the dsDNA molecules are bound to a solid support by ligand-mediated binding.
- the selected DNA molecules are released from the support and introduced into a suitable host cell to generate a library enriched sequences which hybridize to the probe.
- a library produced in this manner provides a desirable substrate for further shuffling using any of the shuffling reactions described herein. It will further be appreciated that any of the above described techniques suitable for enriching a library prior to shuffling can be used to screen the products generated by the methods of DNA shuffling.
- the recombinant libraries are prepared using DNA shuffling.
- the shuffling and screening or selection can be used to "evolve" individual genes, whole plasmids or viruses, multigene clusters, or even whole genomes (Stemmer (1995) Bio/Technology 13:549-553. Reiterative cycles of recombination and screening/selection optinally can be performed to further evolve the nucleic acids of interest. Such techniques do not require the extensive analysis and computation required by conventional methods for polypeptide engineering. Shuffling allows the recombination of large numbers of mutations in a minimum number of screening/selection cycles, in contrast to traditional, pairwise recombination events.
- sequence recombination techniques described herein provide particular advantages in that they provide recombination between mutations in any or all of these, thereby providing a very fast way of exploring the manner in which different combinations of mutations can affect a desired result. In some instances, however, stmctural and/or functional information is available which, although not required for sequence recombination, provides opportunities for modification of the technique.
- These shuffling methods typically employ at least two variant forms of a starting nucleic acid substrate.
- the variant forms of candidate substrates can show substantial sequence or secondary stmctural similarity with each other, but they should also differ in at least two positions.
- the initial diversity between forms can be the result of natural variation, e.g., the different variant forms (homologs) are obtained from different individuals or strains of an organism (including geographic variants) or constitute related sequences from the same organism (e.g., allelic variations).
- the initial diversity can be induced, e.g., the second variant form can be generated by error prone transcription, such as an error prone PCR or use of a polymerase which lacks proofreading activity (see, e.g., Liao (1990) Gene 88:107-111), of the first variant form, or, by replication of the first form in a mutator strain, or by the mutagenic process of DNase fragmentation and reassembly by error prone polymerases.
- the initial diversity between substrates is greatly augmented in subsequent steps of recursive sequence recombination.
- the shuffling of a "family" of nucleic acids is used to create the library of recombinant polynucleotides.
- a family of nucleic acids is shuffled, nucleic acids that encode homologous polypeptides from different strains, species, or gene families or portions thereof, are used as the different forms of the nucleic acids.
- genomics provide an increasing amount of sequence information, it is increasingly possible to directly amplify homologs with designed primers. For example, given the sequence of lipase or protease genes from several species, one can design primers for amplification of the homologs. The resulting nucleic acid segments can then be subjected to shuffling.
- Codon modification procedures can be used to modify any derivatizing enzyme encoding nucleic acid herein, e.g., prior to performing DNA shuffling, or codon modification approaches can be used in conjunction with oligonucleotide shuffling procedures as described below.
- Codon modification shuffling involves selecting a first nucleic acid sequence that encodes a first polypeptide sequence or portion thereof. A plurality of codon altered nucleic acid sequences, each of which encode part or all of the first polypeptide, or a modified or related polypeptide, is then selected (e.g., a library of codon altered nucleic acids can be selected in a biological assay which recognizes library components or activities), and the plurality of codon-altered nucleic acid sequences is recombined to produce a target codon altered nucleic acid encoding part or all of a second protein.
- a plurality of codon altered nucleic acid sequences, each of which encode part or all of the first polypeptide, or a modified or related polypeptide is then selected (e.g., a library of codon altered nucleic acids can be selected in a biological assay which recognizes library components or activities), and the plurality of codon-altered nucleic acid sequences is recombined to produce a target
- the target codon altered nucleic acid is then screened for a detectable functional or stmctural property, optionally including comparison to the properties of the first polypeptide and/or related polypeptides.
- the goal of such screening is to identify a polypeptide that has a stmctural or functional property equivalent or superior to the first polypeptide or related polypeptide.
- a nucleic acid encoding such a polypeptide can be used in essentially any procedure desired, including introducing the target codon altered nucleic acid into a cell, vector, vims (e.g., as a component of a vaccine or immunogenic composition), transgenic organism, or the like.
- "In silico" shuffling (described in detail in Selifonov and Stemmer in
- genetic operators are used to model recombinational or mutational events which can occur in one or more nucleic acid, e.g., by aligning nucleic acid sequence strings (using standard alignment software, or by manual inspection and alignment) and predicting recombinational outcomes.
- the predicted recombinational outcomes are used to produce conesponding outcomes.
- the predicted recombinational outcomes are used to produce corresponding molecules, e.g., by oligonucleotide synthesis and reassembly PCR.
- oligonucleotide-mediated shuffling (described in Crameri et al.
- oligonucleotides corresponding to a family of related homologous nucleic acids are recombined to produce selectable nucleic acids.
- oligonucleotide-mediated recombination is the ability to recombine homologous nucleic acids with low sequence similarity, or even non-homologous nucleic acids.
- these low homology oligonucleotide shuffling methods one or more set of nucleic acid segments are recombined, e.g., with a set of crossover family diversity oligonucleotides.
- Each of these crossover oligonucleotides have a plurality of sequence diversity domains corresponding to a plurality of sequence diversity domains from homologous or non-homologous nucleic acids with low sequence similarity.
- the crossover oligonucleotides which are derived by comparison to one or more homologous or non- homologous nucleic acids, can hybridize to one or more region of the nucleic acid segments, facilitating recombination.
- sets of overlapping families of oligonucleotides are hybridized and elongated (e.g., by reassembly PCR), providing a population of recombined nucleic acids, which can be selected for a desired trait or property.
- the sets of overlapping oligonucleotides include a plurality of oligonucleotide member types which have consensus region subsequences derived from a plurality of homologous target nucleic acids.
- the sets of overlapping oligonucleotides are provided by aligning homologous nucleic acid sequences to select conserved regions of sequence identity and regions of sequence diversity.
- a plurality of oligonucleotides are synthesized (serially or in parallel) which correspond to at least one region of sequence diversity.
- Sets of segments, or subsets of segments used in oligonucleotide shuffling approaches can be provided by cleaving one or more homologous nucleic acids (e.g., with a Dnase), or, more commonly, by synthesizing a set of oligonucleotides corresponding to a plurality of regions of at least one nucleic acid (typically oligonucleotides corresponding to a full length nucleic acid are provided as members of a set of nucleic acid fragments).
- these segments can be used in conjunction with shuffling families of oligonucleotides, e.g., in one or more recombination reaction to produce recombinant derivatizing enzyme encoding nucleic acids.
- recursive sequence recombination can be employed to achieve still further improvements in a desired property. Sequence recombination can be achieved in many different formats and permutations of formats, which share some common principles. Recursive sequence recombination entails successive cycles of recombination to generate molecular diversity. That is, one creates a family of nucleic acid molecules showing some sequence identity to each other but differing in the presence of mutations. In any give cycle, recombination can occur in vivo or in vitro, intracellular or extracellular.
- diversity resulting from recombination can be augmented in any cycle by applying prior methods of mutagenesis (e.g., error-prone PCR or cassette mutagenesis) to either the substrates or products for recombination.
- mutagenesis e.g., error-prone PCR or cassette mutagenesis
- a new or improved property or characteristic can be achieved after only a single cycle of in vivo or in vitro recombination as when using different, variant forms of the sequence, as homologs from different individuals or strains of an organism, or related sequences from the same organism, as allelic variations.
- Expression of the recombinant polynucleotides to obtain the recombinant derivatizing enzymes is generally accomplished in cells.
- the libraries of recombinant polynucleotides can be created either in vitro or in vivo, as described in US Patent No. 5,837,458. For in vitro library generation, the recombinant polynucleotides are thus introduced into cells for expression.
- the methods of the invention are applicable to a wide range of derivatizing enzymes that can catalyze the modification oforganic molecules of interest.
- Such enzymes can modify the substrates by, for example, adding a functional group to the molecule or by modification of an existing functional group on the molecule. Modifications of interest also include addition of chemical moieties onto functional groups.
- the derivatizing enzymes in presently preferred embodiments, do not add to the length of the backbone of the organic molecule. Types of reactions of interest are described in, for example, Khmelnitsky et al (1996) Molecular Diversity and Combinatorial Chemistry, Chapter 14, pp. 144-157 (American Chemical Society), as well as Michels et al (1998) Tibtech 16: 210-215.
- enzymes that are enhanced in certain properties that increase the usefulness of the enzymes in the modification oforganic compounds such as, natural compounds, non-natural compounds (e.g., 5-fluorouracil, azidothymidine, etc.), small molecules, and polymers (e.g., peptides and peptide variants, oligonucleotides/polynucleotides and variants thereof, polyhydroxyalkanoates, polysaccharides, polylactic acid, polylactic-co-glycolic acid, polyethylene glycol, and the like).
- Small molecules employed in the practice of the present invention typically have a molecular weight of less than about 2500 daltons, usually less than about 2000 daltons, and sometimes less than about 1500 daltons.
- libraries can be screened to identify those library members that encode an enzyme that exhibits an improvement, compared to a wild-type enzyme, in a desired property or properties for use in the reaction of interest. For example, one can screen to identify those library members that encode an enzyme that has improved substrate specificity for a particular compound, or improved regioselectivity for at a desired functional group on the compound.
- libraries of recombinant derivatizing enzymes are variants of a given wild type gene, into which variation is introduced by diversity generating methods such as those described herein, e.g., shuffling and gene reassembly shuffling processes. Limited but complete diversity can thus be provided around the given sequence with dense sampling.
- the recombination libraries are produced by applying diversity generating methods to several different wild type genes. Limited and incomplete diversity is achieved, which is scattered all over a functional sequence space, as in sparse sampling. This latter technique is preferred when generating new enzyme specificities.
- the recombinant derivatizing enzymes, and libraries thereof can catalyze the modification of an existing functional group that is present on an organic molecule of interest, such as a lead compound.
- derivatizing agents of interest can oxidize or reduce a functional group, hydrolyze a group, or replace one functional group with another.
- Other reactions of interest include lactonization, isomerization, and epimerization.
- a. Hydroxylation In some embodiments, a hydrogen in an organic molecule is replaced with a hydroxyl group. This can often result in a profound alteration in biological activity. Hydroxylation is often associated with increased metabolism due to first pass through the liver.
- Introduction of a hydroxyl group in a dmg candidate can also confer a more rapid metabolism by the subsequent action of a group transferring enzyme (e.g., enzymes that catalyze methylation, sulfation, phosphorylation and glycosylation).
- a group transferring enzyme e.g., enzymes that catalyze methylation, sulfation, phosphorylation and glycosylation.
- the derivatizing enzymes that are useful for introduction of hydroxyl groups are the mono- and dioxygenases.
- a range of monooxygenases known in the art provide appropriate starting points for making libraries of recombinant monooxygenases that are useful in the methods of the invention.
- One useful class of monooxygenases is exemplified by the heme-dependent eukaryotic and bacterial cytochromes P-450. In the presence of oxygen and an intact redox recycle system, P450s exhibit monooxygenase activity. Addition of hydrogen peroxide or other peroxides, however, can be used to circumvent the NAD(P)H requirement (i.e., allowing for peroxidase activity) toward many of the same substrates.
- the P450 monooxygenase gene family is particularly well suited for use of family shuffling to obtain recombinant derivatizing enzymes.
- Approximately 70-80 families of P450 monooxygenases are known, from many different species.
- representative alignments of P450 enzymes can be found in the Appendices of the volume CYTOCHROME P450: STRUCTURE, MECHANISM, AND BIOCHEMISTRY, 2 nd Addition (ed. by Paul R. Ortiz de Montellano) Plenum Press, New York, 1995) ("Ortiz de Montellano").
- Streptomyces in particular, produces P450 monooxygenases that are used in production of natural products such as antibiotics.
- suitable P450 monooxygenase genes for shuffling include the following, each of which is at least 45% identical at the amino acid level: cytochrome p450 monooxygenase (S venezuelae) AF087022 cytochrome p450 monooxygenase (Sac. erythraea) M83110 cytochrome p450 monooxygenase (Sac. erythraea) M54983 cytochrome p450 monooxygenase (S.
- Other monooxygenase enzymes suitable for introduction of hydroxyl groups and other modifications oforganic molecules include those having activities such as alkane oxidation (e.g., hydroxylation, formation of ketones, aldehydes, etc.), alkene epoxidation, aromatic hydroxylation, N-dealkylation (e.g., of alkylamines), S-dealkylation (e.g., of reduced thio-organics), O-dealkylation (e.g.
- alkyl ethers examples include oxidation of aryloxy phenols, conversion of aldehydes to acids, alcohols to aldehydes or ketones, dehydrogenation, decarbonylation, oxidative dehalogenation of haloaromatics and halohydrocarbons, Baeyer-Villiger monoxygenation, modification of cyclosporins, hydroxylation of mevastatin, hydroxylation of erythromycin, N-hydroxylation, sulfoxide formation, or oxygenation of sulfonylureas.
- suitable monooxygenases for use in the invention are described in co-pending, commonly assigned US patent application Ser. No.
- Dioxygenases are another class of derivatizing enzymes that are useful for biocatalytic synthesis oforganic molecule derivatives.
- the bacterial arene dioxygenases (ADOs) can oxidize ⁇ -bonds to the corresponding vicinal diols.
- ADOs a reducing compound such as NAD(P)H
- these enzymes catalyze the reductive dioxygenation of compounds as diverse as aromatic rings and non-aromatic multiple bonds.
- Arene dioxygenases include, for example, toluene 2,3-dioxygenase, isopropylbenzene 2,3-dioxygenase, benzene- 1,2-dioxygenase, biphenyl-2,3-dioxygenase naphthalene- 1 ,2-dioxygenase, and many homologous and/or functionally similar enzymes.
- Suitable arene dioxygenase-encoding polynucleotides can be obtained from many organisms using cloning methods known to one skilled in the art. The following list provides examples of polynucleotides that encode arene dioxygenases and are suitable for use in the methods of the invention.
- loci are identified by GenBank ID and encode complete or partial protein components of the arene dioxygenases. Suitable loci include, for example: [PSETODC1C] toluene- 1,2-dioxygenase; [AF006691], [PJU53507], [PSECUMA], [REU24277] isopropylbenzene-2,3-[E04215], [PSEBDO] dioxygenase; benzene- 1,2-dioxygenase; [AEBPHAIF], [CTU47637], [D78322], [D88020], [D88021], [PSEBPHA], [PSEBPHABC], [PSEBPHABCC], [PSU95054], [RERBPHAl], [RGBPHA], [RSU27591] biphenyl-2,3- dioxygenase; [PSU15298] chlorobenzene dioxygenase; [AB004059], [AF010471], [AF0369
- the invention also provides methods in which a library oforganic molecule derivatives obtained by contacting the organic molecule with a first library of recombinant derivatizing enzymes is subsequently contacted with a second library of recombinant derivatizing enzymes.
- the enzymes of the second library are often, but not necessarily, those that catalyze the addition of a chemical moiety to a functional group.
- the hydroxylated compound can be modified by chemical or other means that are known to those of skill in the art.
- Halogenases constitute another example of a class of derivatizing enzyme that can be used to obtain libraries oforganic molecule derivatives.
- the halogenases generally halogenate aromatic rings that can become part of complex natural or non-natural products and other organic molecules that are of interest as, for example, lead compounds.
- suitable halogenases include the following: halogenase PrnA, PrnB, PrnC (U74493; P. fluorescens), putative halogenase PltM, PUD, PltA (AF081920; P.fluorescens), putative oxygenase/halogenase (Y16952; Amycolatopsis orientalis).
- these particular enzymes have less than about 35% amino acid sequence identity, the polynucleotides that encode the enzymes are useful as probes to obtain more closely related halogenases that can be used for DNA shuffling. c.
- Other substitutions include the following: halogenase PrnA, Prn
- a sulfur-containing group into an organic compound.
- Thiols for example, are generally introduced in order to generate a thiolate anion, which have a strong affinity for heavy metals. Often, heavy metals are found in enzyme active sites.
- Derivatizing enzymes that are useful for these embodiments include, for example, the aryl sulfotransferase family. This family of enzymes can be used to transfer a sulfo group onto the aromatic part of an organic molecule.
- the aryl sulfotransferase family includes many members that have very high amino acid sequence identity (>80%), such that they can be readily shuffled together to generate the libraries of recombinant derivatizing enzymes.
- Suitable sulfotransferase genes that can be used for recombination include, for example, arylamine sulfotransferase (U33886; Homo sapiens), phenol sulfotransferase (D85541 ; Macaca fascicularis), phenol sulfotransferase (D29807; Canis familiaris), phenol sulfotransferase (U34753; Bos taurus), and minoxidil sulfotransferase (L 19998; Rattus norvegicus) .
- one or more basic groups are substituted for preexisting functional groups.
- the basic groups most typically used in medicinal chemistry are the amines, the amidines, the guanidines, and almost all nitrogen-containing heterocycles. Introduction of such groups into a molecule that already has biological activity has essentially the same solubilizing effect as introduction of an acid function. Amines and basic heterocycles are virtually ubiquitous in successful dmgs. One can readily introduce an amine by, for example, use of an acyltransferase or esterase using a bifunctional compound that includes an amine.
- Additional embodiments of the invention provide recombinant derivatizing enzymes, and libraries thereof, that can catalyze the addition of one or more chemical moieties onto functional groups that are present on an organic molecule of interest, such as a lead compound.
- the recombinant derivatizing enzymes of the invention are those that can attach a group to the core functional dmg moiety at a position that does not destroy function of the dmg. Such attachments can increase the solubility of the dmg moiety, as a prodmg, for example.
- the attachment can be either reversible or irreversible.
- Reversible attachments include, for example, attachment of esters, peptides, and glucosides.
- Irreversible attachments include, for example, attachments via O- and N- alkylation.
- Creation of C-C bonds can be achieved using grafted side chains (e.g., dimethylaminoethyl or morpholinoethyl chains) or acidic side chains (e.g., carboxylic, sulfonic, -OSO 3 H, -PO 3 H 2 , -OPO H 2 ), or with neutral groups (e.g., glyceryl). Larger solubilizing groups can also be added using the enzymes and methods of the invention.
- Nonionizable side chains including, for example, hydroxylated and polyoxymethylenic side chains or diverse glucosides, can also be attached in order to enhance solubility.
- This class of side chains also includes polyethylene glycol derivatives, which are also used for increased solubility as well as sustained release.
- Examples of derivatizing enzymes that are useful for addition of a chemical moiety to a preexisting functional group on a lead compound or other organic molecule are glycosyltransferases, acyltransferases, amidases, N-methyltransferases, phosphotransferases, aryl sulfotransferases, and the like.
- Acyltransferases Acylation is one type of modification chemistry that could theoretically provide much diversity in derivatization oforganic molecules. Traditional chemical processes for acylation, however, are typically non-selective and require multiple protection and de-protection steps. Enzymatic acylation in organic solvent by acyltransferases, including lipases and proteases, for example, can provide certain advantages such as substrate-, stereo- and regio-selectivity. However, it is unlikely that one could obtain, from a set of naturally occurring acyltransferases one that will possess the desired variety of substrate-, stereo-, or regio-specificity for any particular organic molecule.
- the present invention provides libraries that contain a multitude of recombinant acyltransferases that can be used to synthesize acylated derivatives of lead compounds and other organic molecules.
- the invention provides libraries of recombinant polynucleotides that encode lipase and protease enzymes, and acyltransferases. These methods involve the creation of libraries of recombinant polynucleotides using as substrates polynucleotides that encode enzymes that can carry out an acylation reaction.
- Such enzymes include, for example, lipases and proteases.
- the reverse reaction of lipases and proteases in organic solvent can transfer various acyl groups onto hydroxyl sites of the complex natural products. Those enzymes usually posses broad substrate specificity but low activity.
- One example of an lipase family that is suitable for shuffling includes the following members: Y00557, Vibrio cholerae; D50587, Pseudomonas sp KFCC 10818 (AAD22078), Pseudomonas aeruginose
- Acinetobacter calcoaceticus (BAA23128), P. aeruginosa (D50587); Acinetobacter calcoacetius (AF047691); and R. wisconsinensis (U88907 and 2072017), Pseudomonas sp (P26877), Bacillus subtilis (M74101); Bacillus pumilus (A34992); Galactomyces geotrichium (A02813); Candida rugosa (WO 99/14338); and Acinetobacter calcoaceticus (S61927).
- nucleic acids that are suitable for use as substrates include, for example, galactoside 6-0 acetyl transferase (EC 2.3.1.18); lac A of E.
- coli B0342 (lacA) or of other organisms (GENBANK loci MG396;D02_orfl52 (lacA); MJ1064 (lacA), MJ1678, MTH1067); serine O- acetyltransferase (EC 2.3.1.30, (GENBANK locus B3607 (cysE), HI0606 (cysE), HP1210 (cysE), SLR1348 (cysE)); alcohol O-acetyltransferase (EC 2.3.1.84), from, for example, Saccharomyces cerevisiae (loci YGR177C, YOR377W); arylamine N-acetyltransferase (EC 2.3.1.118, representative GENBANK loci include Q00267, D90786, Z92774, 178931, AF030398, AF008204, AF042740); camitine O-acetyltransferase (EC 2.3.
- YM8054.01(CAT2) choline O-acetyltransferase (EC 2.3.1.6), e.g., that of mammalian origin; and acetyl CoA:deacetylvindoline 4-O-acetyltransferase (EC 2.3.1.107) (St-Pierre et al. (1998) Plant J. 14: 703-713).
- Suitable acyl donors for the improved enzymes of the invention include, for example, those compounds that can serve as a donor for the particular enzymes.
- acyl donor substrates include vinyl esters, trifluoroethyl esters and other aliphatic esters, as well as benzyl and fatty acids, and the like. See, e.g., Mozhaev et al (1998) Tetrahedron 54: 3791-3982, in particular p. 3976.
- acyl transferase genes that are shuffled are those that encode enzymes which provide transfer of the acetyl group, and use endogenous pool of acyl-CoA compounds in the cell of the host microbial strain.
- the endogenous pool of acyl-CoA can also be enhanced by introduction of an acyl-CoA ligase, optionally improved by DNA shuffling, into host microbial strain that carries out the acylation reaction.
- the strain is then supplied with exogenous acetate or other carboxylic acid in the medium, which is then attached to CoA by the acyl ligase.
- Suitable acyl ligases and methods for their optimization are described in co-pending, commonly assigned US patent application Ser. No.
- Compounds of interest for derivatization by acylation include, for example, natural products and such as polyketides, flavonoids, peptide antibiotics, and the like, as well as non-naturally occurring compounds. Such compounds find use as, for example, antibiotics, chemotherapeutic agents, and the like.
- the substrate molecules have one or more hydroxyl residues at which acylation can occur.
- Regioselectivity is particularly important for molecules that have multiple functional groups at which acylation can occur.
- the methods of the invention provide a means by which one can obtain an enzyme that acylates the functional group or groups of interest, but not other groups that might otherwise be susceptible to acylation.
- Anticancer dmgs including those that act by dismpting microtubulin dynamics, are among the compounds for which the methods of the invention are useful for developing derivatives of the dmgs that have improved properties. These compounds include, for example, colchicine, colcemid, podophylloxotoxin, taxol, vinblastine, vincristine, and the like.
- a substrate of interest is epothilone, which is a potent anticancer dmg candidate that is currently in the research stage. Selective acylation of two hydroxyl groups on this compounds can increase its water solubility.
- the recombinant acyltransferase libraries of the invention can be used to obtain derivatives that are specifically acylated at these positions.
- Additional examples are rapamycin and FK506.
- Acylation of the C-28 hydroxyl group of rapamycin or the undehydrated C-35 hydroxyl of FK506 can be used to separate their immunosuppresive activities from their nerve regenerative activities (Gold, B.G. (1997) Mol Neurobiol. 15 : 285-306). It is known that the part of rapamycin or FK506 binding to FKBP (FK binding protein) is responsible for the neuroregenerative activity.
- Acylation can destroy the binding of the FKBP-Rapamycin (or FK506) to the effector protein (calcineurin). Therefore, acylation of the aforementioned hydroxyl groups will dismpt the calcineurin binding. Regio selectivity will play a major role in these modifications, since there are several hydroxyl groups in both molecules.
- the screening of the libraries of recombinant polynucleotides that encode lipases, proteases, or other acylating enzymes, whether obtained by DNA shuffling or other methods as described above, is done most easily in vitro using purified or partially purified enzymes or bacterial or yeast lysates in organic solvent systems, by one or more of the screening methods described below.
- These methods include HPLC, mass-spectrometry, UV/Vis and IR spectroscopy, NMR, and the like.
- Another presently preferred method uses a labeled acyl-donor precursor, e.g. labeled carboxylic acid or its derivative, administered to the cells that express libraries of genes that encode shuffled lipases, proteases, or other acyltransferases.
- the amount of label in the reaction products is measured.
- hydrophobic reaction products one can extract the derivatives into a suitable organic solvent, or one can use solid-phase extraction of these compounds by addition of a sufficient amount of hydrophobic porous resin beads (e.g., XAD 1180, XAD-2, -4, -8).
- scintillating dye can be present in the organic solvent, added to the samples, or chemically incorporated in the bead polymer. The latter constitutes a modification of scintillation proximity assay method.
- the methods for detection regioselectivity of the acylation reactions include, for example, HPLC, and in an HTP modality, flow-through NMR spectroscopy.
- NMR spectroscopy is used for determination of relative amounts of different regiomeric acylated derivatives of the natural products or small molecules, the later are preferably obtained by action of the enzymes on isotopically ( C and/or H) labeled substrates.
- Another variation of the NMR technique includes use of isotopically labeled precursors of acyl donor intermediates. b. Glycosyltransferases
- glycosyltransferases Another example of a derivatizing enzyme of interest for generating combinatorial libraries oforganic molecule derivatives are the glycosyltransferases.
- Glycosylation can increase bioavailability, reduce toxicity and increase water solubility of organic molecules, including lead compounds. Because glycosylations are difficult to perform chemically, novel sugar containing antibiotics, such as new glycopeptide and glycosylated macrolide antibiotics, are difficult to make.
- glycosyltransferases allows one to accomplish glycosylation oforganic acceptor compounds that contain one or more hydroxyl groups. Therefore, with the greater variety in glycosylation ability provided by the recombinant enzyme libraries of the invention, many variants oforganic molecules are obtainable.
- new enzymes are provided that can catalyze a variety of previously unavailable glycosylations.
- the recombinant derivatizing enzymes in the libraries of the invention can exhibit changed specificity for both acceptors (e.g., complex natural and synthetic organic molecules) and donors (e.g., different sugars). Increased ability to synthesize aminodeoxy sugars can also be obtained, e.g., by biotransformation.
- new substrates can be accessed, new enzymatic activity can be created and improved; difficult chemical processes can be replaced by biocatalysis, and high scale ups can be accomplished.
- Glycosyltransferases can be evolved using the diversity generating methods described herein, including, for example, shuffling, to generate recombinant glycosyltransferases that exhibit optimal performance with respect to a variety of different reaction parameters.
- Typical reaction parameters include, but are not limited to, specificity of reaction, degree of promiscuity of enzymes and stereochemistry.
- the enzymes are optionally evolved to transfer different nucleotide diphosphate (NDP) sugars and NDP-sugar analogs; to transfer sugars to different acceptor molecules; to attach sugars at different positions compared to naturally occurring enzymes, to possess ambiguity towards positions in multiple site containing acceptors, and to catalyze multiple step-wise glycosylations.
- NDP nucleotide diphosphate
- enzymes can be evolved to generate recombinant derivatizing enzymes that utilize alternative sugars which are optionally synthetic.
- activated sugars such as desoxy and sulfated sugars; non-natural sugars, e.g., nitrosylated, sulfonated, phosphonated, and didesoxy sugars; polyalcohols, e.g., inositol, inositol-phosphates, and inositol phosphonates; other sugar like stmctures and compounds and alternative nucleotides.
- Recombinant glycosyltransferases are also optionally used to transfer sugars to alternative sugar receptors, including but not limited to polyketides, non-ribosomal peptides, complex molecules from organic synthesis, and libraries of chemical compounds.
- Other sugars acceptors of interest in the present invention include, but are not limited to, aglycosyl vancomycin hydrochloride (a peptide antibiotic), somatostatin (a growth hormone), insulin and glucagon-release inhibitor, cholic acid (a detergent steroid), nogalamycin (an anti-tumor antibiotic), L-thyroxine (a thyroid hormone), syringaldazine, aclambicin (an anti-tumor antibiotic and commercial RNA synthesis inhibitor), ritodrine HCl (an adenergic agonist and smooth muscle relaxant), rifamycin (an antibiotic), and ristomycin sulphate (an antibiotic).
- aglycosyl vancomycin hydrochloride a peptide antibiotic
- Each of these commpounds has 3 -dimensional similarity to vancomycin aglycone, as defined by the molecular dynamics interface with the Available Chemical Database that is available through Chemweb (http://www.chemweb.com/ databases). These compounds and their sugar attachment points of interest are shown in Figures 1-10.
- Other natural products of interest for glycosylation include, for example, lovastatin, aglycosyl erythromycin, echinocandin, taxol and cephalexin.
- Any molecule which contains at least one hydroxyl group is optionally glycosylated with an evolved glycosyltransferase.
- Pharmacologically interesting compounds are preferred.
- Sugar acceptors with more than one hydroxy group are optionally glycosylated at only one of the positions. Thus different isomers can be produced by glycosylating at one or the other of the positions.
- compounds with more than one hydroxy group are optionally glycosylated at different positions to a different extent, when NDP sugars are limiting for example.
- compounds are treated multi dimensionally with combinations of NDP-sugars and glycosyltransferases, providing iterative glycosylation.
- the glycosyltransferases are selected from those which transfer hexose residues from UDP-hexose derivatives.
- Preferred hexoses include, for example, D-glucose, D-galactose and D-N-acetylglucosamine.
- Sugars of interest in attachment using evolved glycosyltransferases include, but are not limited to, the following: UDP-N-acetylgalactosamine, UDP-N-acetylglucosamine, UDP-galactose, UDP-galacturonic acid, UDP-glucoronic acid, UDP-mannose, UDP-xylose, UDP-glucose, TDP-glucose, CDP- glucose, ADP-glucose, ADP-ribose, ADP-mannose, GDP-fucose, GDP-glucose, and GDP- mannose, all of which are available from Sigma (St, Louis, MO).
- Deoxy sugars such as 2- deoxy-D- y/o-hexose, 2-deoxy-D- ⁇ ra ⁇ o-hexose, L-fiicose, L-rhamnose, D-mycinose, L- vallarose, D-fucose, D-quinovose, D-rhamnose, D-canarose, D-oliose, D-digitose, D- boivinose, L-oleandrose, chalcose, D-amicetose, L-rhodinose, ascarylose, abequose, paratose, tyvelose, colitose, and the like. These sugars and others are described in Annu.
- the invention provides methods of obtaining recombinant polynucleotides that encode glycosyltransferase enzymes that are enhanced in certain properties that increase more of several known methods.
- the following are illustrative examples of glycosyltransferase-encoding nucleic acids that can be used as source nucleic acids for creation of the recombinant libraries which are then screened to identify those that exhibit an improvement in the glycosylation oforganic compounds, such as altered substrate specificity.
- inositol 1-alpha-galactosyltransferase EC 2.4.1.123; phenol beta- glucosyltransferase, EC 2.4.1.35 (NTU32643, NTU32644); flavone 7-O-beta- glucosyltransferase, EC 2.4.1.81; flavonol 3-O-glucosyltransferase, EC 2.4.1.91 (AB002818, ZMMCCBZ1, AF000372, AF028237, AF078079, D85186, ZMMC2BZ1, VVUFGT); o- dihydroxycoumarin 7-O-glucosyltransferase, EC 2.4.1.104; vitexin beta-glucosyltransferase, EC 2.4.1.105; coniferyl-alcohol glucosyltransferase, EC 2.4.1.111; monoterpenol beta- glucosyltransferase, EC 2.4.1.127
- glycosyltransferase genes can be found in many microorganisms which one skilled in the art can isolate from various soil, sediment, air and aqueous samples by enrichment culture techniques. Glycosyltransferases specifically isolated from the soil bacteria glycosylate several of polyketide aglycones and such glycosylated natural products possess many different biological activities, such as antibiotic, and anticancer. Genes coding for such enzymes are readily available from the public database. For example, glycosyltransferases (S. antibioticus, AJ002638; Sac erythraea, Y14332; S.
- venezuelae AF079762; S peucetius, L47164 and S.fradiae, X81885).
- Those genes share more than 50% of the amino acid sequence identity and any two or more are thus ideal for shuffling together as a family.
- glycosyltransferases that are used for initial shuffling are gtfA, gtfB, gtfC, gtfD, and gtfE, from different Amycolatopsis orientalis strains. These genes code for glycosyltransferases that transfer sugar moieties to the aglycons of vancomycin and
- polynucleotides that encode the improved glycosyltransferase enzymes are introduced into microorganisms that are added to the biocatalytic reaction mixture.
- the glycosyltransferase is expressed by a microorganism species other than that from which the glycosyltransferase gene was obtained.
- the glycosyltransferases used in the methods of the invention are optimized by subjecting nucleic acids that encode the enzymes to recombination and subsequent selection to identify those recombinant polynucleotides that encode enzymes having an enhanced property of interest.
- Libraries of recombinant polynucleotides that are subjected to selection or screening to identify those that encode recombinant glycosyltransferases having enhanced properties can be created by application of, for example, the various recombination-based diversity generating methods described herein (such as shuffling), to nucleic acids that encode these enzymes (i.e., the nucleic acids are the substrates for recombination).
- Sources of glycosyltransferase genes that are suitable for use as substrates in the creation of the libraries of recombinant polynucleotides include, for example, the gtf genes from A.
- orientalis that encode glycosyltransferases that catalyze, e.g., the transfer of glucose to aglycosyl vancomycin. Enzymes that catalyze these reactions are ubiquitous in prokaryotic and eukaryotic organisms.
- glycosyltransferases can be selected from the glycosyltransferase superfamily, aligned with similar homologous sequences, and shuffled against these homologous sequences. Glycosyl transfer reactions are ubiquitous in nature, and one of skill in the art can isolate such genes from a variety of organisms, using one or
- the protein sequences share similarities between 52% (gtfA-gtfD) and 80% (gtfB-gtfE).
- the five published genes can be amplified from different Amycolatopsis orientalis ssp orientalis strains (gtfD and gtfE from ATCC 43490 or ATCC 43491 and gtfA, gt ⁇ , gtfC from NNRL 18098).
- Another number of uncharacterized but related glycosyltransferases genes are optionally PCR amplified from other A.
- orientalis strains e.g, ATCC 19795, 21425, 35164, 15165, 15166, 39444, 43333, 53550, and 53630, and cloned into a suitable cloning and expression vector.
- Further genes can be amplified from the balhimycin producer Amycolatopsis mediterranei DSM5908 (Pelzer et al. (1997) J. Biotechnol. 57: 115-128), and from other Amycolatopsis strains.
- the expression of gtf-enco ⁇ e ⁇ proteins in E. coli can be tested by either SDS-PAGE and Coomassie stain and/or if a detection tag was added by
- gt ⁇ and gtfE clones and several clones of other genes e.g., gtfA, gtfC, gtfD and the like, expressing a polypeptide chain of the desired size are used to generate PCR products of the gt/genes in the context of a screening vector.
- DNAsel fragments of each PCR product are generated and reassembled, e.g., by a variety of shuffling methods as described above.
- the fragment size is between 25 base pairs and 250 base pairs, but this size is easily determined experimentally by methods well known in the arts.
- Methyltransferases are another example of a derivatizing enzyme of interest that can add a chemical moiety onto a functional group present on a lead compound or other organic molecule.
- SAM S-adenosylmethionine
- MTs S-adenosylmethionine dependent methyltransferases
- SAM carries an activated methyl group that is efficiently transferred to nucleophiles having a broad range of chemical reactivity. Transfer of the activated methyl group from SAM to the recipient nucleophile is thermodynamically favorable, thereby driving the methyl transfer reaction essentially to completion.
- N-methyltransferases One class of methyltransferases of interest are the N-methyltransferases.
- the following N-methyltransferases have at least 59% amino acid sequence identity, thus making the family particularly well suited for shuffling: putative TDP-N- dimethyldesosamine-N-methyltransferase (U77459; Saccharomyces erythraea), methyltransferase (AJ002638; S. antibioticus), N,N-dimethyltransferase (AF079762; S. venezuelae), N-methyltransferase (X81885; S.fradiae).
- This family of enzymes usually methylates the amine group of the amino deoxy sugars attached to complex natural products.
- O-methyltransferases also of interest are the O-methyltransferases, several families of which are known.
- the following family of methyltransferases can methylate the hydroxyl groups of complex natural products: 31-demethyl-FK506 methyltransferase (U65940; Streptomyces sp), methyltransferase (X86780; Streptomyces hygroscopicus), carbomycin 4- O methyltransferase (D30759; Streptomyces thermotolerans), and O-methyltransferase (M93958; Streptomyces mycarofaciens). These family members are greater than 45% identical at the amino acid level. d.
- Amidases The invention also provides recombinant libraries of amidases.
- This family of enzymes may be used to introduce amide groups into organic molecules.
- the reverse of the amidase reaction converts carboxylic acid groups into a carboxylic acid amide.
- One such family that is suitable for use in the methods of the invention includes the following amidases, which are at least 55% identical at the amino acid level: N-acetyl- anhydromuramyl-L-alanine amidase (AF082575; Pseudomonas aeruginosa), N-acetyl- anhydromuramyl-L-alanine amidase (U40785; Enterobacter cloacae), AmpD protein (XI 5237 ; E.
- Phosphotransferases The addition of a phospho group onto an existing functional group of a lead compound or other organic molecule is also of interest.
- the invention provides libraries of recombinant phosphotransferases that are useful for obtaining phosphorylated organic molecule derivatives.
- the macrolide and peptide phosphotransferase family members of which have at least 36% amino acid sequence identity, can be subjected to recombination (e.g., macrolide 2 '-phosphotransferase I (D16251; E.
- enzymes capable to catalyze oxidation-reduction reactions are important to oxidize functional alcohols to aldehydes/ketones or reduce aldehydes/ketones groups to alcohols in organic compounds. These newly created groups can then be further modified by other classes enzymes as described.
- One such family suitable for shuffling is that of lactate dehydrogenase, which converts ketone to alcohol with >80% amino acid sequence identity: (Y00711, Homo sapiens; U07181, Rattus norvegicus; 77022A, Sus scrofa domestica; L79954, Trachemys script, etc.).
- Alcohol dehydrogenase is another family enzyme which oxidize alcohol group into aldehyde.
- Suitable genes with this enzyme family are readily available for shuffling (M84409, Homo sapiens; LI 5704, Peromyscus maniculatus; 156882, Struthio camelus; P80222, Alligator mississippiensis, etc). Shuffling of these two families of enzymes can change their substrate specificity towards more complex organic compounds.
- the invention provides, in additional embodiments, methods for obtaining a library oforganic molecule derivatives. These methods involve contacting an organic molecule (a substrate) with a library of recombinant derivatizing enzymes and other necessary reactants to form the library of organic molecule derivatives.
- the derivatizing enzymes as described above, catalyze a reaction such as: a) modification of one or more functional groups present on the organic molecule; b) addition of a chemical moiety onto one or more functional groups present on the organic molecule; or c) introduction of a new functional group onto the organic molecule.
- Organic molecules of interest for derivatization include, for example, those that have pharmacological activity, herbicide or pesticide activity, and the like.
- organic molecules of interest include natural products, such as antibiotics (including, for example, polyketides, steroids, non-ribosomal peptide antibiotics, and the like).
- antibiotics including, for example, polyketides, steroids, non-ribosomal peptide antibiotics, and the like.
- Steroids for example, are an extremely widely used basic stmcture for dmgs whereby the substituents on the rings target the dmg to many different therapeutic targets. Most of these are derived form natural sources and screened for efficacy.
- Substituents observed on steroid dmgs include hydroxyls, methoxy, alkoxy, glycosylations, sulfations, halogenations, double and triple bonds, carbonyls, and the like.
- the chemical derivatization of the steroid ring stmcture is readily achieved at a few well described sites or by modification of the naturally occurring stmctures or non-naturally occurring variants thereof.
- Cyclic glycopeptides and macrolides such as vancomycin and erythromycin are also chemically difficult stmctures that can be modified by the application of shuffled enzyme libraries.
- stmctures isolated from nature and described in the literature, and in company vaults, that have interesting bioactivities but fail in other regards, toxicity, bioavailability, solubility, pharmacokinetics, lack of selectivity are some of the reasons dmg candidates are unable to become dmgs.
- Application of the shuffled libraries can be used to improve these and other characteristics.
- Prostaglandins, alkaloids, anthraquinones are other families of molecules which have many biologically active members. These are also good candidates for improvement with shuffled enzyme libraries.
- tubocurarine chloride alcuronium chloride
- pancuronium bromide pancur
- choleretic and cholekinetic dmgs including, for example, hymecromone, febupol, chenodeoxycholic acid and ursodeoxycholic acid.
- Fluocortolone, paramethasone, dexamethasone, betamethasone, cortisone, hydrocortisone, prednisone, prednisolone, triamcinolone acetonide, triamcinolone, methylprednisolone and prednylidene are among the glucocorticoids that are suitable for derivatization.
- Corticosteroids of interest also include, for example, prednicarbate, hydrocortisone aceponate, fluocortinbutyl, ioteprednol etabonate, and the like.
- the substrates are contacted with the members of the library of recombinant enzymes.
- the enzymatic reactions can be performed in numerous ways, including the use of whole cell biotransformation, permeabilized cells, cell lysate, and purified protein, for example.
- Whole cell biotransformation occurs when the substrate (e.g., an organic molecule) is exposed to cells containing the library of recombinant derivatizing enzymes.
- the library can be expressed as a surface protein on a replicable genetic package, e.g., phage or yeast display, or as a secreted protein that interacts with the substrate in solution.
- the enzymes can also be expressed inside the cell, in which case the substrate will diffuse into the cell before the reaction occurs.
- the resulting product of the derivatizing enzyme activity is isolated from the cells by methods known to those of skill in the art, including, for example, centrifugation, precipitation, extraction with organic solvents, and filtration.
- the cells that express the library can be permeabilized by addition of a number of well known permeabilizing agents such as polymyxin B sulfate.
- the level of permeabilizing agent can be modified to allow the passage of substrate and product to freely diffuse to the enzymes of the library and out of the cell again. At higher levels of permeabilizing agent the protein may be released into solution.
- the compounds of interest will be isolated as for whole cells.
- the library can be used as a cell lysate, whereby the cells expressing the library are broken by addition of well known lysis conditions which includes addition of detergent, PMBS and lysozyme, or sonication.
- the cell debris may be removed before reaction by centrifugation though this may not be necessary.
- Substrate is then added to the lysate and after an incubation at a defined temperature and for a defined length of time.
- the product is then extracted as before and analyzed as described below.
- the recombinant derivatizing enzymes encoded by the library can be purified by many well known techniques before screening or use to make derivatives oforganic molecules. Such methods include, for example, gel filtration, ion exchange, affinity, or hydrophobic chromatography to yield either partially or fully purified protein. Many other purification methods are known to those of skill in the art. The purified protein is then exposed to the substrate under conditions that favor enzyme activity.
- the reaction conditions used for the transformation are optimized for maximal enzymatic turnover by standard methods, which include the use of optimal salt levels, buffer, temperature, and length of reaction.
- the substrate, and any other substrates consumed in the enzymatic reaction are preferably used at a concentration that promotes a high turnover rate.
- the contacting of an organic molecule and other reactants with a recombinant derivatizing enzyme can be done using the entire library of enzymes at once, or with pools of recombinant enzymes from the library, or with a single recombinant enzyme in each reaction. If a pool is used, the pool can be deconvo luted to isolate the particular clone that exhibits a desired activity once an active pool had been identified using the described methods. For example, colonies that express each member of the library of recombinant derivatizing enzymes can be placed in microtiter plates or other suitable container and subjected to high throughput screening.
- the members of the library of recombinant enzymes are immobilized on a solid support prior to contacting with the other reactants.
- the recombinant polynucleotides that encode the enzymes can be introduced into an expression vector that also includes a coding sequence for a tag, such that the recombinant derivatizing enzymes are expressed as a fusion protein with a tag.
- a tag can be attached to the derivatizing enzymes after their expression.
- the tag is typically a member of a binding pair for which a conesponding member is readily obtainable and immobilizable on a solid support.
- the recombinant enzyme can be expressed as a fusion with biotin, which can then be immobilized by binding to streptavidin.
- suitable binding pairs include, for example, maltose binding protein and amylose, histidine tags and an immobilized metal ion, glutathione-S-transferase and reduced glutathione, streptavidin binding tags and streptavidin, epitope tags (e.g., E-tag, myc-tag, HAG-tag, His-tag) and corresponding antibodies, chitin binding domains and chitin, S-tag and RNase minus S- peptide mutant, cellulose binding proteins and domains and cellulose, thioredoxin and DsbA and a thiol compound (e.g.
- poly-cationic tags e.g. , poly-arginine
- a poly- anion column IgG and IgG-derived peptides and protein A, protein G, and the like, calmodulin binding peptide and calmodulin, histactophilin and immobilized metal chelate chromatography.
- the member of the binding pair to which the tag attached to the enzymes binds is preferably attached to a solid support.
- Solid supports suitable for use are known to those of skill in the art.
- a solid support is a matrix of material in a substantially fixed arrangement.
- Exemplar solid supports include glasses, plastics, polymers, metals, metalloids, ceramics, organics, etc.
- Solid supports can be flat or planar, or can have substantially different conformations.
- the substrate can exist as particles, beads, strands, precipitates, gels, sheets, tubing, spheres, containers, capillaries, pads, slices, films, plates, dipsticks, slides, etc.
- Magnetic beads or particles such as magnetic latex beads and iron oxide particles
- solid substrates that can be used in the methods of the invention.
- Magnetic particles are described in, for example, US Patent No. 4,672,040, and are commercially available from, for example, PerSeptive Biosystems, Inc. (Framingham MA), Ciba Corning (Medfield MA), Bangs Laboratories (Carmel IN), and BioQuest, Inc. (Atkinson NH).
- the substrate is chosen to maximize signal to noise ratios, primarily to minimize background binding, for ease of washing and cost.
- Separation of the recombinant enzymes from other cellular components, or from reactants and the like can be effected for example, by removing a bead or dipstick from a reservoir, emptying or diluting a reservoir such as a microtiter plate well, rinsing a bead (e.g. beads with iron cores may be readily isolated and washed using magnets), particle, chromatographic column or filter with a wash solution or solvent.
- the separation step will sometimes include an extended rinse or wash or a plurality of rinses or washes.
- the wells may be washed several times with a washing solution, which typically includes those components of the reaction mixture that can interfere with subsequent screening of the organic molecule derivatives, such as salts, buffer, detergent, nonspecific protein, etc.
- a washing solution typically includes those components of the reaction mixture that can interfere with subsequent screening of the organic molecule derivatives, such as salts, buffer, detergent, nonspecific protein, etc.
- the libraries of recombinant derivatizing enzymes provided by the invention are useful not only to obtain libraries oforganic molecule derivatives, but also provide a source from which one can identify a recombinant enzyme that catalyzes a particular reaction of interest. For example, once a particular organic molecule derivative is identified as having a desired property, one can identify a particular recombinant enzyme from the enzyme library that can catalyze the formation of the particular derivative.
- the libraries of recombinant derivatizing enzymes are useful for the production of combinatorial libraries oforganic molecule derivatives, which are in turn screened to identify those that exhibit a desired activity.
- the product of the screening is often a compound that had not previously been made.
- the libraries of recombinant enzymes provide a source from which one can identify an enzyme that catalyzes a particular known modification of an organic molecule.
- the library is generally subjected to screening to identify those derivatives that are of particular interest.
- a bioassay that is designed to allow detection and/or quantitation of the desired activity.
- desired biological activity including, for example, cell toxicity, genotoxicity, and the like
- desired bioavailability including properties such as plasma half-life, renal clearance, and the like
- desired physicochemical property including properties such as, water solubility, lipid solubility, solubility in organic solvent (e.g., n-octanol), water solubility, pH stability (e.g., the low pH environment of the stomach), temperature stability, resistance to intestinal enzymes, resistance to hepatic enzymes, resistance to plasma enzymes, tissue permeability (e.g., dermal, mucosal, and the like), blood-brain barrier permeability), and other desired properties that can be achieved by derivitization, can all be conducted randomly, e.g., without regard to the stmctures of the compounds, or can be preceded by analysis of the stmctures of the compounds in the library to identify those that have a particular stmcture of interest.
- stmctural analysis can be employed to identify the stmctural features imparted by the library of recombinant derivatizing enzymes.
- the recombinant derivatizing enzymes present in the library are expected to chemically modify a given substrate in a predictable fashion. For example, a glycosyltransferase will transfer a sugar moiety onto an amine or hydroxyl of the substrate. This will lead to predictable changes in the physical behavior of the molecule, which can be utilized for screening.
- glycosyltransferases will place a sugar onto the substrate, methyltransferases will add a methyl group, P450's will tend to add a hydroxyl group, etc.
- a kinase library would transfer a phosphate group onto the substrate and specific phosphate tests will detect the presence of product.
- a number of analytical screening tools are available for determining the stmcture of compounds in a combinatorial library. For example, a number of methods are known that are capable of detecting low concentrations of compounds in a high throughput format, including flow analysis NMR and mass spectrometry. These analytical tools, or others including UV/Vis and IR spectroscopy, fluorescence spectroscopy, luminescence, and the like, can be used to both detect and quantify the novel compounds produced in the enzymatic reactions.
- One hundred percent turnover of the substrate to product is not expected in a library screen and so the analytical techniques are preferably set up to detect the specific changes produced by the enzymatic activity.
- the presence in a library of recombinant enzymes of an enzyme that has methyltransferase activity on a particular substrate of interest could be detected by observation of an increase of 14 amu in the mass spectmm after contact with the enzyme.
- the changes in the chemical structure of the substrate caused by the library can often be specifically monitored and detected. These can then be conelated to the member of the library of recombinant enzymes that catalyzed the particular reaction.
- Suitable labels include, for example, radiolabels such as 3 H, 14 C, 32 P, and the like. This can be achieved using radioactive co-substrates such as 3 H 3 methyl S-adenosyl methionine, whereby only the methylated product of reaction will be labeled. Other labels can also be used; many are known in the art. For example, glycosylation can be detected by use of a sugar molecule that includes a label.
- the product of the action of the shuffled library upon the substrate is expected to provide a product that is more stable than the substrate towards external stress such as extremes of pH, or increase the solubility of the compound in a particular solvent.
- This change in behavior can also be monitored by suitable analytical or bioassay methods.
- the detection of the newly formed product may require separation of the product form the substrate by standard chromatographic methods such as TLC, HPLC, CE, or GC. This can be followed by spectroscopic or other (e.g., flame ionization, mass spectrometry) methods to detect the formation of a novel compound of interest.
- This Example describes how one can generate a library of recombinant glycosyltransferases and use the enzymes for the production of desvancosamine vancomycin.
- Amycolatopsis orientalis ssp. orientalis strains ATCC43490 and NRRL 18098 are obtained from ATCC and NRRL. Initial cultures on agarose petri dishes are prepared according to the supplier's recommendation. Liquid cultures are grown for two to five days in TSB at 25°C -28°C. The genomic DNA is extracted according to a standard procedure (Ausubel et al. (1987) Current Protocols in Molecular Biology, st Edn., John Wiley & Sons, Inc., NY). PCR bv add-on-primer
- PCR is performed using genomic DNA, 1 pmol of gene specific primer, 200 ⁇ M dNTPs, 2 units Deep Vent Polymerase and 0.2 units of its 5 '-3' exonuclease activity lacking variant in the presence of 1.5 M betaine and 1-3.5 mM MgSO 4 in a 50 ⁇ l volume according to the enzyme supplier's (New England Biolabs) instmctions. In all cases hot start using wax beads (M ⁇ P) is employed.
- the cycles are set to the following scheme: 95°C for 5 min initially; 5 cycles: 95°C 45 sec, 76°C lmin 20sec; 5 cycles: 95°C 45sec, 75°C lmin 20sec; 5 cycles: 95°C 45sec, 74°C lmin 20sec; 10 cycles: 95°C 45sec, 73°C lmin 20sec; 10 cycles: 95°C 45sec, 73°C lmin 20sec. All primers are designed according to the sequence entry U84349 and U84350. For the amplification of gtfA, the primers gtfA.For and gtfA.Rev are used.
- the primers gtfB.For and gtfB.Rev are used for the amplification of gtfB.
- the primers gtfC.For and gtfC.Rev are used for the amplification of gtfC.
- the primers gtfD.For and gtfD.Rev are used for the amplification of gtfE.
- the primers gtfE.For and gtfE.Rev are used (Table 1).
- the resulting PCR products are digested with Ndel and EcoRV.
- the digested PCR product that corresponds to the gtf gene is purified by agarose gel electrophoresis and QIA ⁇ XII (Qiagen). 2. Properties and structure of the vector pCKZEBB.
- the vector pCKZEBB is derived from pAK400 (Krebber et al. (1997) J. Immunol. Meth. 201: 35-55.
- the following features of pAK400 are kept.
- the lacl q gene is kept for repression of the lac operon
- the transcriptional terminator (hp l ) between lacl q gene and lac promoter (lac p °) is kept to terminate read through transcription from the lad promoter into the lac promoter controlled operon reducing basal non-induced expression
- the lac promoter operator was kept for transcription initiation and transcription control
- the T7gl0 leader from T7 phage gene 10 in front of the target gene start codon was kept to enable strong translation initiation from the ATG start codon in the Ndel restriction site.
- lppt lpp transcriptional terminator encoded followed by the fl origin of replication to allow single stranded DNA production followed by the chloramphenicol resistance gene (cam R ), and the ColEl origin for double stranded DNA replication.
- pCKZEBB a lac promoter operator controlled polycistronic message replaces the lac promoter operator controlled monocistronic message in pAK400.
- the lac promoter transcribed operon is located between the unique Ndel and Hindlll of the pAK400 vector.
- pCKZEBB a variant of the lacZ gene (start codon ATG incorporated in Ndel site, internal Ndel removed, EcoRV site added to end of gene in front of stop codon, resulting EcoRV lacZ piece inverted in vector) is inserted as a stuffer fragment in the Ndel EcoRV target gene cloning site. This lacZ fragment will be replaced by the target glycosyltransferase genes.
- Biotinylation tag encoded (aa sequence) followed by the translational coupling tag derived from the end of the trpB gene. Both tags are fused in frame to the target glycosyltransferase gene when it replaces the lacZ stuffer fragment.
- the A nucleotide of the stop codon of the translational coupling tag constitutes part of the translational start codon of a green fluorescent protein-encoding gene (GFP; Crameri et al. (1996) Nature Biotechnol 14: 315-319)).
- GFP green fluorescent protein-encoding gene
- the GFP gene is followed by the birA gene PCR cloned including a ribosomal binding site from BL21(DE3).
- a map of pCKZEBB is shown in Figure 19, and the nucleotide sequence of the vector is shown as SEQ ID NO: 19.
- E. coli transformed with pCKZEBB do not turn green fluorescent when grown on 30 ⁇ g/ml chloramphenicol and 1 mM IPTG.
- the IPTG induced expression of the target gene turns the plasmid harboring bacteria green fluorescent by translational coupling to the GFP gene (Oppenheim & Yanofsky (1980) Genetics 95, 785-795) and, B), the target gene will be biotinylated in vivo by the biotinylation tag (Schatz (1993) Bio/Technology 11 : 1138-1143) via the birA derived biotin holoenzyme ligase (Smith et al (1998) Nucl. Acids Res. 26: 1414-1420).
- coli TGI electrocompetent cells (Stratagene) are electroporated with the ligation and plated on LB- Agar plates containing 30 ⁇ g/ml chloramphenicol and 1 mM IPTG and grown overnight at 37°C. Green fluorescent colonies showing different extents of fluorescence are picked and plasmid DNA is prepared.
- glycosyltransferase genes are amplified from the resulting plasmids, including some vector derived flanking regions by primers CK.For3 and H3.Rev using a polymerase according to the manufacturer's recommendations.
- the PCR is purified by Qiaquick columns (Qiagen).
- PCR product derived from either the plasmids are digested with DNAsel (Boehringer). The reaction is stopped on dry ice and the fragments in the desired size range are isolated from 2% agarose gels using glassfilter disks (Whatman) and dialysis membranes (Spectrapor) (Stemmer (1994) Proc. Natl. Acad. Sci. USA 91: 10747-10751 and Stemmer (1994) Nature 370: 389-391).
- Electrocompetent E. coli TGI is transformed with the ligation mix and after 1 hour shaking at 37°C plated on LB-agar containing 30 ⁇ g/ml Chloramphenicol, 1% Glucose and grown overnight at 37°C.
- Colonies are picked into LB-Cam-Glucose and grown ON at 37°C to generate the master plate. From the master plates colonies are arrayed onto LB-Cam-IPTG-Agar and the plates are incubated overnight at 37°C. Green fluorescent colonies are identified by exposure of the plate to 365 nm ultraviolet light. The respective green fluorescent colonies from the master plate are re-arrayed into 96 well plates each well filled with 100 ⁇ l 2YT- Cam 30-l%Glucose and grown overnight at 37°C. 50 ⁇ l culture are transferred to 1 ml of 2YTCam30- lmg/ml biotin and grown for 7 h at 16°C. Then 50 ⁇ l of 100 ⁇ M IPTG is added and the cultures are grown overnight at 16°C.
- the cultures are centrifuged (4000 rpm for 15 minutes) to pellet the cells.
- the cell pellets are washed with 500 ⁇ l of 50 mM ammonium formate (pH 7.4) and pelleted once more.
- the cells are resuspended in 300 ⁇ l lysis buffer (10 ⁇ L Ready to Lyse lysozyme (Epicentre), 2 ⁇ L RNAse A (Qiagen), 2 ⁇ L DNAse I (Boehringer), 2 ⁇ L IM MgSO 4 , in 10 ml of 1 mg/ml Polymyxin B sulfate (Sigma), 2 mM DTT in 50 mM ammonium formate pH 7.4) and agitated at ambient temperature for thirty minutes. The lysate is then clarified by centrifugation (15 minutes at 4000 rpm).
- Reaction mixture (80 ⁇ L) is added to the purified proteins on the beads and the beads are agitated at ambient temperature overnight.
- Reaction mixture contains, 150 ⁇ M vancomycin aglycone (synthesized as described in J. Chem. Soc. Chem. Commun. (1988) 1306-1307), 500 ⁇ M UDP glucose, 2 mM DTT in 50 mM ammonium formate pH 7.4.
- the reactions are quenched by addition of 1 volume of methanol and the mixture is centrifuged (5 minutes at 2000 rpm). Supernatant (lOO ⁇ L) is withdrawn to a new 96 well plate and subjected to mass spectrometry.
- the quenched reaction mixture (10 ⁇ l) is injected into a triple quadmpole electrospray mass spectrometer set in the positive mode.
- Molecular ions are allowed to pass through the first quadmpole (1143 amu for vancomycin aglycone, 1305 amu for desvancosamine vancomycin) and subjected to collision in the second quadmpole before peak detection of the daughter ions at 100 amu in the third quadmpole. Integration of the peaks obtained from this process are directly proportional to product formation. This determines the relative fitness of the library clones in the production of desvancosamine vancomycin.
- step H using multiple genes that encode variants of a particular derivatizing enzyme, using single genes obtained from a library, using single genes shuffled with wild-type genes for backcrossing, and with multiple genes, each of which encodes an enzyme having a different activity.
- the UDP-glucose in step G is replaced by other
- the MS parameters in step H are adapted to detect the predicted molecular ions.
- This Example describes the generation of a library of recombinant O- methyltransferases (OMTase) and the use of enzymes from the library to synthesize derivatives of clarithromycin (6-O-methyl erythromycin).
- O-methyltransferases O-methyltransferases
- a family of erythromycin analogs having a 6-methoxy group have been shown to have useful pharmaceutical properties. These compounds are presently prepared by a multi-step chemical methylation of erythromycin A and its analogs ( Figure 11).
- An enzyme capable of selectively transferring an activated methyl group to the 6-hydroxyl group would allow for a one step high yield production of this class of erythromycin analogs in vivo or as a single bioconversion in vitro.
- This Example describes an approach for obtaining such methyltransferases.
- each of the subfamilies is shuffled alone, as well as shuffling the entire family together. This is accomplished using several shuffling formats that are designed to effect the recombination of genes of both high and low sequence identity.
- SAM S-adenosylmethionine
- MTs methyltransferases
- SAM carries an activated methyl group that is efficiently transferred to nucleophiles having a broad range of chemical reactivity. Transfer of the activated methyl group from SAM to the recipient nucleophile is thermodynamically favorable thereby driving the methytransfer reaction essentially to completion (Figure 12).
- a family of seven genes is known that encode SAM- dependent OMTases specific for secondary alcohols on carbomycin, midecamycin, saframycin, rapamycin, rifamycin, and FK506 ( Figure 13). A comparison of these substrate nucleophiles with the 6-hydroxyl of erythromycin A suggests that only minor adjustments in local specificity would be required for the parent OMTases to accept erythromycin as substrate.
- Another gene of interest is that which encodes ERYG, which O-methylates the mycarose moiety of erythromycin C, resulting in the synthesis of erythromycin A.
- EryG shares 54% identity at the DNA level with rapQ perhaps providing an additional subfamily of OMTases containing tertiary alcohol OMTase activity ( Figure 14).
- the genes to be shuffled are synthesized either from genomic DNA or from synthetic oligonucleotides by the PCR. These genes are then cloned into a suitable vector for expression.
- the complete sequence of the gene encoding the carbomycin-4-OMTase is not known, but one can clone the gene or the partial sequence can be shuffled with the full sequences of the other OMTases.
- SAM dependent OMTases are generated. These libraries are screened against erythromycin A and its analogs for 6-OMTase activity. The identified clones are pooled and evolved further to improve the enzyme to a practical level of activity.
- 10 4 -10 5 clones from the family shuffled library are screened to identify those that have deserythromycin 6-OMTase activity.
- Cell cultures are grown in the presence of deserythromycin A , and the supematants of these cultures are then removed and assayed for the presence of 6-O-methyl deserythromycin A oxime.
- the OMTase genes from the identified clones are isolated, pooled, shuffled, and then screened for increased deserythromycin A 6-OMTase activity. Additional cycles of shuffling and screening will continue until the enzyme activity has reached a level suitable for production of 6-O-methyl deserythromycin.
- the shuffled library can be screened for 6-OMTase activity against erythronolide B, deserythromycin A, erythromycin A, and their oxime derivatives. While it is possible that no deserythromycin A oxime 6- OMTase activity will be detected in the initial library, clones having other 6-OMTase activity may exist. These clones can then be used in further rounds of shuffling to further tailor the 6-OMTase specificity. For example, if activity was detected for erythromycin A, subsequent libraries can be screened first for activity for deserythromycin A, and finally for the deserythromycin oxime. In this way only subtle changes in specificity are expected from each new library.
- the genes encoding the open reading frames for the midecamycin 3' O- methyltransferase (mdmC), the safromycin O-methyltransferase (safC), the rapamycin 31-O- methyltransferase (rapl), and the FK506 31 -O-methyltransferase (fkbM) are isolated and cloned into an appropriate E. coli expression vector (p ⁇ T22B(+)). These genes, which range from 50-80% identical, are then shuffled by family shuffling to generate a library of genes encoding chimeric O-methyltransferases (OMTase). The library is cloned back into the expression vector and expressed in an appropriate E. coli host (BL21(DE3)). This library can now be screened for chimeric enzymes having new properties such as a new specificity for target methylation.
- mdmC midecamycin 3' O- methyltransferase
- OMTase activity can be measured in high-throughput by using an assay that measures the transfer of the radiolabeled methyl group of ( 3 H)S-adenosylmethionine to a desired donor molecule (see Figure 15).
- the assay is based on the transfer of the labeled methyl group from a highly charged molecule (SAM) to a more hydrophobic molecule ( Figure 12).
- SAM highly charged molecule
- Figure 12 The reaction is extracted with an organic solvent such that unreacted SAM remains in the aqueous phase and the methylated substrate is selectively extracted into the organic phase.
- the organic phase can then be measured for its content of radioactivity.
- the advantage of this assay is that it is generally applicable to extractable substrates, it is very high-through-put, and can be used to screen for activity against a pool of compounds simultaneously. The process is as follows.
- Streptomyces lividans is a particularly suitable host for at least two reasons. First, it is transformed with high efficiency by plasmid DNA isolated from E. coli. Second, it is quite permeable to erythromycin and its analogs, so whole cells rather than lysates can be assayed. Alternatively, one can use a high throughput format for measuring enzyme activities from Escherichia coli or Bacillus subtilis cell extracts.
- Purified enzyme or cell lysate is added to an assay mixture of 50 mM phosphate buffer, pH 7.5, containing 0.4 mM MgSO 4 , 0.1 mM DTT, 0.1 mM ( 3 H) S-adenosylmethionine, and 1-10 mM of the target substrate(s). After incubation, the reaction is quenched by extraction with ethylacetate. A sample (50 ⁇ l) of the organic phase is removed, mixed with scintillant (150 ⁇ L) and measured for radioactivity using a 96 well scintillation counter. Clones from samples having radioactivity higher than a control sample having no enzyme added is considered positive and can be further investigated in more quantitative assays.
- Clarithromycin is 6-O-methyl erythromycin.
- the cunent process for the preparation of clarithromycin is a seven step chemical methylation of erythromycin.
- An enzyme capable of carrying out this chemistry in one step could provide a means of preparing clarithromycin by fermentation or biotransformation (see Figure 16).
- the OMTase library is screened for erythromycin 6-0 methylase activity.
- the shuffled OMTase library is plated out on solid medium to separate individual clones. Individual colonies are picked into 96 well plates containing LB medium (200 ⁇ l) and ampicillin (100 ⁇ g/ml). The plates are grown at 30°C for ten hours or until the cultures have reached an optical density of 0.7.
- Isopropylthiogalactoside IPTG is added to 0.1 mM to induce expression of the MTases, and the cells are incubated for an additional 3 hours. The plates are centrifuged and the supernatant discarded. The cell pellet is resuspended in a lysis buffer (200 ⁇ l) of 50mM phosphate buffer, pH 7.5, containing 1 mM EDTA, 1 mM DTT, 2 ⁇ g/ml of polymyxin B sulfate, and 1 mg/ml of T4 lysozyme. The reaction is incubated for 15 minutes at 30°C.
- a sample from each well (20 ⁇ l) is transferred using a 96 head liquid handling station, such as the MultimekTM, to a 96 deep well plate containing clarithromycin synthase assay buffer (280 ⁇ l).
- the buffer is 50 mM phosphate buffer, pH 7.5, containing 0.4 mM MgSO 4 , 0.1 mM DTT, 0.1 mM ( 3 H) S-adenosylmethionine, and 1 mM erythromycin.
- the reaction is incubated at 30°C for one hour.
- Ethylacetate (300 ⁇ L) is added to each well, the plate is shaken vigorously, centrifuged, and a sample (50 ⁇ L) of the upper organic phase is removed and added to a plate containing scintillant (150 ⁇ L). The plate is then read using a plate scintillation counter. Any sample having radioactivity in the organic phase higher than that from samples harboring the parental genes or no MTase gene likely contains an enzyme that transfers a methyl group to erythromycin. Since there are five potential hydroxyl groups on erythromycin to which a methyl group might be transferred, it is necessary to discern whether it was transferred to the 6-hydroxyl. Secondary assay for Clarithromycin synthase.
- the secondary assay for clarithromcyin synthase activity is based on chemical modification with phenyl boronate and analysis by mass spectrometry.
- Erythromycin can be O-methylated in five positions, on the 6, 11, or 12 positions of the macrolide ring, or on either the cladinose or the desosamine moieties.
- Phenyl boronate bind specifically to cis diols, such as the 11,12 diol of erythromycin. Thus if phenyl boronate binds to the enzymatically methylated erythromycin, the methyl group cannot be located at the 11 or the 12 position.
- To determine whether the modified erythromycin is clarithromycin the following assay is performed.
- Enzymatic methylation of erythromycin is performed as described above except the SAM used for the modification is not radiolabeled and the cell extract is from a cell showing a positive radioactivity assay.
- the organic phase is analyzed by two dimensional mass spectroscopy (MS/MS), in which the parent ion is fragmented to submolecular fragments (see Figure 17).
- Clarithromycin has a positive ion molecular weight of 748.48, with the positive charge being due to the protonation of the amine of the desosamine moiety.
- cladinose and the desosamine can be separated from the macrolide ring, however, only molecules containing the desosamine moiety are detected since they carry the amine. Fragmentation of the 748.48 ion results in two distinctive new ions, 590.4 and 158.12.
- the 590 ion is 6-O-methyl deserythromycin A (clarithromycin lacking the cladinose moiety).
- the 158.12 ion is dehydro desosamine, the result of the elimination of the 5-hydroxyl group of the macrolide ring.
- An MS/MS spectrum of the 748.48 peak having the 590 and the 158 ions is distinctive of erythromycin derivatives methylated on the macrolide ring i.e. at the 6, 11, or 12 positions. If the sample shows this spectmm, then it is further analyzed to determine if it is methylated at the 6 position.
- the organic extract is treated with an excess of phenylboronate under neutral conditions and then analyzed by mass spectroscopy. Only if the modification is at the 6 position will the 11 and 12 positions be free to form an adduct with the phenylboronate.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Plant Pathology (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Mycology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Enzymes And Modification Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
Description
Claims
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14884899P | 1999-08-12 | 1999-08-12 | |
US148848P | 1999-08-12 | ||
US63730900A | 2000-08-11 | 2000-08-11 | |
PCT/US2000/022080 WO2001012817A1 (en) | 1999-08-12 | 2000-08-11 | Evolution and use of enzymes for combinathorial and medicinal chemistry |
US637309 | 2000-08-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1208209A1 true EP1208209A1 (en) | 2002-05-29 |
Family
ID=26846224
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP00959219A Withdrawn EP1208209A1 (en) | 1999-08-12 | 2000-08-11 | Evolution and use of enzymes for combination and medicinal chemistry |
Country Status (7)
Country | Link |
---|---|
EP (1) | EP1208209A1 (en) |
JP (1) | JP2003529328A (en) |
KR (1) | KR20020022808A (en) |
CN (1) | CN1378598A (en) |
AU (1) | AU7057500A (en) |
CA (1) | CA2380948A1 (en) |
WO (1) | WO2001012817A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE60213826T3 (en) * | 2001-03-19 | 2013-10-17 | President And Fellows Of Harvard College | DEVELOPMENT OF NEW MOLECULAR FUNCTIONS |
AU2002248051A1 (en) * | 2002-04-09 | 2003-12-19 | Genofocus Co., Ltd. | Stabilized biocatalysts and methods of bioconversion using the same |
JP2006129836A (en) * | 2004-11-09 | 2006-05-25 | Kobe Univ | Method for efficiently producing compound library by biocombinatorial chemistry |
CN101724673B (en) * | 2008-10-29 | 2013-10-16 | 上海来益生物药物研究开发中心有限责任公司 | Method for preparing methyl vancomycin |
WO2018157150A1 (en) | 2017-02-27 | 2018-08-30 | Duke University | In vivo protein n-acylation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1104459A1 (en) * | 1998-08-12 | 2001-06-06 | Maxygen, Inc. | Dna shuffling of monooxygenase genes for production of industrial chemicals |
-
2000
- 2000-08-11 JP JP2001516904A patent/JP2003529328A/en not_active Withdrawn
- 2000-08-11 KR KR1020027001884A patent/KR20020022808A/en not_active Application Discontinuation
- 2000-08-11 WO PCT/US2000/022080 patent/WO2001012817A1/en not_active Application Discontinuation
- 2000-08-11 CA CA002380948A patent/CA2380948A1/en not_active Abandoned
- 2000-08-11 EP EP00959219A patent/EP1208209A1/en not_active Withdrawn
- 2000-08-11 CN CN00811801A patent/CN1378598A/en active Pending
- 2000-08-11 AU AU70575/00A patent/AU7057500A/en not_active Abandoned
Non-Patent Citations (1)
Title |
---|
See references of WO0112817A1 * |
Also Published As
Publication number | Publication date |
---|---|
WO2001012817A1 (en) | 2001-02-22 |
CA2380948A1 (en) | 2001-02-22 |
CN1378598A (en) | 2002-11-06 |
KR20020022808A (en) | 2002-03-27 |
JP2003529328A (en) | 2003-10-07 |
AU7057500A (en) | 2001-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Irmler et al. | Indole alkaloid biosynthesis in Catharanthus roseus: new enzyme activities and identification of cytochrome P450 CYP72A1 as secologanin synthase | |
AU741139B2 (en) | Screening for novel bioactivities | |
Yoon et al. | Generation of multiple bioactive macrolides by hybrid modular polyketide synthases in Streptomyces venezuelae | |
Struck et al. | S‐adenosyl‐methionine‐dependent methyltransferases: highly versatile enzymes in biocatalysis, biosynthesis and other biotechnological applications | |
US7384387B1 (en) | High throughput mass spectrometry | |
US7101684B2 (en) | Modified modular polyketide synthase | |
Schmidt-Dannert | Directed evolution of single proteins, metabolic pathways, and viruses | |
Hong et al. | New olivosyl derivatives of methymycin/pikromycin from an engineered strain of Streptomyces venezuelae | |
Fewer et al. | Nostophycin biosynthesis is directed by a hybrid polyketide synthase-nonribosomal peptide synthetase in the toxic cyanobacterium Nostoc sp. strain 152 | |
EP1208209A1 (en) | Evolution and use of enzymes for combination and medicinal chemistry | |
US9404107B2 (en) | Integration of genes into the chromosome of Saccharopolyspora spinosa | |
CA2377669A1 (en) | Dna shuffling of dioxygenase genes for production of industrial chemicals | |
US20060269528A1 (en) | Production detection and use of transformant cells | |
WO2000048004A1 (en) | High throughput mass spectrometry | |
McLachlan et al. | Directed enzyme evolution and high‐throughput screening | |
US7416870B2 (en) | Methods of directing C-O bond formation utilizing a type II polyketide synthase system | |
US7109019B2 (en) | Gene cluster for production of the enediyne antitumor antibiotic C-1027 | |
Pelzer et al. | Tool-box: tailoring enzymes for bio-combinatorial lead development and as markers for genome-based natural product lead discovery | |
Zhao et al. | Pathway and enzyme engineering and applications for glycodiversification | |
WO2000040596A1 (en) | Gene cluster for production of the enediyne antitumor antibiotic c-1027 | |
Kim | Natural Product Biosynthesis In Uncultured Bacteria | |
Scism | Directed evolution and pathway engineering for nucleotide analogue biosynthesis | |
AU1540702A (en) | Screening for novel bioactivities | |
WO2001004274A2 (en) | Ketoacyl synthase domains useful for priming of polyketide synthases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20020312 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
AX | Request for extension of the european patent |
Free format text: AL;LT;LV;MK;RO;SI |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: DAVIS, S., CHRISTOPHER Inventor name: KREBBER, CLAUS Inventor name: HOWARD, RUSSELL Inventor name: SELIFONOV, SERGEY, A. Inventor name: DELCARDAYRE, STEPHEN |
|
17Q | First examination report despatched |
Effective date: 20030521 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20031001 |